Introduction

The final phase for generating Synthetic Data is to run the .vip file with the associated configuration spreadsheet that was created previously and then following the steps below.  If you haven't started at the beginning, this is the link to begin: start at the beginning of Synthetic Test Data Generation. 

At this point, you should have already Created/Associated the Data Generation Configuration sheet with the associated .vip file.  If not, please see previous article.

  • Open VIP and create a new flow

  • Save the file (as filename.vip) in the location of the Configuration Sheet and the other relevant files.

  • Go To Tools->Accelerators->DataGeneration

  • Browse to the Data Generation Configuration sheet, Click Process (Make sure the Configuration sheet is closed before doing this).  The Configuration sheet is read which generates the flow to generate the data.

Set Argument Values

The Arguments are created by the user when the Database is created and should follow VB.Net naming conventions.

  • Once the flow is generated, you can look through the arguments section by selecting Arguments on the bottom of the screen.

    • Example (see image below): If we want to generate 2 Customers, we set paritem_Customer_ to 2 and if we want to generate 2 Products, we set paritem_Product to 2 and if we want to generate 2 Orders we set value paritem_Order_ to 2 and if we want to use 3 OrderItems, we set the value of paritem_OrderItem to 3.

Full List of Application SampleCommerce Arguments here:

Execute the flow

This section explains how to execute the VIP Flow which uses the Configuration spreadsheet to Generate Synthetic Data. [If you haven't Configured the Configuration Control sheet already, see section on Configuring the Configuration spreadsheet.] This process is described with regard to the SampleCommerce Database.

Examine the SampleCommerce Database before beginning

Before we execute this flow, let's look back at our database using the SQL Server Management tool, so that we can compare the added data after we execute our flow.

We can see in the screenshot of the first database query below that there are:

  • 27 Suppliers

  • 39 Products

If we look at the screenshot of the second database query below, we can also see that there are:

  • 32 Customers

  • 46 Orders

  • 45 Total Amounts

Next we Execute the data generation in VIP by going to menu Action->Execute in VIP.

Examine the Database for New Entries

 If we look again at our database, we can see the added database items. There is now 1 added Supplier and 2 additional products:

  • 28 Suppliers

  • 41 Products

If we compare the output of the second database query below, we can see that we have 2 new Customers, 2 new Orders and 4((2 new Orders)*(2 OrderItems)) new TotalAmounts which is what we specified in the Argument section of our flow.

  • 36 Customers

  • 50 Orders

  • 49 Total Amounts

If we look at the last 4 elements in the Order section, we can see that the last column has the same figure as the last 4 entries in the TotalAmount section thanks to our EventHook creation and specification. This shows how we used our business logic to enhance the test data that we wanted to create.

Eureka! We have now completed the process of Synthetic Test Data Generation.

Summary of the Data Generation Execution Process

  1. Look at the (excel) Database Data using SQL Database Manager before executing the flow

  2. Open the newly created Flow in VIP

  3. Go to Tools–>Accelerators–>Data Generation

  4. Execute Flow in VIP (the one that uses/(associated with) the Configuration spreadsheet)

  5. Look at the Database Data again using SQL Database Manager to confirm that the new Data was generated.


Fixing Errors and Adding Parameters

Adding Parameters

It is possible to add VIP Arguments/Values by opening  VIP, to identify which Argument you would like to add or change.  Then open a text editor such as Notepad++. You can then add the Argument details in the file.

Example  of entering an Argument into a file:

This text file should be saved as XML. You can store multiple Arguments in the same file.

Parameters are also used to control Variables.

Locating Errors in VIP Flows and side-by-side in the Control Spreadsheet

In general, an Error in the Flow is indicated by a red circle with an exclamation mark in it.

Fixing Error with Country

This Error is caused because a Table Name and a Column Name in the Configuration sheet have the same name.  This is not allowed.  In order to fix the problem an '_' is added to the variable name.

Fix

Open the Supplier Sheet in the Configuration Spreadsheet

Change  .Country should to  ._Country

Fixing an IF Expression in VIP and the Control Spreadsheet

Using a Function Parameter

Whenever an 'IF' Expression is used, the Value must be Typecast.

Fix

Check the Type using the Expression Editor in VIP.

Example: We see that the Type is set to 'Boolean'

It is necessary to Cast the InArgument using CTYPE((…………), Boolean)

Also be sure to put the above Type Casting statement in the Product sheet, FunctionParameters Column of the Configuration spreadsheet:

Product IsDiscontinued "CTYPE((……), Boolean)"

Using a Wrap Expression to change Values after Generation

This demonstrates how to use a Wrap Expression to generate new data based on previously generated data.

Open Order sheet and look at the WRAP_IN_EXPRESSION Column

add:

CTYPE(VALUE, Boolean)  statement

In the In the (IF) Function Parameter Line add the IF statement as above (fixing an IF Expression)

This will have the same effect as in the previous section.

Updating the VIP flow to implement the fixes

In VIP, Save the Flow by going to File->Save and Override the Flow that you have.

Next, Open a blank Flow by going to

Tools–>Accelorator–>DataGeneration–>Browse

Select the Spreadsheet and Process

This may take a bit of time.


Using a Parameter Spreadsheet to Specify new Parameters and run a new Generation

Create a new spreadsheet containing the new parameters and values. Here is an example:

  • Generate Data by running the VIP Flow without the added/changed Data.

  • Look at the results (without changes) in the Database Manager such as SQL Manager.

  • Change/add any parameters you would like in the Configuration spreadsheet.

  • Specify the Parameter sheet name in the VIP Flow

  • Execute the VIP Flow with the new added Parameters.

This process allows the user to:

  • Control how many rows they want to generate based on the Parameters file

  • Control Arguments to pass using the Parameters sheet

  • Parameters can be exposed and a form can be created in the VIP Modeller which can invoke the underlying Flow to Generate the Data.

Example:  If we had to create Random Order Items for each Order then we can override our parItem_OrderItem_variable and get a Random integer between 1 and 10 and tell it to re-evaluate itself after each Order is generated, so that when an Order is generated it will re-evaluate that value and use a random value for OrderItem.


Extras - Repeater and BulkCopy

  • Using Repeater

parRepeater Allows the user to store parameter CSVs like 1.csv, 2.csv……up to [parRepeater].csv, then in the flow individual parameters can be read using parCurrentRepeater & ".csv" which will change the underlying parameters. So it offers flexibility to control quite a few things – not only parameters. This is one example but it's up to user how they would like to use it. DataType is Boolean.
parParametersFile is the parameter which will be modified in EventHook Assign by overriding it's value to be evaluated based on parCurrentRepeater.

  • Bulk copy for SQL SERVER

Bulk copy is fast load of synthetically generated data. So when you want to load large volume of synthetic data it's good however it's only available for SQL Server. Also user loses the ability to perform action on individual row that is being generated so BCP has it's own limitation. DataType is Integer.

TEST DATA AUTOMATION SYNTHETIC TEST DATA GENERATION