Data Subsetting from Test Data Automation reads data from a source database and inserts it into a staging database. The connection details for the Subset are first provided in a Basic Control Sheet, along with basic Subset criteria.

This Basic Control Spreadsheet defines a Basic Subset Job. An automated VIP workflow performs the Actions involved in a Basic Subset job, creating the staging database and any schema(s), and executing the Basic Subset. "Soft" Keys, Primary or Foreign, can also be defined manually at this point, along with any Found Criteria.

The Basic Subset Job generates a set of Subset Results, populated into an Advanced Control Spreadsheet. The Advanced Control Spreadsheet contains the Basic Subset Criteria, as well as Subset Rules and Additional Parameters that can then be toggled on or off. These additional parameters allow you to define an Advanced Subset Job. Herein lies the power of Data Subsetting with Test Data Automation: from a small number of user-defined parameters, it will model the database and provide a set of rules to build coherent Data Subsets that reflect the SourceDatabase relationships. This significantly reduces the complexity and manual effort associated with subsetting.

Each time a new Action is run, it will overwrite any relevant information in the Control Spreadsheet. This process allows you to define a data model and Subset Criteria rapidly and iteratively. You can work towards the ideal data set, or iteratively gain greater understand of your data:

The Control Spreadsheet and VIP Server Controller is used to run and refine a Subset iteratively, working until you are happy with its results. Once the Subset job has been defined, the Actions and associated variables can then be exposed in self-service web forms. This enables other users to trigger the Subset using a simple, self-service web portal.

Note: You can also parameterize actions in a Command Script and execute the actions from the command line rather than the VIP Server Controller. For this, a standard set of editable .CMD scripts are shipped with the VIP Server Controller installation.
During a Subset, the automation issues SQL as it crawls from table to table. Recursion further means the subset will return to the beginning until the criteria of completion are fulfilled. The Subset will run until the completion criteria are fulfilled.

Recursion and database "Crawling" means that the subsetted data retains the referential integrity of the original data: the automation will pull in all related data needed for a complete Subset, moving "up" and "down" child and parent tables. If too much data is produced, Subset Rules and tables can be toggled on or off in the Advanced Control Spreadsheet.