This documentation has been updated - please see: Subsetting (VIP version) - Process Overview

A Basic Subset is run using a series of pre-defined, automated Actions. These Actions are informed by the Basic Control Spreadsheet and are executed either using the VIP Server Controller or the Command Line. An overview of both methodologies was provided in this subsection of the Knowledge Base.

This subsection provides instructions on running the individual actions, set out in the order in which they are run to perform a Basic Subset. Each article provides instructions for running the Action using the VIP Server Controller. This is the recommended approach. An appendix for each article provides instructions on running the action with Command Line.

The actions to run the Basic Subset are:

  1. The GETMETADATA Action retrieves the metadata from the Source Data that is needed to run the Subset. It is a composite action, made up of three actions. Each action be run in a single action as "GETMETADATA", or can be run separately. GETMETADATA is recommended for simplicity and speed. Running the actions individually is valuable for closer analysis, learning, and debugging.

    1. The TABLES Action retrieve metadata related to the tables in the Source Database.

    2.  The GETKEYS Action retrieve metadata from the Source Database.

    3. The FINDIDENTITYCOLUMNS Action retrieves the Identity Columns for each specified table.

  2. The PREPENV Action create tables and indexes in the Staging Database.

  3. The BUILDMODEL creates the rules to drive the Subset.

  4. The SUBSET Action writes data to the Staging Database.

The recommended approach for running these actions is as follows:

  1. Configure the Basic Control Spreadsheet, as set out in the previous subsection.

  2. Parameterize GETMETADATA in the VIP Server Controller. You can use the default "Executor.cfg" as a quick starting point for this.

  3. Parameterize GETMETADATA in the VIP Server Controller.

  4. Save the paramaterized action as a Config Files.

  5. Re-use the Config File to run the remaining Basic Subset Actions. You can re-use the Config File for each, often changing only one mandatory parameter each time (the Action).

These actions should be run in order, deciding between either the composite GETMETADATA or the individual actions that constitute it. The actions together auto-populate additional sheets in the Control Spreadsheet, creating an Advanced Control Spreadsheet. This includes the Subset Rules needed to produce coherent Subsets of inter-related data. The rules are formulated automatically.

The Subset Rules can then be toggled on/off. The relationships that will be fulfilled by the Data Subset can also be toggled. This enables you to define Advanced Subsets, iteratively including tables and relationships until you get a coherent data set of the right size.

Note: Subset Rules are formulated to generate a set of data that fulfils the relationships in the Source Database. It therefore creates a data set where the data reflects the Primary and Foreign Key relationships. However, the core actions involved in running a Subset (GETMETADATA, PREPENV, BUILDMODEL, and SUBSET) do not implement these Keys in the Staging database. To implement them, you must perform Post-Subset actions to add Keys.