Data Dictionary

    Data Dictionary


    Article summary

    Understand your data landscape

    Build a complete picture of your data landscape to support effective test data management strategies and privacy by design.

    Document details

    Purpose

    To learn and start cataloguing data.

    Audience

    Anyone using the platform or needs a guide to start using the Data Catalogue.

    Requirements

    Access to the Curiosity Dashboard.

    Data catalogue

    Definitions are used to track and store detailed information about the databases or files we work with. In this guide, we’ll walk through the process of connecting to a pre-configured database connection. We’ll explore how to scan the database, capture its details, and leverage the insights gained to make informed decisions about the next steps in managing and optimizing your data.

    • Catalogue: Automatically map, catalogue, and visualise all data assets across your organisation.

    • Sensitive data identification: Detect and classify sensitive data such as Personally Identifiable Information and Protected Health Information. Understand where your most critical data is stored and how it’s being used to ensure compliance with regulations.

    • Relationship Mapping: Uncover how your data assets are interconnected. Automatically map relationships between tables, fields, and systems, providing a complete picture of how data flows through your organisation.

    • Data gap analysis: Identify gaps and missing data elements that could hinder your analytics, testing, or compliance strategies. Our platform provides detailed reports, and synthetic data generation capabilities, to ensure that every scenario is accounted for.

    Database Connections:

    To create a new database connection, navigate to the Data Dictionary and onto the Databases tab.

    Click on +New Connection Profile

    Often users don’t realise they are on the wrong tab when searching for Databases make sure you are on the right tab

    Provide a Name & Description

    The Connection tab has all the relevant connection details that need to be completed.

      • DBMS Type - Drop down of supported database types.

      • Host

      • Port

      • Database

      • Schemas (You can apply regular expression here to choose which schemas to include)

      • Tables (You can apply regular expression here to choose which tables to include)

     The Security tab holds the Username & Password details to connect to the database with.

    When finished click OK to continue and create the connection.

    Within the Data Catalogue you will now have a database connection ready to use.

    Scan Database

    This process will scan and store a version of the database metadata within the platforms catalogue. This will then be available for other activities later on such as Masking or Data Generation.

    To scan the database, navigate to the Data Dictionary → Databases and click on the newly setup database connection.

    Click on Run Scan (Native)

    This will start the scan process. A job should complete as below.

    When this Job completes, you will have a scanned database to review. It will show Schema, Table, Columns.

    To view the Scan details, click on the ‘Scan #1’.

    If the database is updated you can scan multiple times. You will then have multiple versions of scans.

    The available schemas and some associated information will be presented. In this case

    The public schema in this case we’d like to see in more details. Click on public to learn more.

    The Schema Details with the column details, foreign keys, references are now displayed.

    Clicking on any table will show further details on each table.

    Column information & data types will often start to drive the decisions made in terms of Masking or Data Generation routines.

    Create Definition

    Definitions are used to track and store detailed information about the databases or files we work with. In this guide, we’ll walk through the process of connecting to a pre-configured database connection. We’ll explore how to scan the database, capture its details, and leverage the insights gained to make informed decisions about the next steps in managing and optimizing your data.

    To create a new Definition

    Navigate to the Data Dictionary and click +New Definition

    Provide a Name & Description and choose Database. Click Next Step when ready.

    Pick Existing Connection and choose the previously setup connection profile. In this case Curiosity Bank Connection.

    If the database connection has a scan already it will show here. If no scan is available or you are unsure check to Trigger new scan.

    Click New StepFinish.

    Once completed you will have a new Definition linked to a Connection Profile with a completed Scan.

    When finished click Go to Definition.

    You will see the current Version, Table Details & additional actions you can now perform against the Definition.

    The Version and Schema within the Content tab will hold various scanned information. You can choose different versions or schemas from these drop downs.

    You can also visualize the table relationships within the definition.

    Click on the tree icon in to view diagram and choose Entity Relationship Diagram.

    The diagram will then be generated.

    This will present the existing found relationships that need to be considered and likely maintained when running a masking  or data generation routine.

    In the top right, additional actions are possible including:

    • Toggle Collapse/Expand the image

    • Settings

    • Relayout

    • Download as PNG

    • Search

    • Zoom In / Out

    • Fit to screen

    Settings

    Some databases & files have large structures and table amounts/column amounts. Settings allow you to customize what is displayed and how much based on various configurations. The amount of tables displayed and the columns you wish to see, should be configured here for an easier viewing experience when looking at large databases.


    What's Next