Skip to main content

SOLE Generator

The SOLE Generator is a CLI (Command-line Interface) tool that helps you generate a YAML configuration for SOLE objects from existing DDL statements. This tool comes in handy when you want to start managing your existing Snowflake databases through the DataOps.live platform. Instead of creating your database.template.yml file manually, you can do that by using this tool.

dataops solegen concepts

Below is a list of all the objects that are currently supported:

Prerequisites

First, install it as part of the DataOps CLI package.

Before using the tool, you need to have the Snowflake DDL statements for the objects you want to generate the configuration for. To get that, you can simply run the GET_DDL command in your Snowflake account. For example:

select get_ddl('DATABASE', 'SALES_RECORD', true);

The output of the GET_DDL command is the actual SQL CREATE statement of the specific SOLE object including its children objects. In the example above, the output will include the CREATE statements for all objects that belong to the SALES_RECORD database (schemas, tables, procedures and so on).

tip

For more accurate parsing, enable the use_fully_qualified_names_for_recreated_objects property of the GET_DDL command.

Usage

The SOLE Generator comes as part of the DataOps CLI tool. If you haven't installed it already, you can do so by following the steps explained in DataOps CLI.

The available options for the SOLE Generator CLI are listed in the image below:

dataops-solegen-help __shadow__

You can check them out any time by running the following command in your Terminal window:

dataops solegen --help

Once you have the tool set, you can start generating the YAML configuration for your SOLE objects.

Generate SOLE configuration from a raw DDL statement

The SOLE Generator lets you generate SOLE configuration from raw SQL CREATE queries. You can do that by passing the SQL query as an argument to the dataops solegen command. For example:

dataops solegen "create schema SALES_RECORD.SALES comment='This is a test DataOps.live schema.';"

The YAML configuration generated by the SOLE Generator for the example above would be:

databases:
SALES_RECORD:
schemas:
SALES:
comment: This is a test DataOps.live schema.
manage_mode: all

Generate SOLE configuration from DDL files

Parsing raw DDL statements is useful when you want to generate configuration for a single object. But, if you want to parse multiple objects at the same time, you can pass a DDL file, or even a whole directory to the SOLE Generator.

Generate configuration from a single DDL file

To parse a single DDL file, use:

dataops solegen /path/to/file/database.sql

If the content of the DDL file for your database is:

database.sql
create or replace database SALES_RECORD DATA_RETENTION_TIME_IN_DAYS=5;

create or replace schema SALES_RECORD.SALES;

create or replace TABLE SALES_RECORD.SALES.CUSTOMER (
CUSTOMERID NUMBER(38,0) NOT NULL,
ACCOUNTNUMBER VARCHAR(16777216),
FULLNAME VARCHAR(16777216),
MODIFIEDDATE TIMESTAMP_NTZ(9) NOT NULL
);

create or replace TABLE SALES_RECORD.SALES.PRODUCT (
PRODUCTID NUMBER(38,0) NOT NULL,
DESCRIPTION VARCHAR(100),
UNITPRICE NUMBER(24,2) NOT NULL,
MODIFIEDDATE TIMESTAMP_NTZ(9) NOT NULL
);

Its corresponding YAML configuration generated by the tool would be:

databases:
SALES_RECORD:
data_retention_time_in_days: "5"
schemas:
SALES:
tables:
CUSTOMER:
columns:
CUSTOMERID:
type: NUMBER(38,0)
nullable: false
ACCOUNTNUMBER:
type: VARCHAR(16777216)
nullable: true
FULLNAME:
type: VARCHAR(16777216)
nullable: true
MODIFIEDDATE:
type: TIMESTAMP_NTZ(9)
nullable: false
change_tracking: false
manage_mode: all
PRODUCT:
columns:
PRODUCTID:
type: NUMBER(38,0)
nullable: false
DESCRIPTION:
type: VARCHAR(100)
nullable: true
UNITPRICE:
type: NUMBER(24,2)
nullable: false
MODIFIEDDATE:
type: TIMESTAMP_NTZ(9)
nullable: false
change_tracking: false
manage_mode: all
is_managed: false
is_transient: false
manage_mode: all
namespacing: both
manage_mode: all

Generate configuration from a directory

To parse a whole directory containing multiple DDL files, run:

dataops solegen -d /path/to/directory

This command merges the SOLE configuration of all DDL files from the input directory into a single output.

If the input directory contains subdirectories, the files from those subdirectories will be parsed too.

Additionally, you can:

  • Use the -e option to specify the file format that needs parsing. To filter *.sql files only, run:

    dataops solegen -d /path/to/directory -e .sql
  • Use the -s option to inherit the schema name from the subdirectory if the schema isn't defined.

    If a SQL file is located in a subdirectory and the object name does not include the schema, then the subdirectory name would be used as the schema name.

    If a SQL file is located in a subdirectory and the object name already includes the schema, then the --inherit-schema (-s) flag is ignored.

    dataops solegen -d /path/to/directory -s

    For example, if we have a folder structure as follows:

    Snowflake
    |- TABLES
    | |- SCHEMA1
    | | |- TABLE1.sql
    | | |- TABLE2.sql
    | |- SCHEMA2
    | | |- TABLE3.sql
    | | |- TABLE4.sql

    You can run the SOLE Generator with the -s option as follows:

    dataops solegen -d /path/to/directory -s > tables.yaml

    The resulting tables.yaml file will contain table definitions that specify the schema based on the subdirectory name:

    databases:
    '{{ env.DATAOPS_DATABASE }}':
    schemas:
    schema_1:
    tables:
    TABLE1:
    sql: |-
    -- contents of TABLE1.sql
    TABLE2:
    sql: |-
    -- contents of TABLE2.sql
    schema_2:
    tables:
    TABLE3:
    sql: |-
    -- contents of TABLE3.sql
    TABLE4:
    sql: |-
    -- contents of TABLE4.sql

Output

By default, the SOLE Generator outputs the YAML configuration to the standard output (in the Terminal window). To write the configuration into a YAML file, you can use the -o option of the dataops solegen command:

dataops solegen /path/to/input/file/database.sql -o /path/to/output/file/database.yml

If the output file already exists, you can overwrite it by adding the force flag:

dataops solegen /path/to/input/file/database.sql -o /path/to/output/file/database.yml -f