Preinstalled CLI Tools in DataOps.live Develop

Here is a list of the command-line utilities that ship by default with the DataOps development environment.

danger

The tools installed in the DataOps development environment are subject to change without warning.

asdf

This command line tool is a version manager for multiple languages that allows you to easily switch between different versions of a language. It comes pre-installed in the development environment and can be used to manage versions of languages like Python, Ruby, Node.js, and more.

Configuration/Authentication

None required.

Additional information

Read more about The Multiple Runtime Version Manager.

aws CLI

This unified tool allows you to manage AWS services from the command line. It comes pre-installed in the development environment and can be used to manage a wide range of AWS services, including EC2 instances, S3 buckets, and more.

Configuration/Authentication

AWS credentials are required.

Additional information

Read more about AWS Command Line Interface.

`before_script.sh`

Resides at: /dataops-cde/scripts/

This is a cut-down version of the full before_script used in a DataOps pipeline, mainly to populate key environment-specific variables.

It is designed to be used by other parts of the development environment to ensure that the correct target account, database, etc., are being targeted. However, it can be useful to run it directly to ensure the correct variables are being populated.

DATAOPS_DATABASE is the most critical of these.

Configuration/Authentication

None required.

Usage example

/dataops-cde/scripts/before_script.sh

Produces:

DATAOPS_PREFIX=DATAOPS_TDO_22
DATAOPS_ENV_NAME=FB_DP_VERSIONING
DATAOPS_ENV_NAME_PROD=PROD
DATAOPS_ENV_NAME_QA=QA
DATAOPS_ENV_NAME_DEV=DEV
DATAOPS_BRANCH_NAME_PROD=main
DATAOPS_BRANCH_NAME_QA=qa
DATAOPS_BRANCH_NAME_DEV=dev
DATAOPS_DATABASE=DATAOPS_TDO_22_FB_DP_VERSIONING
DATABASE=DATAOPS_TDO_22_FB_DP_VERSIONING
DATAOPS_DATABASE_MASTER=DATAOPS_TDO_22_PROD
DATAOPS_NONDB_ENV_NAME=FB_DP_VERSIONING
DATAOPS_BEFORE_SCRIPT_HAS_RUN=YES

Additional information

Read more about the DataOps custom before script.

If you have created a custom before_script.sh in your DataOps project and want to use that within your DataOps development environment, don't hesitate to get in touch with our Support team.

dataops CLI

Configuration/Authentication

None required.

Usage example

dataops sole gen "create schema SALES_RECORD.SALES comment='This is a test DataOps.live schema.';"

Produces:

dataops INFO Reading config from create schema SALES_RECORD.SALES comment='This is a test DataOps.live schema.';
dataops INFO Extracting data from a raw DDL query.
dataops INFO Parsing schema object: SALES.
databases:
  SALES_RECORD:
    schemas:
      SALES:
        comment: This is a test DataOps.live schema.
        is_managed: false
        is_transient: false
        manage_mode: all
dataops INFO Written output to stdout

Additional information

dataops-render

Resides at: /dataops-cde/scripts/

This is the same render engine that runs in jobs within a pipeline to render the .template.xxx files.

You can learn more about it in the DataOps template rendering section.

Configuration/Authentication

None required.

Usage example

/home/gitpod/.pyenv/versions/3.8.13/bin/python /dataops-cde/scripts/dataops-render -o --file $STREAMLIT_FOLDER/orders/app.template.py render-template

Produces:

/workspace/truedataops-22/dataops/streamlit/orders/app.py [exists, overwritten]

Additional information

Read more about the DataOps template rendering.

dbt CLI

Configuration/Authentication

~/.dbt/profiles.yml is built/rebuilt every time dbt is run using the following variables:

DATAOPS_MATE_SECRET_ACCOUNT
DATAOPS_MATE_SECRET_PASSWORD
DATAOPS_MATE_SECRET_USER
DATAOPS_MATE_ROLE
DATAOPS_MATE_WAREHOUSE

note

The old variable names e.g. starting with DBT_ are still compatible and functional. However, we recommend transitioning to the above new names e.g. starting with DATAOPS_MATE_ in your setup to stay current with the latest updates.

Usage example

cd dataops/modelling && dbt parse

Produces:

11:22  Running with dbt=1.2.1
11:22  Start parsing.
11:22  Dependencies loaded
11:22  ManifestLoader created
11:22  Manifest loaded
11:22  [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 3 unused configuration paths:
- models.MyTemplate.dataops_meta
- models.MyTemplate.dataops_meta.pipelines
- snapshots

11:22  Manifest checked
11:23  Flat graph built
11:23  Manifest loaded
11:23  Performance info: target/perf_info.json
11:23  Done.

Additional information

glab CLI

Configuration/Authentication

If the environment variable DATAOPS_ACCESS_TOKEN is set, then glab is automatically authenticated.

Usage example

glab ci list

Produces:

Showing 30 pipelines on dataops-demo-project/truedataops-22 (Page 1)

(success) • #851831   kaggledev           (about 1 day ago)
(success) • #851817   kaggledev           (about 1 day ago)
(success) • #851748   kaggledev           (about 1 day ago)
(success) • #851730   kaggledev           (about 1 day ago)
(success) • #851483   kaggledev           (about 1 day ago)
(canceled) • #851482  kaggledev           (about 1 day ago)
(success) • #851422   kaggledev           (about 1 day ago)
(success) • #851415   kaggledev           (about 1 day ago)
(canceled) • #851413  kaggledev           (about 1 day ago)
(success) • #851284   kaggledev           (about 1 day ago)
(canceled) • #851282  kaggledev           (about 1 day ago)
(canceled) • #851281  kaggledev           (about 1 day ago)
(canceled) • #851279  kaggledev           (about 1 day ago)
(canceled) • #851276  kaggledev           (about 1 day ago)
(skipped) • #851275   kaggledev           (about 1 day ago)
(manual) • #851264    kaggledev           (about 1 day ago)
(manual) • #851075    kaggledev           (about 1 day ago)
(failed) • #851074    kaggle_dev          (about 1 day ago)
(manual) • #851038    dp_versioning       (about 1 day ago)
(manual) • #850819    dp_versioning_sean  (about 1 day ago)
(manual) • #850766    dpversioning        (about 1 day ago)
(manual) • #849322    dp_versioning       (about 1 day ago)
(manual) • #849268    dp_versioning       (about 1 day ago)
(manual) • #849172    dp_versioning       (about 1 day ago)
(manual) • #848585    dp_versioning       (about 1 day ago)
(manual) • #848390    dp_versioning       (about 2 days ago)
(manual) • #848388    dp_versioning       (about 2 days ago)
(manual) • #848380    dp_versioning       (about 2 days ago)
(manual) • #848313    dp_versioning       (about 2 days ago)
(manual) • #847459    dp_versioning       (about 2 days ago)

Additional information

jupyter-notebook

This Visual Studio Code (VS Code) extension comes pre-installed in the DataOps development environment. It allows you to create, edit, and run shareable notebooks directly from within VS Code.

With the Jupyter notebook extension, you can easily manage your notebooks, run code cells, and view the output directly in VS Code. You can also use the extension to launch Jupyter servers and manage your kernel environments.

Configuration/Authentication

It is required if your Jupyter notebook needs it, depending on what it is doing, and you should do it within the application.

Additional information

Read more about the Jupyter Notebooks in VS Code.

pre-commit

This tool helps identify issues before committing changes to your project. Pre-commit hooks are scripts or commands that run automatically before a commit.

Configuration/Authentication

Configure pre-commit hooks by creating a pre-commit config file at your project's root. The pre-commit config file describes what repositories and hooks are installed.

Here is an example configuration:

.pre-commit-config.yaml
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      - id: check-yaml
  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
      - id: detect-secrets

This example config file specifies two hooks:

check-yaml: checks that all YAML files are valid
detect-secrets: checks that no secrets are committed to the repository

For more information on these particular hooks, see their respective projects on GitHub.

Usage

To enable pre-commit for your project run:

pre-commit install

Now the tool will run automatically on every commit against changed files.

note

Remember, to enable the tool, you must run pre-commit install every time you clone a project.

To run the hooks manually across all files, not just files that have changed, run:

pre-commit run --all-files

You can update your hooks to the latest version automatically by running:

pre-commit autoupdate

This will update your .pre-commit-config.yaml with the latest version for each repo.

Additional information

Read more about the full official pre-commit documentation.

snowpark

Snowpark allows you to write code in your preferred language and run that code directly on Snowflake. The DevReady is already fully pre-configured with all the main libraries and tools required for Snowpark development and testing.

Configuration/Authentication

Snowflake credentials are required.

Usage examples

See Snowpark use cases for usage examples.

Additional information

snowsql CLI

Configuration/Authentication

None required. The DataOps development environment sets the SNOWSQL environment variables based on the current MATE credentials.

Usage example

snowsql -q "select * from STARSCHEMA.DIM_CUSTOMER limit 5;"

Produces:

* SnowSQL * v1.2.26
Type SQL statements or !help
+-------+---------------------+-----------+--------------+------------------+-------------------+------------+---------------+--------------+--------+---------------+----------------------+-----------------+----------------+-----------------+---------------+-------------------+---------------------+
| TITLE | FULLNAME            | ADDRESSID | PHONENUMBER  | TOTALPURCHASEYTD | DATEFIRSTPURCHASE | BIRTHDATE  | MARITALSTATUS | YEARLYINCOME | GENDER | TOTALCHILDREN | NUMBERCHILDRENATHOME | EDUCATION       | OCCUPATION     | NUMBERCARSOWNED | HOMEOWNERFLAG | COUNTRYREGIONCODE | NAME                |
|-------+---------------------+-----------+--------------+------------------+-------------------+------------+---------------+--------------+--------+---------------+----------------------+-----------------+----------------+-----------------+---------------+-------------------+---------------------|
| Mr.   | David R. Robinett   |     29177 | 238-555-0100 |          16.01   | 2003-09-01        | 1961-02-23 | M             | 25001-50000  | M      | 4             | 0                    | Graduate Degree | Clerical       | 0               | 1             | DE                | Nordrhein-Westfalen |
| Ms.   | Rebecca A. Robinson |     16400 | 648-555-0100 |           4      | 2004-06-05        | 1965-06-11 | M             | 50001-75000  | F      | 3             | 3                    | Bachelors       | Professional   | 1               | 1             | AU                | Victoria            |
| Ms.   | Dorothy B. Robinson |     19867 | 423-555-0100 |        4730.04   | 2002-04-07        | 1954-09-23 | S             | 75001-100000 | M      | 2             | 0                    | Partial College | Skilled Manual | 2               | 0             | AU                | Victoria            |
| Ms.   | Carol Ann F. Rockne |     17009 | 439-555-0100 |        2435.4018 | 2001-10-27        | 1943-07-15 | M             | 25001-50000  | M      | 1             | 0                    | Bachelors       | Clerical       | 0               | 1             | GB                | England             |
| Mr.   | Scott M. Rodgers    |     13003 | 989-555-0100 |        1647      | 2002-04-18        | 1968-05-15 | M             | 50001-75000  | M      | 2             | 2                    | Bachelors       | Professional   | 1               | 1             | AU                | Queensland          |
+-------+---------------------+-----------+--------------+------------------+-------------------+------------+---------------+--------------+--------+---------------+----------------------+-----------------+----------------+-----------------+---------------+-------------------+---------------------+
5 Row(s) produced. Time Elapsed: 0.124s
Goodbye!

Additional information

sole-validate

This tool helps you validate your SOLE configuration locally before running a SOLE pipeline in the data product platform. It allows you to catch potential SOLE configuration issues before running the pipeline, reducing your development time.

Configuration/Authentication

None required.

Usage examples

See SOLE compilation and validation for usage examples.

Additional information

Read more about SOLE compilation and validation.

sqlfluff

A SQL linter (and fixer) as documented here.

Configuration/Authentication

None required.

Usage example

To run linting and report on the results:

cd dataops/modelling/models
sqlfluff lint --dialect snowflake

To run linting, excluding certain rules and report on the results:

cd dataops/modelling/models
sqlfluff lint --dialect snowflake  --exclude-rules L036,L014

SQLFluff can also be used to fix certain issues:

cd dataops/modelling/models
sqlfluff fix --dialect snowflake  --exclude-rules L036,L014

note

When you run the fix command, it may change many models from the current directory downwards. It could change hundreds or even thousands of models. Remember, you only need to stage and commit the modified models you want. Also, it would be best if you reran any changed models before you commit, validating they are still working as expected.

sqlfmt

A SQL formatter as documented here.

Configuration/Authentication

None required.

Usage example

cd dataops/modelling/models
sqlfmt .

note

Running the formatter changes all the models from the current directory downwards. It could change hundreds or even thousands of models. Remember, you only need to stage and commit the modified models you want. Also, it would be best if you reran any changed models before you commit, validating they are still working as expected.

streamlit

Configuration/Authentication

Configuration or authentication is done within your Streamlit application.

Usage example

cd dataops/streamlit/orders/ && streamlit run app.py

Produces:

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.

  You can now view your Streamlit app in your browser.

  Network URL: http://10.0.5.2:8501
  External URL: http://18.202.147.41:8501

See Steamlit use cases for usage examples.

asdf​

Configuration/Authentication​

Additional information​

aws CLI​

Configuration/Authentication​

Additional information​

before_script.sh​

Configuration/Authentication​

Usage example​

Additional information​

dataops CLI​

Configuration/Authentication​

Usage example​

Additional information​

dataops-render​

Configuration/Authentication​

Usage example​

Additional information​

dbt CLI​

Configuration/Authentication​

Usage example​

Additional information​

glab CLI​

Configuration/Authentication​

Usage example​

Additional information​

jupyter-notebook​

Configuration/Authentication​

Additional information​

pre-commit​

Configuration/Authentication​

Usage​

Additional information​

snowpark​

Configuration/Authentication​

Usage examples​

Additional information​

snowsql CLI​

Configuration/Authentication​

Usage example​

Additional information​

sole-validate​

Configuration/Authentication​

Usage examples​

Additional information​

sqlfluff​

Configuration/Authentication​

Usage example​

sqlfmt​

Configuration/Authentication​

Usage example​

streamlit​

Configuration/Authentication​

Usage example​

Additional information​

asdf

Configuration/Authentication

Additional information

aws CLI

Configuration/Authentication

Additional information

`before_script.sh`

Configuration/Authentication

Usage example

Additional information

dataops CLI

Configuration/Authentication

Usage example

Additional information

dataops-render

Configuration/Authentication

Usage example

Additional information

dbt CLI

Configuration/Authentication

Usage example

Additional information

glab CLI

Configuration/Authentication

Usage example

Additional information

jupyter-notebook

Configuration/Authentication

Additional information

pre-commit

Configuration/Authentication

Usage

Additional information

snowpark

Configuration/Authentication

Usage examples

Additional information

snowsql CLI

Configuration/Authentication

Usage example

Additional information

sole-validate

Configuration/Authentication

Usage examples

Additional information

sqlfluff

Configuration/Authentication

Usage example

sqlfmt

Configuration/Authentication

Usage example

streamlit

Configuration/Authentication

Usage example

Additional information