Skip to main content

DataOps Environments

By default, DataOps.live is set up to use the following environments, configured from the project variables in the DataOps Reference Project. To maintain clarity and consistency across the data product platform, environment names are consistently displayed in all capital letters, while branch names are consistently presented in lowercase.

environments-diagram !!shadow!!

Production

Associated with the project's production branch main (value of variable DATAOPS_BRANCH_NAME_PROD) the production environment PROD is always the first pipeline to run. It sets up the primary production database and other key resources in Snowflake. As production is a persistent, long-running branch, its resources are likewise almost always long-running resources. Therefore, these resources, especially the production database, are not usually recreated in the data product platform.

Careful!

For all new projects, the first-ever pipeline to run must be in production (the main/master branch) so that when feature branch pipelines run, they can clone the production database.

By default, all production Snowflake objects will have the suffix PROD (value of variable DATAOPS_ENV_NAME_PROD). For instance, the primary production database is typically named DATAOPS_PROD.

Quality assurance

Like production, the QA environment is a long-running branch with persistent resources in Snowflake. By using the standard branch name qa (value of variable DATAOPS_BRANCH_NAME_QA), this environment, unless configured otherwise, uses the suffix QA (value of variable DATAOPS_ENV_NAME_QA) for its database and other account-level Snowflake resources. For instance, the primary quality assurance database is typically named DATAOPS_QA.

Development

Whereas PROD and QA are long-running environments, DataOps.live treats the DEV environment as more transient, refreshing all resources and recreating the primary development database as a clone of the production database every time a pipeline runs.

Therefore, pipelines that run in the dev branch (value of variable DATAOPS_BRANCH_NAME_DEV) look slightly different from those in the higher branches (like prod and qa), as ingestion jobs are generally automatically omitted due to the available cloned production data.

Feature branches

Remember that in a new DataOps project, the PROD environment sets up the primary production database in Snowflake. When the feature branch pipeline runs, it clones the production database.

Behavioral changes upon cloning

With the release of the Snowflake 2023_07 bundle planned for January 2024, when a table is cloned, its load history will also be cloned. As a result, files are not reloaded, and data is not duplicated in the cloned table. You can override this behavior using the FORCE = TRUE COPY option.

See Snowflake Documentation for more information.

DataOps.live designates any branch named differently from those detailed in the preceding topics as a feature branch. There is no mandated naming convention, but customers are encouraged to implement a naming policy and standard that fits their processes.

Databases and other account-level objects in Snowflake managed by a feature branch will take a naming suffix from a simplified version of the branch name. For example, a branch named DWH-12345 will result in a suffix of FB_DWH12345, which will, in turn, name the feature branch database DATAOPS_FB_DWH12345.

Resources

To learn more about namespacing for the given environment refer to the SOLE Fundamentals Guide and the SOLE Namespace and Environment Management Reference.