Skip to main content

Data Product Glossary

Data product terms and definitions

Here, you'll find a collection of terms and definitions related to data products. Whether you're manually building the data product components or using the data product creator, this glossary serves as a valuable resource for understanding key concepts and terminology.

AI Assist chat

An AI-powered copilot that improves your user experience on the data product platform. Assist integrates into the data product creation workflow, guiding users in building models by suggesting code based on natural language prompts.

Clone database

Also known as a "zero-copy-clone", captures a snapshot of the Snowflake databases and schemas. This means you get a new object structure without the overhead of copying the actual data. Modifications implemented to either the source or clone object do not affect the other.

For more information, see Databases.

Data product manifest

A data product manifest is a document that outlines the properties and metadata of a data product providing consumers with essential information to use the data product. DataOps.live embeds data quality and accuracy directly into the manifest.

For more information, see Data Product Manifest.

Data product specification

A data product definition file includes all the properties necessary to build the data product.

For more information, see Data Product Specification.

Data product orchestrator

The DataOps.live Data Product orchestrator generates data products in our CI/CD process. These data products contain valuable insights and transformations applied to raw data and are typically represented as yaml configurations, accompanied by various parameters that govern their behavior. The Data Product orchestrator enriches the data product specifications with the metadata from the pipeline run and adds all objects and tests that are part of the data product.

For more information, see Data Products Administration.

Dataset

A dataset is a collection of data objects, such as tables, views, functions, schemas, and descriptions. The goal of a data product is to make such data accessible, consumable, insightful, and actionable to address specific, targeted questions.

Input port

A designated endpoint or interface through which data is received or ingested into another data product.

Output port

An output port is a connection point made available by a data product, such as an S3 bucket, a Snowflake share, or a Snowflake role, among other options. The definition of an output port includes all the necessary information for a consumer to establish a connection, excluding credentials.

Source data

The initial database tables where the metadata is collected, generated, or sourced before being organized into a dataset for analysis or use.

Template project

When you initiate data product creation with the data product creator, it's advisable to begin with a new DataOps project. This introduces a template project preconfigured with essential components, such as standardized structure, mandatory directories and files, and many best-practice configurations necessary for the data product project

Service Level Objective (SLO)

SLO is a collection of objectives and specific targets a data product aims to achieve. For example, an SLO may include objectives declaring that data is no more than 4 hours old or that 98% of rows pass quality checks.

Service Level Indicator (SLI)

SLI is a metric used to assess the level of service provided by a system or service. Indicators include the current percentage of passed tests, the most recent update date, and the count of rows that fail quality checks (computed by pipeline execution).

DataOps terms and definitions

Here, you'll find a collection of terms and definitions related to DataOps.live. This glossary serves as a valuable resource for understanding key concepts and terminology.

Access level

The access level in the DataOps.live platform determines what permissions users have on the template project of the data product creator and groups. Users with the Developer primary role have full access to the template project used as a base for the data product creator and can invite other users to the project.

DataOps group

A group is a collection of projects and groups/sub-groups. These items are all contained within an account. Group owners can assign other DataOps users access to each group (as members), assigning each with different privilege levels.

For more information, see DataOps Core Concepts.

DataOps project

A DataOps project on the DataOps.live platform is primarily a Git-compliant code repository that contains configurations allowing code to be merged and pipelines to be run. You can create projects at the top level of an account. However, creating projects within groups/sub-groups is good practice.

For more information, see DataOps Core Concepts.

DataOps runner

A Docker container that picks up and runs the jobs within data product pipelines.

For more information, see Runner Overview.

DataOps Template project

When you start working with our data product platform, a template project with all required configurations is available by default. Starting with this template project is recommended. The standard structure, mandatory directories and files, and many best-practice configurations necessary for the DataOps project are predefined in the template project.

DataOps.live Develop (DevReady)

A browser-based tool that automates development environments for each of your data product tasks in seconds.

For more information, see What is DevReady.

MATE project

DataOps.live typically has a single Modelling and Transformation Engine (MATE) project per DataOps project (repository) located in /dataops/modelling. A project is a complete set of YAML, SQL, and Python files that define all sources and transformations for a MATE project.

For more information, see MATE Core Concepts.