Skip to main content

MATE Project Documentation

Documentation forms a critical part of MATE as well as all other components of the DataOps platform. But, as this content's focus is on MATE, let's dive into how to build MATE docs.

Why?

Good, relevant, well-written documentation reduces stakeholder and user dependence on the data team and improves collaboration and self-service, one of the seven pillars of #TrueDataOps. However, documentation is often given a lower priority than writing code. This is because the documentation is usually created in a separate tool. We have solved this challenge by automating our documentation function and keeping the docs themselves as close to the code as possible.

We have achieved this in two ways:

Documenting models in YAML Files

The documentation of MATE models occurs in YAML files inside the modelling directory (/dataops/modelling/models).

tip

These docs are written in the same YAML file where your MATE tests are configured.

By way of example, let's document the stg_product_types and stg_orders models described in Using SQL to Build MATE Models.

Let's assume that we haven't yet configured any MATE tests for this model, so we will have to create a new YAML file.

As demonstrated in this YAML file, all you do is add a description below each model. You can also add a description below each column.

/dataops/modelling/models/stg_product_types.yml
version: 2

models:
- name: stg_product_types
description: This model contains one unique product type per row
columns:
- name: product_type_id
description: Unique key for stg_product_types
- name: product_type_code
description: Primary key for stg_product_types
- name: product_type_description

- name: stg_orders
description:
columns:
- name: order_id
- name: product_type
- name: items_ordered
- name: order_date

Doc Blocks

Doc blocks are used to create longer, more descriptive documentation. They are created and rendered in Markdown files (.md) in the same directory as the model YAML files (/dataops/modelling/models).

The workflow to build our product_type doc block is as follows:

  • Create a new file called product_types.md to document the different product types in the stg_product_types model.
  • Add the required text wrapped in {% docs <doc name> %} and {% enddocs %}
  • Save the file
  • Call the doc block in a model's YAML file

This code snippet shows how to create a doc block.

product_types.md
{% docs product_types %}

The product type will be one of the following values:

| Type | Description |
| ------------- | ------------------------------------------------------------------------------------------------------- |
| toy_trains | This product type categorizes all the toy trains irrespective of their brand, size, shape, and color |
| toy_cars | This product type categorizes all the toy cars irrespective of their brand, size, shape, and color |
| toy_dolls | This product type categorizes all the toy dolls irrespective of their brand, size, shape, and color |
| toy_airplanes | This product type categorizes all the toy airplanes irrespective of their brand, size, shape, and color |

{% enddocs %}
tip

You can create one file per doc block or add multiple doc blocks to a single file. The key is to use the {% docs <doc name> %} with a unique name at the top of each block.

Lastly, the way to refer to a doc block is to use the statement "{{ doc('<doc name>') }}" as a model or column description.

For instance:

/dataops/modelling/models/stg_product_types.yml
version: 2

models:
- name: stg_orders
description:
columns:
- name: order_id
- name: product_type
description: "{{ doc('product_types') }}"
- name: items_ordered
- name: order_date

Generating the Documentation

At the end of a pipeline run, the default behavior is to automatically generate project documentation. The code for the job that runs is similar to the following YAML code snippet:

generate_model_docs:
extends:
- .modelling_and_transformation_base
- .agent_tag
variables:
TRANSFORM_ACTION: DOCS
stage: "Generate Docs"
script:
- /dataops
artifacts:
when: always
name: modelling_and_transformation
paths:
- $TRANSFORM_PROJECT_PATH/target
icon: ${TRANSFORM_ICON}
note

The artifacts in this job must not be changed, or this job's documentation will not show up as part of the automated documentation.

Viewing the Documentation

The following details are relevant to view the project documentation.

1. View Documentation

The automated documentation menu option is found under CI/CD -> Pipelines, and against each pipeline, see the View Documentation menu option (on the right side of each pipeline row).

view-documentation __shadow__

2. Project Overview

This opens up a new interface where you can see the overall project overview.

docs-overview __shadow__

3. Model Relationships

This interface also includes the ability to view model relationships.

model-relationships __shadow__

4. Model Details

Lastly, it also includes the ability to drill down into a model's details.

model-details __shadow__