Skip to main content

dbt 1.5 Migration

Running dbt as MATE action

As dbt version 1.5 becomes the default for the MATE orchestrator and MATE packages, consider running dbt as a MATE action in your DataOps projects unless it is absolutely necessary. Find below the reasoning behind this.

Macro changes

Rename of dbt_utils.surrogate_key macro

The macro surrogate_key was renamed to generate_surrogate_key as part of the 1.0.0 release of dbt_utils. You can read more about the release at this Migration guide.

Due to this change, usage of the macro surrogate_key would lead to the following error:

Warning: `dbt_utils.surrogate_key` has been replaced by `dbt_utils.generate_surrogate_key`.
The new macro treats null values differently to empty strings. To restore the behavior of the original macro, add a global variable in dbt_project.yml called `surrogate_key_treat_nulls_as_empty_strings` to your dbt_project.yml file with a value of True.

To resolve this error, you must replace the macro surrogate_key with generate_surrogate_key.

Failing configuration:

{{ config(alias='CUSTOMER') }}

SELECT
C_CUSTKEY,
C_NAME,
C_ADDRESS,
C_NATIONKEY,
C_PHONE,
C_ACCTBAL,
C_MKTSEGMENT,
C_COMMENT,
{{dbt_utils.surrogate_key(['C_CUSTKEY','C_NAME','C_ADDRESS'])}} AS C_ID
FROM {{ source('snowflake_sample_data_tpch_sf1', 'CUSTOMER') }}

Successful configuration:

{{ config(alias='CUSTOMER') }}

SELECT
C_CUSTKEY,
C_NAME,
C_ADDRESS,
C_NATIONKEY,
C_PHONE,
C_ACCTBAL,
C_MKTSEGMENT,
C_COMMENT,
{{dbt_utils.generate_surrogate_key(['C_CUSTKEY','C_NAME','C_ADDRESS'])}} AS C_ID
FROM {{ source('snowflake_sample_data_tpch_sf1', 'CUSTOMER') }}
note

There is a behavior change between the two macros. surrogate_key() treated null values and blank strings the same, which could lead to duplicate keys being created. generate_surrogate_key() does not have this flaw. If needed, it's possible to opt into the legacy behavior by setting the following variable in your dbt project:

dbt_project.yml
vars:
surrogate_key_treat_nulls_as_empty_strings: true #turn on legacy behavior

Rename of dbt_utils.not_null_where test

Some functionalities that were part of dbt_utils were migrated to dbt-core as part of the dbt_utils' 1.0 release. You can read more about the release at the Migration duide.

This change updated one of the tests in dbt_utils, dbt_utils.not_null_where. Using the test would lead to the following error:

'dict object' has no attribute 'test_not_null_where'. This can happen when calling a macro that does not exist. Check for typos and/or install package dependencies with "dbt deps".

The test is now available as native functionality in dbt-core as not_null with the where parameter.

Failing configuration:

version: 2
models:
- name: samples_customer
columns:
- name: C_CUSTKEY
description: ""
tests:
- unique
- not_null
- name: C_NATIONKEY
description: ""
tests:
- dbt.not_null_where:
where: "C_CUSTKEY > 1"

Successful configuration:

version: 2
models:
- name: samples_customer
columns:
- name: C_CUSTKEY
description: ""
tests:
- unique
- not_null
- name: C_NATIONKEY
description: ""
tests:
- not_null:
where: "C_CUSTKEY > 1"

apply_grants Macro

dbt 1.5 introduced a new macro apply_grants which can lead to an error if you have a macro with the same name in your project. The error would be in the following format

macro 'dbt_macro__apply_grants' takes no keyword argument 'should_revoke'

You can resolve this by renaming the macro in your project.

Failing configuration:

{% macro apply_grants(roles,
privileges='SELECT',
view_privileges='SELECT',
database=target.database,
schema=this.schema,
relation=this.name,
override_master_check=false,
include_sensitive=false,
exclusion_list=[]) %}

{% endmacro %}

Successful configuration:

{% macro apply_grants_local(roles,
privileges='SELECT',
view_privileges='SELECT',
database=target.database,
schema=this.schema,
relation=this.name,
override_master_check=false,
include_sensitive=false,
exclusion_list=[]) %}

{% endmacro %}

This change also requires renaming the usage of your macro to the new name. For more information about macro rename, visit the section Resolving macros collisions when changing dbt versions.

Macros collisions when changing dbt versions

When you decide to upgrade to a new dbt version, it is essential to be mindful of potential macro collisions that may arise. Different dbt versions often come with updates to bundled and custom-installed third-party packages. This may lead to conflicts in the macros used within your dbt project.

A macro collision occurs when discrepancies between macro definitions exist in different package versions. For instance, consider a scenario where a macro named my_macro is currently being used in your dbt project. In dbt version 1.0.9, the my_macro macro is defined within the my_package package version 0.8.0. However, if you upgrade to dbt version 1.4, the my_package package will be updated to version 1.0.0, which might result in changes to the my_macro macro. These changes could include alterations to the macro's parameters or implementation. Additionally, it is possible that the my_macro has been renamed to my_macro_updated or is entirely missing, which could lead to job failures.

When you encounter macro conflicts while upgrading your dbt version, there are specific steps you can take to resolve them effectively. This topic addresses these conflicts to ensure a smooth transition and an uninterrupted operation.

Macro has changed between package versions:

Option 1: Update the model

  1. Examine the new macro's input requirements in the upgraded package version.
  2. Modify the model that uses the macro to align with the new macro's input requirements.

Option 2: Overwrite the macro

  1. If you prefer to retain the old macro version, overwrite the new version.
  2. Create a new file named my_macro.sql in the /dataops/modelling/macros folder of your dbt project. This action replaces any built-in macro with the same name, preserving your desired behavior.

Macro has a new name between package versions:

Option 1: Update the model call

  1. Identify the new name of the macro (e.g., my_macro_v2) from the upgraded package version.
  2. Modify your model to call the macro with the new name, ensuring seamless integration.

Option 2: Create an alias

  1. Create an alias to continue using the old macro name.
  2. Create a new file named my_macro.sql in the /dataops/modelling/macros folder.
  3. Inside the model files, include the macro call using the new macro name (my_macro_v2). This alias approach will allow you to use the macro with its previous name while using the updated version.

Following these guidelines, you can effectively manage macro collisions when upgrading dbt versions. Understanding the potential issues and employing the appropriate solutions will help keep your dbt project running smoothly and prevent any disruptions caused by conflicting macros. Stay proactive and ensure a seamless transition when adapting to new dbt versions.

Empty tests keyword

MATE jobs would fail if their configuration contains an empty tests: keyword in your YAML file with an error message in the following format:

Parsing Error
Invalid models config given in models/customer.yml @ models: {'name': 'customer', 'tests': None, 'description': 'List of all customers', 'original_file_path': 'models/customer.yml', 'yaml_key': 'models', 'package_name': 'DataOps'} - at path ['tests']: None is not of type 'array'

You can resolve this either by removing the empty tests: keyword or setting it to an empty list like tests: [].

Failing configuration:

version: 2
models:
- name: customer
tests:
description: List of all customers

Successful configuration:

version: 2
models:
- name: customer
description: List of all customers

Deprecated package dlx_modelling_and_transformation_libs removed

As part of the dbt 1.5 update, we have removed the deprecated package dlx_modelling_and_transformation_libs. Consequently, using this package will result in an error formatted as below:

Encountered an error while running operation: Compilation Error
'dlx_modelling_and_transformation_libs' is undefined

> in macro

All functionalities of the package are still available in the package dataops.

To fix this error, replace all instances of dlx_modelling_and_transformation_libs with dataops in the macro or models that are still using the old name.

Failing configuration:

{% macro get_all_users(
) %}

{%- set get_users_sql -%}
select * from SNOWFLAKE.ACCOUNT_USAGE.USERS;
{%- endset -%}
{{ log("get_users_sql: " ~ get_users_sql ,true)}}
{{ dlx_modelling_and_transformation_libs.execute_sql(get_users_sql) }}

{% endmacro %}

Successful configuration:

{% macro get_all_users(
) %}

{%- set get_users_sql -%}
select * from SNOWFLAKE.ACCOUNT_USAGE.USERS;
{%- endset -%}
{{ log("get_users_sql: " ~ get_users_sql ,true)}}
{{ dataops.execute_sql(get_users_sql) }}

{% endmacro %}