When actively developing MATE models in a feature branch, especially with larger projects, it can be frustrating to have to wait for each pipeline to build all the project's models.
This can be alleviated by adding a runner script to detect which models have changed in each pipeline run, adding this script to a tweaked version of the Build all Models job (or your project's equivalent thereof).
In your project, create a runner script to detect the model changes:runner-scripts/30-detect-model-changes.sh
api_response=$(curl -s --header "PRIVATE-TOKEN: $DATAOPS_ACCESS_TOKEN" "https://app.dataops.live/api/v4/projects/$CI_PROJECT_ID/pipelines?ref=$CI_COMMIT_REF_NAME&status=success&order_by=id&sort=desc")
if [[ -n "$DATAOPS_DEBUG" ]]; then echo "api_response: $api_response"; fi
last_commit=$(jq -r '..sha' <<< "$api_response")
if [[ -n "$DATAOPS_DEBUG" ]]; then echo "last_commit: $last_commit"; fi
echo "Grepping the diff against the last pipeline's commit for: $FIND_CHANGED_FILE"
found_files=$(git diff --name-only $last_commit | grep -i "$FIND_CHANGED_FILE")
if [[ -n "$DATAOPS_DEBUG" ]]; then echo "found_files: $found_files"; fi
while IFS= read -r line; do
done <<< "$found_files"
changed_models=$(echo "$changed_models" | xargs)
if [[ -n "$DATAOPS_DEBUG" ]]; then echo "changed_models: $changed_models"; fi
if [[ -n "$changed_models" ]]; then
expose_key TRANSFORM_MODEL_SELECTOR "$changed_models"
expose_key TRANSFORM_MODEL_SELECTOR "_"
if [[ -n "$DATAOPS_DEBUG" ]]; then echo "TRANSFORM_MODEL_SELECTOR: $TRANSFORM_MODEL_SELECTOR"; fi
This script will find the commit SHA from your branch's previous pipeline and compute a diff of filenames between that and the current commit. This list of filenames is searched for a given path pattern, in this case, it will be
dataops/modelling/models/.*\.sqlto look for any changed SQL models and the list of model names are passed straight into
Edit your existing Build all Models job (or equivalent) to ensure it will only run on non-feature branches (dev, qa and master).pipelines/includes/local_includes/modelling_and_transformation/build_all_models.yml
Build all Models:
stage: Data Transformation
- if: '$CI_COMMIT_REF_NAME == "master" || $CI_COMMIT_REF_NAME == "qa" || $CI_COMMIT_REF_NAME == "dev"'
Add a new job as a copy of Build all Models, adding the following lines to the new job:pipelines/includes/local_includes/modelling_and_transformation/build_all_models.yml
Build all Models: ...
Build Changed Models ONLY:
stage: Data Transformation
- cp $CI_PROJECT_DIR/runner-scripts/30-detect-model-changes.sh /runner-scripts/
- chmod +x /runner-scripts/30-detect-model-changes.sh
- if: '$CI_COMMIT_REF_NAME != "master" && $CI_COMMIT_REF_NAME != "qa" && $CI_COMMIT_REF_NAME != "dev"'
In the same way, if you want tests to only run on changed models, also update MATE testing jobs with this logic.
Update variables.yml to include a reference to your DataOps access token (needed for the API call in the runner script):pipelines/includes/config/variables.yml
When your pipeline runs in a feature branch, the new versions of the MATE jobs will run, only building/testing models that have changed since the previous pipeline's commit.