How to Detect if a File Has Changed
Sometimes, it may be necessary to consider triggering a specific action in a pipeline only if a particular file in the project has changed since the last pipeline run in that environment.
This kind of logic is a little at odds with the idempotent properties that DataOps pipelines are generally expected to exhibit, so only use this technique if there is a valid reason to do so.
A valid reason for this approach may be where a DataOps pipeline is triggering an external system/API that has usage limits or a per-action cost model. In that case, it would be cost-effective to only interact with that system if something has changed.
Since this approach relies on computing the state of a file change within a job, it is impossible to use
rules logic to include/exclude jobs from a pipeline.
In your project, create a runner script to detect the file change:runner-scripts/30-detect-file-change
api_response=$(curl -s --header "PRIVATE-TOKEN: $ACCESS_TOKEN" "https://app.dataops.live/api/v4/projects/$CI_PROJECT_ID/pipelines?ref=$CI_COMMIT_REF_NAME&status=success&order_by=id&sort=desc")
if [[ -n "$DATAOPS_DEBUG" ]]; then echo "api_response: $api_response"; fi
last_commit=$(jq -r '..sha' <<< "$api_response")
if [[ -n "$DATAOPS_DEBUG" ]]; then echo "last_commit: $last_commit"; fi
if git diff --name-only $last_commit | grep "$FIND_CHANGED_FILE"; then
echo "File $FIND_CHANGED_FILE HAS changed since the last commit, setting variable FILE_HAS_CHANGED"
expose_key FILE_HAS_CHANGED '1'
echo "File $FIND_CHANGED_FILE has NOT changed since the last commit"
This uses variables
ACCESS_TOKEN(DataOps user access token) and
FIND_CHANGED_FILE(file path to examine) and will set the variable
FILE_HAS_CHANGEDif the specified file has changed since the last pipeline that ran in the same branch.
Create another runner script (or adapt your existing runner script), using the
FILE_HAS_CHANGEDvariable to decide whether to run the job's main activity or stop the script. Here is an example:runner-scripts/50-do-something
echo "Checking to see if the specified file has changed since the last commit"
if [[ -n "$FILE_HAS_CHANGED" ]]; then
echo "The file HAS changed, let's do this thing"
echo "No, the file did not change, we shall exit"
echo "Here is where we do this thing......."
You can alternatively use the non-zero exit code if you want to cause the job, and therefore the pipeline, to fail. Otherwise, the pipeline will continue after this job.
Create a job to run your scripts. Here is an example:pipelines/includes/local_includes/sample_job.yml
Do Something If File Has Changed:
stage: Data Transformation
- cp $CI_PROJECT_DIR/runner-scripts/* /runner-scripts/
When this job runs, it will copy both runner scripts into the runner, running in sequence as part of the