Skip to main content

Secrets Manager Orchestrator

Professional
Enterprise

Image$DATAOPS_SECRETSMANAGER_RUNNER_IMAGE

The Secrets Manager orchestrator allows a DataOps pipeline to retrieve passwords, keys, and other sensitive information from a remote secrets manager service and seamlessly adds them to the pipeline's vault.

This orchestrator currently supports:

Secrets are stored in and managed by a third-party remote Secrets Manager rather than on the orchestrator host. The Secrets Manager orchestrator is configured with the secrets' location to fetch and insert them into the DataOps vault at runtime, allowing access to all the jobs in the pipeline.

Usage

You can use the orchestrator in different ways:

Basic usage

pipelines/includes/local_includes/secrets_management_jobs/secrets_manager_load.yml
"Load Secrets":
extends:
- .agent_tag
stage: "Vault Initialisation"
image: $DATAOPS_SECRETSMANAGER_RUNNER_IMAGE
variables:
script:
- /dataops
icon: ${SECRETSMANAGER_ICON}

Using JSON

Store JSON objects in a secret, an entity within the Secrets Manager, using supported technologies rather than individual values. For example, use the parameter SECRETS_EXPAND_JSON to enable the orchestrator to expand and merge data from a JSON object into the DataOps vault, as follows:

  1. Set SECRETS_EXPAND_JSON to 1 (or True).
  2. Add an example secret named SNOWFLAKE and its values.
SNOWFLAKE
{
"TRANSFORM": {
"USERNAME": "DATAOPS_ADMIN",
"PASSWORD": "abcde12345"
}
}

The result in the vault looks like this:

SNOWFLAKE:
TRANSFORM:
USERNAME: DATAOPS_ADMIN
PASSWORD: abcde12345

Troubleshooting

When using the Secrets Manager with a DataOps runner deployed on AWS EC2, review the necessary Instance Metadata Service Version 2 (IMDSv2) configuration changes.

Supported secrets managers

We currently support the below secrets managers. For details about best practices when using secrets managers, see Security best practices.

AWS Secrets Manager

To correctly merge secrets into the vault, the key names must be a fully-namespaced vault path in the usual dotted notation. For example, the keys in the AWS secret must be SNOWFLAKE.TRANSFORM.USERNAME and SNOWFLAKE.TRANSFORM.PASSWORD, as displayed in figure 1 below, to end up with a vault that looks like this:

SNOWFLAKE:
TRANSFORM:
USERNAME: DATAOPS_ADMIN
PASSWORD: abcde12345

Figure 1: AWS Secret Keys

secrets_mgr_example.png !!shadow!!

To retrieve a single secret, or a specific list of secrets, from the AWS Secrets Manager, set the value of SECRETS_SELECTION to the secret's name, as displayed in the AWS console.

AWS Secrets Manager supports the automatic rotation of secrets for extra security. Learn how to set up an automatic rotation for AWS Secrets Manager secrets using the console in the AWS documentation.

AWS SSM parameter store

To correctly merge secrets into the vault from the AWS SSM Parameter Store, the parameter names must match a fully-namespaced vault path with an optional prefix that will automatically have the slashes replaced with dots.

For example, a parameter named /SNOWFLAKE/TRANSFORM/USERNAME is stored in the vault under the key SNOWFLAKE.TRANSFORM.USERNAME.

If the parameter name has a prefix like /dataops/SNOWFLAKE/TRANSFORM/USERNAME, you can remove this by using the variable SECRETS_STRIP_PREFIX — use the value /dataops/ for the above example key.

To select a subset of parameters using a path prefix, set the SECRETS_SELECTION value to the path. For example:

SECRETS_SELECTION: /dataops/
SECRETS_STRIP_PREFIX: /dataops/

Azure Key Vault

To merge secrets into the vault correctly, the parameter names must match a fully namespaced vault path. As secret names in Key Vault cannot contain dots, you must use dashes as separators.

For example, a secret named SNOWFLAKE-TRANSFORM-USERNAME will be stored in the vault under the key SNOWFLAKE.TRANSFORM.USERNAME.

The Secrets Manager orchestrator's default behavior is retrieving all secrets in the specified vault. However, it is possible to retrieve only a single secret, or a comma-separated list of secrets, by setting the variable SECRETS_SELECTION to the secret's name.

warning

The DataOps vault complies with YAML syntax, so you must store secrets in a hierarchical format following the YAML syntactic standards. The alternative will result in an error in retrieving the secret's value.

The below example shows how having an intermediate namespace with a value, the THREE: three part, is not supported and will result in an error:

ONE:
TWO:
THREE:
FOUR: four
FIVE: five

HashiCorp Vault

As with the AWS services, secret names can use slash separators which will be converted to dots when the secrets are loaded into DataOps. Currently, the Vault integration supports KV version 1 and KV version 2 secrets. When using KV version 2, DataOps by default pulls the latest version of any specific secret.

Use the parameter SECRETS_SELECTION to return a subset of the secrets in the selected mount point.

Authentication sequence

When you add multiple authentication parameters, they will be tried in the following order:

  1. JWT
  2. Token
  3. Username/password

Prerequisites to using JWT

If you use the JWT provided by DataOps in the variable CI_JOB_JWT, make sure to have the following configuration in HashiCorp Vault:

  • JWT authentication enabled and configured to use jwks_url="https://app.dataops.live/-/jwks" and bound_issuer="app.dataops.live"
  • A policy that allows list and read capabilities to the secrets that DataOps will be using
  • A role that uses the above policy and adds bound claims to the project/namespace IDs, branch/environment as required
Example JWT Policy
{
"role_type": "jwt",
"policies": ["dataops_read"],
"token_explicit_max_ttl": 60,
"user_claim": "user_email",
"bound_claims_type": "glob",
"bound_claims": {
"project_id": "12345"
}
}

You can view the token by using this example job:

Example JWT Job
Example JWT Job:
extends: .agent_tag
image: $DATAOPS_UTILS_RUNNER_IMAGE
stage: Pipeline Initialisation
script:
- echo "$CI_JOB_JWT" > CI_JOB_JWT.txt
artifacts:
paths:
- CI_JOB_JWT.txt

Custom secrets manager

Using a custom Python script to load secrets into a DataOps pipeline is also possible. The script will reside in the DataOps project (or a reference project) and can have any name. The script will function as a Python module, providing a single function named get_secrets which accepts no parameters and returns a dictionary of key-value pairs. For example:

scripts/custom_secrets.py
def get_secrets() -> 'dict[str, Any]':
secrets = {}

secrets['SNOWFLAKE.SOLE.PASSWORD'] = some_other_function_to_get_password()
...

return secrets

Potential uses include adding support for customer-specific, in-house built, or other unsupported secrets management technologies or providing a more complex, customizable interface to secrets management.

Get Support

To maintain a seamless customer experience using a custom Python script with DataOps.live, we recommend you reach out to our Support team to help with your development.

Security best practices

Created secrets and credentials should follow security best practices when using any of the supported Secret Managers. This includes doing the following:

  • Safeguard your root credentials. Don't generate any access keys for the root user or use the root for any deployments or everyday operations.
  • Apply a policy of least-privileges for permissions granted to any IAM (Identity and Access Management ) policy - only grant the permissions required to perform a specific task.
  • Use managed credentials where possible. For example, for AWS, this would be an instance profile attached to an EC2 instance or any AWS services that support a configurable IAM role. For Azure, you can use managed identities with supported services.

Supported parameters

The following sections list the details of the below parameters:

General parameters

ParameterRequired/DefaultDescription
SECRETS_EXPAND_JSONOptional. Defaults to FalseHandle compound secrets stored as JSON by merging the whole structure into the DataOps Vault
SECRETS_MANAGEROptional. Defaults to AWS_SECRETS_MANAGERSpecify one of the values: AWS Secrets Manager AWS_SECRETS_MANAGER, AWS SSM Parameter Store AWS_PARAMETER_STORE, Azure Key Vault AZURE_KEY_VAULT, HashiCorp Vault HASHICORP_VAULT, custom script CUSTOM, none NONE
SECRETS_SELECTIONOptionalComma-separated list of selectors. For each selector, specify a name to retrieve a single secret (AWS Secrets Manager) or the name prefix to retrieve (AWS Parameter Store). Otherwise, all available secrets are retrieved.
SECRETS_SELECTION_FILTEROptionalSpecify a prefix or substring to match against secret names and retrieve a subset of secrets
SECRETS_STRIP_PREFIXOptionalRemove a prefix from key names (SSM only)

AWS-specific parameters

ParameterRequired/DefaultDescription
SECRETS_AWS_REGIONOptional. Defaults to eu-west-2Use this AWS region
SECRETS_AWS_USE_ROLEOptional. Defaults to FalseSet to True to use implicit authentication from this orchestrator's EC2 instance role
SECRETS_AWS_ACCESS_KEY_LOCATIONOptional. Defaults to AWS.DEFAULT.S3_KEYUse keys from this vault location when authenticating with AWS
SECRETS_AWS_SECRET_KEY_LOCATIONOptional. Defaults to AWS.DEFAULT.S3_SECRETUse keys from this vault location when authenticating with AWS

Azure-specific parameters

ParameterRequired/DefaultDescription
SECRETS_AZURE_CLIENT_SECRET_LOCATIONOptional. Defaults to AZURE.DEFAULT.CLIENT_SECRETSet this parameter inside a directory in your runner. If not using a managed identity, this is the DataOps Vault location of the client secret for authentication.
SECRETS_AZURE_USE_MANAGED_IDENTITYOptional. Defaults to 1Use the managed identity associated with the orchestrator's VM to authenticate with the Key Vault. Set to 0 (zero) to use client secret instead.
SECRETS_AZURE_TENANT_IDOptionalIf not using a managed identity, this is the Azure tenant ID
SECRETS_AZURE_CLIENT_IDOptionalIf not using a managed identity, this is the Azure client ID
SECRETS_AZURE_KEY_VAULT_URLOptionalURL to the Key Vault instance to access

HashiCorp vault-specific parameters

ParameterRequired/DefaultDescription
SECRETS_HASHICORP_VAULT_MOUNT_POINTREQUIREDVault mount point to use
SECRETS_HASHICORP_VAULT_URLREQUIREDURL of the Vault instance that will be used
SECRETS_HASHICORP_VAULT_NAMESPACEOptionalVault namespace to connect to
SECRETS_HASHICORP_VAULT_JWT_ROLEOptionalIf specified, JWT authentication will be attempted using the CI_JOB_JWT and the role with this name
SECRETS_HASHICORP_VAULT_TOKENOptionalIf provided, token authentication will be attempted
SECRETS_HASHICORP_VAULT_USERNAMEOptionalIf provided (along with SECRETS_HASHICORP_VAULT_PASSWORD), userpass authentication will be attempted
SECRETS_HASHICORP_VAULT_PASSWORDOptionalRequired (along with SECRETS_HASHICORP_VAULT_USERNAME) for userpass authentication
SECRETS_HASHICORP_VAULT_KV_VERSIONOptionalDefault value is 2, set it to 1 only if you use a version 1 vault mount point

Custom script-specific parameters

ParameterRequired/DefaultDescription
SECRETS_CUSTOM_SCRIPTREQUIREDLocation of the Python script that retrieves secrets (absolute path or relative to project root)

Example jobs

Standard Load Secrets job

This job is the standard Load Secrets job from all pipelines, as defined in the DataOps Reference Project.

pipelines/includes/local_includes/secrets_manager_jobs/load_secrets.yml
Load Secrets:
extends:
- .agent_tag
stage: Vault Initialisation
image: $DATAOPS_SECRETSMANAGER_RUNNER_IMAGE
script: /dataops
icon: ${SECRETSMANAGER_ICON}

Retrieve single secret

Note

Since pipelines only include a single Load Secrets job, the parameters for this orchestrator are usually managed in the project's main variables.yml file. Therefore, it is unnecessary to reproduce the entire Load Secrets job here, as all configuration is done via the parameters detailed above.

To select a single secret (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault) or a single parameter prefix (AWS SSM Parameter Store):

variables.yml
variables:
...
SECRETS_SELECTION: path/to/secret

Retrieve multiple secrets

To select multiple secrets (AWS Secrets Manager, Azure Key Vault) or multiple parameter prefixes (AWS SSM Parameter Store):

variables.yml
variables:
...
SECRETS_SELECTION: path/to/secret,path/to/another
Note

The HashiCorp Vault integration does not yet support this syntax.