Secrets Manager Orchestrator
Professional Enterprise
Image | $DATAOPS_SECRETSMANAGER_RUNNER_IMAGE |
---|
The Secrets Manager orchestrator allows a DataOps pipeline to retrieve passwords, keys, and other sensitive information from a remote secrets manager service and seamlessly adds them to the pipeline's vault.
This orchestrator currently supports:
- AWS Secrets Manager
- AWS Systems Manager Parameter Store
- Azure Key Vault
- HashiCorp Vault, Open Source, Cloud, and Enterprise editions
Secrets are stored in and managed by a third-party remote Secrets Manager rather than on the orchestrator host. The Secrets Manager orchestrator is configured with the secrets' location to fetch and insert them into the DataOps vault at runtime, allowing access to all the jobs in the pipeline.
Usage
You can use the orchestrator in different ways:
Basic usage
"Load Secrets":
extends:
- .agent_tag
stage: "Vault Initialisation"
image: $DATAOPS_SECRETSMANAGER_RUNNER_IMAGE
variables:
script:
- /dataops
icon: ${SECRETSMANAGER_ICON}
Using JSON
Store JSON objects in a secret, an entity within the Secrets Manager, using supported technologies rather than individual values. For example, use the parameter SECRETS_EXPAND_JSON
to enable the orchestrator to expand and merge data from a JSON object into the DataOps vault, as follows:
- Set
SECRETS_EXPAND_JSON
to1
(orTrue
). - Add an example secret named
SNOWFLAKE
and its values.
{
"TRANSFORM": {
"USERNAME": "DATAOPS_ADMIN",
"PASSWORD": "abcde12345"
}
}
The result in the vault looks like this:
SNOWFLAKE:
TRANSFORM:
USERNAME: DATAOPS_ADMIN
PASSWORD: abcde12345
Troubleshooting
When using the Secrets Manager with a DataOps runner deployed on AWS EC2, review the necessary Instance Metadata Service Version 2 (IMDSv2) configuration changes.
Supported secrets managers
We currently support the below secrets managers. For details about best practices when using secrets managers, see Security best practices.
AWS Secrets Manager
To correctly merge secrets into the vault, the key names must be a fully-namespaced vault path in the usual dotted notation. For example, the keys in the AWS secret must be SNOWFLAKE.TRANSFORM.USERNAME
and SNOWFLAKE.TRANSFORM.PASSWORD
, as displayed in figure 1 below, to end up with a vault that looks like this:
SNOWFLAKE:
TRANSFORM:
USERNAME: DATAOPS_ADMIN
PASSWORD: abcde12345
Figure 1: AWS Secret Keys
To retrieve a single secret, or a specific list of secrets, from the AWS Secrets Manager, set the value of SECRETS_SELECTION
to the secret's name, as displayed in the AWS console.
AWS Secrets Manager supports the automatic rotation of secrets for extra security. Learn how to set up an automatic rotation for AWS Secrets Manager secrets using the console in the AWS documentation.
AWS SSM parameter store
To correctly merge secrets into the vault from the AWS SSM Parameter Store, the parameter names must match a fully-namespaced vault path with an optional prefix that will automatically have the slashes replaced with dots.
For example, a parameter named /SNOWFLAKE/TRANSFORM/USERNAME
is stored in the vault under the key SNOWFLAKE.TRANSFORM.USERNAME
.
If the parameter name has a prefix like /dataops/SNOWFLAKE/TRANSFORM/USERNAME
, you can remove this by using the variable SECRETS_STRIP_PREFIX
— use the value /dataops/
for the above example key.
To select a subset of parameters using a path prefix, set the SECRETS_SELECTION
value to the path. For example:
SECRETS_SELECTION: /dataops/
SECRETS_STRIP_PREFIX: /dataops/
Azure Key Vault
To merge secrets into the vault correctly, the parameter names must match a fully namespaced vault path. As secret names in Key Vault cannot contain dots, you must use dashes as separators.
For example, a secret named SNOWFLAKE-TRANSFORM-USERNAME
will be stored in the vault under the key SNOWFLAKE.TRANSFORM.USERNAME
.
The Secrets Manager orchestrator's default behavior is retrieving all secrets in the specified vault. However,
it is possible to retrieve only a single secret, or a comma-separated list of secrets, by setting the variable SECRETS_SELECTION
to the secret's name.
The DataOps vault complies with YAML syntax, so you must store secrets in a hierarchical format following the YAML syntactic standards. The alternative will result in an error in retrieving the secret's value.
The below example shows how having an intermediate namespace with a value, the THREE: three
part, is not supported and will result in an error:
- Valid hierarchical secret structure
- Invalid hierarchical secret structure
ONE:
TWO:
THREE:
FOUR: four
FIVE: five
ONE:
TWO:
THREE: three
FOUR: four
FIVE: five
HashiCorp Vault
As with the AWS services, secret names can use slash separators which will be converted to dots when the secrets are loaded into DataOps. Currently, the Vault integration supports KV version 1 and KV version 2 secrets. When using KV version 2, DataOps by default pulls the latest version of any specific secret.
Use the parameter SECRETS_SELECTION
to return a subset of the secrets in the selected mount point.
Authentication sequence
When you add multiple authentication parameters, they will be tried in the following order:
- JWT
- Token
- Username/password
Prerequisites to using JWT
If you use the JWT provided by DataOps in the variable CI_JOB_JWT
, make sure to have the following configuration in HashiCorp Vault:
- JWT authentication enabled and configured to use
jwks_url="https://app.dataops.live/-/jwks"
andbound_issuer="app.dataops.live"
- A policy that allows
list
andread
capabilities to the secrets that DataOps will be using - A role that uses the above policy and adds bound claims to the project/namespace IDs, branch/environment as required
{
"role_type": "jwt",
"policies": ["dataops_read"],
"token_explicit_max_ttl": 60,
"user_claim": "user_email",
"bound_claims_type": "glob",
"bound_claims": {
"project_id": "12345"
}
}
You can view the token by using this example job:
Example JWT Job:
extends: .agent_tag
image: $DATAOPS_UTILS_RUNNER_IMAGE
stage: Pipeline Initialisation
script:
- echo "$CI_JOB_JWT" > CI_JOB_JWT.txt
artifacts:
paths:
- CI_JOB_JWT.txt
Custom secrets manager
Using a custom Python script to load secrets into a DataOps pipeline is also possible. The script will reside in the DataOps project (or a reference project) and can have any name. The script will function as a Python module, providing a single function named get_secrets
which accepts no parameters and returns a dictionary of key-value pairs. For example:
def get_secrets() -> 'dict[str, Any]':
secrets = {}
secrets['SNOWFLAKE.SOLE.PASSWORD'] = some_other_function_to_get_password()
...
return secrets
Potential uses include adding support for customer-specific, in-house built, or other unsupported secrets management technologies or providing a more complex, customizable interface to secrets management.
To maintain a seamless customer experience using a custom Python script with DataOps.live, we recommend you reach out to our Support team to help with your development.
Security best practices
Created secrets and credentials should follow security best practices when using any of the supported Secret Managers. This includes doing the following:
- Safeguard your root credentials. Don't generate any access keys for the root user or use the root for any deployments or everyday operations.
- Apply a policy of least-privileges for permissions granted to any IAM (Identity and Access Management ) policy - only grant the permissions required to perform a specific task.
- Use managed credentials where possible. For example, for AWS, this would be an instance profile attached to an EC2 instance or any AWS services that support a configurable IAM role. For Azure, you can use managed identities with supported services.
Supported parameters
The following sections list the details of the below parameters:
- General Parameters
- AWS-Specific Parameters
- Azure-Specific Parameters
- HashiCorp Vault-Specific Parameters
General parameters
Parameter | Required/Default | Description |
---|---|---|
SECRETS_EXPAND_JSON | Optional. Defaults to False | Handle compound secrets stored as JSON by merging the whole structure into the DataOps Vault |
SECRETS_MANAGER | Optional. Defaults to AWS_SECRETS_MANAGER | Specify one of the values: AWS Secrets Manager AWS_SECRETS_MANAGER , AWS SSM Parameter Store AWS_PARAMETER_STORE , Azure Key Vault AZURE_KEY_VAULT , HashiCorp Vault HASHICORP_VAULT , custom script CUSTOM , none NONE |
SECRETS_SELECTION | Optional | Comma-separated list of selectors. For each selector, specify a name to retrieve a single secret (AWS Secrets Manager) or the name prefix to retrieve (AWS Parameter Store). Otherwise, all available secrets are retrieved. |
SECRETS_SELECTION_FILTER | Optional | Specify a prefix or substring to match against secret names and retrieve a subset of secrets |
SECRETS_STRIP_PREFIX | Optional | Remove a prefix from key names (SSM only) |
AWS-specific parameters
Parameter | Required/Default | Description |
---|---|---|
SECRETS_AWS_REGION | Optional. Defaults to eu-west-2 | Use this AWS region |
SECRETS_AWS_USE_ROLE | Optional. Defaults to False | Set to True to use implicit authentication from this orchestrator's EC2 instance role |
SECRETS_AWS_ACCESS_KEY_LOCATION | Optional. Defaults to AWS.DEFAULT.S3_KEY | Use keys from this vault location when authenticating with AWS |
SECRETS_AWS_SECRET_KEY_LOCATION | Optional. Defaults to AWS.DEFAULT.S3_SECRET | Use keys from this vault location when authenticating with AWS |
Azure-specific parameters
Parameter | Required/Default | Description |
---|---|---|
SECRETS_AZURE_CLIENT_SECRET_LOCATION | Optional. Defaults to AZURE.DEFAULT.CLIENT_SECRET | Set this parameter inside a directory in your runner. If not using a managed identity, this is the DataOps Vault location of the client secret for authentication. |
SECRETS_AZURE_USE_MANAGED_IDENTITY | Optional. Defaults to 1 | Use the managed identity associated with the orchestrator's VM to authenticate with the Key Vault. Set to 0 (zero) to use client secret instead. |
SECRETS_AZURE_TENANT_ID | Optional | If not using a managed identity, this is the Azure tenant ID |
SECRETS_AZURE_CLIENT_ID | Optional | If not using a managed identity, this is the Azure client ID |
SECRETS_AZURE_KEY_VAULT_URL | Optional | URL to the Key Vault instance to access |
HashiCorp vault-specific parameters
Parameter | Required/Default | Description |
---|---|---|
SECRETS_HASHICORP_VAULT_MOUNT_POINT | REQUIRED | Vault mount point to use |
SECRETS_HASHICORP_VAULT_URL | REQUIRED | URL of the Vault instance that will be used |
SECRETS_HASHICORP_VAULT_NAMESPACE | Optional | Vault namespace to connect to |
SECRETS_HASHICORP_VAULT_JWT_ROLE | Optional | If specified, JWT authentication will be attempted using the CI_JOB_JWT and the role with this name |
SECRETS_HASHICORP_VAULT_TOKEN | Optional | If provided, token authentication will be attempted |
SECRETS_HASHICORP_VAULT_USERNAME | Optional | If provided (along with SECRETS_HASHICORP_VAULT_PASSWORD ), userpass authentication will be attempted |
SECRETS_HASHICORP_VAULT_PASSWORD | Optional | Required (along with SECRETS_HASHICORP_VAULT_USERNAME ) for userpass authentication |
SECRETS_HASHICORP_VAULT_KV_VERSION | Optional | Default value is 2 , set it to 1 only if you use a version 1 vault mount point |
Custom script-specific parameters
Parameter | Required/Default | Description |
---|---|---|
SECRETS_CUSTOM_SCRIPT | REQUIRED | Location of the Python script that retrieves secrets (absolute path or relative to project root) |
Example jobs
Standard Load Secrets job
This job is the standard Load Secrets job from all pipelines, as defined in the DataOps Reference Project.
Load Secrets:
extends:
- .agent_tag
stage: Vault Initialisation
image: $DATAOPS_SECRETSMANAGER_RUNNER_IMAGE
script: /dataops
icon: ${SECRETSMANAGER_ICON}
Retrieve single secret
Since pipelines only include a single Load Secrets job, the parameters for this orchestrator are usually managed in the project's main variables.yml
file. Therefore, it is unnecessary to reproduce the entire Load Secrets job here, as all configuration is done via the parameters detailed above.
To select a single secret (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault) or a single parameter prefix (AWS SSM Parameter Store):
variables:
...
SECRETS_SELECTION: path/to/secret
Retrieve multiple secrets
To select multiple secrets (AWS Secrets Manager, Azure Key Vault) or multiple parameter prefixes (AWS SSM Parameter Store):
variables:
...
SECRETS_SELECTION: path/to/secret,path/to/another
The HashiCorp Vault integration does not yet support this syntax.