State Management
Before we look at how SOLE manages a Snowflake object's state, it is crucial to understand what state is in the context of DataOps and SOLE.
State (or local state) is the "condition" of all the Snowflake objects at a given point in time.
In other words, SOLE asks the following questions when inspecting the organization's Snowflake infrastructure:
- What objects exist?
- What do they look like, or how are they structured?
In practice, SOLE maps its configuration to the real world or the physical Snowflake ecosystem, allowing SOLE to use the existing state of the objects during subsequent DataOps pipeline runs.
This information is saved to a local state file and used during SOLE's PLAN and PLAN-DESTROY lifecycle actions and is saved in the persistent cache of the DataOps Runner's host system.
SOLE state management
The concept of state management builds on this definition of local state within the DataOps ecosystem. In summary, SOLE requires the local state information to function. And based on this requirement, state management is an integral part of the Snowflake Object Lifecycle Engine's processes.
Each environment and branch maintains its own set of states. Therefore, SOLE writes a local state file for each object group. Consequently, it is reasonable to conclude that, depending on the size of the organization and the number of Snowflake objects SOLE manages, there will be a significant number of state files to maintain and manage.
These state files are stored on the host system at the following path:
/<cache_directory>/persistent_cache/<dataops_project_name>/<branch_name>/snowflakelifecycle-runner/<resource_group>/<resource_group>.tfstate
The values in angle brackets (<>) translate into the following variables:
-
cache_directory: The host system cache directory- The default is
/agent_cache - To retrieve the value for your DataOps Runner, see the volume mounts in the
/srv/<agent_name>/config/config.tomlfile - Refer to the DataOps Runner initial setup info
- The default is
-
dataops_project_name: The project name in lower-case -
branch_name: The branch name in lower-case -
resource_group: The object group name
State is maintained for all object groups, except for grants.
State reset
If the state file is corrupted or does not match the actual Snowflake infrastructure, pipeline failures can occur.
In such scenarios, you can reset the local state file to perform a fresh inspection of Snowflake. The reset can be triggered by the variable LIFECYCLE_STATE_RESET set to 1.
Resetting the local state file deletes the existing local state for the specified resource group and re-imports all managed Snowflake Objects.
You can configure the state file reset at the project level in pipelines/includes/config/variables.yml to reset the state of all object groups. Or it can be configured at an individual object group level as a parameter to the PLAN or PLAN-DESTROY jobs.
The individual object group state reset function is not supported in AGGREGATE jobs.
Multi-tenant support
SOLE supports multi-tenant configurations through environment-based cache directory isolation. This feature allows multiple tenants or environments to maintain separate state files within the same DataOps infrastructure, preventing state conflicts and ensuring proper isolation between different deployment contexts.
Enabling multi-tenant support
Multi-tenant support is controlled by environment variables and modifies the cache directory structure to include an environment-specific subdirectory.
To enable multi-tenant support, configure the following environment variables:
-
DATAOPS_SOLE_ENABLE_MULTI_TENANT_SUPPORT: Set to enable multi-tenant mode- Accepted values: Any non-empty value except
0,false,False, orno - Example:
1,true,True,yes,enabled
- Accepted values: Any non-empty value except
-
DATAOPS_SOLE_ENV_NAME: Specifies the environment name for cache directory isolation- This variable is required when multi-tenant support is enabled
- The value is used to create a unique cache subdirectory for the environment
- Example:
tenant1,customer_a,production_eu
Cache directory structure
When multi-tenant support is disabled (default behavior), state files are stored at:
<DATAOPS_PERSISTENT_BRANCH_CACHE_DIR>/<RUNNER>/
When multi-tenant support is enabled, state files are stored at:
<DATAOPS_PERSISTENT_BRANCH_CACHE_DIR>/<RUNNER>/<env_name>/
Where <env_name> is the value of the DATAOPS_SOLE_ENV_NAME environment variable.
Configuration example
Set these environment variables in your project configuration or pipeline settings:
DATAOPS_SOLE_ENABLE_MULTI_TENANT_SUPPORT: "1"
DATAOPS_SOLE_ENV_NAME: "tenant_production"
This configuration creates isolated cache directories for each tenant or environment, ensuring that state files do not conflict between different deployments.
Validation and error handling
SOLE validates the multi-tenant configuration at runtime:
- If
DATAOPS_SOLE_ENABLE_MULTI_TENANT_SUPPORTis enabled butDATAOPS_SOLE_ENV_NAMEis not set, SOLE will fail with a fatal error - The error message will indicate:
"Multi-tenant support is enabled but 'DATAOPS_SOLE_ENV_NAME' environment variable is not set"
This validation ensures that multi-tenant configurations are properly defined before pipeline execution.
Use cases
Multi-tenant support is beneficial in scenarios such as:
- Multi-customer SaaS deployments: Isolate state for different customer environments
- Geographic separation: Separate state files for different regions (e.g.,
US,EU,APAC) - Business unit isolation: Maintain separate states for different departments or business units
- Testing environments: Create isolated test environments without affecting production state
When migrating to multi-tenant mode, ensure that DATAOPS_SOLE_ENV_NAME is consistently set across all pipeline runs for the same environment to maintain state continuity.
Changing the DATAOPS_SOLE_ENV_NAME value will result in a different cache directory path, effectively creating a new state context. Ensure this is intentional when modifying this variable.