Migrating to SOLE June 2025

Overview

With the May 2025 5-latest release of SOLE (Snowflake Object Lifecycle Engine), we introduced several changes to improve consistency with Snowflake and the overall user experience. These changes include important YAML configuration changes such as adding new object parameters and altering or removing some existing ones.

This guide shows you how to update your existing SOLE configuration to ensure uninterrupted execution and successful migration to the most recent version.

Production rollout

Since June 2026, the changes are found in the 5-stable production release.

If you encounter problems:

use the migration script,
review the breaking changes below, or
reach out to support@dataops.live

New parameters in existing SOLE objects

New parameter in view

The view object now supports a new copy_grants parameter which defaults to false. Set this parameter to true in your configuration to retain the access permissions from the original view when you recreate it.

For more information and examples, check the view documentation.

New constraint types in table and hybrid table

We added new unique and foreign_keys parameters to the Table and Hybrid Table objects. You can start setting these parameters in your YAML configuration to define relationship constraints between objects.

To set either of the new constraints, follow this syntax:

Unique
Foreign Keys

- table:
    name: SAMPLE
    unique:
      name: "UQ_CUSTOMER_EMAIL"
      keys:
        - CUSTOMER_EMAIL
        - PREFERRED_CATEGORY

- table:
    name: SAMPLE
    foreign_keys:
      - name: "FK_CUSTOMER_PRODUCT"
        keys:
          - PREFERRED_PRODUCT_ID
        referenced_table: rel(hybrid_table.PRODUCT_CATALOG)
        referenced_columns:
          - PRODUCT_ID

The same syntax applies to both Table and Hybrid Table objects.

For more information and examples, check:

New parameter in SCIM integration

In the SCIM Integration object, we added a new parameter named enabled. Snowflake requires this parameter and sets its default value to true to maintain compatibility. You may override it in your YAML configuration as needed.

Classic Configuration
Data Products Configuration

scim_integrations:
  SCIM_INTEGRATION_1:
    provisioner_role: "AAD_PROVISIONER"
    scim_client: "AZURE"
    enabled: false

- scim_integration:
    name: SCIM_INTEGRATION_1
    provisioner_role: "AAD_PROVISIONER"
    scim_client: "AZURE"
    enabled: false

For more information and examples, check the SCIM Integration documentation.

Breaking changes

YAML configuration changes

Besides the new parameters, we also made some changes to the YAML configuration structure of existing parameters. The following sections outline the changes you need to make in your YAML configuration files.

warning

You must apply these changes to your YAML configuration files to ensure compatibility with the latest version of SOLE.

Changes in the `schedule` parameter of task

We updated the schedule parameter of the task object to support a more structured format. You can still set it to a cron expression or an interval in minutes, but with an updated syntax.

Old YAML Configuration
New YAML Configuration

- task:
    name: <task-name>
    database: rel(database.<database-name>)
    schema: rel(schema.<schema-name>)
    schedule: "5 MINUTE"

- task:
    name: <task-name>
    database: rel(database.<database-name>)
    schema: rel(schema.<schema-name>)
    schedule:
      MINUTES: 5

To use a cron expression, you can set the schedule parameter as follows:

Old YAML Configuration
New YAML Configuration

- task:
    name: <task-name>
    database: rel(database.<database-name>)
    schema: rel(schema.<schema-name>)
    schedule: "*/5 * * * *"

- task:
    name: <task-name>
    database: rel(database.<database-name>)
    schema: rel(schema.<schema-name>)
    schedule:
      using_cron: "*/5 * * * *"

Changes in the `signature` parameter of row access policy

Old YAML Configuration
New YAML Configuration

- row_access_policies:
    name: <row-access-policy-name>
    database: rel(database.<database-name>)
    schema: rel(schema.<schema-name>)
    signature:
      EMPL_ID: "VARCHAR"
      EMPL_SAL: "VARCHAR"

- row_access_policies:
    name: <row-access-policy-name>
    database: rel(database.<database-name>)
    schema: rel(schema.<schema-name>)
    arguments:
      EMPL_ID:
        type: VARCHAR
      EMPL_SAL:
        type: VARCHAR

Changes in the `value_data_type` parameter of masking policy

Old YAML Configuration
New YAML Configuration

- masking_policy:
    name: <masking-policy-name>
    database: rel(database.<database-name>)
    schema: rel(schema.<schema-name>)
    value_data_type: "VARCHAR"

- masking_policy:
    name: <masking-policy-name>
    database: rel(database.<database-name>)
    schema: rel(schema.<schema-name>)
    arguments:
      VAL:
        type: "VARCHAR"

Note on file format parameters

When using the field_optionally_enclosed_by parameter in CSV file formats, you must use one of the literal values: NONE, ', or ". Using octal representations (like \042) will cause validation errors.

Removed parameters in existing SOLE objects

To stay consistent with the corresponding Snowflake objects, we removed the support of some existing parameters in SOLE. If you want to continue setting some of them, you can still do so by using SOLE hooks.

warning

These parameters will not appear in the generated resource files, even if you define them in the YAML configuration.

Removed parameter in view

We refined the behaviour of the or_replace parameter of views. SOLE no longer explicitly supports this parameter. Instead, when you change a configuration that requires recreation, SOLE implicitly handles the CREATE OR REPLACE logic. You only need to set copy_grants to true if you wish to preserve existing grants during the recreation. This change simplifies the resource lifecycle by making the or_replace behavior implicit and context-aware.

For more information and examples, check the view documentation.

Removed parameter in resource monitor

We no longer support the set_for_account parameter in the Resource Monitor object. In the future, we will add back the support to assign a resource monitor to the account using a different approach. In the meantime, if you want to assign a resource monitor to the account, you can do so using SOLE hooks.

Let's say you have the RESOURCE_MONITOR_1 object defined in your SOLE configuration as follows:

- resource_monitor:
    name: RESOURCE_MONITOR_1
    frequency: "YEARLY"
    start_timestamp: "2025-07-15"
    end_timestamp: "2028-07-15"
    notify_triggers:
      - 40
    suspend_triggers:
      - 50
    suspend_immediate_triggers:
      - 90

Use the following SOLE hook to assign the resource monitor to the account:

database_level_hooks:
  post_hooks:
    - command: "ALTER ACCOUNT SET RESOURCE_MONITOR = RESOURCE_MONITOR_1;"
      environment: snowflake

Check the Resource Monitor documentation for more information about the supported parameters.

Removed parameter in SAML integration

For security reasons, we removed support for the saml2_snowflake_x509_cert parameter in the SAML Integration object. When you omit this parameter, Snowflake automatically generates it when creating an object. However, if you want to set it yourself explicitly, you can use SOLE hooks. Assuming you already have a SAML Integration object named SAML_INTEGRATION_1 defined in your SOLE configuration, your hook to set the saml2_snowflake_x509_cert should look something like this:

database_level_hooks:
  post_hooks:
    - command: "ALTER SECURITY INTEGRATION SAML_INTEGRATION_1 SET SAML2_SNOWFLAKE_X509_CERT = '{{ env.SAML2_SNOWFLAKE_X509_CERT }}';"
      environment: snowflake

Of course, you need to store the saml2_snowflake_x509_cert value in your secrets vault before referencing it in the hook.

Check the SAML Integration documentation for more information about the supported parameters.

Removed parameter in user

The has_rsa_public_key parameter in the user object is obsolete, and Snowflake no longer supports it. Hence, SOLE no longer supports it. However, the behavior of the rsa_public_key parameter hasn't changed, so you can keep using it to set the RSA public key for the user.

Check the user documentation for more information about the supported parameters.

Automating the migration to the new configuration

If you encounter any issues related to the new SOLE backend, we recommend running the python migration script sole_config_converter.py on your SOLE configuration:

sole_config_converter.py
#!/usr/bin/env python3
"""
YAML Configuration Converter
============================

Automatically converts old YAML configuration format to new format.

Usage: python sole_config_converter.py <config_file_path>

Note: A backup file will be created automatically before conversion.
"""

import sys
import re
import yaml
from pathlib import Path


def convert_schedule(schedule_value):
    """Convert old schedule format to new format"""
    if not isinstance(schedule_value, str):
        return schedule_value

    schedule_clean = schedule_value.strip().strip('"\'')

    # Check if it's a cron expression (5 fields separated by spaces)
    # More robust pattern that validates cron structure
    cron_fields = schedule_clean.split()
    if len(cron_fields) == 5:
        # Validate each field contains valid cron characters
        cron_char_pattern = r'^[*\d\-,/]+$'
        if all(re.match(cron_char_pattern, field) for field in cron_fields):
            return {"using_cron": schedule_clean}

    # Check if it's time unit format (e.g., "5 MINUTE")
    time_pattern = r'^(\d+)\s+(SECOND|SECONDS|MINUTE|MINUTES|HOUR|HOURS|DAY|DAYS)$'
    match = re.match(time_pattern, schedule_clean, re.IGNORECASE)

    if match:
        number = int(match.group(1))
        unit = match.group(2).upper()

        unit_mapping = {
            'SECOND': 'SECONDS', 'SECONDS': 'SECONDS',
            'MINUTE': 'MINUTES', 'MINUTES': 'MINUTES',
            'HOUR': 'HOURS', 'HOURS': 'HOURS',
            'DAY': 'DAYS', 'DAYS': 'DAYS'
        }

        new_unit = unit_mapping.get(unit)
        if new_unit:
            return {new_unit: number}

    # Return unchanged if no pattern matches
    return schedule_value


def convert_signature_to_arguments(signature_dict):
    """Convert signature format to arguments format"""
    if not isinstance(signature_dict, dict):
        return signature_dict

    arguments = {}
    for key, value in signature_dict.items():
        arguments[key] = {"type": value}
    return arguments


def convert_masking_policy(policy_dict):
    """Convert masking policy value_data_type to arguments format"""
    if not isinstance(policy_dict, dict):
        return policy_dict

    new_policy = policy_dict.copy()

    if 'value_data_type' in new_policy:
        value_type = new_policy.pop('value_data_type')
        new_policy['arguments'] = {
            'VAL': {
                'type': value_type
            }
        }

    return new_policy


def convert_templated_yaml(content):
    """Convert templated YAML using precise text-based regex patterns"""

    # 1. Convert schedule: "X UNIT" to schedule:\n  UNITS: X (more precise pattern)
    def replace_time_schedule(match):
        indent = match.group(1)
        quote_char = match.group(2)
        number = match.group(3)
        unit = match.group(4).upper()

        unit_mapping = {
            'SECOND': 'SECONDS', 'SECONDS': 'SECONDS',
            'MINUTE': 'MINUTES', 'MINUTES': 'MINUTES',
            'HOUR': 'HOURS', 'HOURS': 'HOURS',
            'DAY': 'DAYS', 'DAYS': 'DAYS'
        }

        new_unit = unit_mapping.get(unit, unit)
        return f'{indent}schedule:\n{indent}  {new_unit}: {number}'

    # 2. Convert schedule: "cron expression" to schedule:\n  using_cron: "cron expression"
    def replace_cron_schedule(match):
        indent = match.group(1)
        quote_char = match.group(2)
        cron_expr = match.group(3)

        # Check if it's a valid cron (5 fields with valid characters)
        cron_fields = cron_expr.strip().split()
        if len(cron_fields) == 5:
            cron_char_pattern = r'^[*\d\-,/]+$'
            if all(re.match(cron_char_pattern, field) for field in cron_fields):
                return f'{indent}schedule:\n{indent}  using_cron: {quote_char}{cron_expr}{quote_char}'

        return match.group(0)  # Return unchanged if not valid cron

    # 3. Convert signature: KEY: VALUE lines to arguments: KEY: type: VALUE
    def replace_signature_line(match):
        indent = match.group(1)
        key = match.group(2)
        value = match.group(3).strip('"\'')
        return f'{indent}{key}:\n{indent}  type: {value}'

    # 4. Convert value_data_type: TYPE (in masking_policies context) to arguments:\n  VAL:\n    type: TYPE
    def replace_value_data_type(match):
        indent = match.group(1)
        quote_char = match.group(2) if match.group(2) else ''
        data_type = match.group(3)
        return f'{indent}arguments:\n{indent}  VAL:\n{indent}    type: {quote_char}{data_type}{quote_char}'

    # Apply conversions with more precise patterns

    # 1. Time-based schedule conversion (tasks context)
    content = re.sub(
        r'^(\s*)schedule:\s*(["\'])(\d+)\s+(SECOND|SECONDS|MINUTE|MINUTES|HOUR|HOURS|DAY|DAYS)\2\s*$',
        replace_time_schedule,
        content,
        flags=re.MULTILINE | re.IGNORECASE
    )

    # 2. Cron-based schedule conversion (tasks context) - only if not already converted
    content = re.sub(
        r'^(\s*)schedule:\s*(["\'])([^"\']+)\2\s*$',
        replace_cron_schedule,
        content,
        flags=re.MULTILINE
    )

    # 3. Convert signature block in a more targeted way
    def convert_signature_section(match):
        full_match = match.group(0)
        indent = match.group(1)

        # Replace signature: with arguments:
        result = re.sub(r'^(\s*)signature:\s*$', r'\1arguments:', full_match, flags=re.MULTILINE)

        # Convert the field lines within this section only
        lines = result.split('\n')
        converted_lines = []
        in_arguments = False

        for line in lines:
            if 'arguments:' in line:
                in_arguments = True
                converted_lines.append(line)
            elif in_arguments and line.strip() and ':' in line and not line.strip().startswith('comment'):
                # This is a field line under arguments
                match_field = re.match(r'^(\s+)(\w+):\s*(["\']?[^"\'\n]+["\']?)\s*$', line)
                if match_field:
                    field_indent = match_field.group(1)
                    field_name = match_field.group(2)
                    field_value = match_field.group(3).strip('"\'')
                    converted_lines.append(f'{field_indent}{field_name}:')
                    converted_lines.append(f'{field_indent}  type: {field_value}')
                else:
                    converted_lines.append(line)
            else:
                if line.strip() and not line.startswith(' ') and 'arguments:' not in line:
                    in_arguments = False
                converted_lines.append(line)

        return '\n'.join(converted_lines)

    # Apply signature conversion to row_access_policies sections
    content = re.sub(
        r'^(\s*)signature:\s*\n((?:\s+\w+:\s*["\']?[^"\'\n]+["\']?\s*\n)*)',
        convert_signature_section,
        content,
        flags=re.MULTILINE
    )

    # 5. Value data type conversion (masking_policies context only)
    content = re.sub(
        r'^(\s*)value_data_type:\s*(["\']?)([^"\'\n\r]+)\2\s*$',
        replace_value_data_type,
        content,
        flags=re.MULTILINE
    )

    return content


def process_data(data):
    """Process YAML data and apply conversions"""
    if isinstance(data, dict):
        new_data = {}

        for key, value in data.items():
            if key == 'schedule' and isinstance(value, str):
                # Convert task schedule
                new_data[key] = convert_schedule(value)
            elif key == 'signature' and isinstance(value, dict):
                # Convert row access policy signature to arguments
                new_data['arguments'] = convert_signature_to_arguments(value)
            elif key == 'masking_policies' and isinstance(value, dict):
                # Convert masking policies
                new_policies = {}
                for policy_name, policy_config in value.items():
                    new_policies[policy_name] = convert_masking_policy(policy_config)
                new_data[key] = new_policies
            else:
                new_data[key] = process_data(value)

        return new_data
    elif isinstance(data, list):
        return [process_data(item) for item in data]
    else:
        return data


def main():
    """Main conversion function"""
    if len(sys.argv) < 2:
        print("Usage: python simple_config_converter.py <config_file_path>")
        sys.exit(1)

    config_path = sys.argv[1]

    try:
        # Validate file exists
        if not Path(config_path).exists():
            print(f"Error: Configuration file '{config_path}' not found")
            sys.exit(1)

        # Validate file extension
        if not config_path.lower().endswith(('.yml', '.yaml')):
            print(f"Error: File must be a YAML file (.yml or .yaml)")
            sys.exit(1)

        # Load file content
        try:
            with open(config_path, 'r', encoding='utf-8') as file:
                file_content = file.read()
        except UnicodeDecodeError:
            print(f"Error: File encoding issue. Please ensure file is UTF-8 encoded")
            sys.exit(1)

        # Validate file is not empty
        if not file_content.strip():
            print(f"Error: Configuration file is empty")
            sys.exit(1)

        # Check if file contains templating syntax
        has_templates = '{{' in file_content or '{%' in file_content

        if has_templates:
            # Process templated file using text-based conversion
            converted_content = convert_templated_yaml(file_content)
        else:
            # Process regular YAML file
            try:
                config = yaml.safe_load(file_content)
                if config is None:
                    print(f"Error: Configuration file is empty or invalid")
                    sys.exit(1)
                converted_config = process_data(config)
                converted_content = yaml.dump(converted_config, default_flow_style=False, indent=2, sort_keys=False)
            except yaml.YAMLError as e:
                print(f"Error: Invalid YAML format - {e}")
                sys.exit(1)

        # Save with backup (handle existing backup)
        backup_path = config_path + '.backup'
        if Path(backup_path).exists():
            # Create numbered backup if .backup already exists
            counter = 1
            while Path(f"{backup_path}.{counter}").exists():
                counter += 1
            backup_path = f"{backup_path}.{counter}"

        Path(config_path).rename(backup_path)

        try:
            with open(config_path, 'w', encoding='utf-8') as file:
                file.write(converted_content)
        except Exception as e:
            # Restore backup if write fails
            Path(backup_path).rename(config_path)
            print(f"Error: Failed to write converted file - {e}")
            sys.exit(1)

        print("Conversion completed")

    except PermissionError:
        print(f"Error: Permission denied accessing '{config_path}'")
        sys.exit(1)
    except Exception as e:
        print(f"Error: {e}")
        sys.exit(1)


if __name__ == "__main__":
    main()

Running the SOLE June 2025 migration script

In order to run the script you can do the following:

Save the script as sole_config_converter.py.
Run the script using the following command:

python sole_config_converter.py <config_file_path>

Example:

python sole_config_converter.py dataops/sole/databases.yml

Resolving database cloning issues

When migrating to the new SOLE backend, you may encounter compatibility issues with databases created using the from_database parameter. This typically manifests as the following error:

Error: failed to upgrade the state with database created from database, please use snowflake_database instead.
Disclaimer: Right now, database cloning is not supported. They can be imported into the mentioned resource,
but any difference in behavior from standard database won't be handled (and can result in errors)

Root cause

This error occurs because the new SOLE backend uses an updated statefile schema version that is incompatible with the previous database cloning implementation.

Solution

To resolve this compatibility issue, reset the lifecycle state by setting the LIFECYCLE_STATE_RESET environment variable to 1. This forces SOLE to rebuild the state from your current configuration, ensuring compatibility with the new backend.

You can set this variable at runtime while running the pipeline by providing it as an environment variable when triggering the pipeline execution.

After applying the state reset, your pipeline will rebuild the state using the new schema format and resolve the database cloning compatibility issue.

Overview​

New parameters in existing SOLE objects​

New parameter in view​

New constraint types in table and hybrid table​

New parameter in SCIM integration​

Breaking changes​

YAML configuration changes​

Changes in the schedule parameter of task​

Changes in the signature parameter of row access policy​

Changes in the value_data_type parameter of masking policy​

Note on file format parameters​

Removed parameters in existing SOLE objects​

Removed parameter in view​

Removed parameter in resource monitor​

Removed parameter in SAML integration​

Removed parameter in user​

Automating the migration to the new configuration​

Running the SOLE June 2025 migration script​

Resolving database cloning issues​

Root cause​

Solution​