Migrating to SOLE June 2025
Overview
With the May 2025 5-latest
release of SOLE (Snowflake Object Lifecycle Engine), we introduced several
changes to improve consistency with Snowflake and the overall user experience. These changes include important YAML
configuration changes such as adding new object parameters and altering or removing some existing ones.
This guide shows you how to update your existing SOLE configuration to ensure uninterrupted execution and successful migration to the most recent version.
Since June 2026, the changes are found in the 5-stable
production release.
If you encounter problems:
- use the migration script,
- review the breaking changes below, or
- reach out to support@dataops.live
New parameters in existing SOLE objects
New parameter in view
The view object now supports a new copy_grants
parameter which defaults to false
. Set this parameter to
true
in your configuration to retain the access permissions from the original view when you recreate it.
For more information and examples, check the view documentation.
New constraint types in table and hybrid table
We added new unique
and foreign_keys
parameters to the Table and Hybrid Table objects. You can start setting
these parameters in your YAML configuration to define relationship constraints between objects.
To set either of the new constraints, follow this syntax:
- Unique
- Foreign Keys
- table:
name: SAMPLE
unique:
name: "UQ_CUSTOMER_EMAIL"
keys:
- CUSTOMER_EMAIL
- PREFERRED_CATEGORY
- table:
name: SAMPLE
foreign_keys:
- name: "FK_CUSTOMER_PRODUCT"
keys:
- PREFERRED_PRODUCT_ID
referenced_table: rel(hybrid_table.PRODUCT_CATALOG)
referenced_columns:
- PRODUCT_ID
The same syntax applies to both Table and Hybrid Table objects.
For more information and examples, check:
- Table unique constraint
- Table foreign keys constraint
- Hybrid Table unique constraint
- Hybrid Table foreign keys constraint
New parameter in SCIM integration
In the SCIM Integration object, we added a new parameter named enabled
. Snowflake requires this parameter and
sets its default value to true
to maintain compatibility. You may override it in your YAML configuration as needed.
- Classic Configuration
- Data Products Configuration
scim_integrations:
SCIM_INTEGRATION_1:
provisioner_role: "AAD_PROVISIONER"
scim_client: "AZURE"
enabled: false
- scim_integration:
name: SCIM_INTEGRATION_1
provisioner_role: "AAD_PROVISIONER"
scim_client: "AZURE"
enabled: false
For more information and examples, check the SCIM Integration documentation.
Breaking changes
YAML configuration changes
Besides the new parameters, we also made some changes to the YAML configuration structure of existing parameters. The following sections outline the changes you need to make in your YAML configuration files.
You must apply these changes to your YAML configuration files to ensure compatibility with the latest version of SOLE.
Changes in the schedule
parameter of task
We updated the schedule
parameter of the task object to support a more structured format. You can still set it
to a cron expression or an interval in minutes, but with an updated syntax.
- Old YAML Configuration
- New YAML Configuration
- task:
name: <task-name>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
schedule: "5 MINUTE"
- task:
name: <task-name>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
schedule:
MINUTES: 5
To use a cron expression, you can set the schedule
parameter as follows:
- Old YAML Configuration
- New YAML Configuration
- task:
name: <task-name>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
schedule: "*/5 * * * *"
- task:
name: <task-name>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
schedule:
using_cron: "*/5 * * * *"
Changes in the signature
parameter of row access policy
- Old YAML Configuration
- New YAML Configuration
- row_access_policies:
name: <row-access-policy-name>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
signature:
EMPL_ID: "VARCHAR"
EMPL_SAL: "VARCHAR"
- row_access_policies:
name: <row-access-policy-name>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
arguments:
EMPL_ID:
type: VARCHAR
EMPL_SAL:
type: VARCHAR
Changes in the value_data_type
parameter of masking policy
- Old YAML Configuration
- New YAML Configuration
- masking_policy:
name: <masking-policy-name>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
value_data_type: "VARCHAR"
- masking_policy:
name: <masking-policy-name>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
arguments:
VAL:
type: "VARCHAR"
Note on file format parameters
When using the field_optionally_enclosed_by
parameter in CSV file formats, you must use one of the literal values: NONE
, '
, or "
. Using octal representations (like \042
) will cause validation errors.
Removed parameters in existing SOLE objects
To stay consistent with the corresponding Snowflake objects, we removed the support of some existing parameters in SOLE. If you want to continue setting some of them, you can still do so by using SOLE hooks.
These parameters will not appear in the generated resource files, even if you define them in the YAML configuration.
Removed parameter in view
We refined the behaviour of the or_replace
parameter of views. SOLE no longer explicitly supports this parameter.
Instead, when you change a configuration that requires recreation, SOLE implicitly handles the CREATE OR REPLACE logic.
You only need to set copy_grants
to true
if you wish to preserve existing grants during the recreation. This change
simplifies the resource lifecycle by making the or_replace
behavior implicit and context-aware.
For more information and examples, check the view documentation.
Removed parameter in resource monitor
We no longer support the set_for_account
parameter in the Resource Monitor object. In the future, we will add back the
support to assign a resource monitor to the account using a different approach. In the meantime, if you want to assign
a resource monitor to the account, you can do so using SOLE hooks.
Let's say you have the RESOURCE_MONITOR_1 object defined in your SOLE configuration as follows:
- resource_monitor:
name: RESOURCE_MONITOR_1
frequency: "YEARLY"
start_timestamp: "2025-07-15"
end_timestamp: "2028-07-15"
notify_triggers:
- 40
suspend_triggers:
- 50
suspend_immediate_triggers:
- 90
Use the following SOLE hook to assign the resource monitor to the account:
database_level_hooks:
post_hooks:
- command: "ALTER ACCOUNT SET RESOURCE_MONITOR = RESOURCE_MONITOR_1;"
environment: snowflake
Check the Resource Monitor documentation for more information about the supported parameters.
Removed parameter in SAML integration
For security reasons, we removed support for the saml2_snowflake_x509_cert
parameter in the SAML Integration
object. When you omit this parameter, Snowflake automatically generates it when creating an object. However, if
you want to set it yourself explicitly, you can use SOLE hooks.
Assuming you already have a SAML Integration object named SAML_INTEGRATION_1 defined in your SOLE configuration, your
hook to set the saml2_snowflake_x509_cert
should look something like this:
database_level_hooks:
post_hooks:
- command: "ALTER SECURITY INTEGRATION SAML_INTEGRATION_1 SET SAML2_SNOWFLAKE_X509_CERT = '{{ env.SAML2_SNOWFLAKE_X509_CERT }}';"
environment: snowflake
Of course, you need to store the saml2_snowflake_x509_cert
value in your secrets vault before referencing it in the
hook.
Check the SAML Integration documentation for more information about the supported parameters.
Removed parameter in user
The has_rsa_public_key
parameter in the user object is obsolete, and Snowflake no longer supports it. Hence, SOLE no
longer supports it. However, the behavior of the rsa_public_key
parameter hasn't changed, so you can keep
using it to set the RSA public key for the user.
Check the user documentation for more information about the supported parameters.
Automating the migration to the new configuration
If you encounter any issues related to the new SOLE backend, we recommend running the python migration script sole_config_converter.py
on your SOLE configuration:
#!/usr/bin/env python3
"""
YAML Configuration Converter
============================
Automatically converts old YAML configuration format to new format.
Usage: python sole_config_converter.py <config_file_path>
Note: A backup file will be created automatically before conversion.
"""
import sys
import re
import yaml
from pathlib import Path
def convert_schedule(schedule_value):
"""Convert old schedule format to new format"""
if not isinstance(schedule_value, str):
return schedule_value
schedule_clean = schedule_value.strip().strip('"\'')
# Check if it's a cron expression (5 fields separated by spaces)
# More robust pattern that validates cron structure
cron_fields = schedule_clean.split()
if len(cron_fields) == 5:
# Validate each field contains valid cron characters
cron_char_pattern = r'^[*\d\-,/]+$'
if all(re.match(cron_char_pattern, field) for field in cron_fields):
return {"using_cron": schedule_clean}
# Check if it's time unit format (e.g., "5 MINUTE")
time_pattern = r'^(\d+)\s+(SECOND|SECONDS|MINUTE|MINUTES|HOUR|HOURS|DAY|DAYS)$'
match = re.match(time_pattern, schedule_clean, re.IGNORECASE)
if match:
number = int(match.group(1))
unit = match.group(2).upper()
unit_mapping = {
'SECOND': 'SECONDS', 'SECONDS': 'SECONDS',
'MINUTE': 'MINUTES', 'MINUTES': 'MINUTES',
'HOUR': 'HOURS', 'HOURS': 'HOURS',
'DAY': 'DAYS', 'DAYS': 'DAYS'
}
new_unit = unit_mapping.get(unit)
if new_unit:
return {new_unit: number}
# Return unchanged if no pattern matches
return schedule_value
def convert_signature_to_arguments(signature_dict):
"""Convert signature format to arguments format"""
if not isinstance(signature_dict, dict):
return signature_dict
arguments = {}
for key, value in signature_dict.items():
arguments[key] = {"type": value}
return arguments
def convert_masking_policy(policy_dict):
"""Convert masking policy value_data_type to arguments format"""
if not isinstance(policy_dict, dict):
return policy_dict
new_policy = policy_dict.copy()
if 'value_data_type' in new_policy:
value_type = new_policy.pop('value_data_type')
new_policy['arguments'] = {
'VAL': {
'type': value_type
}
}
return new_policy
def convert_templated_yaml(content):
"""Convert templated YAML using precise text-based regex patterns"""
# 1. Convert schedule: "X UNIT" to schedule:\n UNITS: X (more precise pattern)
def replace_time_schedule(match):
indent = match.group(1)
quote_char = match.group(2)
number = match.group(3)
unit = match.group(4).upper()
unit_mapping = {
'SECOND': 'SECONDS', 'SECONDS': 'SECONDS',
'MINUTE': 'MINUTES', 'MINUTES': 'MINUTES',
'HOUR': 'HOURS', 'HOURS': 'HOURS',
'DAY': 'DAYS', 'DAYS': 'DAYS'
}
new_unit = unit_mapping.get(unit, unit)
return f'{indent}schedule:\n{indent} {new_unit}: {number}'
# 2. Convert schedule: "cron expression" to schedule:\n using_cron: "cron expression"
def replace_cron_schedule(match):
indent = match.group(1)
quote_char = match.group(2)
cron_expr = match.group(3)
# Check if it's a valid cron (5 fields with valid characters)
cron_fields = cron_expr.strip().split()
if len(cron_fields) == 5:
cron_char_pattern = r'^[*\d\-,/]+$'
if all(re.match(cron_char_pattern, field) for field in cron_fields):
return f'{indent}schedule:\n{indent} using_cron: {quote_char}{cron_expr}{quote_char}'
return match.group(0) # Return unchanged if not valid cron
# 3. Convert signature: KEY: VALUE lines to arguments: KEY: type: VALUE
def replace_signature_line(match):
indent = match.group(1)
key = match.group(2)
value = match.group(3).strip('"\'')
return f'{indent}{key}:\n{indent} type: {value}'
# 4. Convert value_data_type: TYPE (in masking_policies context) to arguments:\n VAL:\n type: TYPE
def replace_value_data_type(match):
indent = match.group(1)
quote_char = match.group(2) if match.group(2) else ''
data_type = match.group(3)
return f'{indent}arguments:\n{indent} VAL:\n{indent} type: {quote_char}{data_type}{quote_char}'
# Apply conversions with more precise patterns
# 1. Time-based schedule conversion (tasks context)
content = re.sub(
r'^(\s*)schedule:\s*(["\'])(\d+)\s+(SECOND|SECONDS|MINUTE|MINUTES|HOUR|HOURS|DAY|DAYS)\2\s*$',
replace_time_schedule,
content,
flags=re.MULTILINE | re.IGNORECASE
)
# 2. Cron-based schedule conversion (tasks context) - only if not already converted
content = re.sub(
r'^(\s*)schedule:\s*(["\'])([^"\']+)\2\s*$',
replace_cron_schedule,
content,
flags=re.MULTILINE
)
# 3. Convert signature block in a more targeted way
def convert_signature_section(match):
full_match = match.group(0)
indent = match.group(1)
# Replace signature: with arguments:
result = re.sub(r'^(\s*)signature:\s*$', r'\1arguments:', full_match, flags=re.MULTILINE)
# Convert the field lines within this section only
lines = result.split('\n')
converted_lines = []
in_arguments = False
for line in lines:
if 'arguments:' in line:
in_arguments = True
converted_lines.append(line)
elif in_arguments and line.strip() and ':' in line and not line.strip().startswith('comment'):
# This is a field line under arguments
match_field = re.match(r'^(\s+)(\w+):\s*(["\']?[^"\'\n]+["\']?)\s*$', line)
if match_field:
field_indent = match_field.group(1)
field_name = match_field.group(2)
field_value = match_field.group(3).strip('"\'')
converted_lines.append(f'{field_indent}{field_name}:')
converted_lines.append(f'{field_indent} type: {field_value}')
else:
converted_lines.append(line)
else:
if line.strip() and not line.startswith(' ') and 'arguments:' not in line:
in_arguments = False
converted_lines.append(line)
return '\n'.join(converted_lines)
# Apply signature conversion to row_access_policies sections
content = re.sub(
r'^(\s*)signature:\s*\n((?:\s+\w+:\s*["\']?[^"\'\n]+["\']?\s*\n)*)',
convert_signature_section,
content,
flags=re.MULTILINE
)
# 5. Value data type conversion (masking_policies context only)
content = re.sub(
r'^(\s*)value_data_type:\s*(["\']?)([^"\'\n\r]+)\2\s*$',
replace_value_data_type,
content,
flags=re.MULTILINE
)
return content
def process_data(data):
"""Process YAML data and apply conversions"""
if isinstance(data, dict):
new_data = {}
for key, value in data.items():
if key == 'schedule' and isinstance(value, str):
# Convert task schedule
new_data[key] = convert_schedule(value)
elif key == 'signature' and isinstance(value, dict):
# Convert row access policy signature to arguments
new_data['arguments'] = convert_signature_to_arguments(value)
elif key == 'masking_policies' and isinstance(value, dict):
# Convert masking policies
new_policies = {}
for policy_name, policy_config in value.items():
new_policies[policy_name] = convert_masking_policy(policy_config)
new_data[key] = new_policies
else:
new_data[key] = process_data(value)
return new_data
elif isinstance(data, list):
return [process_data(item) for item in data]
else:
return data
def main():
"""Main conversion function"""
if len(sys.argv) < 2:
print("Usage: python simple_config_converter.py <config_file_path>")
sys.exit(1)
config_path = sys.argv[1]
try:
# Validate file exists
if not Path(config_path).exists():
print(f"Error: Configuration file '{config_path}' not found")
sys.exit(1)
# Validate file extension
if not config_path.lower().endswith(('.yml', '.yaml')):
print(f"Error: File must be a YAML file (.yml or .yaml)")
sys.exit(1)
# Load file content
try:
with open(config_path, 'r', encoding='utf-8') as file:
file_content = file.read()
except UnicodeDecodeError:
print(f"Error: File encoding issue. Please ensure file is UTF-8 encoded")
sys.exit(1)
# Validate file is not empty
if not file_content.strip():
print(f"Error: Configuration file is empty")
sys.exit(1)
# Check if file contains templating syntax
has_templates = '{{' in file_content or '{%' in file_content
if has_templates:
# Process templated file using text-based conversion
converted_content = convert_templated_yaml(file_content)
else:
# Process regular YAML file
try:
config = yaml.safe_load(file_content)
if config is None:
print(f"Error: Configuration file is empty or invalid")
sys.exit(1)
converted_config = process_data(config)
converted_content = yaml.dump(converted_config, default_flow_style=False, indent=2, sort_keys=False)
except yaml.YAMLError as e:
print(f"Error: Invalid YAML format - {e}")
sys.exit(1)
# Save with backup (handle existing backup)
backup_path = config_path + '.backup'
if Path(backup_path).exists():
# Create numbered backup if .backup already exists
counter = 1
while Path(f"{backup_path}.{counter}").exists():
counter += 1
backup_path = f"{backup_path}.{counter}"
Path(config_path).rename(backup_path)
try:
with open(config_path, 'w', encoding='utf-8') as file:
file.write(converted_content)
except Exception as e:
# Restore backup if write fails
Path(backup_path).rename(config_path)
print(f"Error: Failed to write converted file - {e}")
sys.exit(1)
print("Conversion completed")
except PermissionError:
print(f"Error: Permission denied accessing '{config_path}'")
sys.exit(1)
except Exception as e:
print(f"Error: {e}")
sys.exit(1)
if __name__ == "__main__":
main()
Running the SOLE June 2025 migration script
In order to run the script you can do the following:
- Save the script as
sole_config_converter.py
. - Run the script using the following command:
python sole_config_converter.py <config_file_path>
Example:
python sole_config_converter.py dataops/sole/databases.yml
Resolving database cloning issues
When migrating to the new SOLE backend, you may encounter compatibility issues with databases created using the from_database
parameter. This typically manifests as the following error:
Error: failed to upgrade the state with database created from database, please use snowflake_database instead.
Disclaimer: Right now, database cloning is not supported. They can be imported into the mentioned resource,
but any difference in behavior from standard database won't be handled (and can result in errors)
Root cause
This error occurs because the new SOLE backend uses an updated statefile schema version that is incompatible with the previous database cloning implementation.
Solution
To resolve this compatibility issue, reset the lifecycle state by setting the LIFECYCLE_STATE_RESET
environment variable to 1
. This forces SOLE to rebuild the state from your current configuration, ensuring compatibility with the new backend.
You can set this variable at runtime while running the pipeline by providing it as an environment variable when triggering the pipeline execution.
After applying the state reset, your pipeline will rebuild the state using the new schema format and resolve the database cloning compatibility issue.