File Format
You can provide configuration to Snowflake Object Lifecycle Engine for the following operations with file format:
- Manage the lifecycle of new and existing file formats
- Manage grants of file format
Usage
- Classic Configuration
- Data Products Configuration
databases:
<database-name>:
schemas:
<schema-name>:
file_formats:
<format-name>:
<configuration-key>: <value>
grants:
<privilege>:
- <role-name>
- <role-name>
- file_format:
name: <format-name>
<configuration-key>: <value>
database: rel(database.<database-name>)
schema: rel(schema.<schema-name>)
grants:
<privilege>:
- rel(role.<role-name>)
- rel(role.<role-name>)
Supported parameters
The engine supports the parameters listed below.
Configuration Key | Required/Optional | Data Types and Values | Description |
---|---|---|---|
format_type | Required | String: For info on format-specific options, see CSV, JSON, AVRO, ORC, PARQUET, and XML | Specifies the format of the data files to load |
comment | Optional | String | Specifies a comment for the file format |
deleted | Optional | Boolean: True enables deletion prevention, False does nothing | Specifies what objects are allowed to be deleted |
grants | Optional | Map: See Supported file_format Grants to Roles | List of Privileges and Roles to which privileges are granted on the current file format |
manage_mode | Optional | String: all (default), none , grants | Configures what properties to manage for the file format. See Changing Manage Mode before changing the value. |
Additional parameters
format_type
set to CSV
Configuration Key | Required/Optional | Data Types and Values | Description |
---|---|---|---|
binary_format | Optional | String: HEX (default), BASE64 , UTF8 | Defines the encoding format for binary input or output |
compression | Optional | String: AUTO (default), GZIP , BZ2 , BROTLI , ZSTD , DEFLATE , RAW_DEFLATE , NONE | Specifies the current compression algorithm for the data file |
date_format | Optional | String: AUTO (default) | Defines the format of date values in the data files |
encoding | Optional | String: UTF8 (default) | Specifies the character set of the source data when loading it into a table |
empty_field_as_null | Optional | Boolean: TRUE (default) | Specifies whether to insert SQL NULL for empty fields in an input file, which are represented by two successive delimiters |
error_on_column_count_mismatch | Optional | Boolean | Specifies whether to generate a parsing error if the number of delimited columns (i.e. fields) in an input file does not match the number of columns in the corresponding table |
escape | Optional | String: NONE (default) | Specifies the character set of the source data when loading it into a table |
escape_unenclosed_field | Optional | String: (\\) (default) | Single character string used as the escape character for unenclosed field values only. Note: Due to a parsing issue, double backslashes (\\ ) are converted to single backslashes (\ ). |
field_delimiter | Optional | String: , (default) | Specifies one or more single-byte or multibyte characters that separate fields in the data file |
field_optionally_enclosed_by | Optional | String | Character used to enclose strings |
file_extension | Optional | String | Specifies the extension for files unloaded to a stage |
null_if | Optional | List/array of string: \\N (default) | Converts to and from SQL NULL. Note: Due to a parsing issue, double backslashes ( \\ ) are converted to single backslashes (\ ). This results in DataOps always detecting a change between the target configuration and Snowflake and applying an update to the file format. |
parse_header | Optional | Boolean: FALSE (default) | Specifies whether to use the first row in the data files to determine column names |
record_delimiter | Optional | String: \n (default) | Specifies one or more single-byte or multibyte characters that separate records in the data file |
replace_invalid_characters | Optional | Boolean | Specifies whether to replace invalid UTF-8 characters with the Unicode replacement character |
skip_byte_order_mark | Optional | Boolean | Specifies whether to skip the BOM (byte order mark) if present in a data file |
skip_header | Optional | Integer: 0 (default) | Number of lines at the start of the file to skip |
skip_blank_lines | Optional | Boolean: FALSE (default) | Specifies to skip any blank lines encountered in the data files |
time_format | Optional | String: AUTO (default) | Format of time values in the data files |
timestamp_format | Optional | String: AUTO (default) | Format of timestamp values in the data files |
trim_space | Optional | Boolean: FALSE (default) | Specifies whether to remove white space from fields |
info
For more information, see the Snowflake Docs.
format_type
set to JSON
Configuration Key | Required/Optional | Data Types and Values | Description |
---|---|---|---|
allow_duplicate | Optional | Boolean: FALSE (default) | Allows duplicate object field names |
binary_format | Optional | String: HEX (default), BASE64 , UTF8 | Defines the encoding format for binary input or output |
compression | Optional | String: AUTO (default), GZIP , BZ2 , BROTLI , ZSTD , DEFLATE , RAW_DEFLATE , NONE | Specifies the current compression algorithm for the data file |
date_format | Optional | String: AUTO (default) | Defines the format of date values in the data files |
deleted | Optional | Boolean: True enables deletion prevention, False does nothing | Specifies what objects are allowed to be deleted |
enable_octal | Optional | Boolean: FALSE (default) | Enables parsing of octal numbers |
file_extension | Optional | String | Specifies the extension for files unloaded to a stage |
ignore_utf8_errors | Optional | Boolean: FALSE (default) | Specifies whether UTF-8 encoding errors produce error conditions |
null_if | Optional | List/array of string: \\N (default) | Converts to and from SQL NULL. Note: Due to a parsing issue, double backslashes ( \\ ) are converted to single backslashes (\ ). This results in DataOps always detecting a change between the target configuration and Snowflake and applying an update to the file format. |
replace_invalid_characters | Optional | Boolean: FALSE (default) | Specifies whether to replace invalid UTF-8 characters with the Unicode replacement character |
strip_outer_array | Optional | Boolean: FALSE (default) | Tells the JSON parser to remove outer brackets |
skip_byte_order_mark | Optional | Boolean: FALSE (default) | specifies whether to skip the BOM (byte order mark) if present in a data file |
strip_null_values | Optional | Boolean: FALSE (default) | Tells the JSON parser to remove object fields or array elements containing null values |
time_format | Optional | String: AUTO (default) | Format of time values in the data files |
timestamp_format | Optional | String: AUTO (default) | Defines the format of timestamp values in the data files |
trim_space | Optional | Boolean: FALSE (default) | specifies whether to remove white space from fields |
info
For more information, see the Snowflake Docs.
format_type
set to AVRO
Configuration Key | Required/Optional | Data Types and Values | Description |
---|---|---|---|
compression | Optional | String: AUTO , GZIP , BZ2 , BROTLI , ZSTD , DEFLATE , RAW_DEFLATE , NONE | Specifies the current compression algorithm for the data file |
null_if | Optional | List/array of string | Converts to and from SQL NULL |
trim_space | Optional | Boolean | specifies whether to remove white space from fields |
info
For more information, see the Snowflake Docs.
format_type
set to ORC
Configuration Key | Required/Optional | Data Types and Values | Description |
---|---|---|---|
null_if | Optional | List/array of string | Converts to and from SQL NULL |
trim_space | Optional | Boolean | specifies whether to remove white space from fields |
info
For more information, see the Snowflake Docs.