DataOps Runner for Snowpark Container Services Compute Pool Recommendations
Understand the cost when choosing different compute pool sizes and numbers
We recommend using a single CPU_X64_S
compute pool as a default. This will be where the runner lives and should be enough to handle most pipelines.
It is possible to use different compute pools for the runner and the jobs, which can make sense depending on the workloads of some jobs. However, this is only cost-efficient when the secondary compute pool is suspended for long enough. For example, if you had only one pipeline that ran once an hour, it may make sense to run this on a secondary compute pool that can be suspended when no pipelines are running.
Suppose your pipelines execute frequently enough to prevent a secondary compute pool from entering a suspended state for significant periods. In that case, operating a single, larger compute pool designed to meet your job requirements is more economically advantageous.
Follow the example below (prices as of Aug 24, 2024) to illustrate when using a separate compute pool is cost-effective.
To run a pipeline on a single CPU_X64_S
for a day would cost 2.64 Snowflake credits.
To run a CPU_X64_XS
pool for a runner and a CPU_X64_S
pool for a day would cost 4.08 credits.
In the example, the secondary CPU_X64_S
pool would have to be suspended for a minimum of 14 hours to be cheaper than running a single CPU_X64_S
pool for a full 24 hours.
This trend is the same when utilizing a secondary pool of any size. It will usually cost less to run a larger, single compute pool. The only exceptions to this will be scenarios where you can suspend the secondary compute pool for long periods.