Skip to main content

CDE Setup and Configuration

Feature release status badge: PriPrev
PriPrev

danger

The DataOps CDE is in private preview. Therefore, the service may be unavailable for periods of time. There also may be occasions where elements of the service are rebuilt, and things like your stored credentials are removed, and you will need to re-add. This poses no risk to the rest of DataOps and is completely isolated. As good development practice dictates, whether developing on local machines or Cloud Development Envrionment, never rely on these for safe storage of configuration/code. Always create a new branch before any changes, and regularly commit your changes to this branch.

Initial setup within the DataOps SaaS application

To let DataOps CDE automatically work with your DataOps project so that you can start coding right away, you must first do the below setup steps:

  1. Under the top-level Projects menu, click on your chosen project to open its details.

  2. From the project menu, click DataOps CDE.

    cde environment variables __shadow__

  3. If this is the first time you have done this, you will see the confirmation message that displays, click Enable DataOps Cloud Development Environment.

    cde environment variables __shadow__

    This brings you back to your project, but now the DataOps CDE is enabled for you.

  4. Create a .gitpod.yml file in the root directory of your repository.

    tasks:
    - name: Setup
    - before: |
    /dataops-cde/scripts/dataops_cde_setup.sh
    image: dataopslive/dataops-gitpod-workspace:5-stable

    tasks defines how the CDE prepares and builds your project and how it starts the project's development server, and image represents the docker image used for workspaces.

    note

    The DataOps CDE works per branch so this file must exist in each branch. Once this file works its way into your main branches, this isn't an issue. Yet when starting, if you open the CDE and it doesn't look or behave as you expect, it's very likely that you opened a branch without this file configured correctly.

  5. Ensure that the .gitignore file in the repository root includes the following:

    dataops/modelling/dbt_packages
    .vscode/settings.json
    snowflake.log
    note

    In the same way, the .gitignore file with this code should be present for each of your branches, otherwise, the cloud platform will not look or behave as you expect.

Initial setup with the DataOps CDE

  1. Click DataOps CDE again to open up a workspace in the DataOps CDE environment. If you haven't done this recently, you may be asked to reauthenticate:

    cde environment variables __shadow__

  2. In the new window, click Continue with app.data....

    The first time you authenticate, you may see (although this requirement is being removed in newer DataOps CDE Releases):

    cde environment variables __shadow__

    You need to click Authorized to allow the DataOps CDE environment to authenticate back to the DataOps SaaS platform with your identity.

  3. In the dialog box that opens, only select the default VS Code Browser.

    cde environment variables __shadow__

    Other editors may not work and certainly will not have the optimized DataOps Developer Experience. You should now see the DataOps CDE that looks like this:

    cde environment variables __shadow__

Initial setup of credentials

Credentials for working with Snowflake are captured as variables.

  1. Set the key variables for the environment.

    info

    These credentials will be used to communicate to Snowflake for the MATE model test and execution. The credentials set in the variables should not represent the same user as the service account you are using for DataOps pipelines.

    We recommended that you use the same role as the service account. If you were to test and run models from the DataOps CDE using a different role, it might not have the correct permissions when you commit this code, and the DataOps pipeline job tries to access them.

    You can set the variables from the DataOps CDE Admin UI https://code.dataops.live/variables, or https://code.dev.dataops.live/variables for the preview environment.

    You can also get here from within the DataOps CDE workspace by clicking Gitpod in the bottom left and then Gitpod: Open Settings from the dropdown list:

    cde environment variables __shadow__

    And selecting Variables from the left panel of the new window.

    cde environment variables __shadow__

    Read more about Variables and Scopes.

    Alternatively, you can set the variables from a terminal within DataOps CDE by using the gp env command, e.g.:

     gp env DBT_ENV_SECRET_ACCOUNT="<account name>"
    gp env DBT_ENV_SECRET_PASSWORD="<password>"
    gp env DBT_ENV_SECRET_USER="<username>"
    gp env DBT_ENV_ROLE="DATAOPS_WRITER" # In a default project this will be DATAOPS_WRITER
    gp env DBT_ENV_WAREHOUSE="DATAOPS_TRANSFORMATION" # In a default project this will be DATAOPS_TRANSFORMATION

    which you do directly in the DataOps CDE terminal:

    cde environment variables __shadow__

    However you enter your variables, you should end up with them looking similar to this:

    cde environment variables __shadow__

  2. Once you have set these variables, you need to start a new workspace to pick up these changes. The easiest way to do this is to delete the workspace and create a new one.

Optional MATE Configuration

The DataOps CDE builds a profiles.yml file assuming the profile name is snowflake_operations. If this is not the case for you, you can set DBT_PROFILE_NAME. For example:

gp env DBT_PROFILE_NAME=my_snowflake_profile_name

Testing this

Congratulations, you should now be fully set up to use the DataOps CDE. You can get into a clean, ready-to-develop workspace by clicking DataOps CDE within a few seconds.