Gradient Docs
Gradient HomeHelp DeskCommunitySign up free
Gradient Next
Gradient Next
  • About Gradient
  • Get Started
    • Quick Start
      • Install the Gradient CLI
    • Core Concepts
    • Organizing Projects
      • Secrets
      • Storing an API key as a Secret
    • Tutorials
      • Gradient Notebooks Tutorial
      • Gradient Workflows Tutorial
      • Gradient Deployments Tutorial
    • FAQ
    • Common Errors
  • Gradient Platform
    • Gradient Notebooks
      • Runtimes
      • Files and storage
      • Machines
      • Terminal
      • Shortcuts
      • Sharing
      • TensorBoard
      • Run on Gradient
    • Gradient Workflows
      • Basic operations
      • Understanding Inputs & Outputs
      • Workflow Spec
      • Gradient Actions
      • Environment Variables
      • Using YAML for Data Science
    • Gradient Deployments
      • Basic operations
      • Deployment Spec
  • Artifacts
    • Container Management
      • Custom Containers
    • Data
      • Versioned Data
        • Public Datasets Repository
        • Storage Providers
      • Persistent Storage
    • Models
      • Managing Models
        • Model Types & Metadata
        • Public Models
    • Code
    • Metrics
      • Push Metrics
      • View & Query Metrics
  • Gradient Cluster
    • Overview
      • Setup
        • Managed Private Clusters
        • Self-Hosted Clusters
          • Pre-installation steps
          • Gradient Installer CLI
          • Terraform
            • Pre-installation steps
            • Install on AWS
            • Install on bare metal / VMs
            • Install on NVIDIA DGX
          • Let's Encrypt DNS Providers
          • Updating your cluster
      • Usage
  • More
    • SDK
      • Projects Client
      • Models Client
      • Deployments Client
      • Workflows Client
      • SDK Examples
      • Full SDK Reference
    • Machine Types
      • Machine Tiers
      • Free Machines (Free Tier)
    • Your Account
      • Teams
        • Creating a Team
        • Upgrading to a Team Plan
      • Hotkeys
      • Billing & Subscriptions
        • Storage Billing
      • Public Profiles
    • Release notes
    • Roadmap
Powered by GitBook
On this page
  • Why use environment variables and not just arguments to a script?
  • Specifying environment variables in Workflows
  • Using values of Workflow environment variables in scripts
  1. Gradient Platform
  2. Gradient Workflows

Environment Variables

PreviousGradient ActionsNextUsing YAML for Data Science

Last updated 3 years ago

Information can be passed to and from Workflows, and between the jobs within them.

One form of this is our , which can be datasets, volumes or strings.

Another is environment variables. These are used when we want to pass information in the Workflow YAML to some other computation being called, such as authenticating to a private GitHub repository, or running a Python .py script.

Why use environment variables and not just arguments to a script?

Some reasons are

  • Environment variables can hold other information like , that are harder to pass securely otherwise.

  • They can be defined as applying to all Workflow jobs (under defaults:), or per-job, under a job's name.

  • Sets of arguments may be used several times, for example, the potentially large number of hyperparameters for training an ML model may want to be made explicit in a script. Then we may want to use them for training more than one model, varying some parameters but leaving the rest unchanged.

  • Using environment variables makes larger setups like this easier to handle than passing large argument lists more than once, helping ensure different model invocations get the same settings when they should be getting them.

Specifying environment variables in Workflows

An environment variable global to a script comes under env: in the defaults: field. An example that is commonly used is the secret to hold the user's API key, and the code block containing it, defining the environment variable HELLO might look like

defaults:
  env:
    HELLO: secret:hello
  resources:
    instance-type: P4000

Common job-specific environment variables are information to be passed to a script, for example, the model hyperparameters from our appear in this code:

RecommenderTrain:
  needs:
    - CloneRecRepo
  inputs:
    repoRec: CloneRecRepo.outputs.repoRec
  env:
    HP_FINAL_EPOCHS: '50'
    HP_FINAL_LR: '0.1'
  outputs:
    trainedRecommender:
      type: dataset
      with:
        ref: recommender
  uses: script@v1
  with:
    script: |-
      cp -R /inputs/repoRec /Deep-Learning-Recommender-TF
      cd /Deep-Learning-Recommender-TF
      python workflow_train_model.py
    image: tensorflow/tensorflow:2.4.1-jupyter

Here, the number of epochs to train the final model, HP_FINAL_EPOCHS: '50', and the model's learning rate, HP_FINAL_LR: '0.1', are used by the workflow_train_model.py Python script that the job calls.

Using values of Workflow environment variables in scripts

To utilize the values of the Workflow environment variables in a script, the user parses them as part of their code. In the recommender case here we pass them to variables in the code:

hp_final_epochs = int(os.environ.get('HP_FINAL_EPOCHS'))
hp_final_lr = float(os.environ.get('HP_FINAL_LR'))

Python's os.environ reads the values, and we need to cast them to the correct data type integer, float, etc.) for the model to understand them.

Advanced Usage

#!/usr/bin/env bash

# comment is ignored

HIDDEN_VARIABLE="value not parsed"
export VISIBLE_VARIABLE_1="this value will be available"

function {
   # if the line does not start with export it's ignored
}

# variables inside strings are not expanded. The value will contain the literal :code:`$OTHER_VARIABLE`.
export VARIABLE_CONTAINING_REFERENCE="$OTHER_VARIABLE"

read in by

from env_config import Config, parse_str

# uses the value of CONFIG_FILE as the file name to load variables from
config = Config(filename_variable='CONFIG_FILE', defer_raise=False)
# visible_variable_1 is declared in test and the current tag is test. variable1 will be loaded from test.sh
config.declare('visible_variable_1', parse_int(), ('test',), 'test'))

# visible_variable_2 is declared in the 'default' tag and not available in the config file.
# visible_variable_2 will be ignored because the current tag is 'test'
config.declare('visible_variable_1', parse_int(), ('default',), 'test')

This could potentially be integrated with Workflows to similarly handle a set of Workflow environment variables.

For more advanced situations, Python has libraries such as . This overlaps somewhat with what Workflows can do, because it allows you to declare environment variables as well. But it has clearer handling of issues like data types and errors, and can handle variables that are lists (useful for hyperparameter tuning), or more complex structures.

It can read declared variables from a file, e.g., test.sh, from env_config's :

Workflow input/output
secrets
Deep Learning Recommender tutorial
env_config
GitHub page