Gradient Docs
Gradient HomeHelp DeskCommunitySign up free
Gradient Next
Gradient Next
  • About Gradient
  • Get Started
    • Quick Start
      • Install the Gradient CLI
    • Core Concepts
    • Organizing Projects
      • Secrets
      • Storing an API key as a Secret
    • Tutorials
      • Gradient Notebooks Tutorial
      • Gradient Workflows Tutorial
      • Gradient Deployments Tutorial
    • FAQ
    • Common Errors
  • Gradient Platform
    • Gradient Notebooks
      • Runtimes
      • Files and storage
      • Machines
      • Terminal
      • Shortcuts
      • Sharing
      • TensorBoard
      • Run on Gradient
    • Gradient Workflows
      • Basic operations
      • Understanding Inputs & Outputs
      • Workflow Spec
      • Gradient Actions
      • Environment Variables
      • Using YAML for Data Science
    • Gradient Deployments
      • Basic operations
      • Deployment Spec
  • Artifacts
    • Container Management
      • Custom Containers
    • Data
      • Versioned Data
        • Public Datasets Repository
        • Storage Providers
      • Persistent Storage
    • Models
      • Managing Models
        • Model Types & Metadata
        • Public Models
    • Code
    • Metrics
      • Push Metrics
      • View & Query Metrics
  • Gradient Cluster
    • Overview
      • Setup
        • Managed Private Clusters
        • Self-Hosted Clusters
          • Pre-installation steps
          • Gradient Installer CLI
          • Terraform
            • Pre-installation steps
            • Install on AWS
            • Install on bare metal / VMs
            • Install on NVIDIA DGX
          • Let's Encrypt DNS Providers
          • Updating your cluster
      • Usage
  • More
    • SDK
      • Projects Client
      • Models Client
      • Deployments Client
      • Workflows Client
      • SDK Examples
      • Full SDK Reference
    • Machine Types
      • Machine Tiers
      • Free Machines (Free Tier)
    • Your Account
      • Teams
        • Creating a Team
        • Upgrading to a Team Plan
      • Hotkeys
      • Billing & Subscriptions
        • Storage Billing
      • Public Profiles
    • Release notes
    • Roadmap
Powered by GitBook
On this page
  • Datasets
  • Volumes
  • Strings
  1. Gradient Platform
  2. Gradient Workflows

Understanding Inputs & Outputs

PreviousBasic operationsNextWorkflow Spec

Last updated 3 years ago

A Gradient Workflow is composed of a series of steps. These steps specify how to orchestrate computational tasks. Each step can communicate with other steps through what are known as inputs and outputs.

There are three types of inputs and outputs. Understanding how these function will help you craft concise and elegant Workflows.

  • Datasets

  • Volumes

  • Strings

Datasets

Scenario 1: Consuming a dataset that already exists within Gradient

inputs:
    my-dataset: 
        type: dataset
        with:
            ref: my-dataset-id

Scenario 2: Generating a new dataset version from a Workflow step

my-job:
  uses: container@v1
  with:
    args:
      - bash
      - '-c'
      - cp -R /my-trained-model /outputs/my-dataset
    image: bash:5
  outputs:
    my-dataset:
      type: dataset
      with:
        ref: my-dataset-id

my-dataset-id can be the actual ID of the dataset, a 15 character string that looks like def123ghi456jkl (or appended with a version ID too), or a name for the dataset.

Volumes

Unlike, e.g., GitHub Actions, it is likely that multiple Gradient Steps/Actions will execute on multiple compute nodes. To facilitate the passing of data between these nodes, Gradient Actions expose the notion of volumes and volume passing.

Note: Volumes are limited to 5GB of data currently. If you need more space we recommend using Datasets.

Here is how you would define an output volume:

    outputs:
      my-volume:
        type: volume

In this example a volume is first created as an output and then used as an input in a subsequent job step:

defaults:
  resources:
    instance-type: P4000

jobs:
  job1:
    uses: container@v1
    with:
      args:
      - bash
      - -c
      - echo hello > /outputs/my-volume/testfile1; echo "wrote testfile1 to volume"
      image: bash
    outputs:
      my-volume:
        type: volume
  job2:
    needs:
    - job1
    uses: container@v1
    with:
      args:
      - bash
      - -c
      - cat /inputs/my-volume/testfile1
      image: bash
    inputs:
      my-volume: job1.outputs.my-volume

Volumes cannot currently be used as an output after the job they were created with. This limitation is planned to be removed in the future.

Strings

In some cases, you may need to pass a single value between Workflow steps. The string type makes this possible.

Scenario 1: Passing a string as a Workflow-level input

inputs:
  my-string:
    type: string
    with:
      value: "my string value"

jobs:
  job-1:
    resources:
      instance-type: P4000
    uses: container@v1
    with:
      args:
      - bash
      - -c
      - cat /inputs/my-string
      image: bash:5
    inputs:
      my-string: workflow.inputs.my-string

Scenario 2: Passing a string between job steps

defaults:
  resources:
    instance-type: P4000

jobs:
  job-1:
    uses: container@v1
    with:
      args:
      - bash
      - -c
      - echo "string output from job-1" > /outputs/my-string; echo job-1 finished
      image: bash:5
    outputs:
      my-string:
        type: string
  job-2:
    uses: container@v1
    with:
      args:
      - bash
      - -c
      - cat /inputs/my-string
      image: bash:5
    needs:
     - job-1
    inputs:
      my-string: job-1.outputs.my-string

To run this example you will need to a) create a dataset named test-model and upload valid TensorFlow model files to it; b) define a secret named MY_API_KEY with your gradient-cli api-key; c) substitute your clusterId in the deployment create step.

defaults:
  resources:
    instance-type: P4000

jobs:
  UploadModel:
    uses: create-model@v1
    with:
      name: my-model
      type: Tensorflow
    inputs:
      model:
        type: dataset
        with:
          ref: test-model
    outputs:
      model-id:
        type: string
  DeployModel:
    needs:
    - UploadModel
    inputs:
      model-id: UploadModel.outputs.model-id
    env:
      PAPERSPACE_API_KEY: secret:MY_API_KEY
    uses: container@v1
    with:
      command: bash
      args:
      - -c
      - >-
       gradient deployments create
       --clusterId cl1234567
       --deploymentType TFServing
       --modelId $(cat inputs/model-id)
       --name "Sample Deployment"
       --machineType P4000
       --imageUrl tensorflow/serving:latest-gpu
       --instanceCount 1
      image: paperspace/gradient-sdk

The dataset type leverages the Gradient platform native primitive. Information stored within datasets is not limited to any single type of data. In fact, a generic dataset can include anything from pretrained models to generated images to configuration files. Inherent to datasets is the notion of versions. Workflows can consume and produce new dataset versions as well as tag new versions of existing datasets.

Note: datasets must be defined in advance of being referenced in a workflow. See for more information.

Volumes enable actions such as the . Volumes can be defined as input volumes or output volumes or both. When a volume is an output it is mounted in /outputs and is writeable. When a volume is an input it is mounted in /inputs and is read only.

Scenario 3: Creating a model from a dataset and passing the model ID as a string to a step

NOTE: There is no native Gradient Actions for Model Deployments today. Instead, you can use the to create and manage your inference endpoints.

dataset
Create Datasets for the Workflow
Deployment
Gradient SDK
@git-checkout action