View and Query Metrics

How to query and view metrics in GUI and CLI

Notebooks, Experiments & Deployment Metrics

Experiments and Jobs

You can view metrics for Jobs by entering the "Metrics" tab. You can view metrics for Experiments by clicking on the individual Jobs that represent worker nodes.

Deployments

Deployment metrics are a Gradient Private Cluster feature. Contact Sales for inquiries!

To view Deployment metrics, navigate to the Metrics tab of the individual deployment.

Notebooks

Notebook metrics are a Gradient Private Cluster feature. Contact Sales for inquiries!

To view Notebook metrics, you will see a minimal version in the Notebook header and you can also open a more robust display with detailed information.

To query the metrics for a given workload using the CLI. Here's an example using Experiments (the Deployments and Notebooks syntax are the same).

Usage: gradient experiments metrics [OPTIONS] COMMAND [ARGS]...

  Read experiment metrics

Options:
  --help  Show this message and exit.

Commands:
  get     Get experiment metrics
  stream  Watch live experiment metrics

Get a single value for a given metric:

Syntax

gradient experiments metrics get --id <experiment id>

Parameters

Usage: gradient experiments metrics get [OPTIONS]

  Get experiment metrics. Shows CPU and RAM usage by default

Options:
  --id TEXT                       ID of the experiment  [required]
  --metric [cpuPercentage|memoryUsage|gpuMemoryFree|gpuMemoryUsed|gpuPowerDraw|gpuTemp|gpuUtilization|gpuMemoryUtilization]
                                  One or more metrics that you want to read.
                                  Defaults to cpuPercentage and memoryUsage
  --interval TEXT                 Interval
  --start [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S]
                                  Timestamp of first time series metric to
                                  collect
  --end [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S]
                                  Timestamp of last time series metric to
                                  collect
  --apiKey TEXT                   API key to use this time only
  --optionsFile PATH              Path to YAML file with predefined options
  --createOptionsFile PATH        Generate template options file
  --help                          Show this message and exit.

Example command to get CPU usage in %:

gradient experiments metrics get --id <experiment id> --metric cpuPercentage

Stream metrics:

Syntax

gradient experiments metrics stream --id <experiment id>

Parameters

Usage: gradient experiments metrics stream [OPTIONS]

  Get experiment metrics. Shows CPU and RAM usage by default

Options:
  --id TEXT                       ID of the experiment  [required]
  --metric [cpuPercentage|memoryUsage|gpuMemoryFree|gpuMemoryUsed|gpuPowerDraw|gpuTemp|gpuUtilization|gpuMemoryUtilization]
                                  One or more metrics that you want to read.
                                  Defaults to cpuPercentage and memoryUsage
  --interval TEXT                 Interval
  --start [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S]
                                  Timestamp of first time series metric to
                                  collect
  --end [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S]
                                  Timestamp of last time series metric to
                                  collect
  --apiKey TEXT                   API key to use this time only
  --optionsFile PATH              Path to YAML file with predefined options
  --createOptionsFile PATH        Generate template options file
  --help                          Show this message and exit.

PreviousMetrics Overview NextPush Metrics

Last updated 4 years ago