Gradient Docs
Gradient HomeHelp DeskCommunitySign up free
Gradient Next
Gradient Next
  • About Gradient
  • Get Started
    • Quick Start
      • Install the Gradient CLI
    • Core Concepts
    • Organizing Projects
      • Secrets
      • Storing an API key as a Secret
    • Tutorials
      • Gradient Notebooks Tutorial
      • Gradient Workflows Tutorial
      • Gradient Deployments Tutorial
    • FAQ
    • Common Errors
  • Gradient Platform
    • Gradient Notebooks
      • Runtimes
      • Files and storage
      • Machines
      • Terminal
      • Shortcuts
      • Sharing
      • TensorBoard
      • Run on Gradient
    • Gradient Workflows
      • Basic operations
      • Understanding Inputs & Outputs
      • Workflow Spec
      • Gradient Actions
      • Environment Variables
      • Using YAML for Data Science
    • Gradient Deployments
      • Basic operations
      • Deployment Spec
  • Artifacts
    • Container Management
      • Custom Containers
    • Data
      • Versioned Data
        • Public Datasets Repository
        • Storage Providers
      • Persistent Storage
    • Models
      • Managing Models
        • Model Types & Metadata
        • Public Models
    • Code
    • Metrics
      • Push Metrics
      • View & Query Metrics
  • Gradient Cluster
    • Overview
      • Setup
        • Managed Private Clusters
        • Self-Hosted Clusters
          • Pre-installation steps
          • Gradient Installer CLI
          • Terraform
            • Pre-installation steps
            • Install on AWS
            • Install on bare metal / VMs
            • Install on NVIDIA DGX
          • Let's Encrypt DNS Providers
          • Updating your cluster
      • Usage
  • More
    • SDK
      • Projects Client
      • Models Client
      • Deployments Client
      • Workflows Client
      • SDK Examples
      • Full SDK Reference
    • Machine Types
      • Machine Tiers
      • Free Machines (Free Tier)
    • Your Account
      • Teams
        • Creating a Team
        • Upgrading to a Team Plan
      • Hotkeys
      • Billing & Subscriptions
        • Storage Billing
      • Public Profiles
    • Release notes
    • Roadmap
Powered by GitBook
On this page
  • System metrics
  • Framework metrics
  • Custom metrics
  1. Artifacts

Metrics

This guide describes the different types of evaluation metrics and how you can view them.

PreviousCodeNextPush Metrics

Last updated 3 years ago

Gradient workloads can record metrics that are available both in realtime or after the workload is complete. Gradient will display these metrics in the web UI and they can also be queried or streamed in the CLI.

Gradient can log three different kinds of metrics: hardware metrics, framework metrics, and custom user metrics.

Note: Framework and custom metrics are only available in a Gradient . for inquiries!

System metrics

All Gradient workloads like Experiments and Deployments monitor and track CPU, Memory, and Network. If the machine is equipped with a GPU, this will be tracked as well.

Framework metrics

For example, accuracy and mean squared errors are two common metrics for classification and regression, respectively.

If your deployment uses TF Serving, some metrics such astensorflow:core:direct_session_runs, tensorflow:cc:saved_model:load_attempt_count , etc. will be logged automatically.

Custom metrics

from gradient_utils.metrics import 

logger = MetricsLogger(grouping_key={'ProjectA': 'SomeLabel'})

logger.add_gauge("Gauge")
logger.add_counter("Counter")


while datetime.now() <= endAt:
    randNum = randint(1, 100)
    logger["Gauge"] = 5
    logger["Gauge"].set(randNum)
    logger["Counter"].inc()
    logger.push_metrics()

You can log custom user metrics from inside of an experiment or deployment using the Python CLI utils. It's based on . Here's a trivial example:

Prometheus Python Client
Private Cluster
Contact Sales
System Metrics showing CPU and Memory Usage