Metrics Overview
This guide describes the different types of evaluation metrics and how you can view them.
Last updated
This guide describes the different types of evaluation metrics and how you can view them.
Last updated
Experiments and Deployments on Gradient can record metrics which are available both in realtime or after they are finished running. Gradient will display these metrics in the web UI and they can also be queried or streamed in the CLI.
We log three different kinds of metrics: hardware metrics, framework metrics, and custom user metrics.
Note: Framework and custom metrics are only available in a Gradient Private Cluster. Contact Sales for inquiries!
All Gradient workloads like Experiments and Deployments monitor and track CPU, Memory, and Network. If the machine is equipped with a GPU, this will be tracked as well.
For example, accuracy and mean squared errors are two common metrics for classification and regression, respectively.
If your deployment uses TF Serving, some metrics such astensorflow:core:direct_session_runs
, tensorflow:cc:saved_model:load_attempt_count
etc. will be logged automatically.
You can log custom user metrics from inside of an experiment or deployment using the Python CLI utils. It's based on Prometheus Python Client. Here's a trivial example: