Gradient Docs
Gradient HomeHelp DeskCommunitySign up free
1.0.0
1.0.0
  • About Paperspace Gradient
  • Get Started
    • Quick Start
    • Core Concepts
    • Install the Gradient CLI
    • Common Errors
  • Tutorials
    • Tutorials List
      • Getting Started with Notebooks
      • Train a Model with the Web UI
      • Train a Model with the CLI
      • Advanced: Distributed training sample project
      • Registering Models in Gradient
      • Using Gradient Deployments
      • Using Custom Containers
  • Notebooks
    • Overview
    • Using Notebooks
      • The Notebook interface
      • Notebook metrics
      • Share a Notebook
      • Fork a Notebook
      • Notebook Directories
      • Notebook Containers
        • Building a Custom Container
      • Notebook Workspace Include Files
      • Community (Public) Notebooks
    • ML Showcase
    • Run on Gradient (GitHub badge)
  • Projects
    • Overview
    • Managing Projects
    • GradientCI
      • GradientCI V1 (Deprecated)
  • Workflows
    • Overview
      • Getting Started with Workflows
      • Workflow Spec
      • Gradient Actions
  • Experiments
    • Overview
    • Using Experiments
      • Containers
      • Single-node & multi-node CLI options
      • Experiment options
      • Gradient Config File
      • Environment variables
      • Experiment datasets
      • Git Commit Tracking
      • Experiment metrics
        • System Metrics
        • Custom Metrics
      • Experiment Logs
      • Experiment Ports
      • GradientCI Experiments
      • Diff Viewer
      • Hyperparameter Tuning
    • Distributed Training
      • Distributed Machine Learning with Tensorflow
      • Distributed Machine Learning with MPI
        • Distributed Training using Horovod
        • Distributed Training Using ChainerMN
  • Jobs
    • Overview
    • Using Jobs
      • Stop a Job
      • Delete a Job
      • List Jobs
      • Job Logs
      • Job Metrics
        • System Metrics
        • Custom Metrics
      • Job Artifacts
      • Public Jobs
      • Building Docker Containers with Jobs
  • Models
    • Overview
    • Managing Models
      • Example: Prepare a TensorFlow Model for Deployments
      • Model Path, Parameters, & Metadata
    • Public Models
  • Deployments
    • Overview
    • Managing Deployments
      • Deployment Containers
        • Custom Deployment Containers
      • Deployment States
      • Deployment Logs
      • Deployment Metrics
      • A Deployed Model's API Endpoint
        • Gradient + TensorFlow Serving
      • Deployment Autoscaling
      • Optimize Models for Inference
  • Data
    • Types of Storage
      • Managing Data in Gradient
        • Managing Persistent Storage with VMs
    • Storage Providers
    • Versioned Datasets
    • Public Datasets Repository
  • TensorBoards
    • Overview
    • Using Tensorboards
      • TensorBoards getting started with Tensorflow
  • Metrics
    • Metrics Overview
    • View and Query Metrics
    • Push Metrics
  • Secrets
    • Overview
    • Using Secrets
  • Gradient SDK
    • Gradient SDK Overview
      • Projects Client
      • Experiments Client
      • Models Client
      • Deployments Client
      • Jobs Client
    • End to end tutorial
    • Full SDK Reference
  • Instances
    • Instance Types
      • Free Instances (Free Tier)
      • Instance Tiers
  • Gradient Cluster
    • Overview
    • Setup
      • Managed Private Clusters
      • Self-Hosted Clusters
        • Pre-installation steps
        • Gradient Installer CLI
        • Terraform
          • Pre-installation steps
          • Install on AWS
          • Install on bare metal / VMs
          • Install on NVIDIA DGX
        • Let's Encrypt DNS Providers
        • Updating your cluster
    • Usage
  • Tags
    • Overview
    • Using Tags
  • Machines (Paperspace CORE)
    • Overview
    • Using Machines
      • Start a Machine
      • Stop a Machine
      • Restart a Machine
      • Update a Machine
      • Destroy a Machine
      • List Machines
      • Show a Machine
      • Wait For a Machine
      • Check a Machine's utilization
      • Check availability
  • Paperspace Account
    • Overview
    • Public Profiles
    • Billing & Subscriptions
    • Hotkeys
    • Teams
      • Creating a Team
      • Upgrading to a Team Plan
  • Release Notes
    • Product release notes
    • CLI/SDK Release notes
Powered by GitBook
On this page
  • Objectives
  • Introduction
  • Launching a Notebook Instance
  • Training the Model
  1. Tutorials
  2. Tutorials List

Getting Started with Notebooks

PreviousTutorials ListNextTrain a Model with the Web UI

Last updated 4 years ago

Objectives

  • Launch a Jupyter Notebook with a single click

  • Access existing datasets available within Gradient

  • Train a machine learning model

  • Save the model for inferencing

Introduction

Gradient provides one-click access to Jupyter Notebooks. You can choose pre-configured environments to launch Notebook instances or create a container with custom environments.

In this walkthrough, we will launch a Jupyter Notebook to train a logistic regression model based on the MNIST dataset. Gradient comes with a set of datasets that are readily available at the /datasets location. Instead of downloading the MNIST dataset to local storage, we will access the existing dataset.

We will also learn how to save the model for inferencing by persisting the final joblib file to the /storage location.

Launching a Notebook Instance

Select the Gradient product and then click on the Notebooks tab to create a new notebook.

After naming your notebook, the next step is to choose a pre-configured environment or container.

Within the container picker, select Filter > All and then locate the container called Jupyter Notebook Data Science Stack. This container will come with the core modules needed for our model.

In the next step, choose the machine type. Since we don’t need high-end machines with GPUs, we can choose a low-cost instance. Select the Free GPU machine to access free GPU instances.

The notebook will now provision. The notebook status will be Provisioning or Pending for a minute or two.

We are now ready to launch the Notebook. Choose Python 3 option under the Notebooks section.

Rename the Notebook to give it a meaningful name. We are now ready to train the model.

Training the Model

Start by importing the modules. We are using Scikit-learn and relevant modules for this model. Since the environment doesn’t have joblib module, we will install it before using it. This is a one-time task that needs to run at the beginning of the training job.

Next, we will create a couple of helper functions that load the dataset and changes the shape as expected by Scikit-learn.

We will now load the MNIST dataset from /datasets location. You can browse the files within the Jupyter environment.

The loadMNIST helper function loads the dataset and converts into a NumPy array.

Let us verify if the dataset is loaded correctly by randomly visualizing a few data points.

Before we pass the training and test data to Scikit-learn Logistic Regression object, we need to reshape it.

We are now ready to fit the data into a logistic regression model.

Let’s call the predict method to see how accurate our model is. We will use the output of this to generate a confusion matrix.

This prints a confusion matrix shown below:

Finally, we will persist the trained model at /storage/mnist for accessing it later. The model that is saved to model.pkl is available to other Notebooks and Jobs launched within your account.

You can use the Jupyter environment to navigate to the /storage/mnist directory to find the saved model.

This example is available in our ML Showcase! You can clone the notebook .

You can the completed Jupyter Notebook and upload it to the VM.

here
download
Jupyter notebooks are quick and easy to launch. 1m48s.
Select Filter > All and then select the container named Jupyter Notebook Data Science Stack
Enable low-cost instances and select the Free GPU cluster
The notebook needs a couple minutes to provision
Newly created notebooks take a minute or two to initalize