Gradient Docs
Gradient HomeHelp DeskCommunitySign up free
1.0.0
1.0.0
  • About Paperspace Gradient
  • Get Started
    • Quick Start
    • Core Concepts
    • Install the Gradient CLI
    • Common Errors
  • Tutorials
    • Tutorials List
      • Getting Started with Notebooks
      • Train a Model with the Web UI
      • Train a Model with the CLI
      • Advanced: Distributed training sample project
      • Registering Models in Gradient
      • Using Gradient Deployments
      • Using Custom Containers
  • Notebooks
    • Overview
    • Using Notebooks
      • The Notebook interface
      • Notebook metrics
      • Share a Notebook
      • Fork a Notebook
      • Notebook Directories
      • Notebook Containers
        • Building a Custom Container
      • Notebook Workspace Include Files
      • Community (Public) Notebooks
    • ML Showcase
    • Run on Gradient (GitHub badge)
  • Projects
    • Overview
    • Managing Projects
    • GradientCI
      • GradientCI V1 (Deprecated)
  • Workflows
    • Overview
      • Getting Started with Workflows
      • Workflow Spec
      • Gradient Actions
  • Experiments
    • Overview
    • Using Experiments
      • Containers
      • Single-node & multi-node CLI options
      • Experiment options
      • Gradient Config File
      • Environment variables
      • Experiment datasets
      • Git Commit Tracking
      • Experiment metrics
        • System Metrics
        • Custom Metrics
      • Experiment Logs
      • Experiment Ports
      • GradientCI Experiments
      • Diff Viewer
      • Hyperparameter Tuning
    • Distributed Training
      • Distributed Machine Learning with Tensorflow
      • Distributed Machine Learning with MPI
        • Distributed Training using Horovod
        • Distributed Training Using ChainerMN
  • Jobs
    • Overview
    • Using Jobs
      • Stop a Job
      • Delete a Job
      • List Jobs
      • Job Logs
      • Job Metrics
        • System Metrics
        • Custom Metrics
      • Job Artifacts
      • Public Jobs
      • Building Docker Containers with Jobs
  • Models
    • Overview
    • Managing Models
      • Example: Prepare a TensorFlow Model for Deployments
      • Model Path, Parameters, & Metadata
    • Public Models
  • Deployments
    • Overview
    • Managing Deployments
      • Deployment Containers
        • Custom Deployment Containers
      • Deployment States
      • Deployment Logs
      • Deployment Metrics
      • A Deployed Model's API Endpoint
        • Gradient + TensorFlow Serving
      • Deployment Autoscaling
      • Optimize Models for Inference
  • Data
    • Types of Storage
      • Managing Data in Gradient
        • Managing Persistent Storage with VMs
    • Storage Providers
    • Versioned Datasets
    • Public Datasets Repository
  • TensorBoards
    • Overview
    • Using Tensorboards
      • TensorBoards getting started with Tensorflow
  • Metrics
    • Metrics Overview
    • View and Query Metrics
    • Push Metrics
  • Secrets
    • Overview
    • Using Secrets
  • Gradient SDK
    • Gradient SDK Overview
      • Projects Client
      • Experiments Client
      • Models Client
      • Deployments Client
      • Jobs Client
    • End to end tutorial
    • Full SDK Reference
  • Instances
    • Instance Types
      • Free Instances (Free Tier)
      • Instance Tiers
  • Gradient Cluster
    • Overview
    • Setup
      • Managed Private Clusters
      • Self-Hosted Clusters
        • Pre-installation steps
        • Gradient Installer CLI
        • Terraform
          • Pre-installation steps
          • Install on AWS
          • Install on bare metal / VMs
          • Install on NVIDIA DGX
        • Let's Encrypt DNS Providers
        • Updating your cluster
    • Usage
  • Tags
    • Overview
    • Using Tags
  • Machines (Paperspace CORE)
    • Overview
    • Using Machines
      • Start a Machine
      • Stop a Machine
      • Restart a Machine
      • Update a Machine
      • Destroy a Machine
      • List Machines
      • Show a Machine
      • Wait For a Machine
      • Check a Machine's utilization
      • Check availability
  • Paperspace Account
    • Overview
    • Public Profiles
    • Billing & Subscriptions
    • Hotkeys
    • Teams
      • Creating a Team
      • Upgrading to a Team Plan
  • Release Notes
    • Product release notes
    • CLI/SDK Release notes
Powered by GitBook
On this page
  1. Data

Public Datasets Repository

PreviousVersioned DatasetsNextOverview

Last updated 5 years ago

This feature is only available in the hosted Gradient version. to learn more.

Jobs and notebooks have access to a read-only directory that is mounted at /datasets. This directory includes the following public datasets (with many more to come).

List of Public Datasets

Name & Path

Description

Fast.ai

/datasets/fastai/

Paperspace's Fast.ai template is built for getting up and running with the enormously popular Fast.ai online MOOC called Practical Deep Learning for Coders.

Source:

CelebA

/datasets/celebA/

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations.

Source:

LSUN

/datasets/lsun/

Contains around one million labeled images for each of 10 scene categories and 20 object categories.

Source:

MNIST

/datasets/mnist/

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples

Source:

COCO

/datasets/coco

Source:

Selfie

/datasets/selfie

Selfie dataset contains 46,836 selfie images annotated with 36 different attributes divided into several categories.

Source:

StyleGan

/datasets/stylegan

StyleGan is a Style-Based Generator Architecture for Generative Adversarial Networks. This dataset allows for photographs of people to be produced by the generator.

Source:

OpenSLR

/datasets/openslr

Open Speech and Language Resources.

Source:

Self Driving Demo

/datasets/self-driving-demo-data

A dataset by comma.ai that includes over 33 hours of commute in California's 280 highway.

Source: https://github.com/commaai/comma2k19

Sentiment140

/datasets/sentiment140

Sentiment140 allows you to discover the sentiment of a brand, product, or topic on Twitter.

Source:

Tiny-imagenet-200

/datasets/tiny-imagenet-200

A subset of the ImageNET dataset created by the Stanford CS231n course. It spans 200 image classes with 500 training examples per class. It also has 50 validation and 50 test examples per class.

Source:

Contact Sales
http://files.fast.ai/data/
http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
http://lsun.cs.princeton.edu/2017/
http://www.yf.io/p/lsun
http://yann.lecun.com/exdb/mnist/
http://cocodataset.org/
http://crcv.ucf.edu/data/Selfie/
https://github.com/NVlabs/stylegan
https://www.openslr.org/resources.php
http://cs.stanford.edu/people/alecmgo/trainingandtestdata.zip
http://cs231n.stanford.edu/tiny-imagenet-200.zip