Files and storage
This guide explains the file management and storage architecture of Gradient Notebooks
Last updated
This guide explains the file management and storage architecture of Gradient Notebooks
Last updated
Every notebook in Gradient has a file management interface that looks like this:
The file manager within the notebook does not represent the full file structure of the notebook.
The full file structure of a notebook is as follows:
/notebooks
is the directory that contains the files commonly displayed in the file manager of the notebook IDE.
/datasets
is a persistent directory where public datasets are stored. Public datasets include a handful of datasets that Gradient makes available out of the box such as MNIST.
/storage
is a shared persistent directory and is accessible by any user who is part of the current team. It is the primary method for sharing data across notebooks and users. In the case of the Private Workspace team, the /storage
volume cannot be shared with other users.
/storage/notebooks
is a legacy volume that is no longer in use on newer Paperspace accounts.
For more information on persistent storage, please continue to the next sections.
Refer to Introduction to the file structure of Gradient Notebooks to understand the file structure of Gradient Notebooks.
Files stored in the file manager are persisted across notebook sessions. This is the same directory that is represented by the yellow box labeled { notebook IDE }
in the previous section.
The notebook must be in the Running state to display files. Offline file view for notebooks is currently under development.
Use the file management tab to upload data, organize files and folders, and download files stored in a notebook.
Additional options such as renaming, duplicating, and deleting files and folders are available by clicking the menu icon on the individual entity.
There are multiple ways to upload files to a notebook, which are discussed in the following sections.
The simplest way to upload to a notebook is to click the Upload icon in the file manager.
The Upload
feature in the file manager provides upload capability for many (but not all) situations. If we need to upload a large number of files or a large dataset, we are better served using tools in code.
Note that a notebook must be in the Running or online state to upload data.
To upload a large number of files or a large amount of data, it is best to use command-line libraries such as curl, Wget, or gdown.
Here is an example of how to use Wget to download the Stanford Dogs dataset to our notebook:
This command downloads the dataset to our current folder:
That's all there is to it! We can also perform the same command from the terminal if we are on the Pro or Growth subscription plans.
To download a file from a notebook, use the Download
feature located in the the three dot menu in the file manager.
As with all data operations, a notebook must be running in order to download data.
Data can be shared between users on a team and between notebooks that belong to users on a team.
Access to shared persistent storage must be done through code, either via the notebook terminal or via a code cell within a notebook, as there is currently no way to access shared persistent storage from the GUI.
We can access shared persistent storage from a code cell within a notebook using the !
operator and issuing our bash commands on a single line connected with the &&
operator.
For example, to create a new directory within our persistent /storage
directory, we'll input the following:
This is what that would look like in a notebook code cell:
We can also access persistent storage via the terminal, as described in the next section.
The terminal feature requires Gradient Pro or Gradient Growth subscriptions.
To access persistent storage in a Gradient Notebooks terminal, we can use the cd
command to change into the persistent directory /storage
.
Let's say we'd like to create a new persistent directory called data
. We can accomplish this as follows:
Let's try it out:
We can now use the directory located at /storage/data
to store any files we need to access across users and notebooks.
Storage in Gradient is scoped to the team level. By default, storage tiers are as follows:
Storage
5 GB
15 GB
50 GB
∞ GB
Excess storage is billed at $0.29 per GB per month and this is prorated for the duration of the month.
As an example, if we are on the Pro plan, which grants us 15 GB of storage, and we use 50 GB of storage for an entire month, we will be billed (50 - 15) * 0.29 = $10.15 on top of our normal bill.
To view storage utilization, visit the Storage tab in the workspace settings.
Here we have an example of the Storage tab for a new team that is not yet using any volume storage:
Here we have an example of a Private Workspace team that is using a good amount of storage:
If we expect to be billed for storage overages, we can use the Utilization tab to explore our storage use further.
Use the file management tab to upload data, organize files and folders, and download files stored in a notebook.
Some additional options such as renaming, duplicating, and deleting files and folders are available by clicking the menu icon on the individual entity.
There are multiple ways to upload files to a notebook, which are discussed in the following sections.
The simplest way to upload to a notebook is to click the Upload icon in the file manager.
Note that a notebook must be in the Running or online state to upload data.
To upload a large number of files or a large amount of data, it is recommended to use command-line libraries such as curl, Wget, or gdown.
These libraries are simple to run in a notebook cell using line magic or in the terminal and they provide a level of reliability for large data transfers beyond that which the GUI can provide.
To upload data to shared persistent storage, try changing into a directory within /storage
before running the upload command.
Downloading files is as simple as clicking the menu icon on a file (the three vertical dots) and clicking Download.
Note that the notebook must be in the Running state in order to download data.