As opposed to standard platform storage, disks provide a faster, short-lived, non-partitioned space for jobs. This makes disks perfect for manipulating large amounts of data.
For example, if there's a large dataset that you need to process, it's better to create a disk, upload all of the required data from the storage to this disk, and only then perform all the necessary operations. In some cases, this can save you about 10%-20% of time depending on the amount of data and the operations you perform with it. After the data is processed, you can download it back to the storage for further use.
neuro disk createcommand is used to create disks. When creating a disk, you need to specify its storage capacity by providing a corresponding value as a parameter. You can also specify the disk's name, which provides a more convenient way of accessing it in commands later. For example:
> neuro disk create --name test-disk 40G
This will create with a storage capacity of
40 * 2^30bytes and a name of
test-disk. Note that some servers can have a predefined granularity (for example, 1GB), so the values you provide will be rounded up in such cases.
Disks have a limited lifetime. This lifespan is calculated from the point in time the disk was last used. The default lifespan of a disk is 1 day. You can also specify a custom lifespan by providing the corresponding value in a
--timeout-unusedparameter when creating a disk:
> neuro disk create --name test-disk --timeout-unused 2d4h 40G
Created at a second from now
Timeout unused 2d 4h
This will set the disk's lifespan to 2 days and 4 hours after last usage.
To mount a disk to a job's container, use the
--volumeparameter in the
runcommand. The syntax for this is
--volume disk:<disk-id-or-name>:/mnt/disk. You need to replace
<disk-id-or-name>with the disk's actual ID or name. You could also adjust
/mnt/disk/part, which defines the path within the running job, where the disk will be mounted. The disk's ID and name are shown when the disk is created.
You cannot mount a subpath from the disk into the job (as opposed to the
storage:). Namely, the following syntax is invalid:
Here's an example:
neuro run --volume disk:test-disk:/mnt/test-disk ubuntu -- echo "Hi" > /mnt/test-disk/echo-file