In this tutorial, we describe the recommended way to train a simple machine learning model on the Neuro platform. As our ML engineers prefer PyTorch over other ML frameworks, we show the training and evaluation of one of the basic PyTorch examples.
We assume that you have already signed up to the platform, installed the Neuro CLI, and logged in to the platform (see Getting Started).
To simplify working with Neuro Platform and to help establish the best practices in the ML environment, we provide a project template. This template consists of the recommended directories and files. It is designed to operate smoothly with our base environment.
Let’s initialize a new project from this template:
$ neuro project init
This command asks several questions about your project:
project_name [Name of the project]: Neuro Tutorial
code_directory [modules]: rnn
After you execute the command mentioned above, you get the following structure:
├── .neuro/ <- live.yaml file with commands for manipulating training environment
├── data/ <- training and testing datasets (we do not keep it under source control)
├── notebooks/ <- Jupyter notebooks
├── rnn/ <- source code of models
├── .gitignore <- default .gitignore for a Python project
When you start working with a project on the Neuro platform, the basic flow looks as follows: you set up the remote environment, upload data and code to your storage, run training, and evaluate the results.
To set up the remote environment, run
$ neuro-flow build myimage
This command will run a lightweight job (via neuro run), upload the files containing your dependencies apt.txt and requirements.txt (via neuro cp), install the dependencies (using neuro exec), do other preparatory steps, and then create the base image from this job and push it to the platform (via neuro save, which works similarly to docker commit).
To upload data and code to your storage, run
$ neuro-flow upload ALL
To run training, you need to specify the training command in .neuro/live.yaml, and then run neuro-flow run train:
open .neuro/live.yaml in an editor,
find the following lines (make sure you're looking at the train job, not multitrain which has a very similar section):