Training Your First Model

Introduction

In this tutorial, we describe the recommended way to train a simple machine learning model on the Neuro platform. As our ML engineers prefer PyTorch over other ML frameworks, we show the training and evaluation of one of the basic PyTorch examples.
We assume that you have already signed up to the platform, installed the Neuro CLI, and logged in to the platform (see Getting Started).
We base our example on the Classifying Names with a Character-Level RNN tutorial.

Initializing a new project

To simplify working with Neuro Platform and to help establish the best practices in the ML environment, we provide a project template. This template consists of the recommended directories and files. It is designed to operate smoothly with our base environment.
Let’s initialize a new project from this template:
1
$ neuro project init
Copied!
This command asks several questions about your project:
1
project_name [Name of the project]: Neuro Tutorial
2
project_slug [neuro-tutorial]:
3
code_directory [modules]: rnn
Copied!

Project structure

After you execute the command mentioned above, you get the following structure:
1
neuro-tutorial
2
├── .neuro/ <- live.yaml file with commands for manipulating training environment
3
├── data/ <- training and testing datasets (we do not keep it under source control)
4
├── notebooks/ <- Jupyter notebooks
5
├── rnn/ <- source code of models
6
├── .gitignore <- default .gitignore for a Python project
7
├── README.md <- auto-generated informational file
8
├── apt.txt <- list of system packages to be installed in the training environment
9
├── requirements.txt <- list Python dependencies to be installed in the training environment
10
└── setup.cfg <- linter settings (Python code quality checking)
Copied!
When you run a job (for example, via neuro-flow run jupyter), the directories are mounted to the job as follows:
Mount Point
Description
Storage URI
/project/data/
Training / testing data
storage:neuro-tutorial/data/
/project/rnn/
User's Python code
storage:neuro-tutorial/rnn/
/project/notebooks/
User's Jupyter notebooks
storage:neuro-tutorial/notebooks/
/project/results/
Logs and results
storage:neuro-tutorial/results/
This mapping is defined as variables in the top section of Makefile and can be adjusted if needed.

Filling the project

Now we need to fill newly created project with the content:
    Change working directory:
1
$ cd neuro-tutorial
Copied!
1
$ curl https://raw.githubusercontent.com/pytorch/tutorials/master/intermediate_source/char_rnn_classification_tutorial.py -o rnn/char_rnn_classification_tutorial.py
Copied!
    Download data from here, extract the ZIP’s content and put it in your data folder:
1
$ curl https://download.pytorch.org/tutorial/data.zip -o data/data.zip && unzip data/data.zip && rm data/data.zip
Copied!

Training and evaluating the model

When you start working with a project on the Neuro platform, the basic flow looks as follows: you set up the remote environment, upload data and code to your storage, run training, and evaluate the results.
To set up the remote environment, run
1
$ neuro-flow build myimage
Copied!
This command will run a lightweight job (via neuro run), upload the files containing your dependencies apt.txt and requirements.txt (via neuro cp), install the dependencies (using neuro exec), do other preparatory steps, and then create the base image from this job and push it to the platform (via neuro save, which works similarly to docker commit).
To upload data and code to your storage, run
1
$ neuro-flow upload ALL
Copied!
To run training, you need to specify the training command in .neuro/live.yaml, and then run neuro-flow run train:
    open .neuro/live.yaml in an editor,
    find the following lines (make sure you're looking at the train job, not multitrain which has a very similar section):
1
bash: |
2
cd $[[ volumes.project.mount ]]
3
python -u $[[ volumes.code.mount ]]/train.py --data $[[ volumes.data.mount ]]
Copied!
    and replace it with the following lines:
1
bash: |
2
cd $[[ volumes.project.mount ]]
3
python -u $[[ volumes.code.mount ]]/char_rnn_classification_tutorial.py
Copied!
Now, you can run
1
$ neuro-flow run train
Copied!
and observe the output. You will see how some checks are made at the beginning of the script, and then the model is being trained and evaluated:
1
['data/names/German.txt', 'data/names/Polish.txt', 'data/names/Irish.txt', 'data/names/Vietnamese.txt',
2
'data/names/French.txt', 'data/names/Japanese.txt', 'data/names/Spanish.txt', 'data/names/Chinese.txt',
3
'data/names/Korean.txt', 'data/names/Czech.txt', 'data/names/Arabic.txt', 'data/names/Portuguese.txt',
4
'data/names/English.txt', 'data/names/Italian.txt', 'data/names/Russian.txt', 'data/names/Dutch.txt',
5
'data/names/Scottish.txt', 'data/names/Greek.txt']
6
Slusarski
7
['Abandonato', 'Abatangelo', 'Abatantuono', 'Abate', 'Abategiovanni']
8
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
9
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,
10
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
11
0., 0., 0.]])
12
torch.Size([5, 1, 57])
13
tensor([[-2.8248, -2.9118, -2.8999, -2.9170, -2.8916, -2.9699, -2.8785, -2.9273,
14
-2.8397, -2.8539, -2.8764, -2.9278, -2.8638, -2.9310, -2.9546, -2.9008,
15
-2.8295, -2.8441]], grad_fn=<LogSoftmaxBackward>)
16
('German', 0)
17
category = Vietnamese / line = Vu
18
category = Chinese / line = Che
19
category = Scottish / line = Fraser
20
category = Arabic / line = Abadi
21
category = Russian / line = Adabash
22
category = Vietnamese / line = Cao
23
category = Greek / line = Horiatis
24
category = Portuguese / line = Pinho
25
category = Vietnamese / line = To
26
category = Scottish / line = Mcintosh
27
5000 5% (0m 19s) 2.7360 Ho / Portuguese ✗ (Vietnamese)
28
10000 10% (0m 38s) 2.0606 Anderson / Russian ✗ (Scottish)
29
15000 15% (0m 58s) 3.5110 Marqueringh / Russian ✗ (Dutch)
30
20000 20% (1m 17s) 3.6223 Talambum / Arabic ✗ (Russian)
31
25000 25% (1m 35s) 2.9651 Jollenbeck / Dutch ✗ (German)
32
30000 30% (1m 54s) 0.9014 Finnegan / Irish ✓
33
35000 35% (2m 13s) 0.8603 Taverna / Italian ✓
34
40000 40% (2m 32s) 0.1065 Vysokosov / Russian ✓
35
45000 45% (2m 52s) 3.6136 Blanxart / French ✗ (Spanish)
36
50000 50% (3m 11s) 0.0969 Bellincioni / Italian ✓
37
55000 55% (3m 30s) 3.1383 Roosa / Spanish ✗ (Dutch)
38
60000 60% (3m 49s) 0.6585 O'Kane / Irish ✓
39
65000 65% (4m 8s) 4.7300 Satorie / French ✗ (Czech)
40
70000 70% (4m 27s) 0.9765 Mueller / German ✓
41
75000 75% (4m 46s) 0.7882 Attia / Arabic ✓
42
80000 80% (5m 5s) 2.1131 Till / Irish ✗ (Czech)
43
85000 85% (5m 25s) 0.5304 Wei / Chinese ✓
44
90000 90% (5m 44s) 1.6258 Newman / Polish ✗ (English)
45
95000 95% (6m 2s) 3.2015 Eberhardt / Irish ✗ (German)
46
100000 100% (6m 21s) 0.2639 Vamvakidis / Greek ✓
47
48
> Dovesky
49
(-0.77) Czech
50
(-1.11) Russian
51
(-2.03) English
52
53
> Jackson
54
(-0.92) English
55
(-1.65) Czech
56
(-1.85) Scottish
57
58
> Satoshi
59
(-1.32) Italian
60
(-1.81) Arabic
61
(-2.14) Japanese
Copied!
Last modified 4mo ago