Accessing Object Storage in GCP

Introduction

This tutorial demonstrates how to access your Google Cloud Storage from the Neuro platform. You will create a new Neuro project, a new project in GCP, a service account, and a bucket that's accessible from a job on the Neuro platform.
Make sure you have Neu.ro CLI installed.

Creating Neuro and GCP Projects

To create a new Neuro project, run:
1
neuro project init
2
cd <project-slug>
3
neuro-flow build myimage
Copied!
It's a good practice to limit the scope of access to a specific GCP project. To create a new GCP Project, run:
1
PROJECT_ID=${PWD##*/} # name of the current directory
2
gcloud projects create $PROJECT_ID
3
gcloud config set project $PROJECT_ID
Copied!
Make sure to set a billing account for your GCP project. See Creating and Managing Projects for details.

Creating a Service Account and Uploading an Account Key

First, create a service account for the job:
1
SA_NAME="neuro-job"
2
gcloud iam service-accounts create $SA_NAME \
3
--description "Neuro Platform Job Service Account" \
4
--display-name "Neuro Platform Job"
Copied!
Then, download the account key:
1
gcloud iam service-accounts keys create ~/$SA_NAME-key.json \
2
--iam-account $SA_NAME@$PROJECT_ID.iam.gserviceaccount.com
Copied!
Make sure that the newly created key is located at ~/.
Create a new secret for the file:
1
neuro secret add gcp-key @~/$SA_NAME-key.json
Copied!
Open .neuro/live.yaml, find the remote_debug section within jobs, and add the following lines at the end of remote_debug:
1
jobs:
2
remote_debug:
3
...
4
secret_files: '["secret:gcp-key:/var/secrets/gcp.json"]'
5
additional_env_vars: '{"GOOGLE_APPLICATION_CREDENTIALS": "/var/secrets/gcp.json"}'
Copied!

Creating a Bucket and Granting Access

Now, create a new bucket. Remember: bucket names are globally unique (see more information on bucket naming conventions).
1
BUCKET_NAME="my-neuro-bucket-42"
2
gsutil mb gs://$BUCKET_NAME/
Copied!
Grant access to the bucket:
1
# Permissions for gsutil:
2
PERM="storage.objectAdmin"
3
gsutil iam ch serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com:roles/$PERM gs://$BUCKET_NAME
4
5
# Permissions for client APIs:
6
PERM="storage.legacyBucketOwner"
7
gsutil iam ch serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com:roles/$PERM gs://$BUCKET_NAME
Copied!

Testing

Create a file and upload it into Google Cloud Storage Bucket:
1
echo "Hello World" | gsutil cp - gs://$BUCKET_NAME/hello.txt
Copied!
Change the default preset to cpu-small in .neuro/live.yamlto avoid consuming GPU for this test:
1
defaults:
2
preset: cpu-small
Copied!
Run a development job and connect to the job's shell:
1
neuro-flow run remote_debug
Copied!
In your job's shell, activate the service account for CLI:
1
gcloud auth activate-service-account --key-file $GOOGLE_APPLICATION_CREDENTIALS
Copied!
And try to use gsutil to access your bucket:
1
gsutil cat gs://my-neuro-bucket-42/hello.txt
Copied!
Please note that in remote_debug, train, and jupyter jobs the environment variable GOOGLE_APPLICATION_CREDENTIALS points to your key file. So you can use it to authenticate other libraries.
For instance, you can access your bucket via Python API provided by package google-cloud-storage:
1
>>> from google.cloud import storage
2
>>> bucket = storage.Client().get_bucket("my-neuro-bucket-42")
3
>>> text = bucket.get_blob("hello.txt").download_as_string()
4
>>> print(text)
5
b'Hello World\n'
Copied!
To close remote terminal session, press ^D or type exit.
Please don't forget to terminate your job when you don't need it anymore:
1
neuro-flow kill remote_debug
Copied!
Last modified 9mo ago