Understand the process involved in registering models
Passing environment variables to Gradient Workflows
Registering Tensorflow Models in Gradient
Training workloads in Gradient can generate machine learning models, which can be interpreted and stored in your Project's Models list. This list holds references to the model and checkpoint files generated during the training period as well as summary metrics associated with the model's performance, such as accuracy and loss.
In this tutorial, we will create a Worklfow to generate a Keras model based on the Fashion MNIST dataset. We will learn techniques such as using the Git checkout action, passing environment variables to Workflows, and specifying the right container image.
The model is trained in Keras but it is finally exported as a TensorFlow model through
tf.saved_model.simple_savemethod. This approach seralizes Keras session into a TensorFlow
This repo https://github.com/gradient-ai/fashionmnist contains the code for training and inferencing the model.
We will start by creating a project that can contain multiple Workflows we may run during the training. We'll use the CLI here but you can perform the action in the user interface.
gradient projects create --name Fashion
Now let's create our Workflow
gradient workflows create --name fashion-mnist --projectId <id of project>
We will now start a Workflow run within the Workflow created above. Make a note of the Workflow id before proceeding further.
Download or copy the YAML training code to your computer.
defaults:env:apiKey: secret:api_key #Replace this secret with your own secretresources:instance-type: C5jobs:CloneRepo:inputs:repo:type: volumeuses: git-[email protected]with:url: https://github.com/gradient-ai/fashionmnist.gitTrainModel:env:MODEL_DIR: /my-trained-modelneeds:- CloneRepoinputs:repo:type: volumeoutputs:trained-model:type: datasetwith:id: dsrvw1m30ymhiyt #Replace this id with your own dataset iduses: [email protected]with:args:- bash- "-c"- >-cd /inputs/repo/train && python train.py && cp -R /my-trained-model /outputs/trained-modelimage: 'tensorflow/tensorflow:1.9.0'UploadModel:inputs:model: TrainModel.outputs.trained-modeloutputs:model-id:type: stringneeds:- TrainModeluses: create-[email protected]with:name: trained-modeltype: Tensorflow
This YAML file incorporates several concepts that are important to understand:
secret:api_key parameter masks your API key so it is not visible to others. You can learn how to store an API key as a Secret here.
instance-type: C5 sets a default instance type in case a step does not specify an instance type.
MODEL_DIR passes an environment variable to the script. In our code, we decide the location to store the model based on the value defined in the MODEL_DIR environment variable.
image is a parameter that points the step to a Docker image used to execute the step. Note: This same training code can run on a GPU instance which would require using the following image:
TrainModel takes an
outputs parameter which stores the model artifacts within a Gradient dataset. You must create a dataset before running the Workflow and add the id on this line.
UploadModel takes a
type parameter that specifies the format of the model. In this case, we are passing in
TensorFlow as the type. Frameworks other than TensorFlow are supported such as
gradient workflows run \--id <workflow id> \--clusterId <if using a private cluster> \--path ./workflow.yaml
We can check if the output of the job is registered as a valid TensorFlow model with the following command.
gradient models list
| Name | ID | Model Type | Project ID |
| None | mosdnkkv1o1xuem | Tensorflow |
You can also visit the Models section of Gradient UI to see a list of registered models.
After registering the model, we can turn that into a Deployment to perform inferencing: