A Gradient Workflow is composed of a series of steps. These steps specify how to orchestrate computational tasks. Each step can communicate with other steps through what are known as
There are three types of inputs and outputs. Understanding how these function will help you craft concise and elegant Workflows.
The dataset type leverages the Gradient platform native dataset primitive. Information stored within datasets is not limited to any single type of data. In fact, a generic dataset can include anything from pretrained models to generated images to configuration files. Inherent to datasets is the notion of versions. Workflows can consume and produce new dataset versions as well as tag new versions of existing datasets. Note: datasets must be defined in advance of being referenced in a workflow. See the dataset documentation for more info.
Scenario 1: Consuming a dataset that already exists within Gradient
inputs:my-dataset:type: datasetwith:id: my-dataset-id
Scenario 2: Generating a new dataset version from a Workflow step
my-job:with:args:- bash- '-c'- cp -R /my-trained-model /outputs/my-datasetimage: bash:5outputs:my-dataset:type: datasetwith:id: my-dataset-id
Unlike, e.g. GitHub Actions, it is likely that multiple Gradient Steps/Actions will execute on multiple compute nodes. To facilitate the passing of data between these nodes, Gradient Actions exposes the notion of volumes and volume passing.
Volumes enable actions such as the @git-checkout action. Volumes can be defined as input volumes or output volumes or both. When a volume is an
output it is mounted in
/outputs and is writeable. When a volume is an
input it is mounted in
/inputs and is read only.
In this example a volume is first created as an output and then used as an input in a subsequent job step:
defaults:resources:instance-type: P4000jobs:job1:with:args:- bash- -c- echo hello > /outputs/my-volume/testfile1; echo "wrote testfile1 to volume"image: bashoutputs:my-volume:type: volumejob2:needs:- job1with:args:- bash- -c- cat /inputs/my-volume/testfile1image: bashinputs:my-volume: job1.outputs.my-volume
Volumes cannot currently be used as an output after the job they were created with, this limitation is planned to be removed in the future.
In some cases, you may need to pass a single value between Workflow steps. The string type makes this possible.
Scenario 1: Passing a string from as a workflow-level input
inputs:my-string:type: stringwith:value: "my string value"jobs:job-1:resources:instance-type: P4000with:args:- bash- -c- cat /inputs/my-stringimage: bash:5inputs:my-string: workflow.inputs.my-string
Scenario 2: Passing a string between job steps
defaults:resources:instance-type: P4000jobs:job-1:with:args:- bash- -c- echo "string output from job-1" > /outputs/my-string; echo job-1 finishedimage: bash:5outputs:my-string:type: stringjob-2:with:args:- bash- -c- cat /inputs/my-stringimage: bash:5needs:- job-1inputs:my-string: job-1.outputs.my-string
Scenario 3: Creating a model from a dataset and passing the model ID as a string to a Deployment step
To run this example you will need to a) create a dataset named
test-model and upload valid tensorflow model files to it, b) define a secret named
MY_API_KEY with your gradient-cli api-key, c) substitute your
clusterId in the deployment create step.
defaults:resources:instance-type: P4000jobs:UploadModel:uses: create-[email protected]with:name: my-modeltype: Tensorflowinputs:model:type: datasetwith:id: test-modeloutputs:model-id:type: stringDeployModel:needs:- UploadModelinputs:model-id: UploadModel.outputs.model-idenv:PAPERSPACE_API_KEY: secret:MY_API_KEYwith:command: bashargs:- -c- >-gradient deployments create--clusterId cl1234567--deploymentType TFServing--modelId $(cat inputs/model-id)--name "Sample Deployment"--machineType P4000--imageUrl tensorflow/serving:latest-gpu--instanceCount 1image: paperspace/gradient-sdk