Bases: airflow.contrib.operators.sagemakerbaseoperator.SageMakerBaseOperator. Initiate a SageMaker transform job. This operator returns The ARN of the model created in Amazon SageMaker. Config – The configuration necessary to start a transform job (templated). SageMaker Python SDK. SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the SDK, you can train and deploy models using popular deep learning frameworks Apache MXNet and TensorFlow.
This notebook uses fashion-mnist dataset classification task as an example to show how one can track Airflow Workflow executions using Sagemaker Experiments.
Overall, the notebook is organized as follow:
Download dataset and upload to Amazon S3.
Create a simple CNN model to do the classification.
Define the workflow as a DAG with two executions, a SageMaker TrainingJob for training the CNN model, followed by a SageMaker TransformJob to run batch predictions on model.
Host and run the workflow locally, and track the workflow run as an Experiment.
Note that if you are running the notebook in SageMaker Studio, please select
Python3(TensorflowCPUOptimized) Kernel; if you are running in SageMaker Notebook, please select
Create a S3 bucket to hold data¶
We will be creating a SageMaker Training Job and fitting by
(x_train,y_train), and then a SageMaker Transform Job to perform batch inference over a large-scale (10K) test data. To do the batch inference, we need first flatten each sampl image (28x28) in
x_test into an float array with 784 features, and then concatenate all flattened samples into a
Upload the dataset to s3¶
Create a simple CNN¶
The CNN we use in this example contains two consecutive (Conv2D - MaxPool - Dropout) modules, followed by a feed-forward layer, and a softmax layer to normalize the output into a valid probability distribution.
Create workflow configurations¶
For the purpose of demonstration, we will be executing our workflow locally. Lets first create a dir under airflow root to store our DAGs.
We will create an experiment named
fashion-mnist-classification-experiment to track our workflow execution first.
The following cell defines our DAG, which is a workflow with two steps. One is running a training job on SageMaker, then followed by running a transform job to perform batch inference on the fashion-mnist testset we created before.
We will write the DAG defnition into the
airflow/dags we just created above.
Now, lets init the airflow db and host it locally
Then, we start a backfill job to execute our workflow. Note, we use backfill job simply because we dont want to wait until the airflow scheduler to trigger the workflow to run.
List workflow executions¶
Each execution in the workflow is modeled by a trial, lets list our workflow executions
Let’s take a closer look at the jobs we created and executed by our workflow
Run the following cell to clean up the sample experiment, if you are working on your own experiment, please ignore.
This article describes how to set up instance profiles to allow you to deploy MLflow models to AWS SageMaker.It is possible to use access keys for an AWS user with similar permissions as the IAM role specified here, but Databricks recommendsusing instance profiles to give a cluster permission to deploy to SageMaker.
Step 1: Create an AWS IAM role and attach SageMaker permission policy
In the AWS console, go to the IAM service.
Click the Roles tab in the sidebar.
Click Create role.
Under Select type of trusted entity, select AWS service.
Under Choose the service that will use this role, click the EC2 service.
Click Next: Permissions.
In the Attach permissions policies screen, select AmazonSageMakerFullAccess.
Click Next: Review.
In the Role name field, enter a role name.
Click Create role.
In the Roles list, click the role name.
Make note of your Role ARN, which is of the format
Step 2: Add an inline policy for access to SageMaker deployment resources
Add a policy to the role.
Paste in the following JSON definition:
These permissions are required to allow the Databricks cluster to:
- Obtain the new role’s canonical ARN.
- Upload permission-scoped objects to S3 for use by SageMaker endpoint servers.
The role’s permissions will look like:
Airflow Sagemaker Example
Step 3: Update the role’s trust policy
iam:AssumeRole access to
Go to Role Summary > Trust relationships > Edit trust relationship.
Paste and save the following JSON:
Your role’s trust relationships should resemble the following:
Sagemaker Airflow Blog
Step 4: Allow your Databricks workspace AWS role to pass the role
Go to your Databricks workspace AWS role.
Paste and save the following JSON definition:
account-id is the ID of the account running the AWS SageMaker service and
role-name is the role you defined in Step 1.
Airflow Vs Sagemaker
Step 5: Create a Databricks cluster instance profile
In your Databricks Admin Console, go to the Instance Profiles tab and click Add Instance Profile.
Paste in the instance profile ARN associated with the AWS role you created. This ARN is of the form
arn:aws:iam::<account-id>:instance-profile/<role-name>and can be found in the AWS console:
Click the Add button.
For details, see Secure access to S3 buckets using instance profiles.
Step 6: Launch a cluster with the instance profile
See Step 6: Launch a cluster with the instance profile.