How to Deploy Machine Learning Models to the Cloud Quickly and Easily

February 1st 2022

Aibro is a serverless MLOps tool that makes Machine Learning cloud computing cheap, easy, and fast. The tool can help data scientists or machine learning engineers train and deploy machine learning models on cloud platforms within a short period of time. We are going to use the IMDB Movie dataset to build a model that can classify if a movie review is positive or negative. We need to import Python packages to load the data, clean the data and create a machine learning model, and save the model for deployment. The next step is to create a simple model and then deploy it.

Audio Presented by Plivo Inc-icon

Speed:

Read by:

Your browser does not support theaudio element.

@davisdavid

Davis David

Data Scientist | AI Practitioner | Software Developer. Giving talks, teaching, writing.

NEWABOUT PAGE

Machine learning models are usually developed in a training environment (online or offline) and then can be deployed to be used with live data. If you're working in Data Science and Machine learning projects, knowing how to deploy a model is one of the most important skills you'll need to have.

0 reactions

Who is this article for?

0 reactions

This article is for those who have created a machine learning model in a local machine and want to deploy and test the model within a short time.

0 reactions

It's also for those who are looking for an alternative platform to deploy their machine learning models.

0 reactions

Let's get started! 🚀

0 reactions

What does it mean to deploy a Machine Learning model?

Model deployment is the process of integrating your model into an existing production environment. The model will receive input and predict an output.

0 reactions

Machine learning models can be deployed in different environments and can be integrated with different web or mobile applications through an API.

0 reactions

“Only when a model is fully integrated with the business systems, we can extract real value from its predictions”. — Christopher Samiullah

There are different platforms that can help you deploy your machine learning model. But for most of these platforms, it takes a lot of time and resources to configure the environment and deploy your model.

0 reactions

For example, Sagemaker offers a popular library and ML frameworks, but you still have to depend on them for new releases. This might mean that you won't be able to deploy your model on time.

0 reactions

Let's say the Sagemaker platform has scikit-learn v0.24 in their environment and you want to train and deploy your model with scikit-learn v1.0.1. You will not be able to do it until Sagemaker upgrades to the new version of scikit learn (1.0.1).

0 reactions

In this article, you will learn how to use Aibro to deploy your model quickly and easily.

0 reactions

What is Aibro?

0 reactions

It currently supports AWS cloud platform and they are planning to support more cloud platforms like Google Cloud, Microsoft Azure, Alibaba Cloud and IBM Cloud. It also supports most of the popular machine learning frameworks in the market like TensorFlow, Pytorch, Scikit-learn, and XGboost.

0 reactions

Another advantage of using Aibro is the ability to reduce cloud costs by 85% using an exclusive cost-saving strategy built for machine learning. After understanding Aibro and its services, let's create a simple model and then deploy it.

0 reactions

Create a Simple Model

The first step is to build the model. We are going to use the IMDB Movie dataset to build a model that can classify if a movie review is positive or negative. Here are the steps you should follow to do that.

0 reactions

Import the Important packages

0 reactions

We need to import Python packages to load the data, clean the data, create a machine learning model, and save the model for deployment.

0 reactions

# import important modules
import numpy as np
import pandas as pd
# sklearn modules
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.naive_bayes import MultinomialNB # classifier 
from sklearn.metrics import (
    accuracy_score,
    classification_report,
    plot_confusion_matrix,
)
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
# text preprocessing modules
from string import punctuation 
# text preprocessing modules
from nltk.tokenize import word_tokenize
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer 
import re #regular expression
# Download dependency
for dependency in (
    "brown",
    "names",
    "wordnet",
    "averaged_perceptron_tagger",
    "universal_tagset",
):
    nltk.download(dependency)
    
import warnings
warnings.filterwarnings("ignore")
# seeding
np.random.seed(123)

Load the dataset from the data folder:

0 reactions

# load data
data = pd.read_csv("../data/labeledTrainData.tsv", sep='\t')

And then show a sample of the dataset:

0 reactions

# show top five rows of data
data.head()

Our dataset has 3 columns:

0 reactions

Id — This is the id of the review
Sentiment — either positive (1) or negative (0)
Review — comment about the movie

Next, let's check the shape of the dataset:

0 reactions

# check the shape of the data
data.shape

(25000, 3)

0 reactions

The dataset has 25,000 reviews.

0 reactions

Now we need to check if the dataset has any missing values:

0 reactions

# check missing values in data
data.isnull().sum()

id 0
sentiment 0
review 0
dtype: int64

0 reactions

The output shows that our dataset does not have any missing values.

0 reactions

How to Evaluate Class Distribution

0 reactions

We can use the value_counts() method from the Pandas package to evaluate the class distribution from our dataset.

0 reactions

# evalute news sentiment distribution
data.sentiment.value_counts()

1 12500
0 12500
Name: sentiment, dtype: int64

0 reactions

In this dataset, we have an equal number of positive and negative reviews.

0 reactions

How to Process the Data

0 reactions

After analyzing the dataset, the next step is to preprocess the dataset into the right format before creating our machine learning model.

0 reactions

The reviews in this dataset contain a lot of unnecessary words and characters that we don’t need when creating a machine learning model.

0 reactions

We will clean the messages by removing stopwords, numbers, and punctuation. Then we will convert each word into its base form by using the lemmatization process in the NLTK package.

0 reactions

The text_cleaning() function will handle all necessary steps to clean our dataset.

0 reactions

stop_words =  stopwords.words('english')

def text_cleaning(text, remove_stop_words=True, lemmatize_words=True):
    # Clean the text, with the option to remove stop_words and to lemmatize words
    # Clean the text
    text = re.sub(r"[^A-Za-z0-9]", " ", text)
    text = re.sub(r"\'s", " ", text)
    text =  re.sub(r'http\S+',' link ', text)
    text = re.sub(r'\b\d+(?:\.\d+)?\s+', '', text) # remove numbers
        
    # Remove punctuation from text
    text = ''.join([c for c in text if c not in punctuation])
    
    # Optionally, remove stop words
    if remove_stop_words:
        text = text.split()
        text = [w for w in text if not w in stop_words]
        text = " ".join(text)
    
    # Optionally, shorten words to their stems
    if lemmatize_words:
        text = text.split()
        lemmatizer = WordNetLemmatizer() 
        lemmatized_words = [lemmatizer.lemmatize(word) for word in text]
        text = " ".join(lemmatized_words)
    
    # Return a list of words
    return(text)

Now we can clean our dataset by using the text_cleaning() function:

0 reactions

#clean the review
data["cleaned_review"] = data["review"].apply(text_cleaning)

Then split the data into features and target variables like this:

0 reactions

#split features and target from  data 
X = data["cleaned_review"]
y = data.sentiment.values

Our feature for training is the cleaned_review variable and the target is the sentiment variable.

0 reactions

We then split our dataset into train and test data. The test size is 15% of the entire dataset.

0 reactions

# split data into train and validate
X_train, X_valid, y_train, y_valid = train_test_split(
    X,
    y,
    test_size=0.15,
    random_state=42,
    shuffle=True,
    stratify=y,
)

How to Create a Model

We will train the Multinomial Naive Bayes algorithm to classify if a review is positive or negative. This is one of the most common algorithms used for text classification.

0 reactions

But before training the model, we need to transform our cleaned reviews into numerical values so that the model can understand the data.

0 reactions

In this case, we will use the TfidfVectorizer method from scikit-learn. TfidfVectorizer will help us to convert a collection of text documents to a matrix of TF-IDF features.

0 reactions

To apply this series of steps (pre-processing and training), we will use a Pipeline class from scikit-learn that sequentially applies a list of transforms and a final estimator.

0 reactions

# Create a classifier in pipeline
sentiment_classifier = Pipeline(steps=[
                               ('pre_processing',TfidfVectorizer(lowercase=False)),
                                 ('naive_bayes',MultinomialNB())
                                 ])

Then we train our classifier like this:

0 reactions

# train the sentiment classifier 
sentiment_classifier.fit(X_train,y_train)

We then create a prediction from the validation set:

0 reactions

# test model performance on valid data 
y_preds = sentiment_classifier.predict(X_valid)

The model’s performance will be evaluated by using the accuracy_score evaluation metric. We use accuracy_score because we have an equal number of classes in the sentiment variable.

0 reactions

accuracy_score(y_valid,y_preds)

0.8629

0 reactions

The accuracy of our model is around 86.29% which is good performance.

0 reactions

How to Save the Model Pipeline

We can save the model pipeline in the model’s directory by using the joblib Python package.

0 reactions

#save model 
import joblib 

joblib.dump(sentiment_classifier, '../models/sentiment_model_pipeline.pkl')

Now that we've built our model, let's learn how to deploy it with Aibro.

0 reactions

Deployment WorkFlow – a Step by step Guide

To deploy your model with Aibro, you need to prepare your model in the properly formatted machine learning model repository.

0 reactions

You can quickly take a look at this repository https://github.com/AIpaca-Inc/Aibro-examples, but we will build the same for the model we have created.

0 reactions

As you can see from the image of the deployment flow, Aibro will create an inference API from the formatted machine learning model repository, and you will receive a unique API URL and start to make a prediction from your model.

0 reactions

All you need to do is to follow these steps.

0 reactions

Step 1: Install the aibro Python library

0 reactions

To install aibro, run the following command in your terminal:

0 reactions

pip install aibro

Step 2: Prepare the Model Repository

0 reactions

The model repository will be formatted in the following structure.

0 reactions

(a) model folderThis folder will contain the model you have created.

0 reactions

(b) data folderThe data folder will have a JSON file that has an input value. For our case, the input will have a text value (review) as follows.

0 reactions

{
"data": "I loved it, the kids loved it. It shows them that anything is possible but more especially when you have that one person fighting for you. That one person who believes in you without fail. I appreciated the various life lessons included in the film about being humble and thankful but commanding respect at the same time despite where or what background you come from. Success doesn’t see age, race or gender but sadly opportunity often does. Will Smith doesn’t let the lack of opportunity beat them as a family and the family is a team. The bigger picture is always knowing that there is a team involved in most successful people."
}

0 reactions

Note: Remember there is no restriction on how you want to format your input and output.

0 reactions

(c) predict.pyThe python file should contain two python functions.

0 reactions

load_model():
This function is responsible for loading the machine learning model from the model folder and returning it. In this tutorial, we will use the joblib package to load the model we have created.

0 reactions

def load_model():
   #load model
   model = joblib.load("model/sentiment_model_pipeline.pkl")
  
   return model

run():
This function will receive a model as the input and then load the data from the data folder. Finally, it will make predictions and return the result.

0 reactions

def run(model):
   fp = open("data/data.json", "r")
   data = json.load(fp)
   review = text_cleaning(data["data"])
 
   result = {"data": model.predict([review])}
   return result

Therefore the predict.py will look as follows:

0 reactions

# import important modules
 
import json # load data
import joblib # load model
from clean import text_cleaning # function to clean the text
   
def load_model():
   #load model
   model = joblib.load("model/sentiment_model_pipeline.pkl")
  
   return model
 
def run(model):
   fp = open("data/data.json", "r")
   data = json.load(fp)
   review = text_cleaning(data["data"])
 
   result = {"data": model.predict([review])}
   return result
 
if __name__ == "__main__":
   run(load_model())

(d) requirements.txt
Aibro will first need to install the packages required to run your model before deploying the model itself. You can either manually write the packages and their version number in the requirements.txt or run the following command which will do the same:

0 reactions

pip list --format=freeze > requirements.txt

nltk==3.6.7
numpy==1.19.1
pandas==1.0.5
scikit_learn==0.23.1
joblib==1.0.0

0 reactions

Note: It is also recommended to use the pipreqs Python package to generate requirements.txt. This is because it will include Python packages based on imports in your project instead of all packages in your environment.

0 reactions

$ pipreqs /home/aibro_project
Successfully saved requirements file /home/aibro_project/requirements.txt

0 reactions

(e) Other Artifacts
You can also include other files or folders that will be used by the predict.py Python file. For example, in the model we have created, we will need to clean the input before making a prediction.

0 reactions

The clean.py contains a Python function that will clean the text before making a prediction.

0 reactions

# import packages
import nltk
 
# Download dependency
corpora_list = ["stopwords","names","brown","wordnet"] 
 
for dependency in corpora_list:
   try:
       nltk.data.find('corpora/{}'.format(dependency))
   except LookupError:
       nltk.download(dependency)
 
taggers_list = ["averaged_perceptron_tagger","universal_tagset"]
 
for dependency in taggers_list:
   try:
       nltk.data.find('taggers/{}'.format(dependency))
   except LookupError:
       nltk.download(dependency)
 
tokenizers_list = ["punkt"]
 
for dependency in tokenizers_list:
   try:
       nltk.data.find('tokenizers/{}'.format(dependency))
   except LookupError:
       nltk.download(dependency)
  
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize
import re #regular expression
from string import punctuation
 
 
stop_words =  stopwords.words('english')
  
# function to clean the text
def text_cleaning(text, remove_stop_words=True, lemmatize_words=True):
   # Clean the text, with the option to remove stop_words and to lemmatize word
   # Clean the text
   text = re.sub(r"[^A-Za-z0-9]", " ", text)
   text = re.sub(r"\'s", " ", text)
   text =  re.sub(r'http\S+',' link ', text)
   text = re.sub(r'\b\d+(?:\.\d+)?\s+', '', text) # remove numbers
      
   # Remove punctuation from text
   text = ''.join([c for c in text if c not in punctuation])
  
   # Optionally, remove stop words
   if remove_stop_words:
       text = text.split()
       text = [w for w in text if not w in stop_words]
       text = " ".join(text)
  
   # Optionally, shorten words to their stems
   if lemmatize_words:
       text = text.split()
       lemmatizer = WordNetLemmatizer()
       lemmatized_words = [lemmatizer.lemmatize(word) for word in text]
       text = " ".join(lemmatized_words)
  
   # Return a list of words
   return(text)

How to Test the Repo with Dryrun

Before we deploy our model, we can test the repo using Dryrun. Dryrun will locally validate the repo structure and test if the inference result can be successfully returned.

0 reactions

The following line of code will test the repo we have created:

0 reactions

from aibro import Inference
 
api_url = Inference.deploy(
   artifacts_path="./sentiment_model_repo",
   dryrun=True,
)

Note: The formatted model repository is saved at path “./sentiment_model_repo”.

0 reactions

The result shows that the prediction finished without errors. Now we can deploy the model.

0 reactions

How to Create an Inference API with One Line of Code

To deploy the model, you need to configure the following variables in the inference.deploy() method.

0 reactions

(a) model_name
The model name should be unique with respect to all current active inference jobs under your profile. In this example, the model name will be "my_sentiment_classifier".

0 reactions

(b) machine_id_config
This is the id of the machine that will run our model. For this example we will use "c5.large.od".You can see the entire list in the marketplace.

0 reactions

(c) artifacts_path
This will be the path to your formatted machine learning model repository. For this example, the path is "./sentiment_model_repo".

0 reactions

(d) description
You can also add a description of your model deployment.

0 reactions

Finally, the one-line code to create an inference API will look as follows.

0 reactions

from aibro import Inference
api_url = Inference.deploy(
   model_name = "my_movie_sentiment_classifier",
   machine_id_config = "c5.large.od",
   artifacts_path = "./sentiment_model_repo",
   description="my first inference job",
)

Once the deployment is finished, an API URL is returned with the syntax like this “http://api.aipaca.ai/v1/{username}/{client_id}/{model_name}/predict”

0 reactions

Note: if your inference job is public, {client_id} is filled out with "public". Otherwise, {client_id} should be filled out with one of your clients' IDs.

0 reactions

In this tutorial, the API URL will be http://api.aipaca.ai/v1/DavisDavid/public/my_movie_sentiment_classifier/predict

0 reactions

How to Test an Aibro API

We successfully deployed the model and got the API URL. Let’s test the model and see the result. We will use a Python package called requests to send a request to the API URL and get results.

0 reactions

Note: The posted data will replace everything in the data folder. Therefore, your posted data should have the same format as whatever you had in the data folder initially.

0 reactions

import requests
import json
 
review = {"data": "A truly beautiful film that will having you crying with joy and pride. The (few) poor reviews cite a lack of authenticity regarding Richards character and a lack of screen time for the other major family members, including Serena. While I admittedly don’t know exactly the kind of person and father Richard was"}
 
prediction = requests.post(
   "http://api.aipaca.ai/v1/DavisDavid/public/my_movie_sentiment_classifier/predict",
   data=review,
)
 
result = prediction.text
 
print(result)

{‘data’: array([1])}

0 reactions

As you can see we managed to predict by using the API, and the model predicts that the review is positive (1).

0 reactions

Complete the inference job

If you're no longer going to use your inference job, you should shut down the API to avoid unnecessary costs. You can shut it down by passing the inference job id in the Inference.complete() method.

0 reactions

rom aibro.inference import Inference
 
id = "inf_cd712f4a-4b59-4e44-8787-9c5b5450ff6d"
Inference.complete(job_id=id)

You will receive an output that the inference job successfully completed.

0 reactions

Final Thoughts

In this article, you have learned the fastest and simplest way to deploy a Machine Learning model to the cloud by using Aibro. You don't need to take a lot of your time and resources to configure the environment – just install aibro and you are good to go.

1 reactions

There is a free community edition you can use for your small ML projects. But Alpaca is also giving free credits to new users so you don't need to worry about the cloud costs. There are a lot of features from Aibro that you can use while deploying your model.

0 reactions

To learn more you can visit our beautiful designed documentation pages here and you can also join our community here to get more help.

0 reactions

You can download the source code used in this article here:https://github.com/Davisy/Aibro-ML-Model-Deployment

0 reactions

If you learned something new or enjoyed reading this article, please share it so that others can see it.

0 reactions

by Davis David @davisdavid.Data Scientist | AI Practitioner | Software Developer. Giving talks, teaching, writing.

Contact me to collaborate

Customized Experience|

How to Deploy Machine Learning Models to the Cloud Quickly and Easily

How to Deploy Machine Learning Models to the Cloud Quickly and Easily

What does it mean to deploy a Machine Learning model?

What is Aibro?

Create a Simple Model

How to Create a Model

How to Save the Model Pipeline

Deployment WorkFlow – a Step by step Guide

How to Test the Repo with Dryrun

How to Create an Inference API with One Line of Code

How to Test an Aibro API

Complete the inference job

Final Thoughts

Recommend

How to store excess wind power underwater

Simplifying API with Tyny.dev Founder Patrick Pittich-Rinnerthaler

Amazon raises US price for Prime as profits jump

Feature Friday 78 - NSXV to T migration tool 1.3.1

人偶之梦 - Velas电波站

Introducing the Road to Web3 Hackathon Powered by Polygon

Meta moves to tackle creepy behaviour in virtual reality

Iris: Student-built robot rover on track to explore the Moon

Top 10 Javascript File Managers to Use in 2022

Top 10 Cloud Security Trends of the Year

About Joyk