4

Serverless Python Bot using Google Cloud Platform

 2 years ago
source link: https://levelup.gitconnected.com/serverless-python-bot-using-google-cloud-platform-ee724f4a6b1f
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Responses

You have 2 free member-only stories left this month.

Serverless Python Bot using Google Cloud Platform

Step-by-Step Tutorial: From API to Database

(illustration by Chaeyun Kim)

Using Python script as a tool to extract, transform, load data become common nowadays. To properly automate Python script and run as a “scraper/bot”, we need a way to make them execute automatically at the proper times. One easy way to do this automation might be running it locally but that means you must turn on your PC all the time.

Fortunately, that is not the only way. This article will show you a serverless method using Google Platform, in particular, “Google Cloud Function” and “Google Cloud Scheduler” to automate your Python script.

Step-by-Step Tutorial

Use case: Periodically Storing Weather Data to Database

Knowing the average weather data can show you patterns and trends, and help researchers to figure out how our atmosphere works. The example data are including temperature, wind speed, rain, humidity, and pressure.

Step 0: Understanding the Structure

In this example, the open-source weather data will be saved to the database using the serverless method. A Python script is used to request (HTTP GET) the data from the OpenWeatherMap API and then update it to the PostgreSQL database.

1*wNnxQawfmAIxjuKR6lqAWg.png?q=20
serverless-python-bot-using-google-cloud-platform-ee724f4a6b1f

A — Data source: OpenWeatherMap API

B — Python Process: Google Cloud Function

C— Job Scheduler: Google Cloud Scheduler

D— Database: PostgreSQL

Step 1: Prepare Python Script

We can create a simple Python script as in the example below. You may follow and adjust according to your dataset and your database credential.

  • Line [5–9]: Load Data from API to Dataframe
  • Line [11–17]: Connect to the database
  • Line [20]: Post dataframe to database

Tip: don't forget to test your Python script before going to the next step. You may use the Colab for this purpose too.

Step 2: Upload Python Script to Google Cloud Function

Before uploading Python script to Google Cloud, you have to register for an account if you haven't had one yet. Then, create a project and activate the Cloud Build/ Cloud Function/ Cloud Scheduler API at the API dashboard. After that, head to Cloud Function and click on create function. Select Function name and Region as you see fit.

Trigger setting: This is how your function is triggered to run. Most tutorials would show “HTTP” as an example, but I would rather recommend Cloud Pub/Sub as we will schedule it to run periodically inside Google Cloud which we will use Google Cloud Scheduler to do this action later. By selecting Cloud Pub/Sub, you will be prompted to create or select a topic. If this is your first function, just create a new topic.

1*UxYFwBUtc6jz0E8XGBgpwQ.png?q=20
serverless-python-bot-using-google-cloud-platform-ee724f4a6b1f
Screenshot by Author

You may leave other settings as default. On the next page, you may select runtime as “Python 3.8” or “Python 3.9”. Copy and paste your script to the main.py. Make sure that your function name matches the entry point name.

1*0ltrik3SrulMCF4jcqL7cQ.png?q=20
serverless-python-bot-using-google-cloud-platform-ee724f4a6b1f
Deploy Python Script on the Google Cloud Function (by Author)

Input all dependencies for your script in the “requirements.txt”. In our example, these are all requirements:

requests>=2.23.0
pandas>=1.3.5
sqlalchemy>=1.4.31
datetime>=4.4
psycopg2>=2.8.6

After you click on “deploy”, head to the Cloud Function list and you can see your our function is showing up. Wait until the green button is shown up to indicate that all dependencies are installed. Click on the action->test and then action->views logs to see if our function is run successfully without error.

In addition, you may query your database to check if new data had been updated.

Step 3: Schedule your function

Head to Google Cloud Scheduler to trigger your function periodically with Pub/Sub.

1*IGwiIZQDghftY6za-DEBVg.png?q=20
serverless-python-bot-using-google-cloud-platform-ee724f4a6b1f

Frequency: Specify how often to run your python script with Unix-cron format. You may use https://crontab.guru/ to help structure cron frequency.

Pub/Sub topic: select the topic created in step 2 when we constructed the cloud function.

That’s it, now your Python Script is ready and running on the cloud architecture in the GCP!

Result

Future Improvement

This example is just to show the overall process of how to use Google Cloud function and there are rooms to be improved such as

  • You may add the database credentials as environmental variables during step two.
  • After data is loaded via request in Python, it is recommended to do data cleaning before storing it in the database.
  • The database port must be opened for Google Cloud Function to access. You might need to apply static IP for the Cloud Function to improve your database security.

What about Cost?

According to GCP, Cloud Functions provides a perpetual free tier for compute-time resources, which includes an allocation of both GB-seconds and GHz-seconds. In addition to the 2 million invocations, the free tier provides 400,000 GB-seconds, 200,000 GHz-seconds of computing time, and 5GB of Internet egress traffic per month. Our example today is a very simple process and surely has no cost.

Alternative?

Yes, there are several options out there which I wrote details about it before.

I hope you enjoy this article and find it useful for your daily work or projects. Please, feel free to contact me if you have any questions.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK