5

SimpleHTTPOperator in Apache Airflow

 2 years ago
source link: https://dzone.com/articles/simplehttpoperator-in-apache-airflow
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Introduction

The SimpleHTTPOperator is an Operator in Airflow which can be used to request an API, say any REST service, and get the response from the Service. This article covers the important aspects of using a SimpleHTTPOperator, such as: 

  • Where to configure the URL (http_conn_id)
  • How to invoke a service
  • How to parse the response back from the service
  • Other attributes

Let’s explore! 

SimpleHTTPOperator

As earlier mentioned, SimpleHTTPOperator is used to call an API by sending an HTTP request. The example in the article is going to invoke an open source weather API, which is a GET call. 

Dallas weather example

One can see in the image {city} is replaced by a valid city “Dallas.” The response received is in JSON format, which explains the weather in Dallas. We are going to invoke the same request through SimpleHTTPOperator. 

As we know, the first step of the workflow execution in Airflow is to create a DAG. Then, the Operator should be used for the task execution.

Operator task execution

The DAG with the name “weatherServiceCall” is created with standard attributes like start_date and schedule_interval. The schedule_interval is defined using a cron-expression, which will invoke the service every 5 minutes. 

Then the SimpleHTTPOperator task is created, which has the following attributes: 

  • task_id = “weatherapi” is the task id
  • method = Indicates the HTTP Verb, which is a GET call 
  • http_conn_id = Explained briefly in the latter part of the article
  • endpoint = The relative part of the URL ex: /weather/Dallas
  • headers = Indicates HTTP Headers 
  • response_check = The lambda/user-defined function for handling the response from the REST Service

Configuring http_conn_id

The “http_conn_id” reflects the HTTP connection/REST endpoint details for the SimpleHTTPOperator. In our example, we have used a string “weather_api” which is configured through the Airflow UI. On the Airflow UI, go to the Admin section and click on Connections.

Admin section of Airflow UI

On the "Add Connection" screen, configure the details such as Conn Id (where the name “weather_api” is configured), Conn Type (which is HTTP), and the Host details.

Add connection

Click on Save, and that’s it: http_conn_id is configured!

response_check

In the example, we have used the Python function “handle_response” for handling the HTTP response. The implementation of the function is:

Handle_response

Implementation is straightforward. In the case the status code is 200, we are returning True. Otherwise, we would return False. 

DAG Details After Execution

DAG Details After Execution

The logs showcase that the response received is successful, and the print statement of the handle_response function is displayed. 

Log showing successful response received

Summary

The article explains in detail how to call a REST service using SimpleHTTPOperator. We also covered the configuration of http_conn_id through Airflow UI as well as showcased the output of DAG execution.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK