Air Quality - Pollutant Index - India
source link: https://dev.to/adhirkirtikar/air-quality-pollutant-index-india-88h
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
This mini project shows the Air Quality Index (Industrial Air Pollution) from various locations in India in a Tableau Public dashboard.
The data is sourced from data.gov.in API using Python for cleaning & loading the data in Google Sheets.
Data is updated daily (or manually) using GitHub Actions or AWS Lambda.
My Workflow
GitHub Action "run-python.yml"
- Google credentials are stored in GitHub Actions Environment Secrets.
- data.gov.in API Key is also stored in GitHub Actions Environment Secrets.
- A GitHub Action is created that can run manually or on a schedule [12PM UTC (8PM SGT)]
- Python 3.9 is setup using actions/[email protected] and the pip packages are cached
- Dependencies are installed (google auth, pygsheets & pandas)
- The Environment Secrets are exported to environment variables.
- Finally, the Python script is run with the environment variables passed as parameters.
Submission Category:
Wacky Wildcards
- I tried to use this workflow as a replacement / complement to the AWS Lambda function that processes the Python script at 8AM SGT.
Yaml File or Link to Code
run-python.yml
# This is a basic workflow to help you get started with Actions
name: Run Python
# Controls when the action will run.
on:
schedule:
# run at 12PM UTC (8PM SGT)
- cron: '0 12 * * *'
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# use environment (named as "env") defined in the GitHub repository settings
environment: env
# The type of runner that the job will run on
runs-on: ubuntu-latest
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
-
name: Checkout
uses: actions/checkout@v2
# Set up Python 3.9 environment and cache pip packages
-
name: Setup Python 3.9
uses: actions/[email protected]
with:
python-version: '3.9'
cache: 'pip'
# Install dependencies mentioned in the requirements.txt
-
name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
# Run a bash shell and store env secrets in parameters to pass to Python script
-
name: Get Parameters & Run Air Quality Index India Python script
shell: bash
env:
GOVINAPIKEY: ${{ secrets.DATA_GOV_IN_API_KEY }}
GDRIVEAPIKEY: ${{ secrets.GDRIVE_API_CREDENTIALS }}
run: |
python "Air Quality Index India.py" "$GOVINAPIKEY" "$GDRIVEAPIKEY"
Full repository is here:
Air-Quality-Index-India-GitHub-Actions
Repo for 2021 GitHub Actions Hackathon on DEV
GitHub Action "run-python.yml"
- Google credentials are stored in GitHub Actions Environment Secrets.
- data.gov.in API Key is also stored in GitHub Actions Environment Secrets.
- A GitHub Action is created that can run manually or on a schedule [12PM UTC (8PM SGT)]
- Python 3.9 is setup using actions/[email protected] and the pip packages are cached
- Dependencies are installed (google auth, pygsheets & pandas)
- The Environment Secrets are exported to environment variables.
- Finally, the Python script is run with the environment variables passed as parameters.
Python script "Air Quality Index India.py"
- The script connects to data.gov.in API using API Key passed as a parameter.
- Then it pulls the latest AQI data for India and stores in pandas dataframe.
- The data is cleaned, formatted and the columns are renamed. Nulls are replaced by 0.
- Google sheet is authenticated and connected using pygsheets.
Additional Resources / Info
Tableau Public Dashboard that uses the data from the generated Google Sheets: "Air Quality - Pollutant Index - India"
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK