3

Air Quality - Pollutant Index - India

 2 years ago
source link: https://dev.to/adhirkirtikar/air-quality-pollutant-index-india-88h
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Pollutant Index

This mini project shows the Air Quality Index (Industrial Air Pollution) from various locations in India in a Tableau Public dashboard.
The data is sourced from data.gov.in API using Python for cleaning & loading the data in Google Sheets.
Data is updated daily (or manually) using GitHub Actions or AWS Lambda.

My Workflow

GitHub Action "run-python.yml"

  • Google credentials are stored in GitHub Actions Environment Secrets.
  • data.gov.in API Key is also stored in GitHub Actions Environment Secrets.
  • A GitHub Action is created that can run manually or on a schedule [12PM UTC (8PM SGT)]
  • Python 3.9 is setup using actions/[email protected] and the pip packages are cached
  • Dependencies are installed (google auth, pygsheets & pandas)
  • The Environment Secrets are exported to environment variables.
  • Finally, the Python script is run with the environment variables passed as parameters.

Submission Category:

Wacky Wildcards

  • I tried to use this workflow as a replacement / complement to the AWS Lambda function that processes the Python script at 8AM SGT.

Yaml File or Link to Code

run-python.yml

# This is a basic workflow to help you get started with Actions

name: Run Python

# Controls when the action will run. 
on:
  schedule:
    # run at 12PM UTC (8PM SGT)
    - cron: '0 12 * * *'

  # Allows you to run this workflow manually from the Actions tab
  workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
  # This workflow contains a single job called "build"
  build:
    # use environment (named as "env") defined in the GitHub repository settings
    environment: env

    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    # Steps represent a sequence of tasks that will be executed as part of the job
    steps:
      # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
      -
        name: Checkout
        uses: actions/checkout@v2

      # Set up Python 3.9 environment and cache pip packages
      - 
        name: Setup Python 3.9
        uses: actions/[email protected]
        with:
          python-version: '3.9'
          cache: 'pip'

      # Install dependencies mentioned in the requirements.txt
      - 
        name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt

      # Run a bash shell and store env secrets in parameters to pass to Python script
      -
        name: Get Parameters & Run Air Quality Index India Python script
        shell: bash
        env:
          GOVINAPIKEY: ${{ secrets.DATA_GOV_IN_API_KEY }}
          GDRIVEAPIKEY: ${{ secrets.GDRIVE_API_CREDENTIALS }}
        run: |
          python "Air Quality Index India.py" "$GOVINAPIKEY" "$GDRIVEAPIKEY"
Enter fullscreen modeExit fullscreen mode

Full repository is here:

Air-Quality-Index-India-GitHub-Actions

Repo for 2021 GitHub Actions Hackathon on DEV

GitHub Action "run-python.yml"

  • Google credentials are stored in GitHub Actions Environment Secrets.
  • data.gov.in API Key is also stored in GitHub Actions Environment Secrets.
  • A GitHub Action is created that can run manually or on a schedule [12PM UTC (8PM SGT)]
  • Python 3.9 is setup using actions/[email protected] and the pip packages are cached
  • Dependencies are installed (google auth, pygsheets & pandas)
  • The Environment Secrets are exported to environment variables.
  • Finally, the Python script is run with the environment variables passed as parameters.

Python script "Air Quality Index India.py"

  • The script connects to data.gov.in API using API Key passed as a parameter.
  • Then it pulls the latest AQI data for India and stores in pandas dataframe.
  • The data is cleaned, formatted and the columns are renamed. Nulls are replaced by 0.
  • Google sheet is authenticated and connected using pygsheets.

Additional Resources / Info

Tableau Public Dashboard that uses the data from the generated Google Sheets: "Air Quality - Pollutant Index - India"


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK