5

Pandas For Beginners — Combining Dataframes — Part 1

 9 months ago
source link: https://ujjwal-dalmia.medium.com/pandas-for-beginners-combining-dataframes-part-1-6140cb4aa26b
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Pandas For Beginners — Combining Dataframes — Part 1

Concatenate and append dataframes.

0*54vff0BzeMS78CZz

Photo by Ricardo Gomez Angel on Unsplash

A single data table is never sufficient to fulfill the requirements of an analytical project. To have a workable dataset, we always gather information from multiple sources and stitch them together. Stitching can be done in many ways like concatenation, append, merge, and joins. To make life simpler, Pandas offers a set of functions that we can use to achieve our purpose. In this tutorial, we will focus only on the concatenation aspect of combining dataframes.

Assumption and Recommendation

Being hands-on is the key to master programming. We recommend that you continue to implement the codes as you follow through with the tutorial. The sample data and the associated Jupiter notebook is available in the Scenario_14 folder of this GitHub link.

If you are new to GitHub and want to learn it, please go through this tutorial. To set up a new Python environment on your system, please go through this tutorial.

Following is the list of Python concepts and pandas functions/ methods used in the tutorial:

Pandas functions

  • read_csv
  • append
  • concat

Getting Started

Step 1 — Keeping the data ready

For this tutorial, we have created three dummy datafiles. In the first file, we have the population of five countries from the year 2010 to 2014. The second file has the population information on the next five countries for the same period. The third file has information on the ten countries (part of files 1 and 2) but, for the period starting 2015 to 2019. The dictionary of these data sets and the sample data snapshot is as follows:

  • Country_name— Name of the country
  • 2010… 2014— Population from the year 2010 to 2014
  • 2015… 2019— Population from the year 2015 to 2019
1*pOUKE9KtsOQyBrjNiNNKTA.png

Sample Data Snapshot (Image by Author)

Step 2 — Importing pandas…


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK