20

7 Data Science Project Ideas for Aspiring Data Scientists

 4 years ago
source link: https://towardsdatascience.com/7-data-science-project-ideas-for-aspiring-data-scientists-7defd62e07a7?gi=405e2c33c890
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

7 Data Science Project Ideas for Aspiring Data Scientists

A beginner-friendly list of data science projects for May 2020

May 5 ·4min read

bmiuAjB.png!web

Photo by Max Duzij on Unsplash

Due to popular demand and many requests, I decided to create a unique list of data science projects for those that are beginning their journey as a Data Scientist. There’s a mix of visualization projects, exploratory data analysis projects, and predictive modeling. I hope you enjoy this article and wish you the best of luck in your endeavors!

Rainfall in India

Project type: Visualization

Link to dataset here .

EZzaiuy.png!web

Photo by Julian Yu on Unsplash

This dataset contains monthly rainfall details of 36 sub-divisions of India. Here are some visualization ideas you can try for yourself:

  • You can create bar graphs or pie graphs to compare the amount of rainfall by region
  • You can create a line graph to compare rainfall by region over time
  • You can create an animated choropleth map to show where it rains over time! If you want to learn how to build a choropleth visualization, check out my tutorial here .

Global Suicide Rates

Project type: Exploratory Data Analysis

Link to dataset here .

63e2Evv.png!web

Photo by Sasha Freemind on Unsplash

This is a consolidated dataset with details on suicide rates, human development index (HDI) numbers, GDP, and demographics by country by year. The purpose of this dataset was to see if there are any indicators that are correlated to increased suicide rates.

Explore the data and see what countries and continents have the highest suicide rates. What trends do you notice? Are suicide rates increasing or decreasing overall? What is the proportion of the number of suicides between males and females? See if you can find any variables that are correlated with suicide rates.

Summer Olympic Medals

Project type: Exploratory Data Analysis

Link to dataset here .

ZvQVRfE.png!web

Photo by Bryan Turner on Unsplash

On a less morbid note, here’s a dataset that contains all of the medal winners in the Summer Olympics from 1976 Montreal to 2008 Beijing. Explore the data and see which countries have won the most medals overall. Are there countries that are performing better over time? What about worse over time?

World Happiness Report

Project type: Exploratory Data Analysis

Link to dataset here .

ayMVFvq.png!web

Photo by KAL VISUALS on Unsplash

The happiness score is a quantifiable measurement of the average ‘happiness’ of a country. This is based on six factors: economic production, social support, life expectancy, freedom, absence of corruption, and generosity.

This dataset contains 155 countries and their associated happiness scores and 6 factors from 2015 to 2019. Are we globally becoming more or less happier each year? What continent is the happiest? The least happy? Which of the six factors has the biggest impact on happiness? What about the least impact?

Pollution in the United States

Project type: Visualization

Link to dataset here .

nA7vyef.png!web

Photo by Ella Ivanescu on Unsplash

This dataset contains information on the four major pollutants (Nitrogen Dioxide, Sulphur Dioxide, Carbon Monoxide, and Ozone) for every day from 2000 to 2016 in the United States.

Here are some visualization ideas:

  • What states are the biggest polluters? The least?
  • How much has the US polluted over time? Are they polluting more than 10 years ago or less?
  • See if you can create a choropleth map to show geographically the level of pollution over time!

Nutrition Facts for McDonald’s Menu

Project type: Exploratory Data Analysis

Link to dataset here .

yY3aeyb.png!web

Photo by XUNO. on Unsplash

This dataset provides a nutrition analysis of every menu item on the US McDonald’s menu, including breakfast, beef burgers, chicken and fish sandwiches, fries, salads, soda, coffee and tea, milkshakes, and desserts.

How many calories does the average McDonald’s value meal contain? Is it really healthier to order grilled chicken instead of crispy? What is the healthiest combination of items that you would have to eat to get your daily nutritional requirements?

Red Wine Quality

Project type: Prediction Modeling

Link to dataset here .

UfQZjmR.png!web

Photo by Terry Vlisidis on Unsplash

This dataset contains data on various wines, their composition, and their wine quality. This can be a regression or classification problem depending on how you frame it. See if you can predict the quality of a red wine given 11 inputs (fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulfates, and alcohol.

Thanks for Reading!

If you like my work and want to support me, I’d greatly appreciate if you followed me on my social media channels:

  1. The BEST way to support me is by following me on Medium here .
  2. Follow me on Twitter here .
  3. Subscribe to my new YouTube channel here .
  4. Follow me on LinkedIn here .
  5. Sign up on my email list here .
  6. Check out my website terenceshin.com .

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK