12

7 Tools to Create A Rockstar Data Science Portfolio

 4 years ago
source link: https://mc.ai/7-tools-to-create-a-rockstar-data-science-portfolio/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Github Pages

Github is probably the most obvious tool for any technical portfolio. Each project can be nicely showcased as a standalone repository with documentation and a README.md.

However, if you have a number of projects all in separate repository’s, then it is better to collate these together with some notes on your own website. Github has a really nice tool for creating a simple website known as Github pages .

To create your website simply navigate to your Github account and select the create new repository option just as you would when creating any new project. However, in the repository name box, you need to add .github.io to the end of the name.

Next, you need to clone the repository in the usual way. From the command line type.

git clone https://github.com/myaccount/myname.github.io.git

Navigate to the new directory.

cd rebeccavickerydt.github.io

Add a index.md file and add a little content.

touch index.md

I’ve added the following content.

<html>
<head>
 <title>My Portfolio</title>
</head>
<body>
 <h1>Example projects</h1>
 <footer>rvickery 2020</footer>
</body>
</html>

Commit and push the changes to the Github Repository.

git add --all
git commit -m "adding landing page"
git push

Now go to the settings in your repository, scroll down to Github Pages and select source master branch .

You can now access your URL for the website in settings.

Now you can visit your site.

WordPress

If you want to create a richer looking portfolio website it is worth looking at using WordPress to create your portfolio.

WordPress is a completely code-free way to develop a website, has a number of themes that I found well suited to a portfolio and is completely free if you don’t use a custom domain. I have just started to move my portfolio here this week and I have used the Penscratch 2 theme. Below is a screenshot of my landing page.

All the themes are very flexible, it is very simple to add new pages and change the layout to fit your style. I have found this to be a very fast way to produce a professional-looking portfolio.

Kaggle

Once you have chosen your tools for creating your portfolio you need to create some projects to add to it.

Kaggle is a popular machine learning competition website and has a number of features that can be used to create projects to showcase your skills.

The competitions themselves contain some very interesting data problems, the data sets can be complex and challenging, plus there is a leaderboard which enables you to benchmarks your skills against others.

Kaggle also has a tool known as Kaggle Kernels . These are hosted notebooks to develop and run your data science code. It is very easy to import the Kaggle data sets to work on them directly and the code you write is public and therefore easily shared in a portfolio. The kernels also have a connection to Google Cloud services which allow you to also showcase your skills with querying data stored in relational databases and accessing cloud computing resources.

Kaggle also maintains an archive of publically available data sets . There are a wide variety of different data sets available here, these are great for creating more novel projects and showcasing exploratory data analysis.

Driven data

If you want to showcase your skills and help solve social problems at the same time then drivendata.org is a great place to start. Similar to Kaggle, this website hosts data science competitions with data sets and leaderboards, but the difference here is that at the heart of all competitions is a social problem to solve with data science.

The data sets behind these competitions are interesting and range from beginner-friendly to the more challenging.

Jovian.ml

If you are looking for an alternative to Github to track, organise or share your data science code, Jovian.ml is a good alternative. This is an open-source project for hosting and sharing notebook based data science projects.

You can create unlimited public projects which can consist of notebooks, files (such as helper files) and environment configuration files in a project. Jovian also has the ability to track output files such as model versions and therefore is in some ways more suited than Github to storing data science projects and experiments.

The jovian.ml profile page

Google Cloud Platform

The vast majority of businesses these days are moving to or are already utilising cloud data storage and computing power. It is, therefore, a good idea to include some practical examples of applying your skills in this area. The Google Cloud Platform (GCP) has a number of tools for showcasing your work.

GCP has a free usage tier so you should be able to create a number of projects for free.

BigQuery is Google’s cloud data warehouse product. Within the product Google host a wide range of public data sets which can be accessed through the BigQuery web UI. The documentation which talks through accessing these data sets can be found here .

Once you have access to some data in BigQuery you can create projects using the GCP AI hub which includes cloud-hosted notebooks, model deployment tools and machine learning API’s.

UCI machine learning repository

The UCI machine learning repository is another great source of publically available data sets. There are nearly 500 data sets available on the website and these cover a wide range of data science project types.

They are helpfully organised into categories around the task type, data type, area and size of data set. These data sets again provide a great resource to put together data science projects for your portfolio.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK