38

Erik Marsja: 9 Data Visualization Techniques You Should Learn in Python

 5 years ago
source link: https://www.tuicool.com/articles/Yn2uMfi
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Correlogram in Python

We continue with an Python data visualization example in which we are going to use the heatmap method to create a correlation plot. Note, a correlogram is a way to visualize the correlation matrix. Before we create the correlogram, using Seaborn, we use Pandas corr method to create a correlation matrix. We are then using numpy to remove to upper half of the correlation matrix.

import numpy as np
import pandas as pd
import seaborn as sns
 
# Correlation matrix
corr = df.corr()
 
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
 
 
fig = plt.figure(figsize=(12, 8))
sns.heatmap(corr, mask=mask, vmax=.3, center=0,
            square=True, linewidths=.5, cbar_kws={"shrink": .5})

ARnymeA.png!web Violin Plots in Python using Seaborn

In the next Python data visualization example we are going to learn how to create a violin plot using Seaborn. A violin plot can be used to display the distribution of the data and its probability density . Furthermore, we get a visualization of the mean of the data (white dot in the center of the box plot, in the image below).

import pandas as pd
import seaborn as sns
 
 
df = pd.read_csv('https://vincentarelbundock.github.io/Rdatasets/csv/datasets/mtcars.csv', index_col=0)
 
fig = plt.figure(figsize=(12, 8))
sns.violinplot(x="vs", y='wt', data=df)

fEZjqyr.png!web Raincloud Plots in Python using ptitprince

Finally, we are going to learn how to create a “Raincloud Plot” in Python. As mentioned in the beginning of the post we need to install the package ptitprince to create this data visualization ( pip install ptitprince ).

Now you may wonder what a Raincloud Plot is? This is a very informative method to display your raw data (remember, bar plots may not be the best method). A Raincloud Plot combines the boxplot, violin plot, and the scatter plot.

Python Rainclod Plot Example:

import pandas as pd
import ptitprince as pt
 
df = pd.read_csv('https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv')
 
ax = pt.RainCloud(x = 'Species', y = 'Sepal.Length', 
                  data = df, 
                  width_viol = .8,
                  width_box = .4,
                  figsize = (12, 8), orient = 'h',
                  move = .0)

j6VNfaA.png!web

Summary

In this Python data visualization tutorial we have learned how to create 9 different plots using Python Seaborn. More precisely we have used Python to create a scatter plot, histogram, bar plot, time series plot, box plot, heat map, correlogram, violin plot, and raincloud plot. All these data visualization techniques can be useful to explore and display your data before carrying on with the parametric data analysis. They are also very handy for visualizing data so that other researchers can get some information about different aspects of your data.

Leave a comment below if there are any data visualization methods that we need to cover in more detail. Here’s a link to a Jupyter notebook containing all the 9 examples covered in this post.

References

Allen M, Poggiali D, Whitaker K et al. Raincloud plots: a multi-platform tool for robust data visualization [version 1; peer review: 2 approved] . Wellcome Open Res 2019, 4 :63. https://doi.org/10.12688/wellcomeopenres.15191.1 )

Weissgerber TL, Milic NM, Winham SJ, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLOS Biology 13(4): e1002128. https://doi.org/10.1371/journal.pbio.1002128


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK