Erik Marsja: 9 Data Visualization Techniques You Should Learn in Python
source link: https://www.tuicool.com/articles/Yn2uMfi
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Correlogram in Python
We continue with an Python data visualization example in which we are going to use the heatmap method to create a correlation plot. Note, a correlogram is a way to visualize the correlation matrix. Before we create the correlogram, using Seaborn, we use Pandas corr method to create a correlation matrix. We are then using numpy to remove to upper half of the correlation matrix.
import numpy as np import pandas as pd import seaborn as sns # Correlation matrix corr = df.corr() mask = np.zeros_like(corr, dtype=np.bool) mask[np.triu_indices_from(mask)] = True fig = plt.figure(figsize=(12, 8)) sns.heatmap(corr, mask=mask, vmax=.3, center=0, square=True, linewidths=.5, cbar_kws={"shrink": .5})
Violin Plots in Python using Seaborn
In the next Python data visualization example we are going to learn how to create a violin plot using Seaborn. A violin plot can be used to display the distribution of the data and its probability density . Furthermore, we get a visualization of the mean of the data (white dot in the center of the box plot, in the image below).
import pandas as pd import seaborn as sns df = pd.read_csv('https://vincentarelbundock.github.io/Rdatasets/csv/datasets/mtcars.csv', index_col=0) fig = plt.figure(figsize=(12, 8)) sns.violinplot(x="vs", y='wt', data=df)
Raincloud Plots in Python using ptitprince
Finally, we are going to learn how to create a “Raincloud Plot” in Python. As mentioned in the beginning of the post we need to install the package ptitprince to create this data visualization ( pip install ptitprince ).
Now you may wonder what a Raincloud Plot is? This is a very informative method to display your raw data (remember, bar plots may not be the best method). A Raincloud Plot combines the boxplot, violin plot, and the scatter plot.
Python Rainclod Plot Example:
import pandas as pd import ptitprince as pt df = pd.read_csv('https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv') ax = pt.RainCloud(x = 'Species', y = 'Sepal.Length', data = df, width_viol = .8, width_box = .4, figsize = (12, 8), orient = 'h', move = .0)
Summary
In this Python data visualization tutorial we have learned how to create 9 different plots using Python Seaborn. More precisely we have used Python to create a scatter plot, histogram, bar plot, time series plot, box plot, heat map, correlogram, violin plot, and raincloud plot. All these data visualization techniques can be useful to explore and display your data before carrying on with the parametric data analysis. They are also very handy for visualizing data so that other researchers can get some information about different aspects of your data.
Leave a comment below if there are any data visualization methods that we need to cover in more detail. Here’s a link to a Jupyter notebook containing all the 9 examples covered in this post.
References
Allen M, Poggiali D, Whitaker K et al. Raincloud plots: a multi-platform tool for robust data visualization [version 1; peer review: 2 approved] . Wellcome Open Res 2019, 4 :63. https://doi.org/10.12688/wellcomeopenres.15191.1 )
Weissgerber TL, Milic NM, Winham SJ, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLOS Biology 13(4): e1002128. https://doi.org/10.1371/journal.pbio.1002128
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK