2

DataFrame to Samples Dict

 2 years ago
source link: https://datacrayon.com/posts/plotapi/data-wrangling/dataframe-to-samples-dict/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Get the Books

Enjoying these notebooks and want to support the work? Check out the practical books on Data Science, Visualisation, and Evolutionary Algorithms.

Get the books
covertops.jpg

Preamble

import pandas as pd
from plotapi import LineFight

LineFight.set_license("your username", "your license key")

Introduction

Plotapi BarFight, PieFight, and LineFight, expect a list of dict items that define the value of nodes over time. The following is an example of this data structure.

samples = [
    {"order": 2000.01, "name": "Sankey", "value": 10},
    {"order": 2000.01, "name": "Terminus", "value": 10},
    {"order": 2000.01, "name": "Chord", "value": 40},
    {"order": 2000.01, "name": "Bar Fight", "value": 90},
    {"order": 2000.01, "name": "Pie Fight", "value": 70},

    {"order": 2000.02, "name": "Sankey", "value": 30},
    {"order": 2000.02, "name": "Terminus", "value": 20},
    {"order": 2000.02, "name": "Chord", "value": 40},
    {"order": 2000.02, "name": "Bar Fight", "value": 120},
    {"order": 2000.02, "name": "Pie Fight", "value": 55},

    {"order": 2000.03, "name": "Sankey", "value": 35},
    {"order": 2000.03, "name": "Terminus", "value": 45},
    {"order": 2000.03, "name": "Chord", "value": 60},
    {"order": 2000.03, "name": "Bar Fight", "value": 85},
    {"order": 2000.03, "name": "Pie Fight", "value": 100},

    {"order": 2000.04, "name": "Sankey", "value": 25},
    {"order": 2000.04, "name": "Terminus", "value": 60},
    {"order": 2000.04, "name": "Chord", "value": 90},
    {"order": 2000.04, "name": "Bar Fight", "value": 50},
    {"order": 2000.04, "name": "Pie Fight", "value": 105},

    {"order": 2000.05, "name": "Sankey", "value": 60},
    {"order": 2000.05, "name": "Terminus", "value": 80},
    {"order": 2000.05, "name": "Chord", "value": 120},
    {"order": 2000.05, "name": "Bar Fight", "value": 30},
    {"order": 2000.05, "name": "Pie Fight", "value": 95},
]

Dataset

Let's work backwards to the DataFrame, our starting point for this data wrangling exercise.

df = (
    pd.DataFrame(samples)
    .pivot(index="order", columns="name")["value"]
    .reset_index()
    .rename_axis(None, axis=1)
)

df
order Bar Fight Chord Pie Fight Sankey Terminus
0 2000.01 90 40 70 10 10
1 2000.02 120 40 55 30 20
2 2000.03 85 60 100 35 45
3 2000.04 50 90 105 25 60
4 2000.05 30 120 95 60 80

Great! Now let's work back to the samples dict.

Wrangling

Our journey back to the samples list of dict items will be through pandas.melt.

df_melted = pd.melt(
    df,
    id_vars="order",
    value_vars=list(df.columns[1:]),
    var_name="name",
    value_name="value",
)

df_melted.head(10)
order name value
0 2000.01 Bar Fight 90
1 2000.02 Bar Fight 120
2 2000.03 Bar Fight 85
3 2000.04 Bar Fight 50
4 2000.05 Bar Fight 30
5 2000.01 Chord 40
6 2000.02 Chord 40
7 2000.03 Chord 60
8 2000.04 Chord 90
9 2000.05 Chord 120

We're nearly there. This next step is optional - we're going to sort by order.

df_melted = df_melted.sort_values("order")
df_melted.head(10)
order name value
0 2000.01 Bar Fight 90
20 2000.01 Terminus 10
5 2000.01 Chord 40
15 2000.01 Sankey 10
10 2000.01 Pie Fight 70
1 2000.02 Bar Fight 120
21 2000.02 Terminus 20
6 2000.02 Chord 40
16 2000.02 Sankey 30
11 2000.02 Pie Fight 55

Now for the final step - let's get our list of dict items.

samples = df_melted.to_dict(orient="records")
samples
[{'order': 2000.01, 'name': 'Bar Fight', 'value': 90},
 {'order': 2000.01, 'name': 'Terminus', 'value': 10},
 {'order': 2000.01, 'name': 'Chord', 'value': 40},
 {'order': 2000.01, 'name': 'Sankey', 'value': 10},
 {'order': 2000.01, 'name': 'Pie Fight', 'value': 70},
 {'order': 2000.02, 'name': 'Bar Fight', 'value': 120},
 {'order': 2000.02, 'name': 'Terminus', 'value': 20},
 {'order': 2000.02, 'name': 'Chord', 'value': 40},
 {'order': 2000.02, 'name': 'Sankey', 'value': 30},
 {'order': 2000.02, 'name': 'Pie Fight', 'value': 55},
 {'order': 2000.03, 'name': 'Terminus', 'value': 45},
 {'order': 2000.03, 'name': 'Sankey', 'value': 35},
 {'order': 2000.03, 'name': 'Pie Fight', 'value': 100},
 {'order': 2000.03, 'name': 'Chord', 'value': 60},
 {'order': 2000.03, 'name': 'Bar Fight', 'value': 85},
 {'order': 2000.04, 'name': 'Pie Fight', 'value': 105},
 {'order': 2000.04, 'name': 'Chord', 'value': 90},
 {'order': 2000.04, 'name': 'Sankey', 'value': 25},
 {'order': 2000.04, 'name': 'Bar Fight', 'value': 50},
 {'order': 2000.04, 'name': 'Terminus', 'value': 60},
 {'order': 2000.05, 'name': 'Pie Fight', 'value': 95},
 {'order': 2000.05, 'name': 'Chord', 'value': 120},
 {'order': 2000.05, 'name': 'Sankey', 'value': 60},
 {'order': 2000.05, 'name': 'Bar Fight', 'value': 30},
 {'order': 2000.05, 'name': 'Terminus', 'value': 80}]

Perfect! We're all done.

Visualisation

No Plotapi exercise is complete without a visualisation.

As we can see, we have set our license details in the preamble with LineFight.set_license().

Here we're using .show() which outputs to a Jupyter Notebook cell, however, we may want to output to an HTML file with .to_html() instead.

LineFight(samples, format_current_order="0.2f").show()

Plotapi - Line Fight Diagram

2,000.012,000.022,000.032,000.042,000.0520406080100120119.13Bar Fight20.63Terminus40.50Chord30.13Sankey56.13Pie Fight2000.02Produced with Plotapi

Here we can see the default behaviour of Plotapi LineFight.

You can do so much more than what's presented in this example, and we'll cover this in later sections. If you want to see the full list of growing features, check out the Plotapi Documentation.

Support this work

You can support this work by getting the e-books. This notebook will always be available for free in its online format.

Support this work

Get the practical book on data visualisation that shows you how to create static and interactive visualisations that are engaging and beautiful.

  • Discounted Price that will grow as the book does,
  • Code examples primarily in Python,
  • Python Notebooks for each Section,
  • Format: PDF download,
  • Supplementary Video Tutorials,
  • Unlimited access to updates.

Plotapi, beautiful by default.

Let plotapi do the heavy lifting – enabling beautiful interactive visualisations with a single line of code (instead of hundreds).

Get Plotapi
plotapi-chord-pokemon-simple.svg

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK