How to Create a Sankey Diagram in Plotly Python?

Spread the love

In this post you will learn how to create a Sankey diagram in plotly python.

Sankey Diagram –

Sankey diagrams visualize the contributions to a flow by defining source to represent the source node, target for the target node, value to set the flow volume, and label that shows the node name.

To create a Sankey diagram in plotly python we use the go.Sankey.

Let’s read a dataset to work with.

import pandas as pd
url = 'https://raw.githubusercontent.com/bprasad26/lwd/master/data/clothing_store_sales.csv'
df = pd.read_csv(url)
df.head()

Here we have some data from a clothing store sales. And we are trying to understand who buys clothes from this store.

At the end we are trying to create the below Sankey diagram.

Let’s see how to do it.

Create a Sankey Diagram with go.Sankey –

Before we create the Sankey diagram, we need to prepare the data. We will first group the data by Method of Payments and Gender and calculate the total count of customers for each groups. Then we rename the dataframe in the form of source, target and value.

df1 = df.groupby(['Method of Payment','Gender'])['Customer'].count().reset_index()
df1.columns = ['source','target','value']
df1

Next we create another dataframe and this time we group the data by Gender and Marital Status and count the number of customers in each group.

df2 = df.groupby(['Gender','Marital Status'])['Customer'].count().reset_index()
df2.columns = ['source','target','value']
df2

Next we concatenate both the dataframe on top of each others.

links = pd.concat([df1, df2], axis=0)
links

Next we will find all the unique values in both the source and target columns.

unique_source_target = list(pd.unique(links[['source','target']].values.ravel('k')))

Next, we need to create a mapping dictionary. We will use a dictionary comprehension to do that.

mapping_dict = {k: v for v, k in enumerate(unique_source_target)}
mapping_dict

Next we need to map these values to the links dataframe that we created earlier.

links['source'] = links['source'].map(mapping_dict)
links['target'] = links['target'].map(mapping_dict)
links

Now, we will convert this dataframe into a dictionary.

links_dict = links.to_dict(orient='list')
links_dict

We have completed all the steps that is needed to prepared the data the way plotly needed to create the Sankey diagram.

Now, Let’s create the Sankey diagram. Here we need to define two things – the node and the link of the Sankey diagram.

import plotly.graph_objects as go

fig = go.Figure(data=[go.Sankey(
    node = dict(
        pad = 15,
        thickness=20,
        line=dict(color='black', width=0.5),
        label = unique_source_target,
        color='green'
    ),
    link = dict(
        source= links_dict['source'],
        target = links_dict['target'],
        value = links_dict['value']
    )
    
)
])
fig.update_layout(title='Clothing Store Sales')
fig.show()

Related Posts –

  1. How to install plotly python with pip?
  2. How to create a Line Chart with Plotly Python?
  3. How to create Scatter plot in Plotly Python?
  4. How to create a Bar Chart in Plotly Python?
  5. How to create Horizontal Bar Chart in Plotly Python?
  6. How to create a Histogram in plotly python?
  7. How to Create a Box Plot in Plotly Python?
  8. How to create a Pie Chart in Plotly Python?
  9. How to create a Dot Plot in Plotly Python?
  10. How to Create Heatmap with Plotly Python?
  11. How to Create a Violin Plot in Plotly Python?
  12. How to Create Subplots in Plotly Python?
  13. How to Create a Bubble Chart in Plotly Python?
  14. How to Create a Gantt Chart in Plotly Python?
  15. How to Create an Area Chart in Plotly Python?
  16. How to Create Tables in Plotly Python?
  17. How to Create a Sunburst Chart in Plotly Python?

Rating: 1 out of 5.

Leave a Reply