Pandas DataFrame stack() method with examples

Spread the love

The stack() method in pandas converts the specified column levels to row levels. The function is named by analogy with a collection of books being reorganized from being side by side on a horizontal position (the columns of the dataframe) to being stacked vertically on top of each other (in the index of the dataframe).

Syntax –

DataFrame.stack(level=- 1, dropna=True)

level – The integer index or name(s) of the column level to convert into a row level. By default, level=-1, which means that the inner-most column level is converted.

dropna – Whether to drop rows in the resulting Frame/Series with missing values. By default it is True.

Example –

1 . Stacking DataFrame with Single Level Columns –

Let’s create a dataframe that has single level.

df = pd.DataFrame([[60, 5],[70, 6],[50, 4]],
                 index=['Max','Steve','Dustin'],
                 columns=['Weight','Height'])
df

Now calling stack() on this dataframe will convert the column to rows and create a multi-level index.

df.stack()

2 . Stacking DataFrame with Multi-Level Columns –

Let’s create a DataFrame with multi-level columns.

multicol1 = pd.MultiIndex.from_tuples([('weight', 'kg'),
                                       ('weight', 'pounds')])

df_multi1 = pd.DataFrame([[60, 132],[70, 154],[50, 110]],
                        index=['Max','Steve','Dustin'],
                        columns=multicol1)
df_multi1

Now, By default the level = -1 which means when we call stack() on this dataframe the innermost level [kg, pounds ] will be converted to rows.

df_multi1.stack()

The outermost level [weight] is called level 0. The level after that [kg, pounds] is called level 1. And any level after that is called level 2. So for example if we want to convert the outermost levels [weight ] to rows then we have to set level to 0.

df_multi1.stack(level=0)

3 . Handing Missing Values –

Let’s create a dataframe that contains some missing values.

multicol2 = pd.MultiIndex.from_tuples([('weight', 'kg'),
                                       ('height', 'ft')])

df_multi2 = pd.DataFrame([[None, 5],[70, 6],[50, 4]],
                        index=['Max', 'Steve', 'Dustin'],
                        columns= multicol2)
df_multi2

By default, dropna=True, which means that rows that contain just NaN will be removed from the result.

df_multi2.stack()

Since we do not have weight for Max the kg row for Max is missing. If you do not wish to drop rows that contains all missing values then set dropna=False.

df_multi2.stack(dropna=False)

Related Posts –

  1. Pandas DataFrame unstack() method with examples

Rating: 1 out of 5.

Leave a Reply