You have a pandas dataframe and you want to shuffle the rows of the dataframe.
There are various ways to shuffle the dataframe in pandas. Let’s see them one by one.
import pandas as pd import numpy as np url = "https://raw.githubusercontent.com/bprasad26/lwd/master/data/clothing_store_sales.csv" df = pd.read_csv(url) df = df.head(10)
Method 1 –
The easiest way to do that is to use the df.sample() method in pandas to select all the rows without replacement.
df1 = df.sample(frac=1)
Method 2 –
You can also shuffle the rows of the dataframe by first shuffling the index using np.random.permutation and then use that shuffled index to select the data from the dataframe.
df2 = df.iloc[np.random.permutation(len(df))]
Method 3 –
Another way to shuffle the rows of a dataframe is using scikit-learn.
from sklearn.utils import shuffle df3 = shuffle(df)