Pandas DataFrame nsmallest() method with examples

Spread the love

The nsmallest() method in pandas return the first n rows with the smallest values in columns, in ascending order. The columns that are not specified are returned as well, but not used for ordering.

Syntax –

DataFrame.nsmallest(n, columns, keep='first')

n – Number of items to retrieve.

columns – Column name or names to order by.

keep – specifying what to do with duplicate rows.

  • first : take the first occurrence.
  • last : take the last occurrence.
  • all : do not drop any duplicates, even it means selecting more than n items.

Examples –

Let’s create a dataframe to work with.

import pandas as pd

data = {'Apple':[89, 90, 90, 90, 125, 84, 131, 123, 123, 140, 145, 145],
       'Orange': [46, 46, 50, 65, 63, 48, 110, 120, 60, 42, 47, 62],
       'Banana': [26, 30, 30, 25, 38, 22, 22, 36, 20, 27, 23, 34 ],
       'Mango': [80, 80, 90, 125, 130, 150, 140, 135, 135, 135, 80, 90]}

index = ['Jan','Feb','Mar','Apr','May','June','Jul','Aug','Sep','Oct','Nov','Dec']
df = pd.DataFrame(data, index=index)
df

1 . Select Top N Rows which has the smallest value in a column –

Let’s say we want to find out top 3 smallest value in the Apple column, we can do this using the nsmallest() method in pandas.

df.nsmallest(n=3, columns='Apple')

2 . Handle Duplicate Data –

When using keep='last', ties are resolved in reverse order.

df.nsmallest(n=3, columns='Apple', keep='last')

The Feb, Mar and Apr, all these 3 months has the same value for the Apple column. And since we set keep to last, pandas will keep the April month value.

And if we set the keep=’first’ then pandas will keep the Feb month value. By Default keep is set to first.

df.nsmallest(n=3, columns='Apple', keep='first')

When using keep='all', all duplicate items are maintained.

df.nsmallest(n=3, columns='Apple', keep='all')

3 . Using multiple columns with nsmallest method –

Let’s say that we want to first order the smallest value first in the Apple column then in the Mango column. We can do this by passing list of columns to the nsmallest method.

df.nsmallest(n=3, columns=['Apple','Mango'])

Related Posts –

  1. Pandas DataFrame nlargest() method with examples.

Rating: 1 out of 5.

Leave a Reply