
The nlargest method in pandas return the first n rows with the largest values in columns, in descending order. The columns that are not specified are returned as well, but not used for ordering.
Syntax –
DataFrame.nlargest(n, columns, keep='first')
n – Required, a Number, specifying the number of rows to return
columns – A String (column label), or a list of column labels, specifying the column(s) to order by
keep – specifying what to do with duplicate rows.
- first: prioritize the first occurrence(s)
- last: prioritize the last occurrence(s)
all:
do not drop any duplicates, even it means selecting more than n items.
Examples –
Let’s create a dataframe to work with.
import pandas as pd
data = {'Apple':[89, 89, 90, 110, 125, 84, 131, 123, 123, 140, 145, 145],
'Orange': [46, 46, 50, 65, 63, 48, 110, 120, 60, 42, 47, 62],
'Banana': [26, 30, 30, 25, 38, 22, 22, 36, 20, 27, 23, 34 ],
'Mango': [80, 80, 90, 125, 130, 150, 140, 135, 135, 135, 80, 90]}
index = ['Jan','Feb','Mar','Apr','May','June','Jul','Aug','Sep','Oct','Nov','Dec']
df = pd.DataFrame(data, index=index)
df

1 . Select Top N Rows which has the largest value in a column –
Let’s say we want to find out the top 3 prices from the Mango Column, we can do this using the nlargest method in pandas.
df.nlargest(n=3, columns='Mango')

2 . Handle duplicate data –
When using keep='last'
, ties are resolved in reverse order.
df.nlargest(n=3, columns='Mango', keep='last')

Both Aug, Sep and Oct month has the same value 135 for the Mango columns. So when we set keep=’last’, pandas kept the Oct value.
If we set it to first, then pandas will keep the Aug value. By Default keep is set to first.
df.nlargest(n=3, columns='Mango', keep='first')

When using keep='all'
, all duplicate items are maintained.
df.nlargest(n=3, columns='Mango', keep='all')

3 . Using multiple columns with nlargest method –
Let’s say we first want to order the largest value in the Mango column first then by Apple column. We can do this by passing the name of the columns in a list.
df.nlargest(n=3, columns=['Mango','Apple'])
