Pandas DataFrame nlargest() method with examples.

Spread the love

The nlargest method in pandas return the first n rows with the largest values in columns, in descending order. The columns that are not specified are returned as well, but not used for ordering.

Syntax –

DataFrame.nlargest(n, columns, keep='first')

n – Required, a Number, specifying the number of rows to return

columns – A String (column label), or a list of column labels, specifying the column(s) to order by

keep – specifying what to do with duplicate rows.

  • first: prioritize the first occurrence(s)
  • last: prioritize the last occurrence(s)
  • all:do not drop any duplicates, even it means selecting more than n items.

Examples –

Let’s create a dataframe to work with.

import pandas as pd

data = {'Apple':[89, 89, 90, 110, 125, 84, 131, 123, 123, 140, 145, 145],
       'Orange': [46, 46, 50, 65, 63, 48, 110, 120, 60, 42, 47, 62],
       'Banana': [26, 30, 30, 25, 38, 22, 22, 36, 20, 27, 23, 34 ],
       'Mango': [80, 80, 90, 125, 130, 150, 140, 135, 135, 135, 80, 90]}

index = ['Jan','Feb','Mar','Apr','May','June','Jul','Aug','Sep','Oct','Nov','Dec']
df = pd.DataFrame(data, index=index)
df

1 . Select Top N Rows which has the largest value in a column –

Let’s say we want to find out the top 3 prices from the Mango Column, we can do this using the nlargest method in pandas.

df.nlargest(n=3, columns='Mango')

2 . Handle duplicate data –

When using keep='last', ties are resolved in reverse order.

df.nlargest(n=3, columns='Mango', keep='last')

Both Aug, Sep and Oct month has the same value 135 for the Mango columns. So when we set keep=’last’, pandas kept the Oct value.

If we set it to first, then pandas will keep the Aug value. By Default keep is set to first.

df.nlargest(n=3, columns='Mango', keep='first')

When using keep='all', all duplicate items are maintained.

df.nlargest(n=3, columns='Mango', keep='all')

3 . Using multiple columns with nlargest method –

Let’s say we first want to order the largest value in the Mango column first then by Apple column. We can do this by passing the name of the columns in a list.

df.nlargest(n=3, columns=['Mango','Apple'])

Related Posts –

  1. Pandas DataFrame nsmallest() method with examples

Rating: 1 out of 5.

Leave a Reply