cudf.DataFrame.nlargest#
- DataFrame.nlargest(n, columns, keep='first')#
Get the rows of the DataFrame sorted by the n largest value of columns
- Parameters
- nint
Number of rows to return.
- columnslabel or list of labels
Column label(s) to order by.
- keep{‘first’, ‘last’}, default ‘first’
Where there are duplicate values:
first : prioritize the first occurrence(s)
last : prioritize the last occurrence(s)
- Returns
- DataFrame
The first n rows ordered by the given columns in descending order.
Notes
- Difference from pandas:
Only a single column is supported in columns
Examples
>>> import cudf >>> df = cudf.DataFrame({'population': [59000000, 65000000, 434000, ... 434000, 434000, 337000, 11300, ... 11300, 11300], ... 'GDP': [1937894, 2583560 , 12011, 4520, 12128, ... 17036, 182, 38, 311], ... 'alpha-2': ["IT", "FR", "MT", "MV", "BN", ... "IS", "NR", "TV", "AI"]}, ... index=["Italy", "France", "Malta", ... "Maldives", "Brunei", "Iceland", ... "Nauru", "Tuvalu", "Anguilla"]) >>> df population GDP alpha-2 Italy 59000000 1937894 IT France 65000000 2583560 FR Malta 434000 12011 MT Maldives 434000 4520 MV Brunei 434000 12128 BN Iceland 337000 17036 IS Nauru 11300 182 NR Tuvalu 11300 38 TV Anguilla 11300 311 AI >>> df.nlargest(3, 'population') population GDP alpha-2 France 65000000 2583560 FR Italy 59000000 1937894 IT Malta 434000 12011 MT >>> df.nlargest(3, 'population', keep='last') population GDP alpha-2 France 65000000 2583560 FR Italy 59000000 1937894 IT Brunei 434000 12128 BN