cuDF User Guide#
- API reference
- 10 Minutes to cuDF and Dask cuDF
- What are these Libraries?
- When to use cuDF and Dask cuDF
- Object Creation
- Viewing Data
- Selecting a Column
- Selecting Rows by Label
- Selecting Rows by Position
- Boolean Indexing
- MultiIndex
- Missing Data
- Stats
- Applymap
- Histogramming
- String Methods
- Concat
- Join
- Grouping
- Transpose
- Time Series
- Categoricals
- Converting to Pandas
- Converting to Numpy
- Converting to Arrow
- Reading/Writing CSV Files
- Reading/Writing Parquet Files
- Reading/Writing ORC Files
- Dask Performance Tips
- Comparison of cuDF and Pandas
- Supported Data Types
- Input / Output
- Working with missing data
- How to Detect missing values
- Float dtypes and missing data
- Datetimes
- Calculations with missing data
- Sum/product of Null/nans
- NA values in GroupBy
- Inserting missing data
- Filling missing values: fillna
- Filling with cudf Object
- Dropping axis labels with missing data: dropna
- Replacing generic values
- String/regular expression replacement
- Numeric replacement
- GroupBy
- Overview of User Defined Functions with cuDF
- Interoperability between cuDF and CuPy
- Options
- Performance comparisons
- Pandas Compatibility Notes
- Copy-on-write
- Memory Profiling
- Breaking changes for pandas 2 in cuDF 24.04+
- Removed
DataFrame.append
&Series.append
, usecudf.concat
instead. - Removed various numeric
Index
sub-classes, usecudf.Index
- Change in bitwise operation results
- ufuncs will perform re-indexing
DataFrame
vsSeries
comparisons need to have matching index- Series.rank
- Value counts sets the results name to
count
/proportion
DataFrame.describe
will include datetime data by default- Converting a datetime string with
Z
to timezone-naive dtype is not allowed. Datetime
&Timedelta
reduction operations will preserve their time resolutions.get_dummies
default return type is changed fromint8
tobool
reset_index
will name columns asNone
whenname=None
- Fixed an issue where duration components were being incorrectly calculated
fillna
ondatetime
/timedelta
with a lower-resolution scalar will now type-cast the seriesGroupby.nth
&Groupby.dtypes
will have the grouped column in result
- Removed