cuDF User Guide#
- API reference
- 10 Minutes to cuDF and Dask cuDF
- What are these Libraries?
- When to use cuDF and Dask cuDF
- Object Creation
- Viewing Data
- Selecting a Column
- Selecting Rows by Label
- Selecting Rows by Position
- Boolean Indexing
- MultiIndex
- Missing Data
- Stats
- Applymap
- Histogramming
- String Methods
- Concat
- Join
- Grouping
- Transpose
- Time Series
- Categoricals
- Converting to Pandas
- Converting to Numpy
- Converting to Arrow
- Reading/Writing CSV Files
- Reading/Writing Parquet Files
- Reading/Writing ORC Files
- Dask Performance Tips
- Comparison of cuDF and Pandas
- Supported Data Types
- Input / Output
- Working with missing data
- How to Detect missing values
- Float dtypes and missing data
- Datetimes
- Calculations with missing data
- Sum/product of Null/nans
- NA values in GroupBy
- Inserting missing data
- Filling missing values: fillna
- Filling with cudf Object
- Dropping axis labels with missing data: dropna
- Replacing generic values
- String/regular expression replacement
- Numeric replacement
- GroupBy
- Overview of User Defined Functions with cuDF
- Interoperability between cuDF and CuPy
- Options
- Performance comparisons
- Pandas Compatibility Notes
- Copy-on-write
- Memory Profiling
- Breaking changes for pandas 2 in cuDF 24.04+
- Removed
DataFrame.append&Series.append, usecudf.concatinstead. - Removed various numeric
Indexsub-classes, usecudf.Index - Change in bitwise operation results
- ufuncs will perform re-indexing
DataFramevsSeriescomparisons need to have matching index- Series.rank
- Value counts sets the results name to
count/proportion DataFrame.describewill include datetime data by default- Converting a datetime string with
Zto timezone-naive dtype is not allowed. Datetime&Timedeltareduction operations will preserve their time resolutions.get_dummiesdefault return type is changed fromint8toboolreset_indexwill name columns asNonewhenname=None- Fixed an issue where duration components were being incorrectly calculated
fillnaondatetime/timedeltawith a lower-resolution scalar will now type-cast the seriesGroupby.nth&Groupby.dtypeswill have the grouped column in result
- Removed