cudf.read_avro#
- cudf.read_avro(filepath_or_buffer, engine='cudf', columns=None, skiprows=None, num_rows=None, **kwargs)#
Load an Avro dataset into a DataFrame
- Parameters
- filepath_or_bufferstr, path object, bytes, or file-like object
Either a path to a file (a str, pathlib.Path, or py._path.local.LocalPath), URL (including http, ftp, and S3 locations), Python bytes of raw binary data, or any object with a read() method (such as builtin open() file handler function or BytesIO).
- engine[‘cudf’], default ‘cudf’
Parser engine to use.
- columnslist, default None
If not None, only these columns will be read.
- skiprowsint, default None
If not None, the number of rows to skip from the start of the file.
- num_rowsint, default None
If not None, the total number of rows to read.
- Returns
- DataFrame
Notes
cuDF supports local and remote data stores. See configuration details for available sources here.
Examples
>>> import pandavro >>> import pandas as pd >>> import cudf >>> pandas_df = pd.DataFrame() >>> pandas_df['numbers'] = [10, 20, 30] >>> pandas_df['text'] = ["hello", "rapids", "ai"] >>> pandas_df numbers text 0 10 hello 1 20 rapids 2 30 ai >>> pandavro.to_avro("data.avro", pandas_df) >>> cudf.read_avro("data.avro") numbers text 0 10 hello 1 20 rapids 2 30 ai