cudf.io.parquet.read_parquet_metadata#
- cudf.io.parquet.read_parquet_metadata(filepath_or_buffer)[source]#
Read a Parquet file’s metadata and schema
- Parameters:
- pathstring or path object
Path of file to be read
- Returns:
- Total number of rows
- Number of row groups
- List of column names
- Number of columns
- List of metadata of row groups
See also
Examples
>>> import cudf >>> num_rows, num_row_groups, names, num_columns, row_group_metadata = cudf.io.read_parquet_metadata(filename) >>> df = [cudf.read_parquet(fname, row_group=i) for i in range(row_groups)] >>> df = cudf.concat(df) >>> df num1 datetime text 0 123 2018-11-13T12:00:00.000 5451 1 456 2018-11-14T12:35:01.000 5784 2 789 2018-11-15T18:02:59.000 6117