public class ORCChunkedReader extends Object implements AutoCloseable
Constructor and Description |
---|
ORCChunkedReader(long chunkReadLimit,
long passReadLimit,
long outputRowSizingGranularity,
ORCOptions opts,
HostMemoryBuffer buffer,
long offset,
long len)
Construct a chunked ORC reader instance, similar to
ORCChunkedReader(long, long, ORCOptions, HostMemoryBuffer, long, long) ,
with an additional parameter to control the granularity of the output table. |
ORCChunkedReader(long chunkReadLimit,
long passReadLimit,
ORCOptions opts,
HostMemoryBuffer buffer,
long offset,
long len)
Construct the reader instance from read limits, output row granularity,
and a file already loaded in a memory buffer.
|
Modifier and Type | Method and Description |
---|---|
void |
close() |
boolean |
hasNext()
Check if the given file has anything left to read.
|
Table |
readChunk()
Read a chunk of rows in the given ORC file such that the returning data has total size
does not exceed the given read limit.
|
public ORCChunkedReader(long chunkReadLimit, long passReadLimit, ORCOptions opts, HostMemoryBuffer buffer, long offset, long len)
chunkReadLimit
- Limit on total number of bytes to be returned per read,
or 0 if there is no limit.passReadLimit
- Limit on the amount of memory used by the chunked reader,
or 0 if there is no limit.opts
- The options for ORC reading.buffer
- Raw ORC file content.offset
- The starting offset into buffer.len
- The number of bytes to parse the given buffer.public ORCChunkedReader(long chunkReadLimit, long passReadLimit, long outputRowSizingGranularity, ORCOptions opts, HostMemoryBuffer buffer, long offset, long len)
ORCChunkedReader(long, long, ORCOptions, HostMemoryBuffer, long, long)
,
with an additional parameter to control the granularity of the output table.
When reading a chunk table, with respect to the given size limits, a subset of stripes may
be loaded, decompressed and decoded into a large intermediate table. The reader will then
subdivide that table into smaller tables for final output using
outputRowSizingGranularity
as the subdivision step. If the chunked reader is
constructed without this parameter, the default value of 10k rows will be used.outputRowSizingGranularity
- The change step in number of rows in the output table.ORCChunkedReader(long, long, ORCOptions, HostMemoryBuffer, long, long)
public boolean hasNext()
public Table readChunk()
public void close()
close
in interface AutoCloseable
Copyright © 2024. All rights reserved.