Class BatchedCompressor

java.lang.Object
ai.rapids.cudf.nvcomp.BatchedCompressor
Direct Known Subclasses:
BatchedLZ4Compressor, BatchedZstdCompressor

public abstract class BatchedCompressor extends Object
Multi-buffer compressor
  • Constructor Details

    • BatchedCompressor

      public BatchedCompressor(long chunkSize, long maxOutputChunkSize, long maxIntermediateBufferSize)
      Construct a batched compressor instance
      Parameters:
      chunkSize - maximum amount of uncompressed data to compress as a single chunk. Inputs larger than this will be compressed in multiple chunks.
      maxIntermediateBufferSize - desired maximum size of intermediate device buffers used during compression.
  • Method Details

    • compress

      public DeviceMemoryBuffer[] compress(BaseDeviceMemoryBuffer[] origInputs, Cuda.Stream stream)
      Compress a batch of buffers. The input buffers will be closed.
      Parameters:
      origInputs - buffers to compress
      stream - CUDA stream to use
      Returns:
      compressed buffers corresponding to the input buffers
    • batchedCompressGetTempSize

      protected abstract long batchedCompressGetTempSize(long batchSize, long maxChunkSize)
      Get the temporary workspace size required to perform compression of an entire batch.
      Parameters:
      batchSize - number of chunks in the batch
      maxChunkSize - maximum size of an uncompressed chunk in bytes
      Returns:
      The size of required temporary workspace in bytes to compress the batch.
    • batchedCompressAsync

      protected abstract void batchedCompressAsync(long devInPtrs, long devInSizes, long chunkSize, long batchSize, long tempPtr, long tempSize, long devOutPtrs, long compressedSizesOutPtr, long stream)
      Asynchronously compress a batch of buffers. Note that compressedSizesOutPtr must point to pinned memory for this operation to be asynchronous.
      Parameters:
      devInPtrs - device address of uncompressed buffer addresses vector
      devInSizes - device address of uncompressed buffer sizes vector
      chunkSize - maximum size of an uncompressed chunk in bytes
      batchSize - number of chunks in the batch
      tempPtr - device address of the temporary workspace buffer
      tempSize - size of the temporary workspace buffer in bytes
      devOutPtrs - device address of output buffer addresses vector
      compressedSizesOutPtr - device address where to write the sizes of the compressed data written to the corresponding output buffers. Must point to a buffer with at least 8 bytes of memory per output buffer in the batch.
      stream - CUDA stream to use