Class Cuda

java.lang.Object
ai.rapids.cudf.Cuda

public class Cuda extends Object
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static final class 
     
    static final class 
    A class representing a CUDA stream
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final Cuda.Stream
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static void
    asyncMemset(long dst, byte value, long count)
    Sets count bytes starting at the memory area pointed to by dst, with value.
    static void
    Set the device for this thread to the appropriate one.
    static void
    Synchronizes the whole device using cudaDeviceSynchronize.
    static void
    Calls cudaFree(0).
    static int
    Gets the major CUDA compute capability of the current device.
    static int
    Gets the minor CUDA compute capability of the current device.
    Gets the CUDA compute mode of the current device.
    static int
    Get the id of the current device.
    static int
    Get the device count.
    static int
    Get the CUDA Driver version, which is the latest version of CUDA supported by the driver.
    static int
    Get the CUDA Runtime version of the current CUDA Runtime instance.
    static boolean
    This should only be used for tests, to enable or disable tests if the current environment is not compatible with this version of the library.
    static boolean
    Whether per-thread default stream is enabled.
    Mapping: cudaMemGetInfo(size_t *free, size_t *total)
    static void
    memset(long dst, byte value, long count)
    Sets count bytes starting at the memory area pointed to by dst, with value.
    static void
    multiBufferCopyAsync(long[] destAddrs, long[] srcAddrs, long[] copySizes, Cuda.Stream stream)
    Copy data from multiple device buffer sources to multiple device buffer destinations.
    static void
    Begins an Nsight profiling session, if a profiler is currently attached.
    static void
    Stops an active Nsight profiling session.
    static void
    setDevice(int device)
    Set the id of the current device.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • DEFAULT_STREAM

      public static final Cuda.Stream DEFAULT_STREAM
  • Constructor Details

    • Cuda

      public Cuda()
  • Method Details

    • getComputeMode

      public static CudaComputeMode getComputeMode()
      Gets the CUDA compute mode of the current device.
      Returns:
      the enum value of CudaComputeMode
    • memGetInfo

      public static CudaMemInfo memGetInfo() throws CudaException
      Mapping: cudaMemGetInfo(size_t *free, size_t *total)
      Throws:
      CudaException
    • memset

      public static void memset(long dst, byte value, long count) throws CudaException
      Sets count bytes starting at the memory area pointed to by dst, with value. The operation has completed when this returns, but it could overlap with operations occurring on other streams.
      Parameters:
      dst - - Destination memory address
      value - - Byte value to set dst with
      count - - Size in bytes to set
      Throws:
      CudaException
    • asyncMemset

      public static void asyncMemset(long dst, byte value, long count) throws CudaException
      Sets count bytes starting at the memory area pointed to by dst, with value. The operation has not necessarily completed when this returns, but it could overlap with operations occurring on other streams.
      Parameters:
      dst - - Destination memory address
      value - - Byte value to set dst with
      count - - Size in bytes to set
      Throws:
      CudaException
    • getDevice

      public static int getDevice() throws CudaException
      Get the id of the current device.
      Returns:
      the id of the current device
      Throws:
      CudaException - on any error
    • getDeviceCount

      public static int getDeviceCount() throws CudaException
      Get the device count.
      Returns:
      returns the number of compute-capable devices
      Throws:
      CudaException - on any error
    • setDevice

      public static void setDevice(int device) throws CudaException, CudfException
      Set the id of the current device.

      Note this is relative to CUDA_SET_VISIBLE_DEVICES, e.g. if CUDA_SET_VISIBLE_DEVICES=1,0, and you call setDevice(0), you will get device 1.

      Note if RMM has been initialized and the requested device ID does not match the device used to initialize RMM then this will throw an error.

      Throws:
      CudaException - on any error
      CudfException
    • autoSetDevice

      public static void autoSetDevice() throws CudaException
      Set the device for this thread to the appropriate one. Java loves threads, but cuda requires each thread to have the device set explicitly or it falls back to CUDA_VISIBLE_DEVICES. Most JNI calls through the cudf API will do this for you, but if you are writing your own JNI calls that extend cudf you might want to call this before calling into your JNI APIs to ensure that the device is set correctly.
      Throws:
      CudaException - on any error
    • getDriverVersion

      public static int getDriverVersion() throws CudaException
      Get the CUDA Driver version, which is the latest version of CUDA supported by the driver. The version is returned as (1000 major + 10 minor). For example, CUDA 9.2 would be represented by 9020. If no driver is installed,then 0 is returned as the driver version.
      Returns:
      the CUDA driver version
      Throws:
      CudaException - on any error
    • getRuntimeVersion

      public static int getRuntimeVersion() throws CudaException
      Get the CUDA Runtime version of the current CUDA Runtime instance. The version is returned as (1000 major + 10 minor). For example, CUDA 9.2 would be represented by 9020.
      Returns:
      the CUDA Runtime version
      Throws:
      CudaException - on any error
    • getComputeCapabilityMajor

      public static int getComputeCapabilityMajor() throws CudaException
      Gets the major CUDA compute capability of the current device. For reference: https://developer.nvidia.com/cuda-gpus Hardware Generation Compute Capability Ampere 8.x Turing 7.5 Volta 7.0, 7.2 Pascal 6.x Maxwell 5.x Kepler 3.x Fermi 2.x
      Returns:
      The Major compute capability version number of the current CUDA device
      Throws:
      CudaException - on any error
    • getComputeCapabilityMinor

      public static int getComputeCapabilityMinor() throws CudaException
      Gets the minor CUDA compute capability of the current device. For reference: https://developer.nvidia.com/cuda-gpus Hardware Generation Compute Capability Ampere 8.x Turing 7.5 Volta 7.0, 7.2 Pascal 6.x Maxwell 5.x Kepler 3.x Fermi 2.x
      Returns:
      The Minor compute capability version number of the current CUDA device
      Throws:
      CudaException - on any error
    • freeZero

      public static void freeZero() throws CudaException
      Calls cudaFree(0). This can be used to initialize the GPU after a setDevice()
      Throws:
      CudaException - on any error
    • isEnvCompatibleForTesting

      public static boolean isEnvCompatibleForTesting()
      This should only be used for tests, to enable or disable tests if the current environment is not compatible with this version of the library. Currently it only does some very basic checks, but these may be expanded in the future depending on needs.
      Returns:
      true if it is compatible else false.
    • isPtdsEnabled

      public static boolean isPtdsEnabled()
      Whether per-thread default stream is enabled.
    • multiBufferCopyAsync

      public static void multiBufferCopyAsync(long[] destAddrs, long[] srcAddrs, long[] copySizes, Cuda.Stream stream)
      Copy data from multiple device buffer sources to multiple device buffer destinations. For each buffer to copy there is a corresponding entry in the destination address, source address, and copy size vectors.
      Parameters:
      destAddrs - vector of device destination addresses
      srcAddrs - vector of device source addresses
      copySizes - vector of copy sizes
      stream - CUDA stream to use for the copy
    • profilerStart

      public static void profilerStart()
      Begins an Nsight profiling session, if a profiler is currently attached.
    • profilerStop

      public static void profilerStop()
      Stops an active Nsight profiling session.
    • deviceSynchronize

      public static void deviceSynchronize()
      Synchronizes the whole device using cudaDeviceSynchronize.