All Classes Namespaces Functions Enumerations Enumerator Modules Pages
Classes | Typedefs | Enumerations | Functions | Variables
kvikio Namespace Reference

KvikIO namespace. More...

Classes

struct  BatchOp
 IO operation used when submitting batches. More...
 
class  BatchHandle
 
class  AllocRetain
 Singleton class to retain host memory allocations. More...
 
class  CompatModeManager
 Store and manage the compatibility mode data associated with a FileHandle. More...
 
struct  DriverInitializer
 
struct  DriverProperties
 
class  defaults
 Singleton class of default values used throughout KvikIO. More...
 
struct  CUfileException
 
class  GenericSystemError
 
class  FileHandle
 Handle of an open file registered with cufile. More...
 
class  FileWrapper
 Class that provides RAII for file handling. More...
 
class  CUFileHandleWrapper
 Class that provides RAII for the cuFile handle. More...
 
class  NvtxManager
 Utility singleton class for NVTX annotation. More...
 
class  RemoteEndpoint
 Abstract base class for remote endpoints. More...
 
class  HttpEndpoint
 A remote endpoint using http. More...
 
class  S3Endpoint
 A remote endpoint using AWS's S3 protocol. More...
 
class  RemoteHandle
 Handle of remote file. More...
 
class  cudaAPI
 Shim layer of the cuda C-API. More...
 
class  cuFileAPI
 Shim layer of the cuFile C-API. More...
 
class  LibCurl
 Singleton class to initialize and cleanup the global state of libcurl. More...
 
class  CurlHandle
 Representation of a curl easy handle pointer and its operations. More...
 
class  StreamFuture
 Future of an asynchronous IO operation. More...
 
class  thread_pool_wrapper
 
class  PushAndPopContext
 Push CUDA context on creation and pop it on destruction. More...
 

Typedefs

using nvtx_color_type = int
 
using BS_thread_pool = thread_pool_wrapper< BS::thread_pool >
 

Enumerations

enum class  CompatMode : uint8_t { OFF , ON , AUTO }
 I/O compatibility mode. More...
 

Functions

void buffer_register (void const *devPtr_base, std::size_t size, int flags=0, std::vector< int > const &errors_to_ignore=std::vector< int >())
 register an existing cudaMalloced memory with cuFile to pin for GPUDirect Storage access. More...
 
void buffer_deregister (void const *devPtr_base)
 deregister an already registered device memory from cuFile More...
 
void memory_register (void const *devPtr, int flags=0, std::vector< int > const &errors_to_ignore={})
 Register device memory allocation which is part of devPtr. Use this together with FileHandle::pread() and FileHandle::pwrite(). More...
 
void memory_deregister (void const *devPtr)
 deregister an already registered device memory from cuFile. More...
 
KVIKIO_EXPORT std::string const & config_path ()
 Get the filepath to cuFile's config file (cufile.json) or the empty string. More...
 
template<typename T >
getenv_or (std::string_view env_var_name, T default_val)
 
template<>
bool getenv_or (std::string_view env_var_name, bool default_val)
 
template<>
CompatMode getenv_or (std::string_view env_var_name, CompatMode default_val)
 
template<>
std::vector< int > getenv_or (std::string_view env_var_name, std::vector< int > default_val)
 
int open_fd_parse_flags (std::string const &flags, bool o_direct)
 Parse open file flags given as a string and return oflags. More...
 
int open_fd (std::string const &file_path, std::string const &flags, bool o_direct, mode_t mode)
 Open file using open(2) More...
 
int open_flags (int fd)
 Get the flags of the file descriptor (see open(2)) More...
 
std::size_t get_file_size (int file_descriptor)
 Get file size from file descriptor fstat(3) More...
 
template<typename F , typename T >
std::future< std::size_t > parallel_io (F op, T buf, std::size_t size, std::size_t file_offset, std::size_t task_size, std::size_t devPtr_offset, std::uint64_t call_idx=0, nvtx_color_type nvtx_color=NvtxManager::default_color())
 Apply read or write operation in parallel. More...
 
constexpr bool is_cuda_available ()
 Check if the CUDA library is available. More...
 
constexpr bool is_cufile_library_available () noexcept
 Check if the cuFile library is available. More...
 
bool is_cufile_available () noexcept
 Check if the cuFile is available and expected to work. More...
 
constexpr int cufile_version () noexcept
 Get cufile version (or zero if older than v1.8). More...
 
bool is_batch_api_available () noexcept
 Check if cuFile's batch API is available. More...
 
bool is_stream_api_available () noexcept
 Check if cuFile's stream (async) API is available. More...
 
void * load_library (std::string const &name, int mode=RTLD_LAZY|RTLD_LOCAL|RTLD_NODELETE)
 Load shared library. More...
 
void * load_library (std::vector< std::string > const &names, int mode=RTLD_LAZY|RTLD_LOCAL|RTLD_NODELETE)
 Load shared library. More...
 
template<typename T >
void get_symbol (T &handle, void *lib, std::string const &name)
 Get symbol using dlsym More...
 
bool is_running_in_wsl () noexcept
 Try to detect if running in Windows Subsystem for Linux (WSL) More...
 
bool run_udev_readable () noexcept
 Check if /run/udev is readable. More...
 
off_t convert_size2off (std::size_t x)
 
ssize_t convert_size2ssize (std::size_t x)
 
CUdeviceptr convert_void2deviceptr (void const *devPtr)
 
template<typename T , std::enable_if_t< std::is_integral_v< T >> * = nullptr>
std::int64_t convert_to_64bit (T value)
 Help function to convert value to 64 bit signed integer.
 
std::uint64_t convert_to_64bit (std::uint64_t value)
 Helper function to allow NVTX payload of type std::uint64_t to pass through without doing anything.
 
template<typename T , std::enable_if_t< std::is_floating_point_v< T >> * = nullptr>
double convert_to_64bit (T value)
 Help function to convert value to 64 bit float.
 
constexpr bool is_host_memory (void const *ptr)
 Check if ptr points to host memory (as opposed to device memory) More...
 
int get_device_ordinal_from_pointer (CUdeviceptr dev_ptr)
 Return the device owning the pointer. More...
 
KVIKIO_EXPORT CUcontext get_primary_cuda_context (int ordinal)
 Given a device ordinal, return the primary context of the device. More...
 
std::optional< CUcontext > get_context_associated_pointer (CUdeviceptr dev_ptr)
 Return the CUDA context associated the given device pointer, if any. More...
 
bool current_context_can_access_pointer (CUdeviceptr dev_ptr)
 Check if the current CUDA context can access the given device pointer. More...
 
CUcontext get_context_from_pointer (void const *devPtr)
 Return a CUDA context that can be used with the given device pointer. More...
 
std::tuple< void *, std::size_t, std::size_t > get_alloc_info (void const *devPtr, CUcontext *ctx=nullptr)
 
template<typename T >
std::future< std::decay_t< T > > make_ready_future (T &&t)
 Create a shared state in a future object that is immediately ready. More...
 
template<typename T >
bool is_future_done (T const &future)
 Check the status of the future object. True indicates that the result is available in the future's shared state. False otherwise. More...
 

Variables

constexpr std::size_t page_size = 4096
 

Detailed Description

KvikIO namespace.

Enumeration Type Documentation

◆ CompatMode

enum kvikio::CompatMode : uint8_t
strong

I/O compatibility mode.

Enumerator
OFF 

Enforce cuFile I/O. GDS will be activated if the system requirements for cuFile are met and cuFile is properly configured. However, if the system is not suited for cuFile, I/O operations under the OFF option may error out.

ON 

Enforce POSIX I/O.

AUTO 

Try cuFile I/O first, and fall back to POSIX I/O if the system requirements for cuFile are not met.

Definition at line 28 of file compat_mode.hpp.

Function Documentation

◆ buffer_deregister()

void kvikio::buffer_deregister ( void const *  devPtr_base)

deregister an already registered device memory from cuFile

Parameters
devPtr_basedevice pointer to deregister

◆ buffer_register()

void kvikio::buffer_register ( void const *  devPtr_base,
std::size_t  size,
int  flags = 0,
std::vector< int > const &  errors_to_ignore = std::vector< int >() 
)

register an existing cudaMalloced memory with cuFile to pin for GPUDirect Storage access.

Parameters
devPtr_basedevice pointer to allocated
lengthsize of memory region from the above specified devPtr
flagsshould be zero or CU_FILE_RDMA_REGISTER (experimental)
errors_to_ignoreCuFile errors to ignore such as CU_FILE_MEMORY_ALREADY_REGISTERED or CU_FILE_INVALID_MAPPING_SIZE
Note
This memory will be use to perform GPU direct DMA from the supported storage.
Warning
This API is intended for usecases where the memory is used as streaming buffer that is reused across multiple cuFile IO operations.

◆ config_path()

KVIKIO_EXPORT std::string const& kvikio::config_path ( )

Get the filepath to cuFile's config file (cufile.json) or the empty string.

This lookup is cached.

Returns
The filepath to the cufile.json file or the empty string if it isn't found.

◆ cufile_version()

constexpr int kvikio::cufile_version ( )
constexprnoexcept

Get cufile version (or zero if older than v1.8).

The version is returned as (1000*major + 10*minor). E.g., cufile v1.8.0 would be represented by 1080.

Notice, this is not the version of the CUDA toolkit. cufile is part of the toolkit but follows its own version scheme.

Returns
The version (1000*major + 10*minor) or zero if older than 1080.

Definition at line 134 of file cufile.hpp.

◆ current_context_can_access_pointer()

bool kvikio::current_context_can_access_pointer ( CUdeviceptr  dev_ptr)

Check if the current CUDA context can access the given device pointer.

Parameters
dev_ptrDevice pointer to query
Returns
The boolean answer

◆ get_context_associated_pointer()

std::optional<CUcontext> kvikio::get_context_associated_pointer ( CUdeviceptr  dev_ptr)

Return the CUDA context associated the given device pointer, if any.

Parameters
dev_ptrDevice pointer to query
Returns
Usable CUDA context, if one were found.

◆ get_context_from_pointer()

CUcontext kvikio::get_context_from_pointer ( void const *  devPtr)

Return a CUDA context that can be used with the given device pointer.

For robustness, we look for an usabale context in the following order: 1) If a context has been associated with devPtr, it is returned. 2) If the current context exists and can access devPtr, it is returned. 3) Return the primary context of the device that owns devPtr. We assume the primary context can access devPtr, which might not be true in the exceptional disjoint addressing cases mention in the CUDA docs[1]. In these cases, the user has to set an usable current context before reading/writing using KvikIO.

[1] https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__UNIFIED.html

Parameters
devPtrDevice pointer to query
Returns
Usable CUDA context

◆ get_device_ordinal_from_pointer()

int kvikio::get_device_ordinal_from_pointer ( CUdeviceptr  dev_ptr)

Return the device owning the pointer.

Parameters
ptrDevice pointer to query
Returns
The device ordinal

◆ get_file_size()

std::size_t kvikio::get_file_size ( int  file_descriptor)

Get file size from file descriptor fstat(3)

Parameters
file_descriptorOpen file descriptor
Returns
The number of bytes

◆ get_primary_cuda_context()

KVIKIO_EXPORT CUcontext kvikio::get_primary_cuda_context ( int  ordinal)

Given a device ordinal, return the primary context of the device.

This function caches the primary contexts retrieved until program exit

Parameters
ordinalDevice ordinal - an integer between 0 and the number of CUDA devices
Returns
Primary CUDA context

◆ get_symbol()

template<typename T >
void kvikio::get_symbol ( T &  handle,
void *  lib,
std::string const &  name 
)

Get symbol using dlsym

Template Parameters
TThe type of the function pointer.
Parameters
handleThe function pointer (output).
libThe library handle returned by dlopen.
nameName of the symbol/function to load.

Definition at line 69 of file shim/utils.hpp.

◆ is_batch_api_available()

bool kvikio::is_batch_api_available ( )
noexcept

Check if cuFile's batch API is available.

Since cuFileGetVersion() first became available in cufile v1.8 (CTK v12.3), this function returns false for versions older than v1.8 even though the batch API became available in v1.6.

Returns
The boolean answer

◆ is_cuda_available()

constexpr bool kvikio::is_cuda_available ( )
constexpr

Check if the CUDA library is available.

Notice, this doesn't check if the runtime environment supports CUDA.

Returns
The boolean answer

Definition at line 72 of file cuda.hpp.

◆ is_cufile_available()

bool kvikio::is_cufile_available ( )
noexcept

Check if the cuFile is available and expected to work.

Besides checking if the cuFile library is available, this also checks the runtime environment.

Returns
The boolean answer

◆ is_cufile_library_available()

constexpr bool kvikio::is_cufile_library_available ( )
constexprnoexcept

Check if the cuFile library is available.

Notice, this doesn't check if the runtime environment supports cuFile.

Returns
The boolean answer

Definition at line 107 of file cufile.hpp.

◆ is_future_done()

template<typename T >
bool kvikio::is_future_done ( T const &  future)

Check the status of the future object. True indicates that the result is available in the future's shared state. False otherwise.

The future shall not be created using std::async(std::launch::deferred). Otherwise, this function always returns true.

Template Parameters
TType of the future.
Parameters
futureInstance of the future.
Returns
Boolean answer indicating if the future is ready or not.

Definition at line 187 of file utils.hpp.

◆ is_host_memory()

constexpr bool kvikio::is_host_memory ( void const *  ptr)
constexpr

Check if ptr points to host memory (as opposed to device memory)

In this context, managed memory counts as device memory

Parameters
ptrMemory pointer to query
Returns
The boolean answer

Definition at line 80 of file utils.hpp.

◆ is_running_in_wsl()

bool kvikio::is_running_in_wsl ( )
noexcept

Try to detect if running in Windows Subsystem for Linux (WSL)

When unable to determine environment, false is returned.

Returns
The boolean answer

◆ is_stream_api_available()

bool kvikio::is_stream_api_available ( )
noexcept

Check if cuFile's stream (async) API is available.

Since cuFileGetVersion() first became available in cufile v1.8 (CTK v12.3), this function returns false for versions older than v1.8 even though the stream API became available in v1.7.

Returns
The boolean answer

◆ load_library() [1/2]

void* kvikio::load_library ( std::string const &  name,
int  mode = RTLD_LAZY|RTLD_LOCAL|RTLD_NODELETE 
)

Load shared library.

Parameters
nameName of the library to load.
Returns
The library handle.

◆ load_library() [2/2]

void* kvikio::load_library ( std::vector< std::string > const &  names,
int  mode = RTLD_LAZY|RTLD_LOCAL|RTLD_NODELETE 
)

Load shared library.

Parameters
namesVector of names to try when loading shared library.
Returns
The library handle.

◆ make_ready_future()

template<typename T >
std::future<std::decay_t<T> > kvikio::make_ready_future ( T &&  t)

Create a shared state in a future object that is immediately ready.

A partial implementation of the namesake function from the concurrency TS (https://en.cppreference.com/w/cpp/experimental/make_ready_future). The cases of std::reference_wrapper and void are not implemented.

Template Parameters
TType of the value provided.
Parameters
tObject provided.
Returns
A future holding a decayed copy of the object provided.

Definition at line 167 of file utils.hpp.

◆ memory_deregister()

void kvikio::memory_deregister ( void const *  devPtr)

deregister an already registered device memory from cuFile.

Parameters
devPtrdevice pointer to deregister

◆ memory_register()

void kvikio::memory_register ( void const *  devPtr,
int  flags = 0,
std::vector< int > const &  errors_to_ignore = {} 
)

Register device memory allocation which is part of devPtr. Use this together with FileHandle::pread() and FileHandle::pwrite().

Parameters
devPtrDevice pointer
flagsShould be zero or CU_FILE_RDMA_REGISTER (experimental)
errors_to_ignoreCuFile errors to ignore such as CU_FILE_MEMORY_ALREADY_REGISTERED or CU_FILE_INVALID_MAPPING_SIZE
Note
This memory will be use to perform GPU direct DMA from the supported storage.
Warning
This API is intended for usecases where the memory is used as streaming buffer that is reused across multiple cuFile IO operations.

◆ open_fd()

int kvikio::open_fd ( std::string const &  file_path,
std::string const &  flags,
bool  o_direct,
mode_t  mode 
)

Open file using open(2)

Parameters
flagsOpen flags given as a string
o_directAppend O_DIRECT to flags
modeAccess modes
Returns
File descriptor

◆ open_fd_parse_flags()

int kvikio::open_fd_parse_flags ( std::string const &  flags,
bool  o_direct 
)

Parse open file flags given as a string and return oflags.

Parameters
flagsThe flags
o_directAppend O_DIRECT to the open flags
Returns
oflags
Exceptions
std::invalid_argumentif the specified flags are not supported.
std::invalid_argumentif o_direct is true, but O_DIRECT is not supported.

◆ open_flags()

int kvikio::open_flags ( int  fd)

Get the flags of the file descriptor (see open(2))

Returns
Open flags

◆ parallel_io()

template<typename F , typename T >
std::future<std::size_t> kvikio::parallel_io ( op,
buf,
std::size_t  size,
std::size_t  file_offset,
std::size_t  task_size,
std::size_t  devPtr_offset,
std::uint64_t  call_idx = 0,
nvtx_color_type  nvtx_color = NvtxManager::default_color() 
)

Apply read or write operation in parallel.

Template Parameters
FThe type of the function applying the read or write operation.
TThe type of the memory pointer.
Parameters
opThe function applying the read or write operation.
bufBuffer pointer to read or write to.
sizeNumber of bytes to read or write.
file_offsetByte offset to the start of the file.
task_sizeSize of each task in bytes.
Returns
A future to be used later to check if the operation has finished its execution.

Definition at line 141 of file parallel_operation.hpp.

◆ run_udev_readable()

bool kvikio::run_udev_readable ( )
noexcept

Check if /run/udev is readable.

cuFile files with internal error when /run/udev isn't readable. This typically happens when running inside a docker image not launched with --volume /run/udev:/run/udev:ro.

Returns
The boolean answer