KvikIO namespace. More...
Classes | |
| struct | BatchOp |
| IO operation used when submitting batches. More... | |
| class | BatchHandle |
| Handle of an cuFile batch using semantic. More... | |
| class | PageAlignedAllocator |
| Allocator for page-aligned host memory. More... | |
| class | CudaPinnedAllocator |
| Allocator for CUDA pinned host memory. More... | |
| class | CudaPageAlignedPinnedAllocator |
| Allocator for page-aligned AND CUDA-registered pinned host memory. More... | |
| class | BounceBufferPool |
| Thread-safe singleton pool for reusable bounce buffers. More... | |
| class | CompatModeManager |
| Store and manage the compatibility mode data associated with a FileHandle. More... | |
| struct | DriverInitializer |
| struct | DriverProperties |
| class | defaults |
| Singleton class of default values used throughout KvikIO. More... | |
| struct | libkvikio_domain |
| Tag type for libkvikio's NVTX domain. More... | |
| class | NvtxManager |
| Utility singleton class for NVTX annotation. More... | |
| struct | CUfileException |
| class | GenericSystemError |
| class | FileHandle |
| Handle of an open file registered with cufile. More... | |
| class | FileWrapper |
| Class that provides RAII for file handling. More... | |
| class | CUFileHandleWrapper |
| Class that provides RAII for the cuFile handle. More... | |
| struct | BlockDeviceInfo |
| Information about a block device. More... | |
| class | WebHdfsEndpoint |
| A remote endpoint for Apache Hadoop WebHDFS. More... | |
| class | MmapHandle |
| Handle of a memory-mapped file. More... | |
| class | RemoteEndpoint |
| Abstract base class for remote endpoints. More... | |
| class | HttpEndpoint |
| A remote endpoint for HTTP/HTTPS resources. More... | |
| class | S3Endpoint |
| A remote endpoint for AWS S3 storage requiring credentials. More... | |
| class | S3PublicEndpoint |
| A remote endpoint for publicly accessible S3 objects without authentication. More... | |
| class | S3EndpointWithPresignedUrl |
| A remote endpoint for AWS S3 storage using presigned URLs. More... | |
| class | RemoteHandle |
| Handle of remote file. More... | |
| class | cudaAPI |
| Shim layer of the cuda C-API. More... | |
| class | cuFileAPI |
| Shim layer of the cuFile C-API. More... | |
| class | LibCurl |
| Singleton class to initialize and cleanup the global state of libcurl. More... | |
| class | CurlHandle |
| Representation of a curl easy handle pointer and its operations. More... | |
| class | StreamFuture |
| Future of an asynchronous IO operation. More... | |
| class | PushAndPopContext |
| Push CUDA context on creation and pop it on destruction. More... | |
Typedefs | |
| using | PageAlignedBounceBufferPool = BounceBufferPool< PageAlignedAllocator > |
| Bounce buffer pool using page-aligned host memory. More... | |
| using | CudaPinnedBounceBufferPool = BounceBufferPool< CudaPinnedAllocator > |
| Bounce buffer pool using CUDA pinned memory. More... | |
| using | CudaPageAlignedPinnedBounceBufferPool = BounceBufferPool< CudaPageAlignedPinnedAllocator > |
| Bounce buffer pool using page-aligned CUDA-registered pinned memory. More... | |
| using | nvtx_scoped_range_type = nvtx3::scoped_range_in< libkvikio_domain > |
| using | nvtx_registered_string_type = nvtx3::registered_string_in< libkvikio_domain > |
| using | nvtx_color_type = nvtx3::color |
| using | ThreadPool = BS::thread_pool |
| Thread pool type used for parallel I/O operations. | |
Enumerations | |
| enum class | CompatMode : uint8_t { OFF , ON , AUTO } |
| I/O compatibility mode. More... | |
| enum class | RemoteEndpointType : uint8_t { AUTO , S3 , S3_PUBLIC , S3_PRESIGNED_URL , WEBHDFS , HTTP } |
| Types of remote file endpoints supported by KvikIO. More... | |
Functions | |
| void | buffer_register (void const *devPtr_base, std::size_t size, int flags=0, std::vector< int > const &errors_to_ignore=std::vector< int >()) |
| Register a device memory region with cuFile for GPUDirect Storage access. More... | |
| void | buffer_deregister (void const *devPtr_base) |
| Deregister a device memory region from cuFile. More... | |
| void | memory_register (void const *devPtr, int flags=0, std::vector< int > const &errors_to_ignore={}) |
| Register a device memory allocation with cuFile for GPUDirect Storage access. Use this function together with FileHandle::pread() and FileHandle::pwrite(). More... | |
| void | memory_deregister (void const *devPtr) |
| Deregister a device memory allocation from cuFile. More... | |
| KVIKIO_EXPORT std::string const & | config_path () |
Get the filepath to cuFile's config file (cufile.json) or the empty string. More... | |
| template<typename T > | |
| T | getenv_or (std::string_view env_var_name, T default_val) |
| template<> | |
| bool | getenv_or (std::string_view env_var_name, bool default_val) |
| template<> | |
| CompatMode | getenv_or (std::string_view env_var_name, CompatMode default_val) |
| template<> | |
| std::vector< int > | getenv_or (std::string_view env_var_name, std::vector< int > default_val) |
| template<typename T > | |
| std::tuple< std::string_view, T, bool > | getenv_or (std::initializer_list< std::string_view > env_var_names, T default_val) |
| Get the environment variable value from a candidate list. More... | |
| template<typename F , typename T > | |
| std::future< std::size_t > | parallel_io (F op, T buf, std::size_t size, std::size_t file_offset, std::size_t task_size, std::size_t devPtr_offset, ThreadPool *thread_pool=&defaults::thread_pool(), std::uint64_t call_idx=0, nvtx_color_type nvtx_color=NvtxManager::default_color()) |
| Apply read or write operation in parallel. More... | |
| int | open_fd_parse_flags (std::string const &flags, bool o_direct) |
| Parse open file flags given as a string and return oflags. More... | |
| int | open_fd (std::string const &file_path, std::string const &flags, bool o_direct, mode_t mode) |
Open file using open(2) More... | |
| int | open_flags (int fd) |
Get the flags of the file descriptor (see open(2)) More... | |
| std::size_t | get_file_size (std::string const &file_path) |
Get file size from file descriptor fstat(3) More... | |
| std::size_t | get_file_size (int file_descriptor) |
| Get file size given the file path. More... | |
| std::pair< std::size_t, std::size_t > | get_page_cache_info (std::string const &file_path) |
| Obtain the page cache residency information for a given file. More... | |
| std::pair< std::size_t, std::size_t > | get_page_cache_info (int fd) |
| Obtain the page cache residency information for a given file. More... | |
| void | drop_file_page_cache (int fd, std::size_t offset=0, std::size_t length=0, bool sync_first=true) |
| Drop page cache for a specific file. More... | |
| void | drop_file_page_cache (std::string const &file_path, std::size_t offset=0, std::size_t length=0, bool sync_first=true) |
| Drop page cache for a specific file. More... | |
| bool | drop_system_page_cache (bool reclaim_dentries_and_inodes=true, bool sync_first=true) |
| Drop the system page cache. More... | |
| bool | clear_page_cache (bool reclaim_dentries_and_inodes=true, bool clear_dirty_pages=true) |
Drop the system page cache. Deprecated. Use drop_system_page_cache instead. | |
| BlockDeviceInfo | get_block_device_info (std::string const &file_path) |
| Get information about the physical block device hosting a file. More... | |
| bool | is_cuda_available () |
| Check if the CUDA library is available. More... | |
| constexpr bool | is_cufile_library_available () noexcept |
| Check if the cuFile library is available. More... | |
| bool | is_cufile_available () noexcept |
| Check if the cuFile is available and expected to work. More... | |
| int | cufile_version () noexcept |
| Get cufile version (or zero if older than v1.8). More... | |
| void * | load_library (std::string const &name, int mode=RTLD_LAZY|RTLD_LOCAL|RTLD_NODELETE) |
| Load shared library. More... | |
| template<typename T > | |
| void | get_symbol (T &handle, void *lib, std::string const &name) |
Get symbol using dlsym More... | |
| bool | is_running_in_wsl () noexcept |
| Try to detect if running in Windows Subsystem for Linux (WSL) More... | |
| bool | run_udev_readable () noexcept |
Check if /run/udev is readable. More... | |
| void | stream_register (CUstream stream, unsigned flags) |
| Registers the CUDA stream to the cuFile subsystem. More... | |
| void | stream_deregister (CUstream stream) |
| Deregisters the CUDA stream from the cuFile subsystem. More... | |
| std::size_t | get_page_size () |
| off_t | convert_size2off (std::size_t x) |
| ssize_t | convert_size2ssize (std::size_t x) |
| CUdeviceptr | convert_void2deviceptr (void const *devPtr) |
| template<typename T , std::enable_if_t< std::is_integral_v< T >> * = nullptr> | |
| std::int64_t | convert_to_64bit (T value) |
| Help function to convert value to 64 bit signed integer. | |
| std::uint64_t | convert_to_64bit (std::uint64_t value) |
| Helper function to allow NVTX payload of type std::uint64_t to pass through without doing anything. | |
| template<typename T , std::enable_if_t< std::is_floating_point_v< T >> * = nullptr> | |
| double | convert_to_64bit (T value) |
| Help function to convert value to 64 bit float. | |
| bool | is_host_memory (void const *ptr) |
Check if ptr points to host memory (as opposed to device memory) More... | |
| int | get_device_ordinal_from_pointer (CUdeviceptr dev_ptr) |
| Return the device owning the pointer. More... | |
| KVIKIO_EXPORT CUcontext | get_primary_cuda_context (int ordinal) |
| Given a device ordinal, return the primary context of the device. More... | |
| std::optional< CUcontext > | get_context_associated_pointer (CUdeviceptr dev_ptr) |
| Return the CUDA context associated the given device pointer, if any. More... | |
| bool | current_context_can_access_pointer (CUdeviceptr dev_ptr) |
| Check if the current CUDA context can access the given device pointer. More... | |
| CUcontext | get_context_from_pointer (void const *devPtr) |
| Return a CUDA context that can be used with the given device pointer. More... | |
| std::tuple< void *, std::size_t, std::size_t > | get_alloc_info (void const *devPtr, CUcontext *ctx=nullptr) |
| template<typename T > | |
| std::future< std::decay_t< T > > | make_ready_future (T &&t) |
| Create a shared state in a future object that is immediately ready. More... | |
| template<typename T > | |
| bool | is_future_done (T const &future) |
| Check the status of the future object. True indicates that the result is available in the future's shared state. False otherwise. More... | |
KvikIO namespace.
| using kvikio::CudaPageAlignedPinnedBounceBufferPool = typedef BounceBufferPool<CudaPageAlignedPinnedAllocator> |
Bounce buffer pool using page-aligned CUDA-registered pinned memory.
Use for: Device I/O operations with Direct I/O enabled Provides both page alignment (for Direct I/O) and CUDA registration (for efficient transfers)
Definition at line 270 of file bounce_buffer.hpp.
| using kvikio::CudaPinnedBounceBufferPool = typedef BounceBufferPool<CudaPinnedAllocator> |
Bounce buffer pool using CUDA pinned memory.
Use for: Device I/O operations without Direct I/O Note: Not page-aligned - cannot be used with Direct I/O
Definition at line 262 of file bounce_buffer.hpp.
| using kvikio::PageAlignedBounceBufferPool = typedef BounceBufferPool<PageAlignedAllocator> |
Bounce buffer pool using page-aligned host memory.
Use for: Host-only Direct I/O operations (no CUDA context involvement)
Definition at line 254 of file bounce_buffer.hpp.
|
strong |
I/O compatibility mode.
Definition at line 15 of file compat_mode.hpp.
|
strong |
Types of remote file endpoints supported by KvikIO.
This enum defines the different protocols and services that can be used to access remote files. It is used to specify or detect the type of remote endpoint when opening files.
Definition at line 31 of file remote_handle.hpp.
| void kvikio::buffer_deregister | ( | void const * | devPtr_base | ) |
Deregister a device memory region from cuFile.
This is the low-level deregistration function that requires the caller to specify the exact base address that was previously registered. For a convenience wrapper that automatically discovers the allocation boundaries, see memory_deregister().
In compatibility mode (when GDS is unavailable), this function is a no-op.
| devPtr_base | Base address of the device memory region to deregister. Must match the address used in the corresponding buffer_register() call. |
| CUfileException | If cuFile deregistration fails. |
| void kvikio::buffer_register | ( | void const * | devPtr_base, |
| std::size_t | size, | ||
| int | flags = 0, |
||
| std::vector< int > const & | errors_to_ignore = std::vector< int >() |
||
| ) |
Register a device memory region with cuFile for GPUDirect Storage access.
This is the low-level registration function that requires the caller to specify the exact base address and size of the memory region to register. For a convenience wrapper that automatically discovers the allocation boundaries, see memory_register().
Registration pins the memory for GPU Direct DMA transfers, which can improve performance when the same buffer is reused across multiple cuFile I/O operations.
In compatibility mode (when GDS is unavailable), this function is a no-op.
| devPtr_base | Base address of the device memory region to register. |
| size | Size in bytes of the memory region to register. |
| flags | Registration flags. Should be 0 or CU_FILE_RDMA_REGISTER (experimental). |
| errors_to_ignore | cuFile error codes to silently ignore, such as CU_FILE_MEMORY_ALREADY_REGISTERED or CU_FILE_INVALID_MAPPING_SIZE. |
| CUfileException | If cuFile registration fails with an error not in errors_to_ignore. |
| KVIKIO_EXPORT std::string const& kvikio::config_path | ( | ) |
Get the filepath to cuFile's config file (cufile.json) or the empty string.
This lookup is cached.
|
noexcept |
Get cufile version (or zero if older than v1.8).
The version is returned as (1000*major + 10*minor). E.g., cufile v1.8.0 would be represented by 1080.
Notice, this is not the version of the CUDA toolkit. cufile is part of the toolkit but follows its own version scheme.
| bool kvikio::current_context_can_access_pointer | ( | CUdeviceptr | dev_ptr | ) |
Check if the current CUDA context can access the given device pointer.
| dev_ptr | Device pointer to query |
| void kvikio::drop_file_page_cache | ( | int | fd, |
| std::size_t | offset = 0, |
||
| std::size_t | length = 0, |
||
| bool | sync_first = true |
||
| ) |
Drop page cache for a specific file.
Advises the kernel to evict cached pages for the specified file descriptor using posix_fadvise with POSIX_FADV_DONTNEED.
| fd | Open file descriptor |
| offset | Starting byte offset (default: 0 for beginning of file) |
| length | Number of bytes to drop (default: 0, meaning entire file from offset) |
| sync_first | Whether to flush dirty pages to disk before dropping. If true, fdatasync will be called prior to dropping. This ensures dirty pages become clean and thus droppable. Can be set to false if we are certain no dirty pages exist for this file. |
drop_system_page_cache().| kvikio::GenericSystemError | if the file descriptor is invalid, or the file cannot be synchronized, or the attempt to drop the page cache fails. |
| void kvikio::drop_file_page_cache | ( | std::string const & | file_path, |
| std::size_t | offset = 0, |
||
| std::size_t | length = 0, |
||
| bool | sync_first = true |
||
| ) |
Drop page cache for a specific file.
Convenience overload that opens the file, drops its page cache, and closes it.
| file_path | Path to the file |
| offset | Starting byte offset (default: 0 for beginning of file) |
| length | Number of bytes to drop (default: 0, meaning entire file from offset) |
| sync_first | Whether to flush dirty pages to disk before dropping. If true, fdatasync will be called prior to dropping. This ensures dirty pages become clean and thus droppable. Can be set to false if we are certain no dirty pages exist for this file. |
drop_system_page_cache(). drop_file_page_cache(int, std::size_t, std::size_t, bool) for detailed behavior and caveats| kvikio::GenericSystemError | if the file cannot be opened, or the file cannot be synchronized, or the attempt to drop the page cache fails. |
| bool kvikio::drop_system_page_cache | ( | bool | reclaim_dentries_and_inodes = true, |
| bool | sync_first = true |
||
| ) |
Drop the system page cache.
| reclaim_dentries_and_inodes | Whether to free reclaimable slab objects which include dentries and inodes.
|
| sync_first | Whether to flush dirty pages to disk before dropping. If true, sync will be called prior to dropping. This ensures dirty pages become clean and thus droppable. |
drop_file_page_cache(int, std::size_t, std::size_t, bool). sudo prefix. This is for the superuser and also for specially configured systems where unprivileged users cannot execute /usr/bin/sudo but can execute /sbin/sysctl. If this step succeeds, the function returns true immediately.sudo prefix. This is for the general case where selective unprivileged users have permission to run /sbin/sysctl with sudo prefix.| kvikio::GenericSystemError | if somehow the child process could not be created. |
| BlockDeviceInfo kvikio::get_block_device_info | ( | std::string const & | file_path | ) |
Get information about the physical block device hosting a file.
Resolves the underlying block device for a given file path, handling:
| file_path | Path to the file whose block device ID is to be determined. |
| kvikio::GenericSystemError | if the file does not exist, or if the block device cannot be determined (e.g., virtual or network filesystem). |
| std::optional<CUcontext> kvikio::get_context_associated_pointer | ( | CUdeviceptr | dev_ptr | ) |
Return the CUDA context associated the given device pointer, if any.
| dev_ptr | Device pointer to query |
| CUcontext kvikio::get_context_from_pointer | ( | void const * | devPtr | ) |
Return a CUDA context that can be used with the given device pointer.
For robustness, we look for an usabale context in the following order: 1) If a context has been associated with devPtr, it is returned. 2) If the current context exists and can access devPtr, it is returned. 3) Return the primary context of the device that owns devPtr. We assume the primary context can access devPtr, which might not be true in the exceptional disjoint addressing cases mention in the CUDA docs[1]. In these cases, the user has to set an usable current context before reading/writing using KvikIO.
[1] https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__UNIFIED.html
| devPtr | Device pointer to query |
| int kvikio::get_device_ordinal_from_pointer | ( | CUdeviceptr | dev_ptr | ) |
Return the device owning the pointer.
| ptr | Device pointer to query |
| std::size_t kvikio::get_file_size | ( | int | file_descriptor | ) |
Get file size given the file path.
| file_path | Path to a file |
| std::size_t kvikio::get_file_size | ( | std::string const & | file_path | ) |
Get file size from file descriptor fstat(3)
| file_descriptor | Open file descriptor |
| std::pair<std::size_t, std::size_t> kvikio::get_page_cache_info | ( | int | fd | ) |
Obtain the page cache residency information for a given file.
| fd | File descriptor. |
get_page_cache_info(std::string const&) overload. | std::pair<std::size_t, std::size_t> kvikio::get_page_cache_info | ( | std::string const & | file_path | ) |
Obtain the page cache residency information for a given file.
| file_path | Path to a file. |
| KVIKIO_EXPORT CUcontext kvikio::get_primary_cuda_context | ( | int | ordinal | ) |
Given a device ordinal, return the primary context of the device.
This function caches the primary contexts retrieved until program exit
| ordinal | Device ordinal - an integer between 0 and the number of CUDA devices |
| void kvikio::get_symbol | ( | T & | handle, |
| void * | lib, | ||
| std::string const & | name | ||
| ) |
Get symbol using dlsym
| T | The type of the function pointer. |
| handle | The function pointer (output). |
| lib | The library handle returned by dlopen. |
| name | Name of the symbol/function to load. |
Definition at line 49 of file shim/utils.hpp.
| std::tuple<std::string_view, T, bool> kvikio::getenv_or | ( | std::initializer_list< std::string_view > | env_var_names, |
| T | default_val | ||
| ) |
Get the environment variable value from a candidate list.
| T | Type of the environment variable value |
| env_var_names | Candidate list containing the names of environment variable |
| default_val | Default value of the environment variable, if none of the candidates has been found |
env_var_name, result, has_found), where:has_found will be false, result will be default_val, and env_var_name will be empty.env_var_name, then has_found will be true, and result be the set value. If more than one candidates have been set with the same value, env_var_name will be assigned the last candidate.| std::invalid_argument | if:
|
Definition at line 77 of file defaults.hpp.
| bool kvikio::is_cuda_available | ( | ) |
Check if the CUDA library is available.
Notice, this doesn't check if the runtime environment supports CUDA.
|
noexcept |
Check if the cuFile is available and expected to work.
Besides checking if the cuFile library is available, this also checks the runtime environment.
|
constexprnoexcept |
Check if the cuFile library is available.
Notice, this doesn't check if the runtime environment supports cuFile.
Definition at line 87 of file cufile.hpp.
| bool kvikio::is_future_done | ( | T const & | future | ) |
Check the status of the future object. True indicates that the result is available in the future's shared state. False otherwise.
The future shall not be created using std::async(std::launch::deferred). Otherwise, this function always returns true.
| T | Type of the future. |
| future | Instance of the future. |
| bool kvikio::is_host_memory | ( | void const * | ptr | ) |
Check if ptr points to host memory (as opposed to device memory)
In this context, managed memory counts as device memory
| ptr | Memory pointer to query |
|
noexcept |
Try to detect if running in Windows Subsystem for Linux (WSL)
When unable to determine environment, false is returned.
| void* kvikio::load_library | ( | std::string const & | name, |
| int | mode = RTLD_LAZY|RTLD_LOCAL|RTLD_NODELETE |
||
| ) |
Load shared library.
| name | Name of the library to load. |
| std::future<std::decay_t<T> > kvikio::make_ready_future | ( | T && | t | ) |
Create a shared state in a future object that is immediately ready.
A partial implementation of the namesake function from the concurrency TS (https://en.cppreference.com/w/cpp/experimental/make_ready_future). The cases of std::reference_wrapper and void are not implemented.
| T | Type of the value provided. |
| t | Object provided. |
| void kvikio::memory_deregister | ( | void const * | devPtr | ) |
Deregister a device memory allocation from cuFile.
This is a convenience wrapper around buffer_deregister() that automatically discovers the base address of the CUDA memory allocation containing devPtr. The entire underlying allocation is deregistered, regardless of which portion devPtr points to.
In compatibility mode (when GDS is unavailable), this function is a no-op.
| devPtr | Pointer anywhere within a previously registered CUDA device memory allocation. |
| CUfileException | If cuFile deregistration fails. |
| void kvikio::memory_register | ( | void const * | devPtr, |
| int | flags = 0, |
||
| std::vector< int > const & | errors_to_ignore = {} |
||
| ) |
Register a device memory allocation with cuFile for GPUDirect Storage access. Use this function together with FileHandle::pread() and FileHandle::pwrite().
This is a convenience wrapper around buffer_register() that automatically discovers the base address and size of the CUDA memory allocation containing devPtr. The entire underlying allocation is registered, regardless of which portion devPtr points to.
Registration pins the memory for GPU Direct DMA transfers, which can improve performance when the same buffer is reused across multiple cuFile I/O operations.
In compatibility mode (when GDS is unavailable), this function is a no-op.
| devPtr | Pointer anywhere within a CUDA device memory allocation. |
| flags | Registration flags. Should be 0 or CU_FILE_RDMA_REGISTER (experimental). |
| errors_to_ignore | cuFile error codes to silently ignore, such as CU_FILE_MEMORY_ALREADY_REGISTERED or CU_FILE_INVALID_MAPPING_SIZE. |
| CUfileException | If cuFile registration fails with an error not in errors_to_ignore. |
| int kvikio::open_fd | ( | std::string const & | file_path, |
| std::string const & | flags, | ||
| bool | o_direct, | ||
| mode_t | mode | ||
| ) |
Open file using open(2)
| flags | Open flags given as a string |
| o_direct | Append O_DIRECT to flags |
| mode | Access modes |
| int kvikio::open_fd_parse_flags | ( | std::string const & | flags, |
| bool | o_direct | ||
| ) |
Parse open file flags given as a string and return oflags.
| flags | The flags |
| o_direct | Append O_DIRECT to the open flags |
| std::invalid_argument | if the specified flags are not supported. |
| std::invalid_argument | if o_direct is true, but O_DIRECT is not supported. |
| int kvikio::open_flags | ( | int | fd | ) |
Get the flags of the file descriptor (see open(2))
| std::future<std::size_t> kvikio::parallel_io | ( | F | op, |
| T | buf, | ||
| std::size_t | size, | ||
| std::size_t | file_offset, | ||
| std::size_t | task_size, | ||
| std::size_t | devPtr_offset, | ||
| ThreadPool * | thread_pool = &defaults::thread_pool(), |
||
| std::uint64_t | call_idx = 0, |
||
| nvtx_color_type | nvtx_color = NvtxManager::default_color() |
||
| ) |
Apply read or write operation in parallel.
| F | The type of the function applying the read or write operation. |
| T | The type of the memory pointer. |
| op | The function applying the read or write operation. |
| buf | Buffer pointer to read or write to. |
| size | Number of bytes to read or write. |
| file_offset | Byte offset to the start of the file. |
| task_size | Size of each task in bytes. |
| devPtr_offset | Offset relative to the devPtr_base pointer. This parameter should be used only with registered buffers. |
| thread_pool | Thread pool to use for parallel execution. Defaults to the global default thread pool. |
Definition at line 137 of file parallel_operation.hpp.
|
noexcept |
Check if /run/udev is readable.
cuFile files with internal error when /run/udev isn't readable. This typically happens when running inside a docker image not launched with --volume /run/udev:/run/udev:ro.
| void kvikio::stream_deregister | ( | CUstream | stream | ) |
Deregisters the CUDA stream from the cuFile subsystem.
| stream | CUDA stream which queues the async I/O operations |
| void kvikio::stream_register | ( | CUstream | stream, |
| unsigned | flags | ||
| ) |
Registers the CUDA stream to the cuFile subsystem.
| stream | CUDA stream which queues the async I/O operations |
| flags | Specifies when the I/O parameters become valid (submission time or execution time) and what I/O parameters are page-aligned. For details, refer to https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html#cufilestreamregister |