Public Member Functions | Static Public Member Functions | List of all members
rapidsmpf::StreamOrderedTiming Class Reference

Stream-ordered wall-clock timer that records its result into Statistics. More...

#include <stream_ordered_timing.hpp>

Public Member Functions

 StreamOrderedTiming (rmm::cuda_stream_view stream, std::shared_ptr< Statistics > statistics)
 Constructs a StreamOrderedTiming and marks the start position in the stream. More...
 
void stop_and_record (std::string const &name, std::optional< std::string > stream_delay_name=std::nullopt)
 Marks the stop position in the stream and schedules recording of the duration. More...
 

Static Public Member Functions

static void cancel_inflight_timings (Statistics const *statistics)
 Cancel all in-flight timings associated with a Statistics object. More...
 

Detailed Description

Stream-ordered wall-clock timer that records its result into Statistics.

Marks a start position in the CUDA stream on construction and a stop position when stop_and_record is called. The elapsed wall-clock time between those two stream positions is recorded into the supplied Statistics object once the stream reaches the stop marker — guaranteeing that the measurement covers exactly the work enqueued between the two calls, in stream order.

If statistics is Statistics::disabled(), the entire class is a no-op.

StreamOrderedTiming timing{stream, stats};
// ... enqueue GPU work on `stream` ...
timing.stop_and_record("my-operation-time");
StreamOrderedTiming(rmm::cuda_stream_view stream, std::shared_ptr< Statistics > statistics)
Constructs a StreamOrderedTiming and marks the start position in the stream.

Definition at line 32 of file stream_ordered_timing.hpp.

Constructor & Destructor Documentation

◆ StreamOrderedTiming()

rapidsmpf::StreamOrderedTiming::StreamOrderedTiming ( rmm::cuda_stream_view  stream,
std::shared_ptr< Statistics statistics 
)

Constructs a StreamOrderedTiming and marks the start position in the stream.

If statistics is Statistics::disabled(), this is a no-op and subsequent calls to stop_and_record will also be no-ops.

Parameters
streamThe CUDA stream to time.
statisticsThe Statistics object that will receive the duration entry.

Member Function Documentation

◆ cancel_inflight_timings()

static void rapidsmpf::StreamOrderedTiming::cancel_inflight_timings ( Statistics const *  statistics)
static

Cancel all in-flight timings associated with a Statistics object.

Should be called when a Statistics object is about to be destroyed, to prevent dangling references from any in-flight stream callbacks. It is safe to call when no in-flight timings are present.

Note
If a stop callback has already executed before this function is called, the associated statistic may still be recorded. The guarantee is only that any still-pending callbacks are cancelled.
Parameters
statisticsThe Statistics object whose in-flight timings should be cancelled.

◆ stop_and_record()

void rapidsmpf::StreamOrderedTiming::stop_and_record ( std::string const &  name,
std::optional< std::string >  stream_delay_name = std::nullopt 
)

Marks the stop position in the stream and schedules recording of the duration.

The stream-ordered duration (time between the start and stop stream positions) is recorded under name. If stream_delay_name is set, the stream delay — the wall-clock time between object construction and when the stream actually executed the start callback — is also recorded under that name. The stream delay reveals how far ahead the CPU is running relative to the GPU stream.

Both values are written to the Statistics object in stream order — i.e. only after all work enqueued between construction and this call has been reached by the stream. If the Statistics object is destroyed before that point, the recording is silently skipped.

Behaviour is undefined if this method is called more than once per StreamOrderedTiming instance.

Parameters
nameName of the stream-ordered duration statistic.
stream_delay_nameName of the stream-delay statistic. If std::nullopt (the default), no stream-delay entry is written.

The documentation for this class was generated from the following file: