byte_pair_encode#
- class pylibcudf.nvtext.byte_pair_encode.BPEMergePairs#
The table of merge pairs for the BPE encoder.
For details, see
cudf::nvtext::bpe_merge_pairs
.
- pylibcudf.nvtext.byte_pair_encode.byte_pair_encoding(Column input, BPEMergePairs merge_pairs, Scalar separator=None) Column #
Byte pair encode the input strings.
For details, see cpp:func:cudf::nvtext::byte_pair_encoding
- Parameters:
- inputColumn
Strings to encode.
- merge_pairsBPEMergePairs
Substrings to rebuild each string on.
- separatorScalar
String used to build the output after encoding. Default is a space.
- Returns:
- Column
An encoded column of strings.