cudf.concat#
- cudf.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=None)[source]#
Concatenate DataFrames, Series, or Indices row-wise.
- Parameters:
- objslist or dictionary of DataFrame, Series, or Index
- axis{0/’index’, 1/’columns’}, default 0
The axis to concatenate along. axis=1 must be passed if a dictionary is passed.
- join{‘inner’, ‘outer’}, default ‘outer’
How to handle indexes on other axis (or axes).
- ignore_indexbool, default False
Set True to ignore the index of the objs and provide a default range index instead.
- keyssequence, default None
If multiple levels passed, should contain tuples. Construct hierarchical index using the passed keys as the outermost level. Currently not supported.
- levelslist of sequences, default None
Specific levels (unique values) to use for constructing a MultiIndex. Otherwise they will be inferred from the keys. Currently not supported.
- nameslist, default None
Names for the levels in the resulting hierarchical index. Currently not supported.
- verify_integritybool, default False
Check whether the new concatenated axis contains duplicates. This can be very expensive relative to the actual data concatenation. Currently not supported.
- sortbool, default False
Sort non-concatenation axis if it is not already aligned.
- Returns:
- A new object of like type with rows from each object in
objs
.
- A new object of like type with rows from each object in
Examples
Combine two
Series
.>>> import cudf >>> s1 = cudf.Series(['a', 'b']) >>> s2 = cudf.Series(['c', 'd']) >>> s1 0 a 1 b dtype: object >>> s2 0 c 1 d dtype: object >>> cudf.concat([s1, s2]) 0 a 1 b 0 c 1 d dtype: object
Clear the existing index and reset it in the result by setting the
ignore_index
option toTrue
.>>> cudf.concat([s1, s2], ignore_index=True) 0 a 1 b 2 c 3 d dtype: object
Combine two DataFrame objects with identical columns.
>>> df1 = cudf.DataFrame([['a', 1], ['b', 2]], ... columns=['letter', 'number']) >>> df1 letter number 0 a 1 1 b 2 >>> df2 = cudf.DataFrame([['c', 3], ['d', 4]], ... columns=['letter', 'number']) >>> df2 letter number 0 c 3 1 d 4 >>> cudf.concat([df1, df2]) letter number 0 a 1 1 b 2 0 c 3 1 d 4
Combine DataFrame objects with overlapping columns and return everything. Columns outside the intersection will be filled with
null
values.>>> df3 = cudf.DataFrame([['c', 3, 'cat'], ['d', 4, 'dog']], ... columns=['letter', 'number', 'animal']) >>> df3 letter number animal 0 c 3 cat 1 d 4 dog >>> cudf.concat([df1, df3], sort=False) letter number animal 0 a 1 <NA> 1 b 2 <NA> 0 c 3 cat 1 d 4 dog
Combine
DataFrame
objects with overlapping columns and return only those that are shared by passinginner
to thejoin
keyword argument.>>> cudf.concat([df1, df3], join="inner") letter number 0 a 1 1 b 2 0 c 3 1 d 4
Combine
DataFrame
objects horizontally along the x axis by passing inaxis=1
.>>> df4 = cudf.DataFrame([['bird', 'polly'], ['monkey', 'george']], ... columns=['animal', 'name']) >>> df4 animal name 0 bird polly 1 monkey george >>> cudf.concat([df1, df4], axis=1) letter number animal name 0 a 1 bird polly 1 b 2 monkey george
Combine a dictionary of DataFrame objects horizontally:
>>> d = {'first': df1, 'second': df2} >>> cudf.concat(d, axis=1) first second letter number letter number 0 a 1 c 3 1 b 2 d 4