cudf.concat#

cudf.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=None)[source]#

Concatenate DataFrames, Series, or Indices row-wise.

Parameters:
objslist or dictionary of DataFrame, Series, or Index
axis{0/’index’, 1/’columns’}, default 0

The axis to concatenate along. axis=1 must be passed if a dictionary is passed.

join{‘inner’, ‘outer’}, default ‘outer’

How to handle indexes on other axis (or axes).

ignore_indexbool, default False

Set True to ignore the index of the objs and provide a default range index instead.

keyssequence, default None

If multiple levels passed, should contain tuples. Construct hierarchical index using the passed keys as the outermost level. Currently not supported.

levelslist of sequences, default None

Specific levels (unique values) to use for constructing a MultiIndex. Otherwise they will be inferred from the keys. Currently not supported.

nameslist, default None

Names for the levels in the resulting hierarchical index. Currently not supported.

verify_integritybool, default False

Check whether the new concatenated axis contains duplicates. This can be very expensive relative to the actual data concatenation. Currently not supported.

sortbool, default False

Sort non-concatenation axis if it is not already aligned.

Returns:
A new object of like type with rows from each object in objs.

Examples

Combine two Series.

>>> import cudf
>>> s1 = cudf.Series(['a', 'b'])
>>> s2 = cudf.Series(['c', 'd'])
>>> s1
0    a
1    b
dtype: object
>>> s2
0    c
1    d
dtype: object
>>> cudf.concat([s1, s2])
0    a
1    b
0    c
1    d
dtype: object

Clear the existing index and reset it in the result by setting the ignore_index option to True.

>>> cudf.concat([s1, s2], ignore_index=True)
0    a
1    b
2    c
3    d
dtype: object

Combine two DataFrame objects with identical columns.

>>> df1 = cudf.DataFrame([['a', 1], ['b', 2]],
...                    columns=['letter', 'number'])
>>> df1
  letter  number
0      a       1
1      b       2
>>> df2 = cudf.DataFrame([['c', 3], ['d', 4]],
...                    columns=['letter', 'number'])
>>> df2
  letter  number
0      c       3
1      d       4
>>> cudf.concat([df1, df2])
  letter  number
0      a       1
1      b       2
0      c       3
1      d       4

Combine DataFrame objects with overlapping columns and return everything. Columns outside the intersection will be filled with null values.

>>> df3 = cudf.DataFrame([['c', 3, 'cat'], ['d', 4, 'dog']],
...                    columns=['letter', 'number', 'animal'])
>>> df3
  letter  number animal
0      c       3    cat
1      d       4    dog
>>> cudf.concat([df1, df3], sort=False)
  letter  number animal
0      a       1   <NA>
1      b       2   <NA>
0      c       3    cat
1      d       4    dog

Combine DataFrame objects with overlapping columns and return only those that are shared by passing inner to the join keyword argument.

>>> cudf.concat([df1, df3], join="inner")
  letter  number
0      a       1
1      b       2
0      c       3
1      d       4

Combine DataFrame objects horizontally along the x axis by passing in axis=1.

>>> df4 = cudf.DataFrame([['bird', 'polly'], ['monkey', 'george']],
...                    columns=['animal', 'name'])
>>> df4
   animal    name
0    bird   polly
1  monkey  george
>>> cudf.concat([df1, df4], axis=1)
  letter  number  animal    name
0      a       1    bird   polly
1      b       2  monkey  george

Combine a dictionary of DataFrame objects horizontally:

>>> d = {'first': df1, 'second': df2}
>>> cudf.concat(d, axis=1)
  first           second
  letter  number  letter  number
0      a       1       c       3
1      b       2       d       4