cudf.crosstab#
- cudf.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, margins_name='All', dropna=None, normalize=False)[source]#
Compute a simple cross tabulation of two (or more) factors. By default computes a frequency table of the factors unless an array of values and an aggregation function are passed.
- Parameters:
- indexarray-like, Series, or list of arrays/Series
Values to group by in the rows.
- columnsarray-like, Series, or list of arrays/Series
Values to group by in the columns.
- valuesarray-like, optional
Array of values to aggregate according to the factors. Requires aggfunc be specified.
- rownameslist of str, default None
If passed, must match number of row arrays passed.
- colnameslist of str, default None
If passed, must match number of column arrays passed.
- aggfuncfunction, optional
If specified, requires values be specified as well.
- marginsNot supported
- margins_nameNot supported
- dropnaNot supported
- normalizeNot supported
- Returns:
- DataFrame
Cross tabulation of the data.
Examples
>>> a = cudf.Series(["foo", "foo", "foo", "foo", "bar", "bar", ... "bar", "bar", "foo", "foo", "foo"], dtype=object) >>> b = cudf.Series(["one", "one", "one", "two", "one", "one", ... "one", "two", "two", "two", "one"], dtype=object) >>> c = cudf.Series(["dull", "dull", "shiny", "dull", "dull", "shiny", ... "shiny", "dull", "shiny", "shiny", "shiny"], ... dtype=object) >>> cudf.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c']) b one two c dull shiny dull shiny a bar 1 2 1 0 foo 2 2 1 2