cudf.DataFrame.stack#

DataFrame.stack(level=-1, dropna=_NoDefault.no_default, future_stack=False)[source]#

Stack the prescribed level(s) from columns to index

Return a reshaped DataFrame or Series having a multi-level index with one or more new inner-most levels compared to the current DataFrame. The new inner-most levels are created by pivoting the columns of the current dataframe:

  • if the columns have a single level, the output is a Series;

  • if the columns have multiple levels, the new index level(s) is (are) taken from the prescribed level(s) and the output is a DataFrame.

Parameters:
levelint, str, list default -1

Level(s) to stack from the column axis onto the index axis, defined as one index or label, or a list of indices or labels.

dropnabool, default True

Whether to drop rows in the resulting Frame/Series with missing values. When multiple levels are specified, dropna==False is unsupported.

Returns:
DataFrame or Series

Stacked dataframe or series.

See also

DataFrame.unstack

Unstack prescribed level(s) from index axis onto column axis.

DataFrame.pivot

Reshape dataframe from long format to wide format.

DataFrame.pivot_table

Create a spreadsheet-style pivot table as a DataFrame.

Notes

The function is named by analogy with a collection of books being reorganized from being side by side on a horizontal position (the columns of the dataframe) to being stacked vertically on top of each other (in the index of the dataframe).

Examples

Single level columns

>>> df_single_level_cols = cudf.DataFrame([[0, 1], [2, 3]],
...                                     index=['cat', 'dog'],
...                                     columns=['weight', 'height'])

Stacking a dataframe with a single level column axis returns a Series:

>>> df_single_level_cols
     weight height
cat       0      1
dog       2      3
>>> df_single_level_cols.stack()
cat  height    1
     weight    0
dog  height    3
     weight    2
dtype: int64

Multi level columns: simple case

>>> import pandas as pd
>>> multicol1 = pd.MultiIndex.from_tuples([('weight', 'kg'),
...                                        ('weight', 'pounds')])
>>> df_multi_level_cols1 = cudf.DataFrame([[1, 2], [2, 4]],
...                                     index=['cat', 'dog'],
...                                     columns=multicol1)

Stacking a dataframe with a multi-level column axis:

>>> df_multi_level_cols1
     weight
         kg    pounds
cat       1        2
dog       2        4
>>> df_multi_level_cols1.stack()
            weight
cat kg           1
    pounds       2
dog kg           2
    pounds       4

Missing values

>>> multicol2 = pd.MultiIndex.from_tuples([('weight', 'kg'),
...                                        ('height', 'm')])
>>> df_multi_level_cols2 = cudf.DataFrame([[1.0, 2.0], [3.0, 4.0]],
...                                     index=['cat', 'dog'],
...                                     columns=multicol2)

It is common to have missing values when stacking a dataframe with multi-level columns, as the stacked dataframe typically has more values than the original dataframe. Missing values are filled with NULLs:

>>> df_multi_level_cols2
    weight height
        kg      m
cat    1.0    2.0
dog    3.0    4.0
>>> df_multi_level_cols2.stack()
       weight height
cat kg    1.0   <NA>
    m    <NA>    2.0
dog kg    3.0   <NA>
    m    <NA>    4.0

Prescribing the level(s) to be stacked

The first parameter controls which level or levels are stacked:

>>> df_multi_level_cols2.stack(0)
            kg     m
cat height  <NA>   2.0
    weight   1.0  <NA>
dog height  <NA>   4.0
    weight   3.0  <NA>
>>> df_multi_level_cols2.stack([0, 1])
cat  height  m     2.0
     weight  kg    1.0
dog  height  m     4.0
     weight  kg    3.0
dtype: float64