cudf.testing.testing.assert_frame_equal#

cudf.testing.testing.assert_frame_equal(left, right, check_dtype=True, check_index_type='equiv', check_column_type='equiv', check_frame_type=True, check_names=True, by_blocks=False, check_exact=False, check_datetimelike_compat=False, check_categorical=True, check_like=False, rtol=1e-05, atol=1e-08, obj='DataFrame')#

Check that left and right DataFrame are equal

This function is intended to compare two DataFrame and output any differences. Additional parameters allow varying the strictness of the equality checks performed.

Parameters:
leftDataFrame

left DataFrame to compare

rightDataFrame

right DataFrame to compare

check_dtypebool, default True

Whether to check the DataFrame dtype is identical.

check_index_typebool or {‘equiv’}, default ‘equiv’

Whether to check the Index class, dtype and inferred_type are identical.

check_column_typebool, default True

Whether to check the column class, dtype and inferred_type are identical. Currently it is idle, and similar to pandas.

check_frame_typebool, default True

Whether to check the DataFrame class is identical.

check_namesbool, default True

Whether to check that the names attribute for both the index and column attributes of the DataFrame is identical.

check_exactbool, default False

Whether to compare number exactly.

by_blocksbool, default False

Not supported

check_exactbool, default False

Whether to compare number exactly.

check_datetime_like_compatbool, default False

Compare datetime-like which is comparable ignoring dtype.

check_categoricalbool, default True

Whether to compare internal Categorical exactly.

check_likebool, default False

If True, ignore the order of index & columns. Note: index labels must match their respective rows (same as in columns) - same labels must be with the same data.

rtolfloat, default 1e-5

Relative tolerance. Only used when check_exact is False.

atolfloat, default 1e-8

Absolute tolerance. Only used when check_exact is False.

objstr, default ‘DataFrame’

Specify object name being compared, internally used to show appropriate assertion message.

Examples

>>> import cudf
>>> df1 = cudf.DataFrame({"a":[1, 2], "b":[1.0, 2.0]}, index=[1, 2])
>>> df2 = cudf.DataFrame({"a":[1, 2], "b":[1.0, 2.0]}, index=[2, 3])
>>> cudf.testing.assert_frame_equal(df1, df2)
......
......
AssertionError: ColumnBase are different

values are different (100.0 %)
[left]:  [1 2]
[right]: [2 3]
>>> df2 = cudf.DataFrame({"a":[1, 2], "c":[1.0, 2.0]}, index=[1, 2])
>>> cudf.testing.assert_frame_equal(df1, df2)
......
......
AssertionError: DataFrame.columns are different

DataFrame.columns values are different (50.0 %)
[left]: Index(['a', 'b'], dtype='object')
right]: Index(['a', 'c'], dtype='object')
>>> df2 = cudf.DataFrame({"a":[1, 2], "b":[1.0, 3.0]}, index=[1, 2])
>>> cudf.testing.assert_frame_equal(df1, df2)
......
......
AssertionError: Column name="b" are different

values are different (50.0 %)
[left]:  [1. 2.]
[right]: [1. 3.]

This will pass without any hitch:

>>> df2 = cudf.DataFrame({"a":[1, 2], "b":[1.0, 2.0]}, index=[1, 2])
>>> cudf.testing.assert_frame_equal(df1, df2)