# Supported Data Types#

cuDF supports many data types supported by NumPy and Pandas, including numeric, datetime, timedelta, categorical and string data types. We also provide special data types for working with decimals, list-like, and dictionary-like data.

All data types in cuDF are nullable.

Kind of data

Data type(s)

Signed integer

`'int8'`, `'int16'`, `'int32'`, `'int64'`

Unsigned integer

`'uint32'`, `'uint64'`

Floating-point

`'float32'`, `'float64'`

Datetime

`'datetime64[s]'`, `'datetime64[ms]'`, `'datetime64['us']`, `'datetime64[ns]'`

Timedelta (duration)

`'timedelta[s]'`, `'timedelta[ms]'`, `'timedelta['us']`, `'timedelta[ns]'`

Category

`cudf.CategoricalDtype()`

String

`'object'` or `'string'`

Decimal

List

`cudf.ListDtype()`

Struct

`cudf.StructDtype()`

## NumPy data types#

We use NumPy data types for integer, floating, datetime, timedelta, and string data types. Thus, just like in NumPy, `np.dtype("float32")`, `np.float32`, and `"float32"` are all acceptable ways to specify the `float32` data type:

```>>> import cudf
>>> s = cudf.Series([1, 2, 3], dtype="float32")
>>> s
0    1.0
1    2.0
2    3.0
dtype: float32
```

## A note on `object`#

The data type associated with string data in cuDF is `"np.object"`.

```>>> import cudf
>>> s = cudf.Series(["abc", "def", "ghi"])
>>> s.dtype
dtype("object")
```

This is for compatibility with Pandas, but it can be misleading. In both NumPy and Pandas, `"object"` is the data type associated data composed of arbitrary Python objects (not just strings). However, cuDF does not support storing arbitrary Python objects.

## Decimal data types#

We provide special data types for working with decimal data, namely `Decimal32Dtype`, `Decimal64Dtype`, and `Decimal128Dtype`. Use these data types when you need to store values with greater precision than allowed by floating-point representation.

Decimal data types in cuDF are based on fixed-point representation. A decimal data type is composed of a precision and a scale. The precision represents the total number of digits in each value of this dtype. For example, the precision associated with the decimal value `1.023` is `4`. The scale is the total number of digits to the right of the decimal point. The scale associated with the value `1.023` is 3.

Each decimal data type is associated with a maximum precision:

```>>> cudf.Decimal32Dtype.MAX_PRECISION
9.0
>>> cudf.Decimal64Dtype.MAX_PRECISION
18.0
>>> cudf.Decimal128Dtype.MAX_PRECISION
38
```

One way to create a decimal Series is from values of type decimal.Decimal.

```>>> from decimal import Decimal
>>> s = cudf.Series([Decimal("1.01"), Decimal("4.23"), Decimal("0.5")])
>>> s
0    1.01
1    4.23
2    0.50
dtype: decimal128
>>> s.dtype
Decimal128Dtype(precision=3, scale=2)
```

Notice the data type of the result: `1.01`, `4.23`, `0.50` can all be represented with a precision of at least 3 and a scale of at least 2.

However, the value `1.234` needs a precision of at least 4, and a scale of at least 3, and cannot be fully represented using this data type:

```>>> s[1] = Decimal("1.234")  # raises an error
```

## Nested data types (`List` and `Struct`)#

`ListDtype` and `StructDtype` are special data types in cuDF for working with list-like and dictionary-like data. These are referred to as “nested” data types, because they enable you to store a list of lists, or a struct of lists, or a struct of list of lists, etc.,

You can create lists and struct Series from existing Pandas Series of lists and dictionaries respectively:

```>>> psr = pd.Series([{'a': 1, 'b': 2}, {'a': 3, 'b': 4}])
>>> psr
0 {'a': 1, 'b': 2}
1 {'a': 3, 'b': 4}
dtype: object
>>> gsr = cudf.from_pandas(psr)
>>> gsr
0 {'a': 1, 'b': 2}
1 {'a': 3, 'b': 4}
dtype: struct
>>> gsr.dtype
StructDtype({'a': dtype('int64'), 'b': dtype('int64')})
```

Or by reading them from disk, using a file format that supports nested data.

```>>> pdf = pd.DataFrame({"a": [[1, 2], [3, 4, 5], [6, 7, 8]]})
>>> pdf.to_parquet("lists.pq")