API Reference

DataFrame

class cuxfilter.dataframe.DataFrame(data)

A cuxfilter GPU DataFrame object

Attributes:
data
edges

Methods

dashboard(charts[, sidebar, layout, theme, ...])

Creates a cuxfilter.DashBoard object

from_arrow(dataframe_location)

read an arrow file from disk as cuxfilter.DataFrame

from_dataframe(dataframe)

create a cuxfilter.DataFrame from cudf.DataFrame/dask_cudf.DataFrame (zero-copy reference)

load_graph(graph)

create a cuxfilter.DataFrame from cudf.DataFrame/dask_cudf.DataFrame (zero-copy reference) from a graph object

preprocess_data

validate_dask_index

dashboard(charts: list, sidebar: list = [], layout=<class 'cuxfilter.layouts.layouts.Layout0'>, theme=<class 'cuxfilter.themes.light.LightTheme'>, title='Dashboard', data_size_widget=True, warnings=False, layout_array=None)

Creates a cuxfilter.DashBoard object

Parameters:
charts: list

list of cuxfilter.charts

layout: cuxfilter.layouts
title: str

title of the dashboard, default “Dashboard”

data_size_widget: boolean

flag to determine whether to diplay the current datapoints selected in the dashboard, default True

warnings: boolean

flag to disable or enable runtime warnings related to layouts, default False

Returns:
cuxfilter.DashBoard object

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> # create a dashboard object
>>> d = cux_df.dashboard([line_chart_1])
classmethod from_arrow(dataframe_location)

read an arrow file from disk as cuxfilter.DataFrame

Parameters:
dataframe_location: str or arrow in-memory table
Returns:
cuxfilter.DataFrame object

Examples

Read dataframe as an arrow file from disk

>>> import cuxfilter
>>> import pyarrow as pa
>>> # create a temporary arrow table
>>> arrowTable = pa.Table.from_arrays([['foo', 'bar']], names=['name'])
>>> # read arrow table, can also ready .arrow file paths directly
>>> cux_df = cuxfilter.DataFrame.from_arrow(df)
classmethod from_dataframe(dataframe)

create a cuxfilter.DataFrame from cudf.DataFrame/dask_cudf.DataFrame (zero-copy reference)

Parameters:
dataframe_location: cudf.DataFrame or dask_cudf.DataFrame
Returns:
cuxfilter.DataFrame object

Examples

Read dataframe from a cudf.DataFrame/dask_cudf.DataFrame

>>> import cuxfilter
>>> import cudf
>>> cudf_df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(cudf_df)
classmethod load_graph(graph)

create a cuxfilter.DataFrame from cudf.DataFrame/dask_cudf.DataFrame (zero-copy reference) from a graph object

Parameters:
tuple object (nodes, edges) where nodes and edges are cudf DataFrames
Returns:
cuxfilter.DataFrame object

Examples

load graph from cugraph object

>>> import cuxfilter
>>> import cudf, cugraph
>>> edges = cudf.DataFrame(
>>>     {
>>>         'source': [0, 1, 2, 3, 4],
>>>         'target':[0,1,2,3,4],
>>>         'weight':[4,4,2,6,7],
>>>     }
>>> )
>>> G = cugraph.Graph()
>>> G.from_cudf_edgelist(edges, destination='target')
>>> cux_df = cuxfilter.DataFrame.load_graph((G.nodes(), G.edges()))

load graph from (nodes, edges)

>>> import cuxfilter
>>> import cudf
>>> nodes = cudf.DataFrame(
>>>     {
>>>         'vertex': [0, 1, 2, 3, 4],
>>>         'x':[0,1,2,3,4],
>>>         'y':[4,4,2,6,7],
>>>         'attr': [0,1,1,1,1]
>>>     }
>>> )
>>> edges = cudf.DataFrame(
>>>     {
>>>         'source': [0, 1, 2, 3, 4],
>>>         'target':[0,1,2,3,4],
>>>         'weight':[4,4,2,6,7],
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.load_graph((nodes,edges))

DashBoard

class cuxfilter.dashboard.DashBoard(charts=[], sidebar=[], dataframe=None, layout=<class 'cuxfilter.layouts.layouts.Layout0'>, theme=<class 'cuxfilter.themes.light.LightTheme'>, title='Dashboard', data_size_widget=True, show_warnings=False, layout_array=None)

A cuxfilter GPU DashBoard object. Examples ——–

Create a dashboard

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh, panel_widgets
>>> df = cudf.DataFrame(
>>>     {'key': [0, 1, 2, 3, 4], 'val':[float(i + 10) for i in range(5)]}
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> line_chart_2 = bokeh.bar(
>>>     'val', 'key', data_points=5, add_interaction=False
>>> )
>>> sidebar_widget = panel_widgets.card("test")
>>> d = cux_df.dashboard(charts=[line_chart_1, line_chart_2],
>>> sidebar=[sidebar_widget])
>>> d
`cuxfilter DashBoard
[title] Markdown(str)
[chart0] Markdown(str, sizing_mode='stretch_both'), ['nav'])
[chart1] Column(sizing_mode='scale_both', width=1600)
    [0] Bokeh(Figure)
[chart2] Column(sizing_mode='scale_both', width=1600)
    [0] Bokeh(Figure)`
>>> # d.app() for serving within notebook cell,
>>> # d.show() for serving as a separate web-app
>>> d.app() #or d.show()
displays interactive dashboard

do some visual querying/ crossfiltering

Attributes:
charts

Charts in the dashboard as a dictionary.

queried_indices

Read-only propery queried_indices returns a merged index of all queried index columns present in self._query_str_dict as a cudf.Series.

server

Methods

add_charts([charts, sidebar])

Adding more charts to the dashboard, after it has been initialized. Parameters ---------- charts: list list of cuxfilter.charts objects.

app([sidebar_width])

Run the dashboard with a bokeh backend server within the notebook.

export()

Export the cudf.DataFrame based on the current filtered state of the dashboard.

preview()

Preview(Async) all the charts in a jupyter cell, non interactive(no backend server).

show([notebook_url, port, threaded, ...])

Run the dashboard with a bokeh backend server within the notebook. Parameters ---------- notebook_url: str, optional, default localhost:8888 - URL where you want to run the dashboard as a web-app, including the port number. - Can use localhost instead of ip if running locally. port: int, optional- Has to be an open port.

stop()

stop the bokeh server

add_charts(charts=[], sidebar=[])

Adding more charts to the dashboard, after it has been initialized. Parameters ———- charts: list

list of cuxfilter.charts objects

sidebar: list

list of cuxfilter.charts.panel_widget objects

Notes

After adding the charts, refresh the dashboard app tab to see the updated charts.

Charts of type widget cannot be added to sidebar but widgets can be added to charts(main layout)

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh, panel_widgets
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard([line_chart_1])
>>> line_chart_2 = bokeh.bar(
>>>     'val', 'key', data_points=5, add_interaction=False
>>> )
>>> d.add_charts(charts=[line_chart_2])
>>> # or
>>> d.add_charts(charts=[], sidebar=[panel_widgets.card("test")])
app(sidebar_width=280)

Run the dashboard with a bokeh backend server within the notebook. Parameters ———- Examples ——–

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard([line_chart_1])
>>> d.app()
property charts

Charts in the dashboard as a dictionary.

export()

Export the cudf.DataFrame based on the current filtered state of the dashboard.

Also prints the query string of the current state of the dashboard. Returns ——- cudf.DataFrame

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> line_chart_2 = bokeh.bar(
>>>     'val', 'key', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard(
>>>    [line_chart_1, line_chart_2],
>>>    layout=cuxfilter.layouts.double_feature
>>> )
>>> # d.app() for serving within notebook cell,
>>> # d.show() for serving as a separate web-app
>>> d.app() #or d.show()
displays interactive dashboard
>>> queried_df = d.export()
final query 2<=key<=4
async preview()

Preview(Async) all the charts in a jupyter cell, non interactive(no backend server). Mostly intended to save notebook state for blogs, documentation while still rendering the dashboard.

Notes

  • Png format

  • Bokeh and Datashader based charts also have a save tool

on the side toolbar, which can download and save the individual chart when interacting with the dashboard.

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> line_chart_2 = bokeh.bar(
>>>     'val', 'key', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard(
>>>    [line_chart_1, line_chart_2],
>>>    layout=cuxfilter.layouts.double_feature
>>> )
>>> await d.preview()
displays charts in the dashboard
property queried_indices

Read-only propery queried_indices returns a merged index of all queried index columns present in self._query_str_dict as a cudf.Series.

Returns None if no index columns are present.

show(notebook_url='http://localhost:8888', port=0, threaded=False, service_proxy=None, **kwargs)

Run the dashboard with a bokeh backend server within the notebook. Parameters ———- notebook_url: str, optional, default localhost:8888

  • URL where you want to run the dashboard as a web-app,

including the port number. - Can use localhost instead of ip if running locally.

port: int,

optional- Has to be an open port

service_proxy: str, optional, default None,

available options: jupyterhub

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard([line_chart_1])
>>> d.show(url='localhost:8889')
stop()

stop the bokeh server