API Reference#

The two main components to cuxfilter are DataFrame for connecting the dashboard to a cuDF backed dataframe, and Dashboard for setting dashboard options.

DataFrame#

class cuxfilter.dataframe.DataFrame(data)#

A cuxfilter GPU DataFrame object

Attributes:
data
edges

Methods

dashboard(charts[, sidebar, layout, theme, ...])

Creates a cuxfilter.DashBoard object

from_arrow(dataframe_location)

read an arrow file from disk as cuxfilter.DataFrame

from_dataframe(dataframe)

create a cuxfilter.DataFrame from cudf.DataFrame/dask_cudf.DataFrame (zero-copy reference)

load_graph(graph)

create a cuxfilter.DataFrame from cudf.DataFrame/dask_cudf.DataFrame (zero-copy reference) from a graph object

preprocess_data

validate_dask_index

dashboard(charts: list, sidebar: list = [], layout=<class 'cuxfilter.layouts.layouts.Layout0'>, theme=<class 'cuxfilter.themes.default.LightTheme'>, title='Dashboard', data_size_widget=True, warnings=False, layout_array=None)#

Creates a cuxfilter.DashBoard object

Parameters:
charts: list

list of cuxfilter.charts

layout: cuxfilter.layouts
theme: cuxfilter.themes, default cuxfilter.themes.default.
title: str

title of the dashboard, default “Dashboard”

data_size_widget: boolean

flag to determine whether to diplay the current datapoints selected in the dashboard, default True

warnings: boolean

flag to disable or enable runtime warnings related to layouts, default False

Returns:
cuxfilter.DashBoard object

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> # create a dashboard object
>>> d = cux_df.dashboard([line_chart_1])
classmethod from_arrow(dataframe_location)#

read an arrow file from disk as cuxfilter.DataFrame

Parameters:
dataframe_location: str or arrow in-memory table
Returns:
cuxfilter.DataFrame object

Examples

Read dataframe as an arrow file from disk

>>> import cuxfilter
>>> import pyarrow as pa
>>> # create a temporary arrow table
>>> arrowTable = pa.Table.from_arrays([['foo', 'bar']], names=['name'])
>>> # read arrow table, can also ready .arrow file paths directly
>>> cux_df = cuxfilter.DataFrame.from_arrow(df)
classmethod from_dataframe(dataframe)#

create a cuxfilter.DataFrame from cudf.DataFrame/dask_cudf.DataFrame (zero-copy reference)

Parameters:
dataframe_location: cudf.DataFrame or dask_cudf.DataFrame
Returns:
cuxfilter.DataFrame object

Examples

Read dataframe from a cudf.DataFrame/dask_cudf.DataFrame

>>> import cuxfilter
>>> import cudf
>>> cudf_df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(cudf_df)
classmethod load_graph(graph)#

create a cuxfilter.DataFrame from cudf.DataFrame/dask_cudf.DataFrame (zero-copy reference) from a graph object

Parameters:
tuple object (nodes, edges) where nodes and edges are cudf DataFrames
Returns:
cuxfilter.DataFrame object

Examples

load graph from cugraph object

>>> import cuxfilter
>>> import cudf, cugraph
>>> edges = cudf.DataFrame(
>>>     {
>>>         'source': [0, 1, 2, 3, 4],
>>>         'target':[0,1,2,3,4],
>>>         'weight':[4,4,2,6,7],
>>>     }
>>> )
>>> G = cugraph.Graph()
>>> G.from_cudf_edgelist(edges, destination='target')
>>> cux_df = cuxfilter.DataFrame.load_graph((G.nodes(), G.edges()))

load graph from (nodes, edges)

>>> import cuxfilter
>>> import cudf
>>> nodes = cudf.DataFrame(
>>>     {
>>>         'vertex': [0, 1, 2, 3, 4],
>>>         'x':[0,1,2,3,4],
>>>         'y':[4,4,2,6,7],
>>>         'attr': [0,1,1,1,1]
>>>     }
>>> )
>>> edges = cudf.DataFrame(
>>>     {
>>>         'source': [0, 1, 2, 3, 4],
>>>         'target':[0,1,2,3,4],
>>>         'weight':[4,4,2,6,7],
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.load_graph((nodes,edges))

DashBoard#

class cuxfilter.dashboard.DashBoard(charts=[], sidebar=[], dataframe=None, layout=<class 'cuxfilter.layouts.layouts.Layout0'>, theme=<class 'cuxfilter.themes.default.LightTheme'>, title='Dashboard', data_size_widget=True, show_warnings=False, layout_array=None)#

A cuxfilter GPU DashBoard object. Examples ——–

Create a dashboard

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh, panel_widgets
>>> df = cudf.DataFrame(
>>>     {'key': [0, 1, 2, 3, 4], 'val':[float(i + 10) for i in range(5)]}
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> line_chart_2 = bokeh.bar(
>>>     'val', 'key', data_points=5, add_interaction=False
>>> )
>>> sidebar_widget = panel_widgets.card("test")
>>> d = cux_df.dashboard(charts=[line_chart_1, line_chart_2],
>>> sidebar=[sidebar_widget])
>>> d
`cuxfilter DashBoard
[title] Markdown(str)
[chart0] Markdown(str, sizing_mode='stretch_both'), ['nav'])
[chart1] Column(sizing_mode='scale_both', width=1600)
    [0] Bokeh(Figure)
[chart2] Column(sizing_mode='scale_both', width=1600)
    [0] Bokeh(Figure)`
>>> # d.app() for serving within notebook cell,
>>> # d.show() for serving as a separate web-app
>>> d.app() #or d.show()
displays interactive dashboard

do some visual querying/ crossfiltering

Attributes:
charts

Charts in the dashboard as a dictionary.

queried_indices

Read-only propery queried_indices returns a merged index of all queried index columns present in self._query_str_dict as a cudf.Series or dask_cudf.Series.

server

Methods

add_charts([charts, sidebar])

Adding more charts to the dashboard, after it has been initialized.

app([sidebar_width, width, height])

Run the dashboard with a bokeh backend server within the notebook.

export()

Export the cudf.DataFrame based on the current filtered state of the dashboard.

show([notebook_url, port, threaded, ...])

Run the dashboard with a bokeh backend server within the notebook.

stop()

stop the bokeh server

add_charts(charts=[], sidebar=[])#

Adding more charts to the dashboard, after it has been initialized.

Parameters:
charts: list

list of cuxfilter.charts objects

sidebar: list

list of cuxfilter.charts.panel_widget objects

Notes

After adding the charts, refresh the dashboard app tab to see the updated charts. Charts of type widget cannot be added to sidebar but widgets can be added to charts(main layout)

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh, panel_widgets
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard([line_chart_1])
>>> line_chart_2 = bokeh.bar(
>>>     'val', 'key', data_points=5, add_interaction=False
>>> )
>>> d.add_charts(charts=[line_chart_2])
>>> # or
>>> d.add_charts(charts=[], sidebar=[panel_widgets.card("test")])
app(sidebar_width=280, width=1200, height=800)#

Run the dashboard with a bokeh backend server within the notebook.

Parameters:
sidebar_width: int, optional, default 280

width of the sidebar in pixels

width: int, optional, default 1200

width of the dashboard in pixels

height: int, optional, default 800

height of the dashboard in pixels

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard([line_chart_1])
>>> d.app(sidebar_width=200, width=1000, height=450)
property charts#

Charts in the dashboard as a dictionary.

export()#

Export the cudf.DataFrame based on the current filtered state of the dashboard.

Also prints the query string of the current state of the dashboard.

Returns:
cudf.DataFrame based on the current filtered state of the dashboard.

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> line_chart_2 = bokeh.bar(
>>>     'val', 'key', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard(
>>>    [line_chart_1, line_chart_2],
>>>    layout=cuxfilter.layouts.double_feature
>>> )
>>> # d.app() for serving within notebook cell,
>>> # d.show() for serving as a separate web-app
>>> d.app() #or d.show()
displays interactive dashboard
>>> queried_df = d.export()
final query 2<=key<=4
show(notebook_url='http://localhost:8888', port=0, threaded=False, service_proxy=None, sidebar_width=280, height=800, **kwargs)#

Run the dashboard with a bokeh backend server within the notebook.

Parameters:
notebook_url: str, optional, default localhost:8888
  • URL where you want to run the dashboard as a web-app,

including the port number.

  • Can use localhost instead of ip if running locally.

port: int, optional

Has to be an open port

service_proxy: str, optional, default None,

available options: jupyterhub

threaded: boolean, optional, default False

whether to run the server in threaded mode

sidebar_width: int, optional, default 280

width of the sidebar in pixels

height: int, optional, default 800

height of the dashboard in pixels

**kwargs: dict, optional

additional keyword arguments to pass to the server

Examples

>>> import cudf
>>> import cuxfilter
>>> from cuxfilter.charts import bokeh
>>> df = cudf.DataFrame(
>>>     {
>>>         'key': [0, 1, 2, 3, 4],
>>>         'val':[float(i + 10) for i in range(5)]
>>>     }
>>> )
>>> cux_df = cuxfilter.DataFrame.from_dataframe(df)
>>> line_chart_1 = bokeh.line(
>>>     'key', 'val', data_points=5, add_interaction=False
>>> )
>>> d = cux_df.dashboard([line_chart_1])
>>> d.show(url='localhost:8889')
stop()#

stop the bokeh server