cuSpatial API Reference¶
GIS¶

cuspatial.
directed_hausdorff_distance
(xs, ys, space_offsets)¶ Compute the directed Hausdorff distances between all pairs of spaces.
 Parameters
 xs
column of xcoordinates
 ys
column of ycoordinates
 space_offsets
beginning index of each space, plus the last space’s end offset.
 Returns
 resultcudf.DataFrame
The pairwise directed distance matrix with one row and one column per input space; the value at row i, column j represents the hausdorff distance from space i to space j.
Examples
The directed Hausdorff distance from one space to another is the greatest of all the distances between any point in the first space to the closest point in the second.
Consider a pair of lines on a grid:
: x xyy : :
x_{0} = (0, 0), x_{1} = (0, 1)
y_{0} = (1, 0), y_{1} = (2, 0)
x_{0} is the closest point in
x
toy
. The distance from x_{0} to the farthest point iny
is 2.y_{0} is the closest point in
y
tox
. The distance from y_{0} to the farthest point inx
is 1.414.Compute the directed hausdorff distances between a set of spaces
>>> result = cuspatial.directed_hausdorff_distance( [0, 1, 0, 0], # xs [0, 0, 1, 2], # ys [0, 2, 4], # space_offsets ) >>> print(result) 0 1 0 0.0 1.414214 1 2.0 0.000000

cuspatial.
haversine_distance
(p1_lon, p1_lat, p2_lon, p2_lat)¶ Compute the haversine distances between an arbitrary list of lon/lat pairs
 Parameters
 p1_lon
longitude of first set of coords
 p1_lat
latitude of first set of coords
 p2_lon
longitude of second set of coords
 p2_lat
latitude of second set of coords
 Returns
 resultcudf.Series
The distance between all pairs of lon/lat coordinates

cuspatial.
lonlat_to_cartesian
(origin_lon, origin_lat, input_lon, input_lat)¶ Convert lon/lat to
x,y
coordinates with respect to an origin lon/lat point. Results are scaled relative to the size of the Earth in kilometers. Parameters
 origin_lon
number
longitude offset (this is subtracted from each input before converting to x,y)
 origin_lat
number
latitude offset (this is subtracted from each input before converting to x,y)
 input_lon
Series
orlist
longitude coordinates to convert to x
 input_lat
Series
orlist
latitude coordinates to convert to y
 origin_lon
 Returns
 resultcudf.DataFrame
 xcudf.Series
xcoordinate of the input relative to the size of the Earth in kilometers.
 ycudf.Series
ycoordinate of the input relative to the size of the Earth in kilometers.

cuspatial.
point_in_polygon
(test_points_x, test_points_y, poly_offsets, poly_ring_offsets, poly_points_x, poly_points_y)¶ Compute from a set of points and a set of polygons which points fall within which polygons. Note that polygons_(x,y) must be specified as closed polygons: the first and last coordinate of each polygon must be the same.
 Parameters
 test_points_x
xcoordinate of test points
 test_points_y
ycoordinate of test points
 poly_offsets
beginning index of the first ring in each polygon
 poly_ring_offsets
beginning index of the first point in each ring
 poly_points_x
x closedcoordinate of polygon points
 poly_points_y
y closedcoordinate of polygon points
 Returns
 resultcudf.DataFrame
A DataFrame of boolean values indicating whether each point falls within each polygon.
Examples
Test whether 3 points fall within either of two polygons
>>> result = cuspatial.point_in_polygon( [0, 8, 6.0], # test_points_x [0, 8, 6.0], # test_points_y cudf.Series([0, 1], index=['nyc', 'hudson river']), # poly_offsets [0, 3], # ring_offsets [10, 5, 5, 10, 0, 10, 10, 0], # poly_points_x [10, 10, 5, 5, 0, 0, 10, 10], # poly_points_y ) # The result of point_in_polygon is a DataFrame of Boolean # values indicating whether each point (rows) falls within # each polygon (columns). >>> print(result) nyc hudson river 0 True True 1 True False 2 False True # Point 0: (0, 0) falls in both polygons # Point 1: (8, 8) falls in the first polygon # Point 2: (6.0, 6.0) falls in the second polygon
note input Series x and y will not be index aligned, but computed as sequential arrays.

cuspatial.
polygon_bounding_boxes
(poly_offsets, ring_offsets, xs, ys)¶ Compute the minimum boundingboxes for a set of polygons.
 Parameters
 poly_offsets
Begin indices of the first ring in each polygon (i.e. prefixsum)
 ring_offsets
Begin indices of the first point in each ring (i.e. prefixsum)
 xs
Polygon point xcoordinates
 ys
Polygon point ycoordinates
 Returns
 resultcudf.DataFrame
minimum bounding boxes for each polygon
 x_mincudf.Series
the minimum xcoordinate of each bounding box
 y_mincudf.Series
the minimum ycoordinate of each bounding box
 x_maxcudf.Series
the maximum xcoordinate of each bounding box
 y_maxcudf.Series
the maximum ycoordinate of each bounding box

cuspatial.
polyline_bounding_boxes
(poly_offsets, xs, ys, expansion_radius)¶ Compute the minimum boundingboxes for a set of polylines.
 Parameters
 poly_offsets
Begin indices of the first ring in each polyline (i.e. prefixsum)
 xs
Polyline point xcoordinates
 ys
Polyline point ycoordinates
 expansion_radius
radius of each polyline point
 Returns
 resultcudf.DataFrame
minimum bounding boxes for each polyline
 x_mincudf.Series
the minimum xcoordinate of each bounding box
 y_mincudf.Series
the minimum ycoordinate of each bounding box
 x_maxcudf.Series
the maximum xcoordinate of each bounding box
 y_maxcudf.Series
the maximum ycoordinate of each bounding box
Indexing¶

cuspatial.
quadtree_on_points
(xs, ys, x_min, x_max, y_min, y_max, scale, max_depth, min_size)¶  Construct a quadtree from a set of points for a given areaofinterest
bounding box.
 Parameters
 xs
Column of xcoordinates for each point
 ys
Column of ycoordinates for each point
 x_min
The lowerleft xcoordinate of the area of interest bounding box
 x_max
The upperright xcoordinate of the area of interest bounding box
 y_min
The lowerleft ycoordinate of the area of interest bounding box
 y_max
The upperright ycoordinate of the area of interest bounding box
 scale
Scale to apply to each point’s distance from
(x_min, y_min)
 max_depth
Maximum quadtree depth
 min_size
Minimum number of points for a nonleaf quadtree node
 Returns
 resulttuple (cudf.Series, cudf.DataFrame)
 keys_to_pointscudf.Series(dtype=np.int32)
A column of sorted keys to original point indices
 quadtreecudf.DataFrame
A complete quadtree for the set of input points
 keycudf.Series(dtype=np.int32)
An int32 column of quadrant keys
 levelcudf.Series(dtype=np.int8)
An int8 column of quadtree levels
 is_quadcudf.Series(dtype=np.bool_)
A boolean column indicating whether the node is a quad or leaf
 lengthcudf.Series(dtype=np.int32)
If this is a nonleaf quadrant (i.e.
is_quad
isTrue
), this column’s value is the number of children in the nonleaf quadrant.Otherwise this column’s value is the number of points contained in the leaf quadrant.
 offsetcudf.Series(dtype=np.int32)
If this is a nonleaf quadrant (i.e.
is_quad
isTrue
), this column’s value is the position of the nonleaf quadrant’s first child.Otherwise this column’s value is the position of the leaf quadrant’s first point.
Notes
Swaps
min_x
andmax_x
ifmin_x > max_x
Swaps
min_y
andmax_y
ifmin_y > max_y
Examples
An example of selecting the
min_size
andscale
based on input:>>> np.random.seed(0) >>> points = cudf.DataFrame({ "x": cudf.Series(np.random.normal(size=120)) * 500, "y": cudf.Series(np.random.normal(size=120)) * 500, }) >>> max_depth = 3 >>> min_size = 50 >>> min_x, min_y, max_x, max_y = (points["x"].min(), points["y"].min(), points["x"].max(), points["y"].max()) >>> scale = max(max_x  min_x, max_y  min_y) // (1 << max_depth) >>> print( "min_size: " + str(min_size) + "\n" "num_points: " + str(len(points)) + "\n" "min_x: " + str(min_x) + "\n" "max_x: " + str(max_x) + "\n" "min_y: " + str(min_y) + "\n" "max_y: " + str(max_y) + "\n" "scale: " + str(scale) + "\n" ) min_size: 50 num_points: 120 min_x: 1577.4949079170394 max_x: 1435.877311993804 min_y: 1412.7015761122134 max_y: 1492.572387431971 scale: 301.0 >>> key_to_point, quadtree = cuspatial.quadtree_on_points( points["x"], points["y"], min_x, max_x, min_y, max_y, scale, max_depth, min_size ) >>> print(quadtree) key level is_quad length offset 0 0 0 False 15 0 1 1 0 False 27 15 2 2 0 False 12 42 3 3 0 True 4 8 4 4 0 False 5 106 5 6 0 False 6 111 6 9 0 False 2 117 7 12 0 False 1 119 8 12 1 False 22 54 9 13 1 False 18 76 10 14 1 False 9 94 11 15 1 False 3 103 >>> print(key_to_point) 0 63 1 20 2 33 3 66 4 19 ... 115 113 116 3 117 78 118 98 119 24 Length: 120, dtype: int32
Interpolation¶

class
cuspatial.
CubicSpline
(t, y, ids=None, size=None, prefixes=None)¶ Fits each column of the input Series y to a hermetic cubic spline.
cuspatial.CubicSpline
supports two usage patterns: The first is identical to scipy.interpolate.CubicSpline:curve = cuspatial.CubicSpline(t, y) new_points = curve(np.linspace(t.min, t.max, 50))
This allows API parity with scipy. This isn’t recommended, as scipy host based interpolation performance is likely to exceed GPU performance for a single curve.
However, cuSpatial massively outperforms scipy when many splines are fit simultaneously. Data must be arranged in a SoA format, and the exclusive prefix_sum of the separate curves must also be passed to the function.:
t = cudf.Series(np.repeat(cp.arange(100), 1000)).astype('float32') y = cudf.Series(np.random.random(100*1000)).astype('float32') prefix_sum = cudf.Series(cp.arange(1000)*100).astype('int32') new_samples = cudf.Series( np.repeat(np.linspace(0, 100, 1000), 1000) ).astype('float32') curve = cuspatial.CubicSpline(t, y, prefixes=prefix_sum) new_points = curve(new_samples, prefix_sum*10)
Methods
__call__
(coordinates[, groups])Interpolates new input values coordinates using the .c DataFrame or map of DataFrames.

CubicSpline.
__init__
(t, y, ids=None, size=None, prefixes=None)¶ Computes various error preconditions on the input data, then uses CUDA to compute cubic splines for each set of input coordinates on the GPU in parallel.
 Parameters
 tcudf.Series
time sample values. Must be monotonically increasing.
 ycudf.Series
columns to have curves fit to according to x
 ids (Optional)cudf.Series
ids of each spline
 size (Optional)cudf.Series
fixed size of each spline
 prefixes (Optional)cudf.Series
alternative to size, allows splines of varying length. Not yet fully supported.
 Returns
 CubicSplinecallable o
o.c
contains the coefficients that can be used to compute new points along the spline fitting the originalt
data.o(n)
interpolates the spline coordinates along new input valuesn
.

CubicSpline.
__call__
(coordinates, groups=None)¶ Interpolates new input values coordinates using the .c DataFrame or map of DataFrames.
Spatial Querying¶

cuspatial.
points_in_spatial_window
(min_x, max_x, min_y, max_y, xs, ys)¶ Return only the subset of coordinates that fall within a rectangular window.
A point (x, y) is inside the query window if and only if
min_x < x < max_x AND min_y < y < max_y
The window is specified by minimum and maximum x and y coordinates.
 Parameters
 min_x
lower xcoordinate of the query window
 max_x
upper xcoordinate of the query window
 min_y
lower ycoordinate of the query window
 max_y
upper ycoordinate of the query window
 xs
column of xcoordinates that may fall within the window
 ys
column of ycoordinates that may fall within the window
 Returns
 resultcudf.DataFrame
subset of (x, y) pairs above that fall within the window
Notes
Swaps
min_x
andmax_x
ifmin_x > max_x
Swaps
min_y
andmax_y
ifmin_y > max_y
Spatial Joining¶

cuspatial.
join_quadtree_and_bounding_boxes
(quadtree, poly_bounding_boxes, x_min, x_max, y_min, y_max, scale, max_depth)¶ Search a quadtree for polygon or polyline bounding box intersections.
 Parameters
 quadtreecudf.DataFrame
A complete quadtree for a given areaofinterest bounding box.
 poly_bounding_boxescudf.DataFrame
Minimum bounding boxes for a set of polygons or polylines
 x_min
The lowerleft xcoordinate of the area of interest bounding box
 x_max
The upperright xcoordinate of the area of interest bounding box
 min_y
The lowerleft ycoordinate of the area of interest bounding box
 max_y
The upperright ycoordinate of the area of interest bounding box
 scale
Scale to apply to each point’s distance from
(x_min, y_min)
 max_depth
Maximum quadtree depth at which to stop testing for intersections
 Returns
 resultcudf.DataFrame
Indices for each intersecting bounding box and leaf quadrant.
 poly_offsetcudf.Series
Indices for each poly bbox that intersects with the quadtree.
 quad_offsetcudf.Series
Indices for each leaf quadrant intersecting with a poly bbox.
Notes
Swaps
min_x
andmax_x
ifmin_x > max_x
Swaps
min_y
andmax_y
ifmin_y > max_y

cuspatial.
quadtree_point_in_polygon
(poly_quad_pairs, quadtree, point_indices, points_x, points_y, poly_offsets, ring_offsets, poly_points_x, poly_points_y)¶ Test whether the specified points are inside any of the specified polygons.
Uses the table of (polygon, quadrant) pairs returned by
cuspatial.join_quadtree_and_bounding_boxes
to ensure only the points in the same quadrant as each polygon are tested for intersection.This prefiltering can dramatically reduce number of points tested per polygon, enabling faster intersectiontesting at the expense of extra memory allocated to store the quadtree and sorted point_indices.
 Parameters
 poly_quad_pairs: cudf.DataFrame
Table of (polygon, quadrant) index pairs returned by
cuspatial.join_quadtree_and_bounding_boxes
. quadtreecudf.DataFrame
A complete quadtree for a given areaofinterest bounding box.
 point_indicescudf.Series
Sorted point indices returned by
cuspatial.quadtree_on_points
 points_xcudf.Series
xcoordinates of points used to construct the quadtree.
 points_ycudf.Series
ycoordinates of points used to construct the quadtree.
 poly_offsetscudf.Series
Begin index of the first ring in each polygon.
 ring_offsetscudf.Series
Begin index of the first point in each ring.
 poly_points_xcudf.Series
Polygon point xcoodinates.
 poly_points_ycudf.Series
Polygon point ycoodinates.
 Returns
 resultcudf.DataFrame
Indices for each intersecting point and polygon pair.
 polygon_indexcudf.Series
Indices of each polygon with which a point intersected.
 point_indexcudf.Series
Indices of each point that intersects with a polygon.

cuspatial.
quadtree_point_to_nearest_polyline
(poly_quad_pairs, quadtree, point_indices, points_x, points_y, poly_offsets, poly_points_x, poly_points_y)¶ Finds the nearest polyline to each point in a quadrant, and computes the distances between each point and polyline.
Uses the table of (polyline, quadrant) pairs returned by
cuspatial.join_quadtree_and_bounding_boxes
to ensure distances are computed only for the points in the same quadrant as each polyline. Parameters
 poly_quad_pairs: cudf.DataFrame
Table of (polyline, quadrant) index pairs returned by
cuspatial.join_quadtree_and_bounding_boxes
. quadtreecudf.DataFrame
A complete quadtree for a given areaofinterest bounding box.
 point_indicescudf.Series
Sorted point indices returned by
cuspatial.quadtree_on_points
 points_xcudf.Series
xcoordinates of points used to construct the quadtree.
 points_ycudf.Series
ycoordinates of points used to construct the quadtree.
 poly_offsetscudf.Series
Begin index of the first point in each polyline.
 poly_points_xcudf.Series
Polyline point xcoodinates.
 poly_points_ycudf.Series
Polyline point ycoodinates.
 Returns
 resultcudf.DataFrame
Indices for each point and its nearest polyline, and the distance between the two.
 point_indexcudf.Series
Indices of each point that intersects with a polyline.
 polyline_indexcudf.Series
Indices of each polyline with which a point intersected.
 distancecudf.Series
Distances between each point and its nearest polyline.
Trajectory¶

cuspatial.
derive_trajectories
(object_ids, xs, ys, timestamps)¶ Derive trajectories from object ids, points, and timestamps.
 Parameters
 object_ids
column of object (e.g., vehicle) ids
 xs
column of xcoordinates (in kilometers)
 ys
column of ycoordinates (in kilometers)
 timestamps
column of timestamps in any resolution
 Returns
 resulttuple (objects, traj_offsets)
 objectscudf.DataFrame
object_ids, xs, ys, and timestamps sorted by
(object_id, timestamp)
, used bytrajectory_bounding_boxes
andtrajectory_distances_and_speeds
 traj_offsetscudf.Series
offsets of discovered trajectories
Examples
Compute sorted objects and discovered trajectories
>>> objects, traj_offsets = cuspatial.derive_trajectories( [0, 1, 2, 3], # object_id [0, 0, 1, 1], # x [0, 0, 1, 1], # y [0, 10, 0, 10] # timestamp ) >>> print(traj_offsets) 0 0 1 2 >>> print(objects) object_id x y timestamp 0 0 1 0 0 1 0 0 0 10 2 1 3 1 0 3 1 2 1 10

cuspatial.
trajectory_bounding_boxes
(num_trajectories, object_ids, xs, ys)¶ Compute the bounding boxes of sets of trajectories.
 Parameters
 num_trajectories
number of trajectories (unique object ids)
 object_ids
column of object (e.g., vehicle) ids
 xs
column of xcoordinates (in kilometers)
 ys
column of ycoordinates (in kilometers)
 Returns
 resultcudf.DataFrame
minimum bounding boxes (in kilometers) for each trajectory
 x_mincudf.Series
the minimum xcoordinate of each bounding box
 y_mincudf.Series
the minimum ycoordinate of each bounding box
 x_maxcudf.Series
the maximum xcoordinate of each bounding box
 y_maxcudf.Series
the maximum ycoordinate of each bounding box
Examples
Compute the minimum bounding boxes of derived trajectories
>>> objects, traj_offsets = trajectory.derive_trajectories( [0, 0, 1, 1], # object_id [0, 1, 2, 3], # x [0, 0, 1, 1], # y [0, 10, 0, 10] # timestamp ) >>> traj_bounding_boxes = cuspatial.trajectory_bounding_boxes( len(traj_offsets), objects['object_id'], objects['x'], objects['y'] ) >>> print(traj_bounding_boxes) x_min y_min x_max y_max 0 0.0 0.0 2.0 2.0 1 1.0 1.0 3.0 3.0

cuspatial.
trajectory_distances_and_speeds
(num_trajectories, object_ids, xs, ys, timestamps)¶ Compute the distance traveled and speed of sets of trajectories
 Parameters
 num_trajectories
number of trajectories (unique object ids)
 object_ids
column of object (e.g., vehicle) ids
 xs
column of xcoordinates (in kilometers)
 ys
column of ycoordinates (in kilometers)
 timestamps
column of timestamps in any resolution
 Returns
 resultcudf.DataFrame
 meterscudf.Series
trajectory distance (in kilometers)
 speedcudf.Series
trajectory speed (in meters/second)
Examples
Compute the distances and speeds of derived trajectories
>>> objects, traj_offsets = cuspatial.derive_trajectories(...) >>> dists_and_speeds = cuspatial.trajectory_distances_and_speeds( len(traj_offsets) objects['object_id'], objects['x'], objects['y'], objects['timestamp'] ) >>> print(dists_and_speeds) distance speed trajectory_id 0 1000.0 100000.000000 1 1000.0 111111.109375
IO¶

cuspatial.
read_polygon_shapefile
(filename)¶ Reads polygon geometry from an ESRI shapefile into GPU memory.
 Parameters
 filenamestr, pathlike
ESRI Shapefile file path (usually ends in
.shp
)
 Returns
 resulttuple (cudf.Series, cudf.Series, cudf.DataFrame)
 poly_offsetscudf.Series(dtype=np.int32)
Offsets of the first ring in each polygon
 ring_offsetscudf.Series(dtype=np.int32)
Offsets of the first point in each ring
 pointscudf.DataFrame
 DataFrame of all points in the shapefile
 xcudf.Series(dtype=np.float64)
xcomponents of each polygon’s points
 ycudf.Series(dtype=np.float64)
ycomponents of each polygon’s points