# cuSpatial API Reference¶

## GIS¶

`gis.``directed_hausdorff_distance`(ys, points_per_space)

Compute the directed Hausdorff distances between all pairs of spaces.

Parameters
xs

column of x-coordinates

ys

column of y-coordinates

points_per_space

number of points in each space

Returns
resultcudf.DataFrame

The pairwise directed distance matrix with one row and one column per input space; the value at row i, column j represents the hausdorff distance from space i to space j.

Examples

The directed Hausdorff distance from one space to another is the greatest of all the distances between any point in the first space to the closest point in the second.

Wikipedia

Consider a pair of lines on a grid:

```     :
x
-----xyy---
:
:
```

x0 = (0, 0), x1 = (0, 1)

y0 = (1, 0), y1 = (2, 0)

x0 is the closest point in `x` to `y`. The distance from x0 to the farthest point in `y` is 2.

y0 is the closest point in `y` to `x`. The distance from y0 to the farthest point in `x` is 1.414.

Compute the directed hausdorff distances between a set of spaces

```>>> result = cuspatial.directed_hausdorff_distance(
[0, 1, 0, 0], # xs
[0, 0, 1, 2], # ys
[2,    2],    # points_per_space
)
>>> print(result)
0         1
0  0.0  1.414214
1  2.0  0.000000
```
`gis.``haversine_distance`(p1_lat, p2_lon, p2_lat)

Compute the haversine distances between an arbitrary list of lon/lat pairs

Parameters
p1_lon

longitude of first set of coords

p1_lat

latitude of first set of coords

p2_lon

longitude of second set of coords

p2_lat

latitude of second set of coords

Returns
resultcudf.Series

The distance between all pairs of lon/lat coordinates

`gis.``lonlat_to_cartesian`(origin_lat, input_lon, input_lat)

Convert lon/lat to `x,y` coordinates with respect to an origin lon/lat point. Results are scaled relative to the size of the Earth in kilometers.

Parameters
origin_lon`number`

longitude offset (this is subtracted from each input before converting to x,y)

origin_lat`number`

latitude offset (this is subtracted from each input before converting to x,y)

input_lon`Series` or `list`

longitude coordinates to convert to x

input_lat`Series` or `list`

latitude coordinates to convert to y

Returns
resultcudf.DataFrame
xcudf.Series

x-coordinate of the input relative to the size of the Earth in kilometers.

ycudf.Series

y-coordinate of the input relative to the size of the Earth in kilometers.

`gis.``point_in_polygon`(test_points_y, poly_offsets, poly_ring_offsets, poly_points_x, poly_points_y)

Compute from a set of points and a set of polygons which points fall within which polygons. Note that polygons_(x,y) must be specified as closed polygons: the first and last coordinate of each polygon must be the same.

Parameters
test_points_x

x-coordinate of test points

test_points_y

y-coordinate of test points

poly_offsets

beginning index of the first ring in each polygon

poly_ring_offsets

beginning index of the first point in each ring

poly_points_x

x closed-coordinate of polygon points

poly_points_y

y closed-coordinate of polygon points

Returns
resultcudf.DataFrame

A DataFrame of boolean values indicating whether each point falls within each polygon.

Examples

Test whether 3 points fall within either of two polygons

```>>> result = cuspatial.point_in_polygon(
[0, -8, 6.0],                             # test_points_x
[0, -8, 6.0],                             # test_points_y
cudf.Series([0, 1], index=['nyc', 'hudson river']), # poly_offsets
[0, 3],                                   # ring_offsets
[-10, 5, 5, -10, 0, 10, 10, 0],           # poly_points_x
[-10, -10, 5, 5, 0, 0, 10, 10],           # poly_points_y
)
# The result of point_in_polygon is a DataFrame of Boolean
# values indicating whether each point (rows) falls within
# each polygon (columns).
>>> print(result)
nyc            hudson river
0          True          True
1          True         False
2         False          True
# Point 0: (0, 0) falls in both polygons
# Point 1: (-8, -8) falls in the first polygon
# Point 2: (6.0, 6.0) falls in the second polygon
```

note input Series x and y will not be index aligned, but computed as sequential arrays.

`gis.``polygon_bounding_boxes`(ring_offsets, xs, ys)

Compute the minimum bounding-boxes for a set of polygons.

Parameters
poly_offsets

Begin indices of the first ring in each polygon (i.e. prefix-sum)

ring_offsets

Begin indices of the first point in each ring (i.e. prefix-sum)

xs

Polygon point x-coordinates

ys

Polygon point y-coordinates

Returns
resultcudf.DataFrame

minimum bounding boxes for each polygon

x_mincudf.Series

the minimum x-coordinate of each bounding box

y_mincudf.Series

the minimum y-coordinate of each bounding box

x_maxcudf.Series

the maximum x-coordinate of each bounding box

y_maxcudf.Series

the maximum y-coordinate of each bounding box

`gis.``polyline_bounding_boxes`(xs, ys, expansion_radius)

Compute the minimum bounding-boxes for a set of polylines.

Parameters
poly_offsets

Begin indices of the first ring in each polyline (i.e. prefix-sum)

xs

Polyline point x-coordinates

ys

Polyline point y-coordinates

Returns
resultcudf.DataFrame

minimum bounding boxes for each polyline

x_mincudf.Series

the minimum x-coordinate of each bounding box

y_mincudf.Series

the minimum y-coordinate of each bounding box

x_maxcudf.Series

the maximum x-coordinate of each bounding box

y_maxcudf.Series

the maximum y-coordinate of each bounding box

## Indexing¶

`indexing.``quadtree_on_points`(ys, x_min, x_max, y_min, y_max, scale, max_depth, min_size)
Construct a quadtree from a set of points for a given area-of-interest

bounding box.

Parameters
xs

Column of x-coordinates for each point

ys

Column of y-coordinates for each point

x_min

The lower-left x-coordinate of the area of interest bounding box

x_max

The upper-right x-coordinate of the area of interest bounding box

y_min

The lower-left y-coordinate of the area of interest bounding box

y_max

The upper-right y-coordinate of the area of interest bounding box

scale

Scale to apply to each point’s distance from `(x_min, y_min)`

max_depth

min_size

Minimum number of points for a non-leaf quadtree node

Returns
resulttuple (cudf.Series, cudf.DataFrame)
keys_to_pointscudf.Series(dtype=np.int32)

A column of sorted keys to original point indices

A complete quadtree for the set of input points

keycudf.Series(dtype=np.int32)

An int32 column of quadrant keys

levelcudf.Series(dtype=np.int8)

An int8 column of quadtree levels

A boolean column indicating whether the node is a quad or leaf

lengthcudf.Series(dtype=np.int32)

If this is a non-leaf quadrant (i.e. `is_quad` is `True`), this column’s value is the number of children in the non-leaf quadrant.

Otherwise this column’s value is the number of points contained in the leaf quadrant.

offsetcudf.Series(dtype=np.int32)

If this is a non-leaf quadrant (i.e. `is_quad` is `True`), this column’s value is the position of the non-leaf quadrant’s first child.

Otherwise this column’s value is the position of the leaf quadrant’s first point.

Notes

• Swaps `min_x` and `max_x` if `min_x > max_x`

• Swaps `min_y` and `max_y` if `min_y > max_y`

Examples

An example of selecting the `min_size` and `scale` based on input:

```>>> np.random.seed(0)
>>> points = cudf.DataFrame({
"x": cudf.Series(np.random.normal(size=120)) * 500,
"y": cudf.Series(np.random.normal(size=120)) * 500,
})

>>> max_depth = 3
>>> min_size = 50
>>> min_x, min_y, max_x, max_y = (points["x"].min(),
points["y"].min(),
points["x"].max(),
points["y"].max())
>>> scale = max(max_x - min_x, max_y - min_y) // (1 << max_depth)
>>> print(
"min_size:   " + str(min_size) + "\n"
"num_points: " + str(len(points)) + "\n"
"min_x:      " + str(min_x) + "\n"
"max_x:      " + str(max_x) + "\n"
"min_y:      " + str(min_y) + "\n"
"max_y:      " + str(max_y) + "\n"
"scale:      " + str(scale) + "\n"
)
min_size:   50
num_points: 120
min_x:      -1577.4949079170394
max_x:      1435.877311993804
min_y:      -1412.7015761122134
max_y:      1492.572387431971
scale:      301.0

points["x"],
points["y"],
min_x,
max_x,
min_y,
max_y,
scale, max_depth, min_size
)

0     0      0    False      15       0
1     1      0    False      27      15
2     2      0    False      12      42
3     3      0     True       4       8
4     4      0    False       5     106
5     6      0    False       6     111
6     9      0    False       2     117
7    12      0    False       1     119
8    12      1    False      22      54
9    13      1    False      18      76
10   14      1    False       9      94
11   15      1    False       3     103

>>> print(key_to_point)
0       63
1       20
2       33
3       66
4       19
...
115    113
116      3
117     78
118     98
119     24
Length: 120, dtype: int32
```

## Interpolation¶

class `cuspatial.core.interpolate.``CubicSpline`(t, y, ids=None, size=None, prefixes=None)

Fits each column of the input Series y to a hermetic cubic spline.

`cuspatial.CubicSpline` supports two usage patterns: The first is identical to scipy.interpolate.CubicSpline:

```curve = cuspatial.CubicSpline(t, y)
new_points = curve(np.linspace(t.min, t.max, 50))
```

This allows API parity with scipy. This isn’t recommended, as scipy host based interpolation performance is likely to exceed GPU performance for a single curve.

However, cuSpatial massively outperforms scipy when many splines are fit simultaneously. Data must be arranged in a SoA format, and the exclusive prefix_sum of the separate curves must also be passed to the function.:

```t = cudf.Series(np.repeat(cp.arange(100), 1000)).astype('float32')
y = cudf.Series(np.random.random(100*1000)).astype('float32')
prefix_sum = cudf.Series(cp.arange(1000)*100).astype('int32')
new_samples = cudf.Series(
np.repeat(np.linspace(0, 100, 1000), 1000)
).astype('float32')

curve = cuspatial.CubicSpline(t, y, prefixes=prefix_sum)
new_points = curve(new_samples, prefix_sum*10)
```

Methods

 `__call__`(coordinates[, groups]) Interpolates new input values coordinates using the .c DataFrame or map of DataFrames.
`CubicSpline.``__init__`(t, y, ids=None, size=None, prefixes=None)

Computes various error preconditions on the input data, then uses CUDA to compute cubic splines for each set of input coordinates on the GPU in parallel.

Parameters
tcudf.Series

time sample values. Must be monotonically increasing.

ycudf.Series

columns to have curves fit to according to x

ids (Optional)cudf.Series

ids of each spline

size (Optional)cudf.Series

fixed size of each spline

prefixes (Optional)cudf.Series

alternative to size, allows splines of varying length. Not yet fully supported.

Returns
CubicSplinecallable o

`o.c` contains the coefficients that can be used to compute new points along the spline fitting the original `t` data. `o(n)` interpolates the spline coordinates along new input values `n`.

`CubicSpline.``__call__`(coordinates, groups=None)

Interpolates new input values coordinates using the .c DataFrame or map of DataFrames.

## Spatial Querying¶

`spatial_window.``points_in_spatial_window`(max_x, min_y, max_y, xs, ys)

Return only the subset of coordinates that fall within a rectangular window.

A point (x, y) is inside the query window if and only if `min_x < x < max_x AND min_y < y < max_y`

The window is specified by minimum and maximum x and y coordinates.

Parameters
min_x

lower x-coordinate of the query window

max_x

upper x-coordinate of the query window

min_y

lower y-coordinate of the query window

max_y

upper y-coordinate of the query window

xs

column of x-coordinates that may fall within the window

ys

column of y-coordinates that may fall within the window

Returns
resultcudf.DataFrame

subset of (x, y) pairs above that fall within the window

Notes

• Swaps `min_x` and `max_x` if `min_x > max_x`

• Swaps `min_y` and `max_y` if `min_y > max_y`

## Spatial Joining¶

`spatial_join.``join_quadtree_and_bounding_boxes`(poly_bounding_boxes, x_min, x_max, y_min, y_max, scale, max_depth)

Search a quadtree for polygon or polyline bounding box intersections.

Parameters

A complete quadtree for a given area-of-interest bounding box.

poly_bounding_boxescudf.DataFrame

Minimum bounding boxes for a set of polygons or polylines

x_min

The lower-left x-coordinate of the area of interest bounding box

x_max

The upper-right x-coordinate of the area of interest bounding box

min_y

The lower-left y-coordinate of the area of interest bounding box

max_y

The upper-right y-coordinate of the area of interest bounding box

scale

Scale to apply to each point’s distance from `(x_min, y_min)`

max_depth

Maximum quadtree depth at which to stop testing for intersections

Returns
resultcudf.DataFrame

Indices for each intersecting bounding box and leaf quadrant.

poly_offsetcudf.Series

Indices for each poly bbox that intersects with the quadtree.

Indices for each leaf quadrant intersecting with a poly bbox.

Notes

• Swaps `min_x` and `max_x` if `min_x > max_x`

• Swaps `min_y` and `max_y` if `min_y > max_y`

`spatial_join.``quadtree_point_in_polygon`(quadtree, point_indices, points_x, points_y, poly_offsets, ring_offsets, poly_points_x, poly_points_y)

Test whether the specified points are inside any of the specified polygons.

Uses the table of (polygon, quadrant) pairs returned by `cuspatial.join_quadtree_and_bounding_boxes` to ensure only the points in the same quadrant as each polygon are tested for intersection.

This pre-filtering can dramatically reduce number of points tested per polygon, enabling faster intersection-testing at the expense of extra memory allocated to store the quadtree and sorted point_indices.

Parameters

Table of (polygon, quadrant) index pairs returned by `cuspatial.join_quadtree_and_bounding_boxes`.

A complete quadtree for a given area-of-interest bounding box.

point_indicescudf.Series

Sorted point indices returned by `cuspatial.quadtree_on_points`

points_xcudf.Series

x-coordinates of points used to construct the quadtree.

points_ycudf.Series

y-coordinates of points used to construct the quadtree.

poly_offsetscudf.Series

Begin index of the first ring in each polygon.

ring_offsetscudf.Series

Begin index of the first point in each ring.

poly_points_xcudf.Series

Polygon point x-coodinates.

poly_points_ycudf.Series

Polygon point y-coodinates.

Returns
resultcudf.DataFrame

Indices for each intersecting point and polygon pair.

point_offsetcudf.Series

Indices of each point that intersects with a polygon.

polygon_offsetcudf.Series

Indices of each polygon with which a point intersected.

`spatial_join.``quadtree_point_to_nearest_polyline`(quadtree, point_indices, points_x, points_y, poly_offsets, poly_points_x, poly_points_y)

Finds the nearest polyline to each point in a quadrant, and computes the distances between each point and polyline.

Uses the table of (polyline, quadrant) pairs returned by `cuspatial.join_quadtree_and_bounding_boxes` to ensure distances are computed only for the points in the same quadrant as each polyline.

Parameters

Table of (polyline, quadrant) index pairs returned by `cuspatial.join_quadtree_and_bounding_boxes`.

A complete quadtree for a given area-of-interest bounding box.

point_indicescudf.Series

Sorted point indices returned by `cuspatial.quadtree_on_points`

points_xcudf.Series

x-coordinates of points used to construct the quadtree.

points_ycudf.Series

y-coordinates of points used to construct the quadtree.

poly_offsetscudf.Series

Begin index of the first point in each polyline.

poly_points_xcudf.Series

Polyline point x-coodinates.

poly_points_ycudf.Series

Polyline point y-coodinates.

Returns
resultcudf.DataFrame

Indices for each point and its nearest polyline, and the distance between the two.

point_offsetcudf.Series

Indices of each point that intersects with a polyline.

polyline_offsetcudf.Series

Indices of each polyline with which a point intersected.

distancecudf.Series

Distances between each point and its nearest polyline.

## Trajectory¶

`trajectory.``derive_trajectories`(xs, ys, timestamps)

Derive trajectories from object ids, points, and timestamps.

Parameters
object_ids

column of object (e.g., vehicle) ids

xs

column of x-coordinates (in kilometers)

ys

column of y-coordinates (in kilometers)

timestamps

column of timestamps in any resolution

Returns
resulttuple (objects, traj_offsets)
objectscudf.DataFrame

object_ids, xs, ys, and timestamps sorted by `(object_id, timestamp)`, used by `trajectory_bounding_boxes` and `trajectory_distances_and_speeds`

traj_offsetscudf.Series

offsets of discovered trajectories

Examples

Compute sorted objects and discovered trajectories

```>>> objects, traj_offsets = cuspatial.derive_trajectories(
[0, 1, 2, 3],  # object_id
[0, 0, 1, 1],  # x
[0, 0, 1, 1],  # y
[0, 10, 0, 10] # timestamp
)
>>> print(traj_offsets)
0  0
1  2
>>> print(objects)
object_id       x       y  timestamp
0          0       1       0          0
1          0       0       0         10
2          1       3       1          0
3          1       2       1         10
```
`trajectory.``trajectory_bounding_boxes`(object_ids, xs, ys)

Compute the bounding boxes of sets of trajectories.

Parameters
num_trajectories

number of trajectories (unique object ids)

object_ids

column of object (e.g., vehicle) ids

xs

column of x-coordinates (in kilometers)

ys

column of y-coordinates (in kilometers)

Returns
resultcudf.DataFrame

minimum bounding boxes (in kilometers) for each trajectory

x_mincudf.Series

the minimum x-coordinate of each bounding box

y_mincudf.Series

the minimum y-coordinate of each bounding box

x_maxcudf.Series

the maximum x-coordinate of each bounding box

y_maxcudf.Series

the maximum y-coordinate of each bounding box

Examples

Compute the minimum bounding boxes of derived trajectories

```>>> objects, traj_offsets = trajectory.derive_trajectories(
[0, 0, 1, 1],  # object_id
[0, 1, 2, 3],  # x
[0, 0, 1, 1],  # y
[0, 10, 0, 10] # timestamp
)
>>> traj_bounding_boxes = cuspatial.trajectory_bounding_boxes(
len(traj_offsets),
objects['object_id'],
objects['x'],
objects['y']
)
>>> print(traj_bounding_boxes)
x_min   y_min   x_max   y_max
0     0.0     0.0     2.0     2.0
1     1.0     1.0     3.0     3.0
```
`trajectory.``trajectory_distances_and_speeds`(object_ids, xs, ys, timestamps)

Compute the distance traveled and speed of sets of trajectories

Parameters
num_trajectories

number of trajectories (unique object ids)

object_ids

column of object (e.g., vehicle) ids

xs

column of x-coordinates (in kilometers)

ys

column of y-coordinates (in kilometers)

timestamps

column of timestamps in any resolution

Returns
resultcudf.DataFrame
meterscudf.Series

trajectory distance (in kilometers)

speedcudf.Series

trajectory speed (in meters/second)

Examples

Compute the distances and speeds of derived trajectories

```>>> objects, traj_offsets = cuspatial.derive_trajectories(...)
>>> dists_and_speeds = cuspatial.trajectory_distances_and_speeds(
len(traj_offsets)
objects['object_id'],
objects['x'],
objects['y'],
objects['timestamp']
)
>>> print(dists_and_speeds)
distance          speed
trajectory_id
0                1000.0  100000.000000
1                1000.0  111111.109375
```

## IO¶

`shapefile.``read_polygon_shapefile`()

Reads polygon geometry from an ESRI shapefile into GPU memory.

Parameters
filenamestr, pathlike

ESRI Shapefile file path (usually ends in `.shp`)

Returns
resulttuple (cudf.Series, cudf.Series, cudf.DataFrame)
poly_offsetscudf.Series(dtype=np.int32)

Offsets of the first ring in each polygon

ring_offsetscudf.Series(dtype=np.int32)

Offsets of the first point in each ring

pointscudf.DataFrame
DataFrame of all points in the shapefile
xcudf.Series(dtype=np.float64)

x-components of each polygon’s points

ycudf.Series(dtype=np.float64)

y-components of each polygon’s points