Spatial Indexing#

Spatial indexing functions provide GPU-accelerated point-in-polygon and spatial join operations.

cuspatial.quadtree_point_in_polygon(poly_quad_pairs, quadtree, point_indices, points_x, points_y, poly_offsets, ring_offsets, poly_points_x, poly_points_y)#

Test whether the specified points are inside any of the specified polygons.

Uses the table of (polygon, quadrant) pairs returned by cuspatial.join_quadtree_and_bounding_boxes to ensure only the points in the same quadrant as each polygon are tested for intersection.

This pre-filtering can dramatically reduce number of points tested per polygon, enabling faster intersection-testing at the expense of extra memory allocated to store the quadtree and sorted point_indices.

Parameters
poly_quad_pairs: cudf.DataFrame

Table of (polygon, quadrant) index pairs returned by cuspatial.join_quadtree_and_bounding_boxes.

quadtreecudf.DataFrame

A complete quadtree for a given area-of-interest bounding box.

point_indicescudf.Series

Sorted point indices returned by cuspatial.quadtree_on_points

points_xcudf.Series

x-coordinates of points used to construct the quadtree.

points_ycudf.Series

y-coordinates of points used to construct the quadtree.

poly_offsetscudf.Series

Begin index of the first ring in each polygon.

ring_offsetscudf.Series

Begin index of the first point in each ring.

poly_points_xcudf.Series

Polygon point x-coodinates.

poly_points_ycudf.Series

Polygon point y-coodinates.

Returns
resultcudf.DataFrame

Indices for each intersecting point and polygon pair.

polygon_indexcudf.Series

Indices of each polygon with which a point intersected.

point_indexcudf.Series

Indices of each point that intersects with a polygon.

cuspatial.quadtree_point_to_nearest_polyline(poly_quad_pairs, quadtree, point_indices, points_x, points_y, poly_offsets, poly_points_x, poly_points_y)#

Finds the nearest polyline to each point in a quadrant, and computes the distances between each point and polyline.

Uses the table of (polyline, quadrant) pairs returned by cuspatial.join_quadtree_and_bounding_boxes to ensure distances are computed only for the points in the same quadrant as each polyline.

Parameters
poly_quad_pairs: cudf.DataFrame

Table of (polyline, quadrant) index pairs returned by cuspatial.join_quadtree_and_bounding_boxes.

quadtreecudf.DataFrame

A complete quadtree for a given area-of-interest bounding box.

point_indicescudf.Series

Sorted point indices returned by cuspatial.quadtree_on_points

points_xcudf.Series

x-coordinates of points used to construct the quadtree.

points_ycudf.Series

y-coordinates of points used to construct the quadtree.

poly_offsetscudf.Series

Begin index of the first point in each polyline.

poly_points_xcudf.Series

Polyline point x-coodinates.

poly_points_ycudf.Series

Polyline point y-coodinates.

Returns
resultcudf.DataFrame

Indices for each point and its nearest polyline, and the distance between the two.

point_indexcudf.Series

Indices of each point that intersects with a polyline.

polyline_indexcudf.Series

Indices of each polyline with which a point intersected.

distancecudf.Series

Distances between each point and its nearest polyline.

cuspatial.point_in_polygon(test_points_x, test_points_y, poly_offsets, poly_ring_offsets, poly_points_x, poly_points_y)#

Compute from a set of points and a set of polygons which points fall within which polygons. Note that polygons_(x,y) must be specified as closed polygons: the first and last coordinate of each polygon must be the same.

Parameters
test_points_x

x-coordinate of test points

test_points_y

y-coordinate of test points

poly_offsets

beginning index of the first ring in each polygon

poly_ring_offsets

beginning index of the first point in each ring

poly_points_x

x closed-coordinate of polygon points

poly_points_y

y closed-coordinate of polygon points

Returns
resultcudf.DataFrame

A DataFrame of boolean values indicating whether each point falls within each polygon.

Examples

Test whether 3 points fall within either of two polygons

>>> result = cuspatial.point_in_polygon(
    [0, -8, 6.0],                             # test_points_x
    [0, -8, 6.0],                             # test_points_y
    cudf.Series([0, 1], index=['nyc', 'hudson river']), # poly_offsets
    [0, 3],                                   # ring_offsets
    [-10, 5, 5, -10, 0, 10, 10, 0],           # poly_points_x
    [-10, -10, 5, 5, 0, 0, 10, 10],           # poly_points_y
)
# The result of point_in_polygon is a DataFrame of Boolean
# values indicating whether each point (rows) falls within
# each polygon (columns).
>>> print(result)
            nyc            hudson river
0          True          True
1          True         False
2         False          True
# Point 0: (0, 0) falls in both polygons
# Point 1: (-8, -8) falls in the first polygon
# Point 2: (6.0, 6.0) falls in the second polygon

note input Series x and y will not be index aligned, but computed as sequential arrays.

note poly_ring_offsets must contain only the rings that make up the polygons indexed by poly_offsets. If there are rings in poly_ring_offsets that are not part of the polygons in poly_offsets, results are likely to be incorrect and behavior is undefined.

cuspatial.polygon_bounding_boxes(poly_offsets, ring_offsets, xs, ys)#

Compute the minimum bounding-boxes for a set of polygons.

Parameters
poly_offsets

Begin indices of the first ring in each polygon (i.e. prefix-sum)

ring_offsets

Begin indices of the first point in each ring (i.e. prefix-sum)

xs

Polygon point x-coordinates

ys

Polygon point y-coordinates

Returns
resultcudf.DataFrame

minimum bounding boxes for each polygon

x_mincudf.Series

the minimum x-coordinate of each bounding box

y_mincudf.Series

the minimum y-coordinate of each bounding box

x_maxcudf.Series

the maximum x-coordinate of each bounding box

y_maxcudf.Series

the maximum y-coordinate of each bounding box

cuspatial.polyline_bounding_boxes(poly_offsets, xs, ys, expansion_radius)#

Compute the minimum bounding-boxes for a set of polylines.

Parameters
poly_offsets

Begin indices of the first ring in each polyline (i.e. prefix-sum)

xs

Polyline point x-coordinates

ys

Polyline point y-coordinates

expansion_radius

radius of each polyline point

Returns
resultcudf.DataFrame

minimum bounding boxes for each polyline

x_mincudf.Series

the minimum x-coordinate of each bounding box

y_mincudf.Series

the minimum y-coordinate of each bounding box

x_maxcudf.Series

the maximum x-coordinate of each bounding box

y_maxcudf.Series

the maximum y-coordinate of each bounding box

cuspatial.quadtree_on_points(xs, ys, x_min, x_max, y_min, y_max, scale, max_depth, min_size)#
Construct a quadtree from a set of points for a given area-of-interest

bounding box.

Parameters
xs

Column of x-coordinates for each point

ys

Column of y-coordinates for each point

x_min

The lower-left x-coordinate of the area of interest bounding box

x_max

The upper-right x-coordinate of the area of interest bounding box

y_min

The lower-left y-coordinate of the area of interest bounding box

y_max

The upper-right y-coordinate of the area of interest bounding box

scale

Scale to apply to each point’s distance from (x_min, y_min)

max_depth

Maximum quadtree depth

min_size

Minimum number of points for a non-leaf quadtree node

Returns
resulttuple (cudf.Series, cudf.DataFrame)
keys_to_pointscudf.Series(dtype=np.int32)

A column of sorted keys to original point indices

quadtreecudf.DataFrame

A complete quadtree for the set of input points

keycudf.Series(dtype=np.int32)

An int32 column of quadrant keys

levelcudf.Series(dtype=np.int8)

An int8 column of quadtree levels

is_quadcudf.Series(dtype=np.bool_)

A boolean column indicating whether the node is a quad or leaf

lengthcudf.Series(dtype=np.int32)

If this is a non-leaf quadrant (i.e. is_quad is True), this column’s value is the number of children in the non-leaf quadrant.

Otherwise this column’s value is the number of points contained in the leaf quadrant.

offsetcudf.Series(dtype=np.int32)

If this is a non-leaf quadrant (i.e. is_quad is True), this column’s value is the position of the non-leaf quadrant’s first child.

Otherwise this column’s value is the position of the leaf quadrant’s first point.

Notes

  • Swaps min_x and max_x if min_x > max_x

  • Swaps min_y and max_y if min_y > max_y

Examples

An example of selecting the min_size and scale based on input:

>>> np.random.seed(0)
>>> points = cudf.DataFrame({
        "x": cudf.Series(np.random.normal(size=120)) * 500,
        "y": cudf.Series(np.random.normal(size=120)) * 500,
    })

>>> max_depth = 3
>>> min_size = 50
>>> min_x, min_y, max_x, max_y = (points["x"].min(),
                                  points["y"].min(),
                                  points["x"].max(),
                                  points["y"].max())
>>> scale = max(max_x - min_x, max_y - min_y) // (1 << max_depth)
>>> print(
        "min_size:   " + str(min_size) + "\n"
        "num_points: " + str(len(points)) + "\n"
        "min_x:      " + str(min_x) + "\n"
        "max_x:      " + str(max_x) + "\n"
        "min_y:      " + str(min_y) + "\n"
        "max_y:      " + str(max_y) + "\n"
        "scale:      " + str(scale) + "\n"
    )
min_size:   50
num_points: 120
min_x:      -1577.4949079170394
max_x:      1435.877311993804
min_y:      -1412.7015761122134
max_y:      1492.572387431971
scale:      301.0

>>> key_to_point, quadtree = cuspatial.quadtree_on_points(
        points["x"],
        points["y"],
        min_x,
        max_x,
        min_y,
        max_y,
        scale, max_depth, min_size
    )

>>> print(quadtree)
    key  level  is_quad  length  offset
0     0      0    False      15       0
1     1      0    False      27      15
2     2      0    False      12      42
3     3      0     True       4       8
4     4      0    False       5     106
5     6      0    False       6     111
6     9      0    False       2     117
7    12      0    False       1     119
8    12      1    False      22      54
9    13      1    False      18      76
10   14      1    False       9      94
11   15      1    False       3     103

>>> print(key_to_point)
0       63
1       20
2       33
3       66
4       19
    ...
115    113
116      3
117     78
118     98
119     24
Length: 120, dtype: int32
cuspatial.join_quadtree_and_bounding_boxes(quadtree, poly_bounding_boxes, x_min, x_max, y_min, y_max, scale, max_depth)#

Search a quadtree for polygon or polyline bounding box intersections.

Parameters
quadtreecudf.DataFrame

A complete quadtree for a given area-of-interest bounding box.

poly_bounding_boxescudf.DataFrame

Minimum bounding boxes for a set of polygons or polylines

x_min

The lower-left x-coordinate of the area of interest bounding box

x_max

The upper-right x-coordinate of the area of interest bounding box

min_y

The lower-left y-coordinate of the area of interest bounding box

max_y

The upper-right y-coordinate of the area of interest bounding box

scale

Scale to apply to each point’s distance from (x_min, y_min)

max_depth

Maximum quadtree depth at which to stop testing for intersections

Returns
resultcudf.DataFrame

Indices for each intersecting bounding box and leaf quadrant.

poly_offsetcudf.Series

Indices for each poly bbox that intersects with the quadtree.

quad_offsetcudf.Series

Indices for each leaf quadrant intersecting with a poly bbox.

Notes

  • Swaps min_x and max_x if min_x > max_x

  • Swaps min_y and max_y if min_y > max_y

cuspatial.points_in_spatial_window(min_x, max_x, min_y, max_y, xs, ys)#

Return only the subset of coordinates that fall within a rectangular window.

A point (x, y) is inside the query window if and only if min_x < x < max_x AND min_y < y < max_y

The window is specified by minimum and maximum x and y coordinates.

Parameters
min_x

lower x-coordinate of the query window

max_x

upper x-coordinate of the query window

min_y

lower y-coordinate of the query window

max_y

upper y-coordinate of the query window

xs

column of x-coordinates that may fall within the window

ys

column of y-coordinates that may fall within the window

Returns
resultcudf.DataFrame

subset of (x, y) pairs above that fall within the window

Notes

  • Swaps min_x and max_x if min_x > max_x

  • Swaps min_y and max_y if min_y > max_y