GeoPandas, The difference a Spatial Index Makes!

Table of Contents

Introduction

I was recreating the ArcPy Feature Vertices to Points tool using GeoPandas and Shapely and I was unimpressed with the length of time it took to perform against its ArcPy counterpart when it came to processing dangles. However, that was until I applied a spatial index to the workflow, which rapidly sped up matters. Rapidly being an understatement! Generating a spatial index with Geopandas sindex uses the rtree method. Apparently quad-tree is faster but I’m very impressed with the results here.

Dangles

Dangles in GIS

Without the Spatial Index

This took 6 minutes and 8 seconds to complete on a dataset that contained 97 records with mulitpart geometries that broke out to 1555 parts.

				
					def check_if_dangle(point, gdf):
    """
    Return a count of the linear features a point intersects, if it is more than
    one then it is not a dangle, exactly one and it is a dangle
 
    args:
        point   the point to assess if its a dangle or not
        gdf     the entire GeoDataFrame to assess the point against
 
    return
        count (int) the number of linear features a point intersects
 
    """
 
    ## Create a buffered bounding box around the point with the specified tolerance (set here as 0.01)
    buffered_bbox = point.buffer(0.01).bounds
 
    ## Convert the buffered bounding box to a Polygon
    buffered_bbox_polygon = box(*buffered_bbox)
 
    ## initiate count, which is how many lines the buffered_bbox_polygon intersects
    ## more than one and it is not a dangle
    count = 0
 
    ## for each row (linear feature in the dataset)
    for index, row, in gdf.iterrows():
        ## get the geometry of the line
        line = row["geometry"]
        ## if the buffered_bbox_polygon intersects the line, increase the count by 1
        if buffered_bbox_polygon.intersects(line):
            count += 1
 
    return count
				
			

With the Spatial Index

This only took 1.9 seconds!! to return the same output.

				
					## spatial index to rapidly speed up processing of larger datasets
spatial_index = gdf.sindex
 
def check_if_dangle(point, gdf):
    """
    Return the count of the linear features a point intersects, if it is more than
    one then it is not a dangle, exactly one and it is a dangle
 
    args:
        point   the point to assess if its a dangle or not
        gdf     the entire GeoDataFrame to assess the point against
 
    return
        count (int) the number of linear features a point intersects
 
    """
 
    ## Create a buffered bounding box around the point with the specified tolerance
    buffered_bbox = point.buffer(0.01).bounds
 
    ## Convert the buffered bounding box to a Polygon
    buffered_bbox_polygon = box(*buffered_bbox)
 
    ## Use the spatial index to efficiently find intersecting line features within the buffered bounding box
    possible_matches_index = list(spatial_index.intersection(buffered_bbox))
 
    ## initiate count, which is how many lines the buffered_bbox_polygon intersects
    ## more than one and it is not a dangle
    count = 0
 
    ## for possible match from the spatial index (for each index)
    for index in possible_matches_index:
        line = gdf.geometry.iloc[index]
        if buffered_bbox_polygon.intersects(line):
            count += 1
 
    return count
				
			

Our YouTube channel is packed with extra content for ArcPy & ArcGIS Pro, and the ArcGIS API for Python & ArcGIS Online. There are also free courses. Make sure to subscribe to be notified about newly released material and added courses. Struggling with something related to Python & ArcGIS? Let us know and we can make a tutorial video. We’re here for your learning!

Leave a Comment