GeoPandas, The difference a Spatial Index Makes!

Table of Contents

Introduction

I was recreating the ArcPy Feature Vertices to Points tool using GeoPandas and Shapely and I was unimpressed with the length of time it took to perform against its ArcPy counterpart when it came to processing dangles. However, that was until I applied a spatial index to the workflow, which rapidly sped up matters. Rapidly being an understatement! Generating a spatial index with Geopandas sindex uses the rtree method. Apparently quad-tree is faster but I’m very impressed with the results here.

Dangles

Dangles in GIS

Without the Spatial Index

This took 6 minutes and 8 seconds to complete on a dataset that contained 97 records with mulitpart geometries that broke out to 1555 parts.

				
					def check_if_dangle(point, gdf):
    """
    Return a count of the linear features a point intersects, if it is more than
    one then it is not a dangle, exactly one and it is a dangle
 
    args:
        point   the point to assess if its a dangle or not
        gdf     the entire GeoDataFrame to assess the point against
 
    return
        count (int) the number of linear features a point intersects
 
    """
 
    ## Create a buffered bounding box around the point with the specified tolerance (set here as 0.01)
    buffered_bbox = point.buffer(0.01).bounds
 
    ## Convert the buffered bounding box to a Polygon
    buffered_bbox_polygon = box(*buffered_bbox)
 
    ## initiate count, which is how many lines the buffered_bbox_polygon intersects
    ## more than one and it is not a dangle
    count = 0
 
    ## for each row (linear feature in the dataset)
    for index, row, in gdf.iterrows():
        ## get the geometry of the line
        line = row["geometry"]
        ## if the buffered_bbox_polygon intersects the line, increase the count by 1
        if buffered_bbox_polygon.intersects(line):
            count += 1
 
    return count
				
			

With the Spatial Index

This only took 1.9 seconds!! to return the same output.

				
					## spatial index to rapidly speed up processing of larger datasets
spatial_index = gdf.sindex
 
def check_if_dangle(point, gdf):
    """
    Return the count of the linear features a point intersects, if it is more than
    one then it is not a dangle, exactly one and it is a dangle
 
    args:
        point   the point to assess if its a dangle or not
        gdf     the entire GeoDataFrame to assess the point against
 
    return
        count (int) the number of linear features a point intersects
 
    """
 
    ## Create a buffered bounding box around the point with the specified tolerance
    buffered_bbox = point.buffer(0.01).bounds
 
    ## Convert the buffered bounding box to a Polygon
    buffered_bbox_polygon = box(*buffered_bbox)
 
    ## Use the spatial index to efficiently find intersecting line features within the buffered bounding box
    possible_matches_index = list(spatial_index.intersection(buffered_bbox))
 
    ## initiate count, which is how many lines the buffered_bbox_polygon intersects
    ## more than one and it is not a dangle
    count = 0
 
    ## for possible match from the spatial index (for each index)
    for index in possible_matches_index:
        line = gdf.geometry.iloc[index]
        if buffered_bbox_polygon.intersects(line):
            count += 1
 
    return count
				
			

At Final Draft Mapping we provide comprehensive courses for automating tasks within ArcGIS Pro and ArcGIS Online with ArcPy and the ArcGIS API for Python. Courses range from beginner to advanced workflows and all paid courses provide extra support where you can ask questions. Automation within ArcGIS is a highly sought after skill, by adding these skills to your arsenal you are placing yourself at the forefront of that demand. 

We appreciate our blog readers, you can get 25% off any (non-sale) course at any time with the code FDMBLOG25

Leave a Comment

Your email address will not be published. Required fields are marked *