Measuring Geographic Distributions with GeoPandas: Central Feature

Table of Contents

Introduction

The Central Feature is the point that is the shortest distance to all other points in the dataset and thus identifies the most centrally located feature. The Central Feature can be used to find the most accessible feature, for example, the most accessible school to hold a training day for teachers from schools in a given area.

Sources:
The Esri Guide to GIS Analysis, Volume 2: Spatial Measurements and Statistics.
An Introduction to Statistical Problem Solving in Geography

This course is designed to instill the basics of Python Programming by incrementally increasing your knowledge session-upon-session. In each section you will be given new material for a workbook to fill out and by the end of this course you will have your very own Python reference handbook. So how does this course have a GIS focus? Simple, most elements of the course have GIS and geospatial data in mind. Instead of using non-descript variables and values, we will use terms such as population, city, x_coord, y_coord, and so on. This will aid participants with pinpointing how they can relate geospatial data to Python. 

The Formula

For each feature calculate the total distance to all other features. The feature that has the shortest total distance is the Central Feature.

For Point features the X and Y coordinates of each feature is used, for Polygons the centroid of each feature represents the X and Y coordinate to use, and for Linear features the mid-point of each line is used for the X and Y coordinate

Using GeoPandas to Calculate the Central Feature

The code below uses GeoPandas and Shapely to find the central feature for a dataset and create an output file. In our example we will use a Shapefile, but you can use any input and output filetypes that you have available with your GeoPandas setup. 

The code is heavily commented for ease of understanding the workflow. For a Point and Polygon, we use the centroid. You could use the Point geometry itself for a Point shapefile, but in order to get the “total_distance” calculation on one line of code it was easier to assign the Point geometry to a column called “point” for each geometry type. For a Polyline, we use the midpoint of each line.

We calculate the total distance from each point to all other points and then find the point with the shortest total distance, this represents the central feature. Although, there could be multiple features with the same smallest shortest-distance so we account for this.

Lastly, we export the central feature(s) from the original input Shapefile to a new Shapefile.

				
					import geopandas as gpd

## input shapefile path
in_shp = r"path\to\input\shapefile\input.shp"

## the output shapefile path for the central feature point
out_shp = r"path\to\output\shapefile\output.shp"

## read in the shapefile to a GeoDataFrame
gdf = gpd.read_file(in_shp)

## get the geometry type from the first record
geom_type = gdf.geom_type[0]

## for Point and Polygon geometry get the centroid
if geom_type in ("Point", "Polygon"):
    ## get the centroid of each feature as a Point geometry
    gdf["point"] = gdf.geometry.centroid

## for LineString geometry get the midpoint
elif geom_type == "LineString":
    ## get the midpoint of each line as a Point geometry
    gdf["point"] = gdf.geometry.interpolate(0.5, normalized=True)

## calculate the total distance for each point to all other points
## https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoSeries.distance.html
gdf['total_distance'] = gdf["point"].apply(lambda geom: gdf["point"].distance(geom).sum())

## get the value of the minimum distance
min_distance = gdf["total_distance"].min()

## there could be multiple central features with the same smallest
## cumulative distance
central_feature = gdf[gdf["total_distance"] == min_distance]

## sanitize the central feature(s) gdf and make ready for output
central_feature = central_feature.drop(columns=["point", "total_distance"])

## write the central feature(s) to the output shapefile
central_feature.to_file(out_shp, driver="ESRI Shapefile")
				
			

Central Feature in Action

Data for Primary School location was downloaded from the Department of Education (Ireland) and processed to contain Primary Schools in County Kildare in a projected coordinate system – Irish Transverse Mercator (EPSG:2157). You can download the Shapefile containing the data used below here.

Primary Schools Kildare from Department of Education

Running the script produces a Shapefile that contains the Central Feature from the original Primary Schools Shapefile.

Primary School Central Feature

Below is a comparison between our GeoPandas tool and the Central Feature tool output from ArcGIS Pro. Spot on!

Central Feature Geopandas
Central Feature ArcGIS Pro

At Final Draft Mapping we provide comprehensive courses for automating tasks within ArcGIS Pro and ArcGIS Online with ArcPy and the ArcGIS API for Python. Courses range from beginner to advanced workflows and all paid courses provide extra support where you can ask questions. Automation within ArcGIS is a highly sought after skill, by adding these skills to your arsenal you are placing yourself at the forefront of that demand. 

We appreciate our blog readers, you can get 25% off any (non-sale) course at any time with the code FDMBLOG25

Also in this series...

Leave a Comment

Your email address will not be published. Required fields are marked *