While most shapefiles acquired from various sources are fairly ready to use with little or no edits, other uses may dictate that changes are made to a shapefile. One such instance is the need to add a column to a shapefile based on some condition. In most instances, the conditions may be too complicated such that the task cannot be efficiently accomplished in desktop GIS software without scripting. This article outlines the process of adding a column and conditional to a shapefile using a Python script and GeoPandas.
The intention of this article is to add a new column to the roads shapefile called ACCTYPE (sourced from OpenStreetMap) based on the following conditional assignments:
- Primary or primary_link ASSIGN 0,
- Secondary or secondary_link ASSIGN 1,
- Tertiary or tertiary_link ASSIGN 2,
- Trunk or trunk_link ASSIGN 3
Steps to follow:
Import required packages
import os import geopandas as gpd import numpy as np from shapely import geometry
Set the working directory
os.chdir(rf'D:\Codebase')
Read the shapefile with GeoPandas to view the shapefile as a DataFrame and then find unique values in the targeted column. (In this case highway column)
df = gpd.read_file(r'.\data\roads\roads.shp') print(df.head()) #print first five rows in dataframe highway_values = df.highway.unique() print(sorted(highway_values))
Next, we create separate arrays for both the conditions to be met and values to be assigned when the condition is met in a particular order.
conditions = [ (df['highway'] == 'primary') | (df['highway'] == 'primary_link'), (df['highway'] == 'secondary') | (df['highway'] == 'secondary_link'), (df['highway'] == 'tertiary') | (df['highway'] == 'tertiary_link'), (df['highway'] == 'trunk') | (df['highway'] == 'trunk_link') ] values = [0,1,2,3]
Use NumPy to create a new column, test the conditions and assign values.
df['ACCTYPE'] = np.select(conditions, values) print(df.head())
Convert the data into a GeoDataFrame and save it to a file as a .shp
df = gpd.GeoDataFrame(df, geometry= 'geometry') df.to_file(rf'.\data\roads\roads_edit.shp', driver = 'ESRI Shapefile')
If you check the shapefile (you should), we have created a new column called ACCTYPE with integer values ranging between 0 and 3. The process presented here can be re-used with some tweaks to meet your specific need.
Link to the complete code.
https://github.com/osekojoe/GISTools/blob/master/addColumnToShapefile.py
Shapefile conditionals in GeoPandas & Numpy