Describing transit systems using GTFS data in Python

After learning R during the spring semester (and tinkering with this site), I was eager to expand my programming knowledge, which led me to GEOG 6180 Geoprocessing with Python. The course had three main objectives:

  • Learn fundamental principles of programming in Python
  • Use Python to manipulate and analyze geographic data
  • Write scripts for geoprocessing in Python and ArcGIS

I learned the basics of Python and arcpy during the first few weeks of the semester before moving onto “classic” spatial analyses, such as calculating the optimal location for a site based on the minimum weighted travel distance or determining slope and aspect from a digital elevation model, or DEM. These assignments focused on the theory and structure behind each type of analysis and used simple inputs to illustrate the basic process.

_config.yml

For my semester project, I decided to write a script to process General Transit Feed Specification data. GTFS is a standard text-based data format used by thousands of transit agencies worldwide to share operational data, including routes, schedules, live tracking, and more. A few months ago, I had seen two intriguing fellowships with the USDOT Bureau of Transportation Statistics that involved handling GTFS data, and I thought this would be the perfect opportunity to gain some firsthand experience.

I used Jupyter to write separate notebok scripts to describe a single bus system and compare two bus systems, which used the gtfs_functions package to handle the initial data processing, geopandas and pandas for data analysis, and keplergl for map visualizations. I won’t go into detail on the process since the notebooks include some commentary, but it was a great learning experience and now I have a script that can theoretically take any GTFS feed and spit out some basic statistics and maps.

I have a much stronger grasp of the basic fundamentals of programming after taking this course. Even though I had written plenty of R code during the spring, I was primarily focused on cleaning and analyzing specific datasets in a linear, step-by-step process. This course introduced functions, for loops, if/elif/else statements, and other statements or methods that will help me write more advanced scripts. Plus, knowing the basics of arcpy will be a gamechanger as I work more in ArcGIS Pro. I’m excited to keep expanding my expertise in Python, especially as I begin thinking about what my early career could look like.

Written on December 9, 2023