NYU Geospatial Work

I just wrapped up a 2-year stint with New York University’s Digital Library department, mostly helping them with their geospatial data but also working on some ancillary projects as well.

GeoBlacklight

I was initially brought on to help with their GeoBlacklight instance, the NYU Spatial Data Repository. GBL is a Ruby on Rails project that sits in front of Solr to make finding Geospatial data fast and discoverable via a map interface. The combination of Ruby/Rails and geospatial data was right up my alley.

NYU's Spatial Data Repository

I upgraded their instance of GBL to version 4, made functional improvements, fixed bugs, addressed CVEs, integrated New Relic, updated the Capistrano deployment, did some exploratory work with Anubis and did some additional work to deter bots and crawlers. I also worked on an Ansible playbook to automate setting up GeoServer on Rocky Linux.

OpenGeoMetadata, GeoServer & PostGIS

The source of their data for the Spatial Data Repository is a collection of JSON files using the OGM Aardvark Schema, found in the edu.nyu OGM Repository.

To help manage this data I created a command-line tool called sdr-data-loader using Python to automate loading Shapefiles into PostGIS and publishing them to GeoServer.

I took this opportunity to generate an “audit report” to discover and diagnose any records with issues such as missing database records, unpublished layers or invalid IDs. In the end that report helped me fix about 1600 records with issues.

PMTiles & COG

I did some exploratory work to determine the feasibility of converting some of their geospatial data to PMTiles or Cloud Optimized GeoTIFFs to eventually remove the need for GeoServer. Due to the volume and heterogeneous nature of their data, as well as somewhat immature support at the time for those technologies, it was decided not to pursue those formats for now.

Invenio RDM

The latter half of my time with NYU was spent working on their Invenio RDM instance UltraViolet.

GeoSpatial Support

My main goal was to add geospatial functionality to Invenio RDM - the ability to add records with geospatial data that could be previewed using Geoserver both while editing and viewing records. I created a React component that talks to a Python backend to validate record data against GeoServer. This lets depositors get immediate feedback on whether the data they have entered is correct.

UltraViolet GeoSpatial Integration

I created a data loading CLI (also in Python) to help automate ingesting their geospatial data into Ultraviolet. This included converting data from OGM’s Aardvark format to a format compatible with Invenio RDM

Upgrades and Entra ID Support

During the course of working with Invenio RDM I updated their instance from version 12 to 13 and added OAuth support for Microsoft Entra ID.

GeoJSON Previewer

As an extracurricular project I created a file previewer for GeoJSON after working with Invenio RDM’s previewers for so long. That should be available in the next release of the project.

Invenio RDM GeoJSON Previewer

Fun Work

I had a great time working with the folks in NYU’s Digital Libraries department and hope to engage with them again soon. It was also nice to be able to contribute to open source projects like Invenio RDM and GeoBlacklight.

I’m excited to takes some of the GeoSpatial learnings from this engagement and bring them over to the Lost Mapper, particularly how to set up an instance of GeoBlacklight and ingest your organizations data.

My gis-dockerized project came in handy many times while testing out scripts and working with data. A miniature version of it even made its way into the sdr-data-loader!