Building an Unstructured Data Import
Sometimes the hardest thing to figure out is not “how” to best apply a new technology, but “when not” to apply it. Nowhere has this been more true than with AI. With the rush to stick the AI brand on every project, software engineers are being pressured to “just throw more AI” at the problem and hope it works. In my hourly contracting on Neighbor Solutions, I’ve been asked to build a data import pipeline for community resources. To give a bit of context, one core function of the Neighbor Solutions app is to help users who have a heart for our unhoused neighbors to guide them towards helpful resources in their community. These can be food banks, shelters, warming stations, and many more. The major technical difficulty here is getting accurate data into the system in an automated way. Many times, lists of these resources are in poorly formatted PDFs or screenshots of webpages. There is very little consistency here, so some amount of natural language processing is required. ...