GeoMatch

Project Info

Team Banico thumbnail

Team Members


Blaise Ulric Sy Banico , Jason Banico

Project Description


GeoMatch is a utility that helps in linking various location-based data together.

It has the following components:

Geo Data Index. This index contains geolocated records that links to external data sources, such as the vast address-based open data repositories. Data sources that have addresses, but not geolocation, will use address-to-geolocation translation services such as those using the Geocoded National Address File, or the Google Geocoding API (which is helpful to deal with those challenging corner-of addresses.)

Match Database. This contains matches made. Matches fall under the following categories:
- exact match. Entries in the geo data index that are declared to be in the same location by meeting certain standards (ie., within X distance) are linked.
- assisted match. The system identifies candidates for matching, but requires a human to confirm that it is to be linked.
- user matched. Data not linked by any of the prior processes can be user matched. Using visual tools, users can tag entries that appear to be in the same location together.

Matching App. This provides the user interface to crowdsource the effort to link data together. By presenting data in maps, users can link entries together. Contributions will be credited to the user, and the hope is that, just like Wikipedia, there will be a community of volunteers who will help link together data that software cannot. (Eventually, it is hoped that, eventually, ML/AI can do this too.) It can also be gamified so that users will be rewarded for their efforts.

screen


#geocoding #geolocation #indexing

Data Story


Using software to match records together via addresses (and regex) are good, but have some limitations. Our proposal with GeoMatch is to utilize geolocation to bring them together, with the aid of Geocoded National Address File or Google Geocoding API. They may be done automatically, or may require human input. For the latter, crowdsourcing through volunteers may be a cost-effective way to achieve this.


Evidence of Work

Video

Homepage

Project Image

Team DataSets

Geocoding Addresses Best Practices

Description of Use Tips on how to use Google's Geocoding API are listed here. Services like Geocoding API are essential for GeoMatch.

Data Set

Victorian Government School Zones 2023

Description of Use Data will be used to feed into the Geo Data Index. Entries that do not have geolocation will need address-to-geolocation services (such as Google Maps API) to get geo attributes.

Data Set

Crash Stats - Data Extract

Description of Use Data will be used to feed into the Geo Data Index. Entries that do not have geolocation will need address-to-geolocation services (such as Google Maps API) to get geo attributes.

Data Set

School Locations 2022

Description of Use Data will be used to feed into the Geo Data Index. Entries that do not have geolocation will need address-to-geolocation services (such as Google Maps API) to get geo attributes.

Data Set

Traffic Count Locations

Description of Use Data will be used to feed into the Geo Data Index. Entries that do not have geolocation will need address-to-geolocation services (such as Google Maps API) to get geo attributes.

Data Set

Victorian liquor licences by location

Description of Use Data will be used to feed into the Geo Data Index. Entries that do not have geolocation will need address-to-geolocation services (such as Google Maps API) to get geo attributes.

Data Set

Traffic Lights

Description of Use Data will be used to feed into the Geo Data Index. Entries that do not have geolocation will need address-to-geolocation services (such as Google Maps API) to get geo attributes.

Data Set

FOI - Point - Vicmap Features of Interest

Description of Use Data will be used to feed into the Geo Data Index. Entries that do not have geolocation will need address-to-geolocation services (such as Google Maps API) to get geo attributes.

Data Set

Challenge Entries

Better use of data - Connecting addresses in datasets

Addresses in datasets are a complicated thing. For example, they may have typos and abbreviations, they might rely on the context of the dataset (is this Melbourne Australia, or Melbourne Florida), and they may be more or less specific. Given an address or dataset containing addresses, how can we discover useful connections with other datasets given a large corpus of potential information?

Eligibility: Participants must use one or more datasets from Data.Vic to be eligible.

Go to Challenge | 5 teams have entered this challenge.