Street Address Geocoding

Street address geocoding is the process of determining an estimated latitude and longitude position for the location of a street address.  A very common task when working with spatial data is to take a table where each row has a street address and to add latitude and longitude fields for the estimated location of that address.

 

For example, suppose we have a table of ten records where each record gives the restaurant number and the street address of an In-n-Out fast food restaurant in California, Utah or Texas.  This is the same table used in the Example: Street Address Geocoding topic.

 

eg_street_geocode01_01.png

 

The classic GIS way to street address geocode that table would be to add the estimated latitude and longitude of each restaurant.

 

eg_street_geocode01_13.png

 

In Manifold we normally jump straight to creating a geom field that contains a point at the right location for each record.

 

eg_street_geocode01_04.png

 

We can then see the location of each restaurant in a drawing.

 

eg_street_geocode01_05.png

 

The map above shows the geocoded restaurants drawing as a layer above a Google Street Map layer.

Reverse Geocoding

Reverse geocoding is the process of finding street addresses near a given latitude/longitude location.   

The Geocoding Process in Manifold

Manifold geocodes a table of addresses using geocoding data sources, which are normally web-based geocoding servers like the Google geocoding server.     We use SQL within Manifold to use a Manifold SQL geocoding function such as GeocodeAddress to geocode records in the table, either to update the table or to SELECT ... INTO to create a new table.   

 

The GeocodeAddress function takes as arguments a data source in the project and the name of an address field that contains an address as a string such as a varchar data type.   The function returns the longitude/latitude location of that address as given by the data source, providing it in Latitude / Longitude projection (WGS 84 base) as decimal coordinates in a float64x2 value.

 

At the present writing Manifold has nine web-based geocoding data sources built in, from Bing to Yandex.   Geocoding servers vary wildly in their characteristics with some being free to use, some requiring an API key, some allowing a limited number of free geocodes per day or per second with additional geocodes requiring a payment.  

 

Geocoding servers also vary in their ability to digest what we might think is a reasonable street address.  Because most were designed to handle geocoding for web sites where users tend to specify addresses in a wildly inconsistent manner, most are reasonably good at being able to parse a wide range of addresses.   For example, both the Bing and the Google geocoders can parse an address that consists of the name of a big city.  

 

Create a data source from the Google Geocoder called google in a project and then enter the expression

 

VALUES (GeocodeAddress([google], 'Chicago'));

 

Into a Command window and press the ! button and Google will be pleased to return the result

 

-87.6297982,41.8781136

 

...meaning that Google thinks the approximate center of the city of Chicago in Illinois is at latitude 41.878 and longitude -87.629.    Bing agrees.    

 

Why do both Google and Bing think that the string 'Chicago' fed to them means the big city of Chicago in Illinois and not the locations also named Chicago in South Africa, Zimbabwe, Guatemala or Mexico?  Most likely because web searches for Chicago seek the big city in Illinois over the others by a factor of a billion to one.

 

If we provide a more detailed street address most geocoding servers can do even better.  Both Google is happy to recognize an address such as

 

'1600 Pennsylvania Avenue NW, Washington, DC 20500'

 

and will return the result of

 

-77.0364823, 38.897675799

 

...meaning that Google thinks the White House (at that address) is located at latitude 38.897675799 and longitude -77.0364823.     

Geocoding Functions

The following Manifold SQL functions assume a geocoding server as a data source:

 

GeocodeAddress

Given a data source and an address, return the longitude/latitude coordinates as a float64x2 value.
 
Example: using a Google geocoding data source called GoogleG, return the longitude,latitude coordinates for 'Chicago'.
 

VALUES (GeocodeAddress([GoogleG], 'Chicago'));

 

GeocodeAddressMatches

Given a data source and an address, returns a table of matches. Each match is a string with the format of the string depending on the geocoding server in use.   Most geocoding servers return JSON.

GeocodeLocationMatches

Reverse geocoding.  Given a data source and a longitude/latitude location as a float64x2 value, returns a table of matches around that location.

GeocodeAddressSupported

Given a data source, returns true if the data source supports GeocodeAddress and GeocodeAddressMatches functions.

GeocodeLocationSupported

Given a data source, returns true if the data source supports the GeocodeLocationMatches function.

 

See the Example: Street Address Geocoding topic for examples using the GeocodeAddress function.

Geocoders Provide Approximate Locations

As anyone knows who has followed a web-based mapping application to a location provided for a given street address, such addresses are usually only approximate and can be wildly inaccurate.   Geocoders usually do not know where a specific address is located; instead, they maintain a database of street segments with a range of addresses for each particular street segment.   If a Main Street segment along a particular city block is stored with a range of addresses from 20 to 40, an address at 30 Main Street will usually be interpolated as halfway down the block without knowing exactly where the address is located.   In regions of sparse inhabitation, such as rural areas, geocoders can be wildly inaccurate and may interpolate address locations that are kilometers away from the actual location.

 

As the largest web providers become better at data mining their users, triangulating travel using the locations of surveilled cell phones and at blending information from multiple data sources, more refined geocoding strategies have become possible.   Geocoders will often no longer do interpolations within well-known urban areas but instead will utilize databases of structures maintained by cities or private providers to correlate addresses to specific locations.   Geocoders are also getting better in rural locations as sweeping efforts to digitize the exact locations of specific addresses (down to an access road or gate in the case of ranch and farm properties) in service of better emergency response have resulted in the accumulation of databases of exact address coordinates in many rural areas.

 

But even in such cases the exact location where to place a dot to denote the location of a particular address is often a matter of choice.   Should a dot be placed at a centroid of the real estate parcel associated with a given street address?  For large parcels, such as the one at 1600 Pennsylvania Avenue in Washington, a dot placed at the centroid would be far from any access road or entry gate.   Should the location dot be placed at the main entry gate?   At the postal box for the parcel?  At the center of the main building?   At the center of the facade of the main building that fronts the road used for the address?  Should the point be on the boundary of the parcel or along the edge of any sidewalk?

 

Such factors may seem like minor details, but they can have a big effect on how locations derived from street addresses are used for emergency service response, parcel delivery, computations involving real estate parcels or even simply finding a restaurant.  In the examples above several of the In-n-Out restaurants are located in shopping centers with the address of the restaurant being the address of the shopping center.   Find a dot at the centroid of a shopping center may be easy enough but then locating a restaurant that could be hundreds of meters away might not be so easy if it is not immediately in sight.

 

See the Example: Street Address Geocoding topic for an example of how two different geocoders may place the same address at two significantly different locations.

Notes

NULLs - When a geocoding server returns a NULL for an address that usually means one of several  things:

 

 

 

Manifold Geocoding Database - A data source could also be the Manifold dataport for connecting to the legacy Manifold Geocoding Database (GCDB) used with Release 8 and built on the US Census Bureau's TIGER database.  That data source should be considered deprecated given the greater accuracy of more contemporary data sources.

 

See the Example: Street Address Geocoding topic for an example using GCDB as a geocoding data source.

 

See Also

Tables

 

Queries

 

Data Types

 

Web Servers and Image Servers

 

Street Address Geocoding

 

Command Window

 

Example: Add a Spatial Index to a Table - A typical use of an index is to provide a spatial index on a geom field in a table, so the geom data can be visualized in a drawing. This topic provides the step by step procedure for adding a spatial index.

 

Example: Create a Geocoded Table from a Drawing - A partner example to this topic.  A geocoded table has records with a latitude and longitude for each record.   This example starts with a table for a drawing of points where the geom field in the table contains geometry information for each point.   We extract the Y and X locations for each point  from the geom field to create latitude and longitude fields in the table for each record.

 

Example: Street Address Geocoding -  Geocode a table of street addresses using the Google Geocoder.

 

Example: Create a Drawing from a Geocoded Table - A partner example this topic.  A geocoded table has records with a latitude and longitude for each record.   This example starts with a table containing a list of cities with a latitude and longitude field for the location of each city.   We create a geom from the latitude and longitude fields using a template in the Transform panel and then we create a drawing that shows the cities as points.  This example shows all the infrastructure steps involved.