Example: Street Address Geocoding

In this example we street address geocode a table of street addresses using the Google geocoder.   This example is also shown, using the Google geocoder, in the Street Address Geocoding using SQL YouTube video  in the Manifold Sales YouTube channel.

 

Google's web-based geocoding service allows geocoding a limited number of addresses per day for free.  A paid version of the service allows geocoding as many as desired.  In this example we will only geocode ten addresses so we can easily use the free service.    

 

We begin by launching File - Create - New Data Source to create a new data source in our project.

 

eg_street_geocode01_00.png

 

We would like to use a shorter name for this data source so we enter google_gcode as the Name.   In the Type box we choose Web Server: geocodeserver and then as a subtype we choose the Google Geocoder.    Press Create Data Source.

 

 

eg_street_geocode01_00a.png

 

The google_gcode data source is created in the project pane.   While we are at it we also create a data source for the Google Streets imageserver and then we create a map and drop the Google streets layer into it.  We will use this map later on to display the locations of our geocoded records.

 

The project also contains a restaurants table that we will geocode.  The restaurants table contains the addresses for a selection of In-N-Out restaurants located in California, Utah and Texas.  

eg_street_geocode01_01.png

 

Clicking open the restaurants table we see that it contains an mfd_id index field, a number field that provides the official number of the restaurant in In-N-Out's database and an address field.

 

To geocode the restaurant table we create a query which we call Geocode Restaurants.  

 

eg_street_geocode01_02.png

The query text is

 

SELECT [mfd_id], [number], [address],

GeomMakePoint(GeocodeAddress([google_gcode],[address])) AS [Geom]

   INTO [geocoded restaurants]

   FROM [restaurants];

 

To deconstruct the above query, we are creating a new table by using SELECT ... INTO to populate the new table using fields from the restaurants table.  The most important line is

 

GeomMakePoint(GeocodeAddress([google_gcode],[address])) AS [Geom]

 

where we use the GeocodeAddress function to geocode the address field  using the google_gcode geocoding data source.   The float64x2 value returned by that function is converted by the GeomMakePoint function into a Manifold geom data type which is saved into the new table using the name Geom.

 

To run the query we press the ! button in the main Manifold toolbar.

 

eg_street_geocode01_03.png

 

The Results tab reports 10 results.    

 

eg_street_geocode01_04.png

 

We can double-click open the new geocoded restaurants table the query created and see that it does indeed contain a list of restaurants with the number, address and a Geom field that contains a point geom for each.  tech_ravi_sm.png

 

Tech Tip: When using free web-based geocoders such as Google, some of the results may be returned as NULL on some runs and not on others.  Free web based geocoders may have limits on how may addresses we can geocode in a certain time period.   Such limits may be a consequence of slow or limited capacity in servers, or, in the case of unlimited capacity servers such as Google, may be a consequence of asking to geocode too many addresses too quickly in too short a period of time.  When a fast application like Manifold very rapidly asks Google to geocode addresses, Google may return some of them as NULLs.   On a subsequent run, Google may geocode those same addresses which were previously returned as NULL.   See the user forum for queries and discussion on how to re-run address geocoding queries using only those records that were returned as NULL.

Create a Drawing

In the Project pane we Right-click on the geocoded restaurants table and in the context menu we choose Create - New Drawing.

 

eg_street_geocode01_04a.png

 

The New Drawing dialog suggests a default Name, which we use.   It knows there is only one Geometry field in the table, called Geom, and it knows there is no spatial index, as yet, in the table so it offers to create one for us.    The red text reporting the coordinate system tells us we must assign an initial coordinate system using the coordinate picker button.

 

eg_street_geocode01_04b.png

 

We know that the Google geocoder, like seemingly all web geocoders, returns latitude and longitude coordinates so we assign the Latitude / Longitude coordinate system.   We press Create Drawing.  

 

We can then drop the new drawing into the map to see where the geocoded points are located.  We also Alt-click the name tab of the map to undock the map window.

eg_street_geocode01_05.png

In the illustration above we have used Style to color the points green for better visibility.   As expected, the points are located in California, Utah and Texas.

 

We zoom into the points in Texas for a closer look.

eg_street_geocode01_06.png

 

The point for the In-n-Out restaurant on Stacy Road in Allen, Texas, appears above, the third record in our geocoded table.

 

Variations in Geocoded Data Type

The above is a typical procedure for geocoding a table in Manifold, where we create Manifold geom fields so the geometry is immediately usable in Manifold.   If desired, we can also create different data types for the result of geocoding, either by later converting the Manifold geom field using the Transform panel or by writing a different query initially.

 

eg_street_geocode01_07.png

 

SELECT [mfd_id], [number], [address],

GeocodeAddress([google_gcode],[address]) AS [longlat]

   INTO [longlat restaurants]

   FROM [restaurants];

 

 

For example, the above simply stores the float64x2 vector value created by the GeocodeAddress function without converting it into a geom.   We run the query by pressing the ! button.

 

eg_street_geocode01_08.png

As before, we get ten results.

 

We double-click open the resulting longlat restaurants table.

 

eg_street_geocode01_09.png

 

The longlat field contains a float64x2 vector, which we can see by hovering the mouse over the column header to see the data type reported in a tooltip.

 

eg_street_geocode01_10.png

We named the field longlat to remind us that coordinates are reported in X,Y order, that is longitude first and then latitude.  

 

Suppose we would like to geocode a table to create separate longitude and latitude fields for use by some other application?  That is easy to do:

 

eg_street_geocode01_11.png

 

SELECT [mfd_id], [number], [address],

VectorValue(GeocodeAddress([google_gcode],[address]), 1) AS [lat],

VectorValue(GeocodeAddress([google_gcode],[address]), 0) AS [lon]

   INTO [latlon restaurants]

   FROM [restaurants];

 

Using procedures similar to those in the Example: Create a Geocoded Table from a Drawing topic, The above query uses the VectorValue function to extract the latitude and the longitude values from the results of GeocodeAddress.    This isn't the most efficient way to use a geocoder since we end up calling the geocoder twice for each address, but it does make what is going on completely obvious.

 

We run the query by pressing the ! button.

eg_street_geocode01_12.png

 

Again, there are 10 results and a new table is created.

 

eg_street_geocode01_13.png

 

We double-click the table to open it and to see the above, a table with a latitude and a longitude for each record.   tech_ravi_sm.png

 

Caution:  The query above that runs the GeocodAddress function twice, once to then extract a latitude and once to then extract a longitude is not efficient since in both cases we are getting a latitude and longitude.  It is presented as written to show the idea in a very simple way.  It would be smarter to simply save the latitude and longitude values received from running it once and then extracting from there.  

 

Running the GeocodAddress function twice in short succession can also anger Google, which might return a NULL for the latitude or the longitude to show its displeasure at asking the same thing twice.  We can get around all that by creating a geom and then using the procedure in the Example: Create a Geocoded Table from a Drawing topic to generate separate Latitude and Longitude fields from the geom.  

 

In general, if we re-run this example we will likely find that the Google geocoder returns NULLs for many of the records, sometimes seemingly at random.   The Microsoft Bing geocoder, using a free API key that can be obtained from Microsoft, will often be more reliable.    For commercial use when geocoding many records, we may have to sign up to a paid plan with Microsoft or Google to use their web-based geocoders.   Alternatives to paid geocoding services are open services such as the OpenCage Geocoder which is based on OpenStreetMap and is built into Manifold as a choice in the geocodeserver list of data sources.

 

Using the Manifold Geocoding Database

The Manifold Geocoding Database (GCDB) is a legacy database originally created from US Census Bureau data years ago to provide a free US address geocoding data source for Manifold GIS users.  It is a free download on the Manifold web site.   The data is now old and out of date so it does not make sense to use the GCDB if we can use a free web geocoding server.  However, the GCDB is totally free, it is local and requires no Internet link, and it can be useful if we simply want to run some addresses through a geocoder to generate a group of geocoded addresses for testing, for statistical purposes, or for other such purposes without needing to worry about geocoding each and every address.

 

The GCDB consists of individual .dat data files for each US state plus a states.dat file that covers all states.   Manifold has a geocoding dataport that allows Manifold to use the GCDB as a geocoding data source.

 

To create a data source for GDCB we launch the New Data Source dialog.

 

eg_street_geocode01_14.png

 

We enter a short name GCDB and choose File: datgdcb as the Type.  Since we will geocode only a few addresses and we do not want to grow the size of the .map file by saving cached data between sessions, we uncheck the lower two cache option boxes.   We click the [...] Browse button to specify the location of the GDCB data files we will use.

 

 

eg_street_geocode01_15.png

 

We browse to the location where the GCDB files are stored on our machine and click on the states.dat file and then press Open.    Important: to geocode using the GCDB we must link the full GDCB data set using the states.dat file.    

 

eg_street_geocode01_16.png

 

We press Create Data Source and that creates a new data source called GCDB in the project.

 

Next we open the Command Window and write a query that uses that data source to geocode our table of In-n-Out restaurant addresses.

 

eg_street_geocode01_17.png

 

SELECT [mfd_id], [number], [address],

GeomMakePoint(GeocodeAddress([GCDB],[address])) AS [Geom]

   INTO [gcdb restaurants]

   FROM [restaurants];

 

The query is similar to the ones used earlier in this topic.   We use the GCDB data source as the geocoding data source.

 

eg_street_geocode01_17a.png

 

Running the query by pressing the ! button shows 10 results.

 

It produces a new gcdb restaurants table.  We create a new drawing from that table, using the coordinate system picker button  to assign Latitude / Longitude as the initial coordinate system.

 

We then drag and drop that new drawing into the map to see how the points geocoded by GCDB line up with those geocoded by Google.

 

eg_street_geocode01_18.png

 

In the illustration above the GCDB points have been styled as yellow squares.     Zoomed into the region near the In-n-Out restaurant in Allen, Texas, we see a difference between the location geocoded for the same address, a typical situation when using different geocoders.

 

 

i_in_n_out.png

 

The above is a photo of the actual restaurant, from the Spatial Data topic.

 

Notes

NULLs - When a web-based geocoding server returns a NULL for an address that usually means one of several  things:

 

 

If we encounter too many NULLs when geocoding a small number of records, we can try a different geococding server.   For example, Bing, using a free API key that can be obtained from Microsoft is often less erratic than Google.   Use a good search engine to find pages that provide instructions for getting a free Bing API key.

 

See Also

Tables

 

Data Types

 

Web Servers and Image Servers

 

Street Address Geocoding

 

Adding an Index to a Table - A basic topic on adding an index using the built-in capabilities of the mfd_id field.

 

File - Create - New Data Source

 

Command Window

 

Assign Initial Coordinate System

 

Style: Drawings

 

Example: Add a Spatial Index to a Table - A typical use of an index is to provide a spatial index on a geom field in a table, so the geom data can be visualized in a drawing. This topic provides the step by step procedure for adding a spatial index.

 

Example: Create a Geocoded Table from a Drawing - A partner example to this topic.  A geocoded table has records with a latitude and longitude for each record.   This example starts with a table for a drawing of points where the geom field in the table contains geometry information for each point.   We extract the Y and X locations for each point  from the geom field to create latitude and longitude fields in the table for each record.

 

Example: Create a Drawing from a Geocoded Table - A partner example this topic.  A geocoded table has records with a latitude and longitude for each record.   This example starts with a table containing a list of cities with a latitude and longitude field for the location of each city.   We create a geom from the latitude and longitude fields using a template in the Transform panel and then we create a drawing that shows the cities as points.  This example shows all the infrastructure steps involved.

 

Example: Create a Drawing from Read-Only Geocoded CSV File - A detailed example using a mix of dialogs and SQL to create a drawing that shows data from an external, read-only geocoded CSV file that is linked into the project.