Example: Street Address Geocoding

In this example we street address geocode a table of street addresses using the Bing geocoder.   This example is also shown, using the Google geocoder, in the Street Address Geocoding using SQL YouTube video  in the Manifold Sales YouTube channel.

 

This example topic uses the Bing geocoder because at the present writing it is easier for many people to get an API key to use the Bing geocoder for free for less than industrial levels of geocoding.   In this example we will only geocode ten addresses so we can easily use the free service.    

 

We begin by launching File - Create - New Data Source.   The menu provides a list of favorites to choose from as well as a More... option.  

 

 

Choose More... to launch the New Data Source dialog.

 

 

We would like to use a short name for this data source so we enter bing as the Name.   In the Type box we choose Web Server: geocodeserver and then as a subtype we choose the Bing Geocoder.    

 

In the API key box we enter the API key we obtained from the Microsoft developer website.

 

Press Create Data Source.

 

 

A bing data source is created in the project pane.   While we are at it we create a new map using a OpenStreetMaps base layer.   We will use this map later on to display the locations of our geocoded records.  The OSM Base data source was created automatically when we created the map using a Bing streets base layer.

 

The project also contains a restaurants table that we will geocode.  The restaurants table contains the addresses for a selection of In-N-Out restaurants located in California, Utah and Texas.  

 

 

Clicking open the restaurants table we see that it contains an mfd_id index field, a number field that provides the official number of the restaurant in In-N-Out's database and an address field.  

 

It is just a coincidence that the mfd_id index field has the same numbers as the number field giving the official number of the restaurant: the example table was created by copying the first ten records from In-N-Out's database and pasting them into a new table, so the mfd_id numbers in this case turned out the same as the number numbers.

 

To geocode the restaurant table we create a query which we call Geocode Restaurants.  

 

The query text:

 

SELECT [mfd_id], [number], [address],

  GeomMakePoint(GeocodeAddress([bing],[address])) AS [Geom]

   INTO [geocoded restaurants]

   FROM [restaurants];

 

To deconstruct the above query, we are creating a new table by using SELECT ... INTO to populate the new table using fields from the restaurants table.  The most important line is

 

GeomMakePoint(GeocodeAddress([bing],[address])) AS [Geom]

 

where we use the GeocodeAddress function to geocode the address field  using the bing geocoding data source.   The float64x2 value returned by that function is converted by the GeomMakePoint function into a Manifold geom data type which is saved into the new table using the name Geom.

 

To run the query we press the ! button in the main Manifold toolbar.

 

 

The Results tab reports 10 results.  

 

 

We can double-click open the new geocoded restaurants table the query created and see that it does indeed contain a list of restaurants with the number, address and a Geom field that contains a point geom for each.  

 

Tech Tip: When using free web-based geocoders, some of the results may be returned as NULL on some runs and not on others.  Free web based geocoders may have limits on how may addresses we can geocode in a certain time period.   Such limits may be a consequence of slow or limited capacity in servers.

Create a Drawing

In the Project pane we right-click on the geocoded restaurants table and in the context menu we choose Create - New Drawing.

 

 

The New Drawing dialog suggests a default Name, which we use.   It knows there is only one Geometry field in the table, called Geom, and it knows there is no spatial index, as yet, in the table so it offers to create one for us.    The red text reporting the coordinate system tells us we must assign an initial coordinate system using the coordinate picker button.

 

 

We know that the Bing geocoder, like seemingly all web geocoders, returns latitude and longitude coordinates so we assign the Latitude / Longitude coordinate system.   We press Create Drawing.  

 

We can then drop the new drawing into the map to see where the geocoded points are located.  We also Shift-click the name tab of the map to undock the map window.

 

 

In the illustration above we have used Style to color the points green for better visibility.   As expected, the points are located in California, Utah and Texas.

 

We zoom into the points in Texas for a closer look.

 

 

The point for the In-N-Out restaurant on Stacy Road in Allen, Texas, appears above, the third record in our geocoded table.    We can see the Bing geocoder placed the location for 190 E. Stacy Road in Allen, Texas, on the wrong side of Andrews Parkway.   The green dot appears on top of the Chase bank building, not on top of the In-N-Out Burger location on the other side of the parkway.

Creating Different Data Types

The above is a typical procedure for geocoding a table in Manifold, where we create Manifold geom fields so the geometry is immediately usable in Manifold.   If desired, we can also save the results of geocoding into different data types, either by later converting the Manifold geom field using the Transform pane or by writing a different query initially.

 

 

The query text:

 

SELECT [mfd_id], [number], [address],

  GeocodeAddress([bing],[address]) AS [longlat]

   INTO [longlat restaurants]

   FROM [restaurants];

 

 

For example, the above simply stores the float64x2 vector value created by the GeocodeAddress function without converting it into a geom.   We run the query by pressing the ! button.

 

 

As before, we get ten results.

 

We double-click open the resulting longlat restaurants table.

 

 

The longlat field contains a float64x2 vector, which we can see by hovering the mouse over the column header to see the data type reported in a tooltip.

 

 

We named the field longlat to remind us that coordinates are reported in X,Y order, that is longitude first and then latitude.  

 

Suppose we would like to geocode a table to create separate longitude and latitude fields for use by some other application?  That is easy to do:

 

 

The query text:

 

SELECT [mfd_id], [number], [address],

  VectorValue(GeocodeAddress([bing],[address]), 1) AS [lat],

  VectorValue(GeocodeAddress([bing],[address]), 0) AS [lon]

   INTO [latlon restaurants]

   FROM [restaurants];

 

Using procedures similar to those in the Example: Create a Geocoded Table from a Drawing topic, The above query uses the VectorValue function to extract the latitude and the longitude values from the results of GeocodeAddress.    This isn't the most efficient way to use a geocoder since we end up calling the geocoder twice for each address, but it does make what is going on completely obvious.

 

We run the query by pressing the ! button.

 

 

Again, there are 10 results and a new table is created.

 

 

We double-click the table to open it and to see the above, a table with a latitude and a longitude for each record.   

 

Caution:  The query above that runs the GeocodAddress function twice, once to then extract a latitude and once to then extract a longitude is not efficient since in both cases we are getting a latitude and longitude.  It is presented as written to show the idea in a very simple way.  It would be smarter to simply save the latitude and longitude values received from running it once and then extracting from there.  

 

Running the GeocodAddress function twice in short succession can also anger some geocoders, even if you use an API key, which might return a NULL for the latitude or the longitude to show its displeasure at asking the same thing twice.  Bing is cool about such things but other geocoders might not be.  We can get around all that by creating a geom and then using the procedure in the Example: Create a Geocoded Table from a Drawing topic to generate separate Latitude and Longitude fields from the geom.  

 

In general, if we re-run this example with other geocoders we will likely find that some geocoders returns NULLs for many of the records, sometimes seemingly at random.   The Microsoft Bing geocoder, using a free API key that can be obtained from Microsoft, is usually very reliable when geocoding a few hundred or a few thousand records.    For commercial use when geocoding many records, we may have to sign up to a paid plan with Microsoft or Google to use their web-based geocoders.   Alternatives to paid geocoding services are open services such as the OpenCage Geocoder which is based on OpenStreetMap and is built into Manifold as a choice in the geocodeserver list of data sources.  Free geocoders, however, often have significant and undocumented limits on what their operators consider "too much" use.  

Using the Manifold Geocoding Database

The Manifold Geocoding Database (GCDB) is a legacy database originally created from US Census Bureau data to provide a free US address geocoding data source for Manifold GIS users.  It is a free download on the Manifold web site.   

 

The data is now very old and out of date so it does not make sense in production applications to use the GCDB if we can use a free web geocoding server.  It is limited to addresses in the United States.  However, the GCDB is free, it is totally local and requires no Internet link, and it can be useful if we simply want to run some addresses through a geocoder to generate a group of geocoded addresses for testing, for developing other parts of our code, for statistical purposes, or for other such limited purposes without needing to worry about how many addresses we are geocoding or how often we are geocoding the same thing.

 

The GCDB consists of individual .dat data files for each US state plus a states.dat file that covers all states.   Manifold has a geocoding dataport that allows Manifold to use the GCDB as a geocoding data source.

 

To create a data source for GDCB we again choose File - Create - New Data Source.  

 

 

Choose More... to launch the New Data Source dialog.

 

 

We enter a short name GCDB and choose File: datgdcb as the Type.  There is little point to caching data in the project that already exists in a file on our systems, so we uncheck the lower two cache option boxes.   

 

We click the [...] Browse button to specify the location of the GDCB data files we will use.

 

 

We browse to the location where the GCDB files are stored on our machine and click on the states.dat file and then press Open.    

 

Important: to geocode using the GCDB we must link the full GDCB data set using the states.dat file.    

 

 

We enter GCDB as the Name, and we choose the File: datgdcb as the type.  We can name the data source however we like, but it makes sense to use short names that remind us of what the data source is.  

 

We press Create Data Source and that creates a new data source called GCDB in the project.   

 

Next we open the Command Window and write a query that uses that new data source to geocode our table of restaurant addresses.

 

 

The query text:

 

SELECT [mfd_id], [number], [address],

  GeomMakePoint(GeocodeAddress([GCDB],[address])) AS [Geom]

   INTO [gcdb restaurants]

   FROM [restaurants];

 

The query is similar to the ones used earlier in this topic.   We use the GCDB data source as the geocoding data source.

 

 

Running the query by pressing the ! button shows 10 results.

 

The query produces a new gcdb restaurants table.  We create a new drawing from that table, using the coordinate system picker button  to assign Latitude / Longitude as the initial coordinate system.

 

We then drag and drop that new drawing into the map to see how the points geocoded by GCDB line up with those geocoded by Bing.

 

 

Zoomed out, the GCDB geocoder seems to have produced locations, styled as larger, bright blue dots, at the same locations as Bing.

 

 

Zoomed into the region near the In-N-Out restaurant in Allen, Texas, we see that the Census Bureau data used in the GCDB geocoder placed the blue dot for the restaurant even farther away from actual location than Bing, all the way on the left side of the window in the middle of a major highway interchange.   

Using the MapBox Geocoder

The MapBox geocoder is a service of MapBox, a US company founded in Washington, DC, in 2013, and since then acquiring $226 million in venture funding from 2013 to 2017.

 

To create a data source for MapBox, we  choose File - Create - New Data Source.  

 

 

Choose More... to launch the New Data Source dialog.

 

 

We enter mapbox as the Name, and we choose the MapBox geocoder.  We can name the data source however we like, but it makes sense to use short names that remind us of what the data source is.

 

Enter our API key (the illustration shows a fake key, not our real one), which we learned how to obtain by searching the Internet using terms like "mapbox geocoder api key."

 

We press Create Data Source and that creates a new data source called mapbox in the project.   

 

Next we open the Command Window and write a query that uses that new data source to geocode our table of restaurant addresses.

 

The query text:

 

SELECT [mfd_id], [number], [address],

  GeomMakePoint(GeocodeAddress([mapbox],[address])) AS [Geom]

   INTO [mapbox restaurants]

   FROM [restaurants];

 

The query uses the mapbox geocoding data source to geocode records from the restaurants table and to save the geocoded records into a new table.

 

we right-click the new table and choose Create - New Drawing, creating a new drawing using Latitude / Longitude coordinate system.   We then drag and drop the new drawing into the map.

 

 

We use the Style pane to style the mapbox points as inverted magenta triangles.  The zoomed-in view above shows that the MapBox geocoder placed the restaurant almost as far away as the GCDB geocoder did, in the lower left part of the window, with dramatically lower accuracy than Bing.

 

Such a big difference between the location geocoded for the same address is a typical result when using different geocoders.

Using the OpenCage Geocoder

The OpenCage geocoder is a service of OpenCage GmbH, a German company that provides worldwide geocoding using OpenStreetMap data.

 

To create a data source for OpenCage we choose File - Create - New Data Source.  

 

 

Choose More... to launch the New Data Source dialog.

 

 

We enter opencage as the Name, and we choose the OpenCage geocoder.  We can name the data source however we like, but it makes sense to use short names that remind us of what the data source is.

 

Enter our API key (the illustration shows a fake key, not our real one), which we learned how to obtain by searching the Internet using terms like "opencage geocoder api key."

 

We press Create Data Source and that creates a new data source called opencage in the project.   

 

Next we open the Command Window and write a query that uses that new data source to geocode our table of restaurant addresses.

 

 

The query text:

 

SELECT [mfd_id], [number], [address],

  GeomMakePoint(GeocodeAddress([opencage],[address])) AS [Geom]

   INTO [opencage restaurants]

   FROM [restaurants];

 

The query uses the opencage geocoding data source to geocode records from the restaurants table and to save the geocoded records into a new table.

 

we right-click the new table and choose Create - New Drawing, creating a new drawing using Latitude / Longitude coordinate system.   We then drag and drop the new drawing into the map.

 

 

We use the Style pane to style the opencage points as yellow diamonds.  The zoomed-in view above shows that the OpenCage geocoder placed the restaurant exactly in the correct position, at the far right part of the window, with dramatically better accuracy than MapBox and substantially better accuracy than Bing.

 

Using the Here Geocoder

The Here geocoder is a service of HERE Global B.V., a Netherlands company with over 9,000 employees in 56 countries.

 

To create a data source for Here we choose File - Create - New Data Source.  

 

 

Choose More... to launch the New Data Source dialog.

 

 

We enter here as the Name, and we choose the Here geocoder.  We can name the data source however we like, but it makes sense to use short names that remind us of what the data source is.

 

Enter our API key (the illustration shows a fake key, not our real one), which we learned how to obtain by searching the Internet using terms like "here geocoder api key."

 

We press Create Data Source and that creates a new data source called here in the project.   

 

Next we open the Command Window and write a query that uses that new data source to geocode our table of restaurant addresses.

 

 

The query text:

 

SELECT [mfd_id], [number], [address],

  GeomMakePoint(GeocodeAddress([here],[address])) AS [Geom]

   INTO [here restaurants]

   FROM [restaurants];

 

The query uses the here geocoding data source to geocode records from the restaurants table and to save the geocoded records into a new table.

 

we right-click the new table and choose Create - New Drawing, creating a new drawing using Latitude / Longitude coordinate system.   We then drag and drop the new drawing into the map.

 

 

We use the Style pane to style the here points as square orange boxes with a small, inner, black square.   The zoomed-in view above shows that the Here geocoder placed the restaurant in almost the same location as Bing, with dramatically better accuracy than MapBox but substantially lower accuracy than OpenCage, which for this particular address was exactly correct.

 

Using the MapQuest Geocoder

The MapQuest geocoder is a service of MapQuest, Inc, a US company that was one of the original online maps websites, formerly part of Verizon and AOL and now part of System1, a US company based in Venice, California.

 

To create a data source for MapBox, we  choose File - Create - New Data Source.  

 

 

Choose More... to launch the New Data Source dialog.

 

 

We enter mapquest as the Name, and we choose the MapQuest geocoder.  We can name the data source however we like, but it makes sense to use short names that remind us of what the data source is.

 

Enter our API key (the illustration shows a fake key, not our real one), which we learned how to obtain by searching the Internet using terms like "mapquest geocoder api key."

 

We press Create Data Source and that creates a new data source called mapquest in the project.   

 

Next we open the Command Window and write a query that uses that new data source to geocode our table of restaurant addresses.

 

 

The query text:

 

SELECT [mfd_id], [number], [address],

  GeomMakePoint(GeocodeAddress([mapquest],[address])) AS [Geom]

   INTO [mapquest restaurants]

   FROM [restaurants];

 

The query uses the mapquest geocoding data source to geocode records from the restaurants table and to save the geocoded records into a new table.

 

we right-click the new table and choose Create - New Drawing, creating a new drawing using Latitude / Longitude coordinate system.   We then drag and drop the new drawing into the map.

 

 

We use the Style pane to style the mapquest points as green, slightly rotated boxes with an outer black box.   The zoomed-in view above shows that the MapQuest geocoder placed the restaurant far away from the true location, in the middle of East Stacy Road approximately half-way between the GCDB point and the Bing and Here points.   Accuracy is better than MapBox, but very poor compared to Bing and Here.   Only OpenCage geocoded the address to the correct location.

Using Other Geocoders

Manifold also supports Google, and also the Yandex geocoder, which is popular in Eastern Europe and former countries of the Soviet Union.   Procedures for Yandex and Google are similar to the above.   Yandex is not illustrated since it is specialized for users in countries that normally do not use English.   Google is not illustrated since, as of the present writing, Google requires provision of a credit card to obtain an API key, even if we intend to stay within the free tier of use.

 

Notes

NULLs - When a web-based geocoding server returns a NULL for an address that usually means one of several  things:

 

 

If we encounter too many NULLs when geocoding a small number of records, we can try a different geococding server.   For example, Bing, using a free API key that can be obtained from Microsoft is often less erratic than other geocoders without using an API key.   Use a good search engine to find pages that provide instructions for getting a free Bing API key.

 

The Real Thing - Illustrated below is a photo of the actual restaurant in Allen, Texas, from the Spatial Data topic.

 

 

 

See Also

Tables

 

Data Types

 

Web Servers and Image Servers

 

Street Address Geocoding

 

Adding an Index to a Table - A basic topic on adding an index using the built-in capabilities of the mfd_id field.

 

File - Create - New Data Source

 

Command Window

 

Assign Initial Coordinate System

 

Style: Drawings

 

Example: Add a Spatial Index to a Table - A typical use of an index is to provide a spatial index on a geom field in a table, so the geom data can be visualized in a drawing. This topic provides the step by step procedure for adding a spatial index.

 

Example: Create a Geocoded Table from a Drawing - A partner example to this topic.  A geocoded table has records with a latitude and longitude for each record.   This example starts with a table for a drawing of points where the geom field in the table contains geometry information for each point.   We extract the Y and X locations for each point  from the geom field to create latitude and longitude fields in the table for each record.

 

Example: Create a Drawing from a Geocoded Table - A partner example this topic.  A geocoded table has records with a latitude and longitude for each record.   This example starts with a table containing a list of cities with a latitude and longitude field for the location of each city.   We create a geom from the latitude and longitude fields using a template in the Transform pane and then we create a drawing that shows the cities as points.  This example shows all the infrastructure steps involved.

 

Example: Create a Drawing from Read-Only Geocoded CSV File - A detailed example using a mix of dialogs and SQL to create a drawing that shows data from an external, read-only geocoded CSV file that is linked into the project.