CSV Servers

Manifold can read Comma Separated Values (CSV)  and similar formats that represent tables within text files, where each line is a record and the values for fields are separated by a character such as a comma (CSV). Each CSV file contains one table.   A similar format uses a | "pipe" character and is often known as Pipe Separated Values, or PSV file format.    This topic shows how to acquire CSV data from web servers providing data in CSV format.  A related topic, CSV, ASC, PSV, TAB, shows how to acquire CSV data from local files.   

 

ico_nb_arrow_blue.png Please read the CSV, ASC, PSV, TAB topic for important information about CSV format and how Manifold handles different options which occur within CSV format.

 

We can bring data from CSV servers into Manifold using two, related techniques:

 

 

 

 

Connecting to a CSV server:

 

  1. Choose File - Create - New Data Source in the main menu, or right-click in the Project pane and choose New Data Source.

  2. Choose Web Server: csvserver in the Type box.

  3. Enter the URL for the CSV data in the Source box.

  4. Default settings for other options will usually work for most CSV data.   Press Create Data Source.

  5. A new data source appears in the project.  Open it to see the table that is linked in from the CSV file.

 

Creating a local copy of the CSV table:

 

  1. Click on the table linked in from the CSV to highlight it.

  2. Press Ctrl-C or click the Copy button in the Project pane toolbar.

  3. Click anywhere in the main, Manifold .map part of the project outside of the new data source's hierarchy.

  4. Press Ctrl-V or click the Paste button in the Project pane toolbar.

 

Tables that are linked from CSV data are neither selectable nor editable since they do not have any indexes.  We cannot add an index to a linked CSV table since CSV format does not support indexes.   We can, however, add a key field and index to a local table that has been copied and pasted from a linked CSV table.  Such local tables are fully capable, native Manifold tables.   We can easily add a key field and index to them to enable editing and selection, using the procedure given in the Add an Index to a Table topic.

 

Adding an index to a table:

 

  1. Double-click on the table to open it.

  2. Choose Edit - Schema.

  3. In the Schema dialog click the Add Identity button.

  4. Press the Save Changes button.

 

dlg_web_servers_csv01_01.png

 

Name

Name for the new data source, "Data Source" by default.  Specify a more  memorable name as desired.  If we forget the origin of a data source we can hover the mouse over the data source and a tool tip will provide connection information.

Type

Choose Web Server: csvdeserver in the Type box to connect to a CSV format source over the web.

Source

A connection string to the web server CSV data.  The connection string can also be entered using the Web Login dialog launched by the [...] browse button.

btn_browse.png  Browse button

Click to launch the Web Login dialog, to allow use of a login and password plus use of a proxy server if desired.   The Web Login dialog is also handy for providing a Test button that can be used to test the connection.

Open as read-only

Open the data source read-only.  Has no effect with CSV servers since they are read-only in any event.

Cache data

Cache data downloaded from the server while the project is open.  Cached data is saved within the .map project itself in a Cache sub-folder in the System Data hierarchy. Also provides greater flexibility with read-only data sources.

Save cached data between sessions

Save the cached data for the next time this project is opened,  within the .map project itself in a Cache sub-folder in the System Data hierarchy.   Caution: checking this box can result in very large .map files when the results of browsing very large data from web servers are all saved.  However, having such data cached in the .map is handy for offline browsing of the project.

First line contains field names

Checked by default.   The first line in most CSV data gives the names of fields.  Some CSV data sources do not provide the names of fields in the first line.

Read all fields as text

The system will try to guess what fields contain text and what fields contain integers or floating point numbers.  In some cases, such as Zip codes in US postal addresses, what appears to be a number should be treated as text.  Checking this box imports all fields as text even if they are not marked as text and even if they appear to be numeric fields.

Allow multiline text values

When checked, ignore linefeed and other characters normally denoting the end of a line which occur within text, that is, within text qualifier characters.  Check this box to deal with "csv" data that include multiple paragraphs within a single text field, as often occurs in CSV data originally harvested from web pages, KML or other web-oriented formats.

List delimiter

The character used within the CSV data to separate fields within a record. The default (auto) setting usually guesses correctly.   Enter any character or choose it from the list of standard characters: comma, colon, semicolon, pipe, space or tab.

Decimal separator

The character used within the CSV data to denote the decimal fractional part of a number, almost always either a period or a comma.    The default (auto) setting usually guesses correctly.   Enter any character or choose it from the list of two standard characters: period or comma.  A period is used in North America and many former UK colonies.  A comma is used in Europe, South America and many African countries.

Text qualifier

The character used within the CSV data to wrap values intended to be text.  The default (auto) setting usually guesses correctly.   Enter any character or choose it from the list of two standard characters:  a double " quote character or a single ' quote character.

Create Data Source

Create the new data source in the project pane and close the dialog.

Edit Query

Launch the Command Window loaded with a query that creates the data source using the given settings.  A great way to learn how to use SQL to create data sources.

Cancel

 Exit the dialog without doing anything.

 

 

btn_browse.png  Pressing the browse button next to the Source box launches the Web Login dialog, to specify a server and connection characteristics.

 

dlg_web_servers_csv01_02.png

 

Server

The connection string for the server.  This may be a simple URL or a very lengthy URL/connection string that embeds parameters such as keys that grant access or other parameters

Use login and password

Check this box for servers that require logging in with a login name and a password, providing the required name and password in the Login and Password boxes.

Use proxy server

Check this box when connecting through a proxy server.   The Proxy, Login, and Password boxes allow specifying the connection string to the proxy server as well as the login name and password required to use the proxy server.

User agent

Identifies what application (Manifold) is asking for a connection.  Some web servers want to know what client software is connecting, for compatibility or for business reasons.  The default string optimizes compatibility (Mozilla is very generic) while also identifying Manifold Release 9 as the client.  Users can adjust the string as necessary to comply with any special server requirements.

API key

Provide a key that authorizes use of an API when connecting to a proprietary data source that requires such a key.  Not used with CSV servers.

Application key

A secondary application key or authentication code for those servers, such as here.com (also known as wego.here.com) that may require it.   Not used with CSV servers.

Timeout

Specify a time in milliseconds to wait for connecting to the specified server.  Use 0 for the default timeout or specify whatever is the desired time to wait before giving up on the server.

Test

Press the Test button to try the connection using the specify parameters.   If successful, a Connection established information dialog will pop open.

 

Example: Connect to the World Bank CSV Server

Visiting the World Bank Climate Data API page we see that the World Bank publishes historical data for the average yearly temperatures by country in CSV format, accessible through their web server given the right URL.  Reading the API documentation carefully we see that the correct URL to get the "historical" (it is estimated data in many cases) data for a country in CSV format is:

 

http://climatedataapi.worldbank.org/climateweb/rest/v1/country/cru/tas/year/iso3.csv

 

...where iso3 should be replaced by the three letter ISO code for a country, for example, CAN for Canada.   To get a temperature of average annual temperatures for Canada in CSV format, we would use the connection string:

 

http://climatedataapi.worldbank.org/climateweb/rest/v1/country/cru/tas/year/CAN.csv

 

We launch Manifold and choose File - Create New Data Source.

 

dlg_web_servers_csv01_03.png

 

We enter Average Annual Temps Canada as the Name.  We choose Web Server: csvserver as the Type.   In the Source box we enter the URL connection string:

 

http://climatedataapi.worldbank.org/climateweb/rest/v1/country/cru/tas/year/CAN.csv

 

Leaving the other options at their default settings, we press Create Data Source.

 

dlg_web_servers_csv01_04.png

 

A new data source appears in the project pane.  We expand it to see it contains a table called CAN.  

 

Manifold web server dataports such as the CSV server or ECWP server and similar which produce a single image, drawing or table will compose names for the components they create by taking the URL object name and cutting out extensions, so that CAN.csv becomes simply CAN, and parameters are cut, so that data? becomes simply data.

 

dlg_web_servers_csv01_05.png

 

We open the table and see it contains read-only data (from the gray background of fields) giving the average annual temperature for Canada by year.   The API page tells us the temperatures are in degrees Centigrade.    The table is read-only because it is generated by the World Bank server.    

 

If there are any changes in the data on the World Bank server, when we refresh the table, for example, by pressing View - Refresh or when we open the project again after closing it, the latest data will appear.  It is not likely the World Bank will change its mind about whatever it estimates the average temperature in Canada was in 1902, but if we had connected to a CSV server that provided more dynamic data, such as the latitude and longitude locations of ships, we would be happy to have dynamically updated data.

Create a Read/Write Table

If we want a read/write table imported into the .map portion of our project, we can easily do that.   Importing data, of course, means that we create a local, static copy, and not dynamically updated data served from a server.  See the Importing and Linking topic for discussion of the difference.

 

We can Copy the table from the data source and Paste it into our project to create a read/write table.

 

dlg_web_servers_csv01_06.png

 

Components in Manifold must have a name that is unique through the project, but the CAN table that we paste into the project does not pose a naming conflict because the name within the project is simply CAN, while the name of the CAN table within the data source is CAN:Average Annual Temps Canada.   We double click our new, local table to open it.

 

dlg_web_servers_csv01_07.png

 

Except for the shorter name, it looks just like the table we copied and pasted from the CSV server data source, including the read-only gray background.   To make it read-write, we launch the Schema dialog, which we can do by pressing Edit - Schema or by using the keyboard shortcut Ctrl-E.

 

dlg_web_servers_csv01_08.png

 

btn_schema_add_identity.png  We press the Add Identity button.  

 

dlg_web_servers_csv01_09.png

 

Manifold instantly adds a key, identity field and an index on that field, using the traditional mfd_id and mfd_id_x names.   We press Save Changes.

 

dlg_web_servers_csv01_10.png

 

The table's background for fields changes to white to indicate it is now read/write.  It is now fully editable and selectable.   The new mfd_id field appears as well.  We can hide that if we like using the Layers pane.

Example: Connect to a CSV File through the Web

It is not clear if the World Bank serves CSV data that is generated on the fly, or if the URL we used connects to a file maintained on their server.   A very common use of CSV servers is to access data published as CSV files and placed on web server for each access by anyone.    

 

For example, if we have a website called www.manifold.net and we want to make a CSV file available to Manifold users through the CSV server dataport, all we need do is to place that file somewhere on our site where it can be accessed with a URL.   Suppose we have a file called cathedrals.csv that provides a list of French gothic cathedrals.   We could put it into a folder in our site called /files and the URL to connect to it would be:

 

http://www.manifold.net/files/cathedrals.csv

 

We can try that now (the link above is a live link).  We choose File - Create - New Data Source:

 

dlg_web_servers_csv01_11.png

 

We enter French Gothic Cathedrals as the Name.  We choose Web Server: csvserver as the Type.   In the Source box we enter the URL connection string:

 

http://www.manifold.net/files/cathedrals.csv

 

Leaving the other options at their default settings, we press Create Data Source.

 

dlg_web_servers_csv01_12.png

 

A new data source appears in the project pane.  We expand it to see it contains a table called cathedrals.  

 

dlg_web_servers_csv01_13.png

 

We open the table and see it contains read-only data (from the gray background of fields) giving a list of cathedrals, the height of the nave in meters for each, some comments, and the latitude and longitude location of each.    The table is read-only because web servers publish data one-way, and we do not have rights to change data within the manifold.net server.

 

If there are any changes in the CSV file placed on the manifold.net website, for example, if we add additional cathedrals to the list, the cathedrals table will be updated when we refresh it.   

Using CSV Server Data in a Query

To use the linked cathedrals table within our project, we could copy and paste it as a local copy, as in the prior example, or we can use it from a query.  

 

We right-click into the .map portion of the project, that is, outside of any of the data sources, and we choose Create - New Query.  

 

dlg_web_servers_csv01_14.png

 

We name the new query cathedrals.  It appears in the Project pane as seen above.   We double-click the query to open it and we enter the following SQL:

 

 

TABLE CALL TableCacheIndexGeoms((

  SELECT [Name], [Nave_meters], [Comments], [Latitude], [Longitude],

    GeomMakePoint(VectorMakeX2([Longitude], [Latitude])) AS [Geom]

  FROM [French Gothic Cathedrals]::[cathedrals]

), TRUE);

 

This is an adaptation of the query used in the Example: Create a Drawing Dynamically from a Geocoded Table topic, copied and pasted into the Command Window we use to edit the query (not illustrated), with the names of the table and fields adjusted to the CSV server table we are using.

 

dlg_web_servers_csv01_15.png

 

We can now right-click onto the cathedrals query in the Project pane and choose Create - New Drawing.

 

dlg_web_servers_csv01_16.png

 

We choose cathedrals Drawing as the name for our new drawing.  

 

btn_coord_sys_picker.png  We click the coordinate picker button to set the coordinate system to Latitude / Longitude, and then we press Create Drawing.

 

dlg_web_servers_csv01_17.png

 

A new drawing appears in our project.

 

dlg_web_servers_csv01_18.png

 

We can create a new map, add a few web server image server sources for background, a Bing satellite layer and a Yandex "skeletal" layer (contributed by a Manifold user in a post on the georeference.org forum) showing labels and major roads with transparency in between, allowing the satellite layer to show through.  

 

We drag and drop our cathedrals Drawing into the map, using Style to show each cathedral location as a star glyph, rotated slightly for better visual energy, with a drop shadow in black.  We color the fill color for the points using a thematic format to color them by nave height, using the Color Brewer Spectral palette, with blue being the lowest nave and red the highest nave.

 

In the above display, if the CSV file on the Manifold web site changes, when the data source refreshes, the CSV data pulled by the CSV server dataport will change and the results generated by the query will change, to add, delete, or move points in the drawing based on how the data in the CSV file on the Manifold server changes.

 

That much is very useful, but for fuller access to the data, such as clicking on one of the points and seeing the attributes, such as the comments field, we need to do more than such a simple query.   The usual approach with CSV data to which we want fuller access is to use a query which copies the data from a CSV server table into a local table in the .map project, with the query adding a key field and index.   That makes the local table fully read/write and selectable, so all the usual features like selection, zoom to selection, alt-clicking a point to launch it in the Record pane so we can see attributes, and so on, all will work.

Example: Link to New Zealand Statistics Data

Many organizations publish data as CSV files.   

 

In search of thrilling statistics, we visit the ever-popular stats.gov.nz site to obtain statistics in CSV file format, at

 

https://www.stats.govt.nz/large-datasets/csv-files-for-download/

 

dlg_web_servers_csv01_19.png

 

The site lists numerous data sets that are available for free download as CSV files.   

 

We are using the Opera browser, so we can take advantage of automatic VPN anonymity (just say No! to geo-tracking) and automatic ad-blocking within the browser itself.  

 

dlg_web_servers_csv01_20.png

 

We skip the boring economic statistics and scroll down to a more interesting subject, statistics on alcohol available for consumption.  We right-click on the link and choose Copy link address.  

 

dlg_web_servers_csv01_21.png

 

In Manifold, we choose File - Create - New Data Source.  We provide a sensible Name and then we Paste the URL we copied into the Source box.  We press Create Data Source.

 

dlg_web_servers_csv01_22.png

 

A new data source appears in our project.  We expand it, and, all the way from New Zealand, we see a table appear.

 

dlg_web_servers_csv01_23.png

 

We double-click the table to open it, and we see it contains statistics on alcohol available for consumption.   A world of data at our fingertips!

Notes

Dates - Different languages have different ways for expression dates.  When recognizing date values as dates, Manifold is guided by whatever languages are specified in the Help - About dialog in the Lang entry, the presumption being that the date formats in use are those employed by whatever language is in use.    Reading a CSV file allows date fields to use multiple different languages.  For example, one field can use US English while another field can use German.

See Also

Tables

 

Data Types

 

File - Create - New Data Source

 

CSV, ASC, PSV, TAB

 

Example: Import CSV and Create a Drawing - Import a CSV file by creating a New Data Source, copying and pasting the table conveyed by the CSV, prepping the table for use as a geocoded table and then creating a drawing.  

 

Example: Create a Drawing Dynamically from a Geocoded Table - A more general, cooler way to create a drawing from a geocoded table, without adding any fields to the table.   We create a small query that generates geometry on the fly, and then we create a drawing from the query.   This is how more advanced users often do it, using a technique that is perfect for creating drawings from tables in remote DBMS packages or read-only files.