About Coordinate Systems

The term projection is often used as a synonym for coordinate system.   To assign or to change a projection / coordinate system, please see the Projections topic.   This topic provides background information on coordinate systems.

 

When using spatial data we need to know where that data is located.  A coordinate system is an organized way to use numbers to describe a location.   All spatial data we use will have some numbers attached to that data which specify where the data is located based on the coordinate system that data uses.  In Manifold, all spatial data is stored in a table and the coordinate system utilized to make sense of the spatial data in a table is specified in the table's properties.

 

While there are many ways of using numbers to describe a location, one simple way we learned in school  is to use two numbers, an X number and a Y number, for each location.    

 

If we plot a curved line on a piece of graph paper using X numbers to specify left or right distances and Y numbers to specify up or down distances we are using an X,Y coordinate system.  The X and the Y values are called coordinates.

 

i_xy_coords.png

 

We can use a third Z number to specify a distance up or down to make an X,Y,Z coordinate system.

 

i_xyz_coords.png

 

If we want to do more than draw on a sheet of paper or in abstract three dimensional space we will need a way to assign more meaning to what the X, Y and possibly Z numbers are supposed to mean in a particular case, for example, how they might be used to specify a location on the Earth's surface.   

 

For example, a specific coordinate system such as the Mercator coordinate system uses X and Y coordinates but it does so with the understanding that a specific set of equations and parameters specify how those X and Y coordinates relate in a 1 to 1 way to Longitude and Latitude locations on Earth.

 

When we use the term coordinate system in Manifold we mean the use of coordinates together with the definition of how that coordinate system relates to locations on the Earth.    That definition is usually based on equations and parameters, but it sometimes may be based on tables.   

 

A synonym for coordinate system when used to describe locations on Earth is projection.   The term projection is the classical name used by geographers and cartographers.   The term coordinate system as applied to spatial data tends to be the name used by programmers.

 

Manifold uses both the terms coordinate system as well as projection as interchangeable synonyms but tends to use projection in the context of discussing cartographic matters and tends to use coordinate system when discussing programming or the technology of manipulating spatial data using tools such as SQL.   See the discussion in the Projections Tutorial topic to learn more about what projections are and why they are used.

 

Defining Coordinate Systems / Projections

 

Before computers the usual way to define a new coordinate system was to publish a paper that set forth the equations and parameters or other means which defined how locations on the Earth's surface were represented by the X, Y and Z numbers in the coordinate system and to give a name to that coordinate system, such as Lambert Conformal Conic Projection.  

 

The use of text names to identify projections has caused endless troubles when spatial data is stored and exchanged because plain text names do not by themselves convey all of the different options and parameters that must be specified for even a simple coordinate system such as the Lambert Conformal Conic.   In addition to the name of the coordinate system all options and parameters must be exactly and accurately specified as well.

 

If we have two different data sets, say, one showing roads in Ohio and the other showing rivers and lakes in Ohio, simply saying that both data sets use "the Lambert Conformal Conic coordinate system" is not enough to assure that whatever computer software we are using can interpret the data accurately within the intended coordinate system so that, say, the roads and the rivers line up and we do not see a highway interchange in the middle of a lake.  We must also specify which datum or base was used, what standard parallels were used and other options and parameters.

 

There are three main tasks involved in ensuring that spatial data sets can be published and exchanged between different users and different software without fatal confusion over coordinate systems:

 

 

None of the above tasks can be taken for granted.   Much of the world's spatial data uses ambiguous names, the world's most widely used format standard for storing spatial data does not include any way to tag the file with the coordinate system intended and computer software that claims to work with spatial data varies wildly in its ability to implement accurate calculations for coordinate systems.  

 

The result is that much of the effort spent in working with spatial data goes into figuring out which coordinate system was used, trying to determine missing parameters and other housekeeping matters that would, if all three of the above tasks were fulfilled, take no time at all.   Most high end systems for working with spatial data have facilities for dealing with such problems, as does Manifold.  One of Manifold's strategies to make life easier for spatial data users is to be able to recognize virtually all of the various different systems for naming coordinate systems or to convey which coordinate system is used for a given data set.

 

Standard Names for Coordinate Systems

 

Standard names for coordinate systems usually are defined within some standard collection of coordinate systems or projections:

 

 

Definitions published by National Cartographic Organizations

 

Text names for coordinate systems, such as Lambert Conformal Conic, tend to be defined in a more or less standard way in each country by the national cartographic bureaucracy for that country, for example, USGS in the US or IGN in France.  Most often these are government organizations but sometimes they are non-government organizations.  

 

Given that GIS (Geographic Information Systems) software was first developed in the US and that the US software industry has had great influence in the world, the text names employed by USGS and set forth in a series of technical papers by USGS authors such as John Parr Snyder have had great influence throughout the world on establishing names for coordinate systems in English.  

 

For text names of well-known coordinate systems Manifold uses the names and associated definitions as set forth by USGS publications.   When coordinate systems are used within specific countries and names for those coordinate systems have been published by the national cartographic bureau, Manifold uses those names as defined by the relevant national cartographic bureau.   These text names appear in the long list of coordinate system names in the Standard tab of the Coordinate System dialog.

 

Definitions published by EPSG

 

EPSG Codes are a numeric code such as EPSG:3857 which precisely specify a single coordinate system and any related transformations or algorithms from within a database of thousands in the EPSG database.   The EPSG also includes a text name for each code.

 

The European Petroleum Survey Group or EPSG was a scientific organization of specialists in the European petroleum industry that compiled a standard database of coordinate systems and related information to facilitate technical work in oil and gas exploration.  EPSG was absorbed into the International Association of Oil & Gas Producers or IOGP in 2005 but the database continues to

be maintained and known as the EPSG database.

 

EPSG codes are by far the most comprehensive and least ambiguous way of specifying a given coordinate system.  The technical accuracy and quality in EPSG is also far and away the best of any of the standard means of specifying coordinate systems and the precise transformations between them to be used.  The quality of technical work in EPSG is simply brilliant, easily the best overall throughout the world even considering the high standards set by the world's best national cartographic agencies such as the US's USGS and France's IGN.

 

Although it is a daunting task for any software to include the many thousands of EPSG codes, and thus coordinate systems, in the EPSG database because EPSG is so precise and comprehensive it has emerged as the gold standard for defining coordinate systems.    The EPSG database may be freely downloaded at no charge.

 

The EPSG tab in the Coordinate System dialog in Manifold allows us to specify a coordinate system using an EPSG code.  Manifold supports all EPSG codes, even those which have been deprecated or replaced by other codes.  For example, EPSG:26747 has been replaced by EPSG:26799.

 

Unfortunately, not all software uses EPSG codes correctly.   Some software claims to use EPSG codes but then ignores key parts of the standard.   See the That YX Thing essay for the canonical rant on the most common misuse of EPSG.

 

Definitions published by other groups

 

OGC or the Open Geospatial Consortium (formerly known as the Open GIS Consortium), is a self-appointed "standards organization" that sells copies of its coordinate system and other specifications through ISO for a fee.  Most OGC coordinate system names are simply OGC versions of an EPSG code, but some are legacy coordinate systems implemented by OGC before they acknowledged reality and began using EPSG like everyone else.

 

OGC identifies a coordinate system with a Spatial Reference System Identifier or SRID.  The OGC SRID system utilizes "well known text" or WKT strings written in OGC's lengthy syntax to cite either those  coordinate reference systems or CRSs which OGC implemented before adopting EPSG, or to define the SRID by citing an EPSG code, the majority of SRIDs.  

 

OGC's SRID 2029, for example, is simply another name for EPGS:2029 written in bureaucratic OGC text  that names EPSG:2029 together with a recital of the EPSG components used in EPSG:2029 but phrased in in the peculiarly inefficient form OGC prefers, taking 785 characters to convey the key information the EPSG code conveys in four characters.    Therefore, for a precise understanding of what the SRID 2029 is supposed to be, any software which uses it must know exactly what EPSG:2029 is supposed to be.  Manifold, of course, knows exactly what this and all other EPSG codes are supposed to be.

 

When connecting to an OGC-influenced resource such as a web server providing data using OGC specifications, software must parse the OGC text shoved at it to extract either the legacy OGC coordinate system or the more modern EPSG code that defines the coordinate system intended.   Manifold understands all such OGC codes either in the various legacy OGC forms or as EPSG.

 

Manifold also provides SQL compatibility with OGC WKT specifications of coordinate systems by providing functions and SQL that parse OGC WKT to extract the coordinate system intended.  In particular Manifold can parse OGC style strings that specify coordinate systems such as AUTO:xxx,..., AUTO2:xxx,.. and CRS:xxx  as well as urn:ogc:def:crs:... OGC strings with variants for EPSG, CRS and AUTOxxx.  This allows easy extraction of OGC coordinate systems for conversion into efficient forms.

 

Coordinate System Bases or Datums

 

When we use a coordinate system to specify a location on the Earth with XY or XYZ numbers we are using a projection that represents the not-quite-spherical surface of the Earth as a flat sheet.   See the Projections Tutorial for how and why that is done.

 

If the Earth were a perfect, unchanging sphere we could precisely specify any location on Earth using the longitude and latitude numbers for that location.   We could calculate the corresponding X, Y numbers for that location in a projection by just taking the longitude and latitude numbers and grinding them through the algebraic equations that define the projection.   Simple!

 

But the Earth is not a perfect sphere.  At best, the Earth is a somewhat lumpy, irregular ellipsoid that changes shape over time as a result of plate tectonics and other geologic processes.   For example, the surface in parts of North America is rising as the land continues to rebound upwards in recovery from the last Ice Age, no longer pressed down by the weight of massive glaciers.

 

As illustrated in the Earth as an Ellipsoid topic, neither a sphere nor even a simple ellipsoid will match the Earth's surface in all locations.   We can approximate the surface using a sphere or an ellipsoid that's approximately the right size for most of the Earth, and then we can nudge that approximate surface so that it overlaps better or worse for the region we would like to map.   But exactly which longitude and latitude numbers specify some real location on Earth depend on what sphere or ellipsoid we have used and how we have placed it relative to the uneven surface of the real Earth.   

 

Depending on what shape we have used as reference to read off longitude and latitude numbers the very same numbers can specify very different locations on Earth, possibly miles or kilometers away from each other.   Even very similar spheres can result in differences of hundreds of feet or meters for the same longitude or latitude coordinates.    It is as if we punched in a longitude and latitude location into our car's GPS navigation system and depending on how the unit was set the exact same numbers could put us at the location desired or in the middle of a nearby lake.

 

Therefore, for numbers in any coordinate system, including the apparently simple and unprojected longitude and latitude coordinate system,  to precisely specify a location we have to know what shape of the Earth was used in the definition of that coordinate system.   

 

Simplifying somewhat, the exact shape of the Earth used is called the datum or the coordinate system base.   Manifold refers to it in dialogs as the base and in documentation uses both terms, datum and base.    Datum is the classic cartographic term, so when the documentation discusses projections using materials derived from classic texts on geography and cartography the word datum tends to be used.   Base tends to be used in dialogs and in programming since that is the more modern computational style.

 

Text-based ways of naming coordinate systems usually name the projection, as in Lambert Conformal Conic, and then also give a list of optional parameters, one of which is the name of the datum used, also cited with a text name such as WGS84.  That is fine in theory but in practice users would often remember to cite the name of the projection but forget to cite the critically important parameters such as which datum was used.  If we do not know whether the WGS84 datum was used or the NAD27 datum we do not know exactly where the data is located give or take a few hundred feet or meters.

 

Such errors are avoided by using EPSG codes to specify coordinate systems.  The definition of each EPSG code includes the base that is used, so specifying a coordinate system by EPSG code automatically specifies the correct base.  

 

Smart File Formats

 

One of the biggest problems encountered in real life with spatial data is that most of the world's spatial data has been published in formats that do not completely and unambiguously specify the coordinate system used for that data.   Instead, much of the world's data has been published in formats which rely upon some external means, such as oral gossip or accompanying text files, to tell users what coordinate system was used for the data.

 

If we get data in a file format that does not convey all details of what coordinate system is to be used for that data we will have to search around for whatever source of external information might tell us.  That might be some accessory page on the Internet site from which we downloaded the data or it might be some accompanying file, often called a metadata file that describes the coordinate system which is supposed to be used.  

 

If we cannot find such accompanying data we might have to utilize expert methods to guess what coordinate system was used or to cast the data using expert technical means from whatever unknown coordinate system was used into a known coordinate system. Sometimes that is possible and sometimes it is not.  None of that is for the faint of heart or for the inexpert user, and at times even heroic efforts cannot make sense of a mystery data set.  Chalk that up to the slacker idiocy of publishing spatial data in dumb formats without bothering to convey what coordinate system was used.

 

The way to avoid such problems is to publish and to utilize spatial data in smart file formats that as part of the format completely and unambiguously specify the coordinate system used for the data in all necessary details.    When we utilize spatial data in such formats Manifold knows from the format itself exactly which coordinate system was used and the data can be imported or linked with precise accuracy.

 

One problem we might encounter is spatial data provided to us by inexpert users who think that the format they are using is a smart format when in fact the format fails to convey full information on the coordinate system used.   The classic example is the use of shapefiles to exchange data, which despite the great antiquity and profound limitations of the format have become one of the world's most popular formats for sharing spatial data.   

 

See the Shapefiles Strangely Out of Shape section of the Essays section for the canonical rant on shapefiles. For the purposes of this coordinate systems topic, the main problem with shapefiles is that the shapefile written standard makes no provision at all for specifying the coordinate system used.   Even in the case of shapefiles containing "unprojected" data as longitude and latitude numbers, there is no specification of which datum was used.   

 

To get around this limitation a variety of hacks have been suggested by shapefile users over the many decades since shapefiles were first introduced.  Most depend upon use of an accessory file with a .prj extension that conveys the name of the coordinate system used.   The problem with those is that there is no single standard for .prj files, for example, no single standard that provides a canonical list of coordinate system names that may be used together with all relevant parameters for each.    If the same GIS package is used to save and to read shapefiles using such hacks a .prj can work well because whatever undocumented quirks are used by the package will at least be consistent (usually) within that package's own use. But other software won't know those undocumented quirks.

 

Experts therefore usually consider .prj files reliable only when saving data from a particular GIS software package for later use by that same package and not as a reliable way of specifying coordinate systems for exchanging spatial data between different users and different software packages. Slackers, of course, put blind faith into .prj files as solving all problems with coordinate system stuff they haven't gotten around to learning.

 

If we get spatial data from a smart format Manifold will read every bit of information on coordinate systems to automatically correctly place the spatial data.    If we read spatial data from a dumb format Manifold will allow us to manually specify needed information on coordinate systems so we can use the spatial data.

 

If we read spatial data from a dumb format like shapefiles which have been made quasi-smart with an accompanying .prj file, "world" file or other hack, Manifold will apply the usual conventions a street-smart user would, for example, assuming that the most frequently-occurring ESRI-style .prj was used, to extract automatically as much information as possible on the intended coordinate system.  That usually will be a good start that can eliminate or reduce the need for manually entering information.

 

Whenever Manifold exports spatial data it will write coordinate system information out to the fullest capacity of whatever format is being used.   If the format to which the data is exported is a smart format that fully captures coordinate system information, Manifold will ensure that perfect and complete coordinate system information is correctly written along with the data.  

 

In addition, for all formats Manifold will also, just in case, write out an accompanying .MAPMETA file that contains coordinate system information using JSON (JavaScript Object Notation) format.   If we have no choice but to write spatial data to a dumb file format at least we will have an accompanying file we can send along with that data to tell the next user with expert precision what coordinate system was used.

 

Accurate Calculations and Accurate Use

 

The third task required of any software working with spatial data is to make accurate calculations when transforming that data.    That cannot be taken for granted as many coordinate systems can be computed to various levels of accuracy, some transformations can be computed in optional ways and not all software will use the most accurate algorithms or methods.    The use of different levels of accuracy in transformations between coordinate systems can cause misalignments of data that were transformed by different software systems.  Another cause of misalignments is when software systems do not utilize the transformations that a coordinate system standard, such as EPSG, specifies for a given coordinate system.   If one system uses the correct transformations and another does not the results may not be the same.

 

A third cause of misalignments is the use of highly specific, regional models for the Earth's surface in given locations, usually called grids, that not all software may use.   Grids are used in some provinces, states, regions or even cities of certain countries and are not always easy to obtain.  Many grids are licensed intellectual property of the regions or cities which require their use for official cartography and thus can be difficult to redistribute in an "open" manner.

 

Manifold knows and will use the highest accuracy algorithms in all cases and will also use high accuracy grids in various parts of the world where they are used.   That may result in slight misalignments of data as compared to data transformed by software which does not use high accuracy algorithms or which fails to use regional grids.

 

Notes

Coordinate systems are about manifolds - While there are many formal ways of defining the term coordinate system (Wikipedia's current entry thoughtfully uses the term manifold in its definition) the simplest way to understand the notion is as a way of using numbers to specify a location.

 

Verbose synonyms - Some programs refer to coordinate systems as coordinate reference systems or CRS or even as a spatial reference system or SRS.    Those are simply more verbose ways of referring to a coordinate system.    

 

Missing Grids - Manifold includes virtually all known enhanced accuracy, local grids.   Of the 100+ such grids used around the world as of this writing only two are not included in Manifold, the grid files for Bavaria and for Brandenburg, both in Germany and both available only by license from the local cartographic bureaus.  The Bavarian file is NTv2-Ba.gsb used to convert between the DHDN90 and ETRS89 bases, and the Brandenburg file is NTv2-BB.gsb used to convert between S42/83 and  ETRS89 bases.  

 

Users who are willing to agree to the licenses can obtain these grid files from the relevant bureau (the Landesamt für Digitalisierung, Breitband und Vermessung in Bavaria or the Landesvermessung und Geobasisinformation Brandenburg for Brandenburg) and utilize them within Manifold.

 

Manifold also provides grid files for Canadian provinces which in the past were published for free download and unrestricted use without a license required but as of this writing are provided for some provinces only under license.  The grid files provided by Manifold are identical to the current versions but were obtained prior to the licensing restriction and thus are included within Manifold.

 

Wrong datum / base - The slacker tendency to just pick a datum without knowing or bothering to care which datum was used is one reason why so many spatial data sets do not quite align when used together.   The data may be perfectly accurate but using the wrong base might put a highway from one data set into the middle of a river from a different data set instead of running next to the river.

 

The related slacker tendency to think "Hey, if I don't know it is wrong it must be good..." helps propagate such errors: if our slacker friend is working with a roads layer and a waterways layer, both of which were originally created using an NAD27 base but our friend cluelessly uses WGS84 as the base for both layers, then the two layers will align perfectly for our friend, since they are both wrong in the same way.   If our slacker friend then publishes the layers for download on Internet earnestly citing the base used as WGS84, our friend will contribute to the world's supply of bogus Internet data: anyone else who uses that data together with accurately specified data layers will see that our friend's layers are wrong.

 

World files - Yet another hack to provide coordinate system information for spatial data in dumb file formats is the use of so-called world files.   A world file accompanies a format such as JPG or other dumb format and purports to give coordinate system information.  Like the use of .prj files with shapefiles, there is no single, unambiguous standard for world files nor do they always provide all information required.   Manifold will read any accompanying world files and will extract all information possible utilizing the most frequently occurring conventions on how to make sense of world files.

 

 

img_descartes.png

 

René Descartes (1596 - 1650)

 

The use of X, Y and Z coordinates to plot coordinates was invented by René Descartes and is known as the Cartesian coordinate system in his honor.   By developing a numeric way of representing locations Descartes invented the link between geometry and algebra and revolutionized mathematics with the introduction of analytic geometry.   Virtually all geometric and display work in computer science utilizes the Cartesian coordinate system.

 

Born in 1596 in the house which still stands at 29 rue Descartes in the town now called Descartes, located on the river Creuse to the south of Tours, in France, Descartes was originally educated as a lawyer.   He left France for the Dutch Republic to pursue his desire to become a professional military officer, pursuant to which he studied military engineering and expanded his knowledge of mathematics.

 

As famous for his work in philosophy ("I think therefore I am.") as for his mathematical and scientific work, Descartes lived most of his adult life in the Netherlands.   He left for Sweden the year before he died in 1650 at age 53, apparently of pneumonia.   A Catholic who often lived among Protestants and a controversial philosopher to boot,  Descartes may have been assassinated with poison.   

 

Descartes had become internationally famous during his lifetime but the controversies surrounding his philosophy and religion assured he had many enemies.   After his death his works were prohibited by the Pope and later lectures on his philosophy were banned by the French king.   Even the portrait above was vandalized by someone who slashed it.

 

The most accurate portrait we have of Descartes is the small study shown above.  It was most likely done from life by Franz Hals and is probably a preparatory sketch in oil for a larger work, now lost, that subsequently was much copied by other artists after the death of Descartes.  The small portrait hangs in the National Gallery of Denmark Statens Museum for Kunst in Copenhagen.  The famous portrait of Descartes in the Louvre, once attributed to Hals, is now thought by most experts to have been executed by some other artist who copied the lost, larger portrait by Hals.  The Louvre portrait lacks the vitality seen above that is so characteristic of works by Hals.

 

After his death in Sweden the remains of Descartes were returned to France in 1666.  Missing some bones taken as souvenirs by admirers of Descartes, the remains were reburied in Paris in the church of Saint-Étienne-du-Mont within a copper casket.

 

During the French Revolution what were said to be his remains were moved again to save them from revolutionaries bent on despoiling the church.  In the rush of relocation a decayed wooden coffin was removed from Saint-Étienne-du-Mont and hidden within an Egyptian sarcophagus in a museum, even though the wooden coffin did not match either the location or the description of the copper coffin that was known to have been used to inter the remains of Descartes within Saint-Étienne-du-Mont.   It is possible that in the chaos of the revolution the wrong remains were taken from the church.

 

In 1819 the sarcophagus was opened and what few, powdery bone fragments were left, missing a skull and barely recognizable as bone, were taken from the sarcophagus and reburied in the ancient abbey of Saint-Germain-des-Prés in the Left Bank quarter of Paris.  

 

At the present writing there are five skulls surviving in various collections and museums that have been claimed to be the skull of Descartes.  The skull at the Musée de l'Homme in Paris appears to have the best odds of being the genuine skull.  The convoluted history of that skull is supported by multiple, consistent sources of documentation going back to the theft of the skull by a Swedish soldier hired to guard the transfer of remains from Sweden to France in 1666.   Forensic reconstruction shows that the Musée de l'Homme skull is a good match to facial features and dimensions shown in portraits of Descartes.   The Musée de l'Homme keeps the skull in storage and displays a cast.

 

See Also

Coordinates

 

Projections

 

Change Projection

 

Projections Tutorial

 

The Earth as an Ellipsoid

 

Shapefiles Strangely Out of Shape

 

That YX Thing

 

Accessory Files Created