Langbahn Team – Weltmeisterschaft

GeoNames

Worldwide density of GeoNames entries in 2006

GeoNames (or GeoNames.org) is a user-editable geographical database available and accessible through various web services, under a Creative Commons attribution license. The project was founded in late 2005.[1]

The GeoNames dataset differs from, but includes data from,[2] the US Government's similarly named GEOnet Names Server.

Database and web services

The GeoNames database contains over 25,000,000 geographical names corresponding to over 11,800,000 unique features.[3] All features are categorized into one of nine feature classes and further subcategorized into one of 645 feature codes. Beyond names of places in various languages, data stored include latitude, longitude, elevation, population, administrative subdivision and postal codes. All coordinates use the World Geodetic System 1984 (WGS84).

Those data are accessible free of charge through a number of Web services and a daily database export.[4]

Wiki interface

The core of GeoNames database is provided by official public sources, the quality of which may vary. Through a wiki interface, users are invited to manually edit and improve the database by adding or correcting names, move existing features, add new features, etc.[5]

Semantic Web integration

Each GeoNames feature is represented as a web resource identified by a stable URI. This URI provides access, through content negotiation, either to the HTML wiki page, or to a RDF description of the feature, using elements of the GeoNames ontology.[6] This ontology describes the GeoNames features properties using the Web Ontology Language, the feature classes and codes being described in the SKOS language. Through Wikipedia articles URL linked in the RDF descriptions, GeoNames data are linked to DBpedia data and other RDF Linked Data.

Accuracy and improvements

As in other crowdsourcing schemes, GeoNames edit interface allows everyone to sign in and edit the database, hence false information can be entered and such information can remain undetected especially for places that are not accessed frequently. Ahlers (2013) studies these inaccuracies and classifies them into loss in the granularity of coordinates (e.g., due to truncation and low-resolution geocoding in some cases), wrong feature codes, near-identical places, and the placement of places outside their designated countries. Manually correcting these inaccuracies is both tedious and error-prone (due to the database size) and may require experts.

The literature provides very few works on automatically resolving them. Singh & Rafiei (2018) study the problem of automatically detecting the scope of locations in a geographical database and its applications in identifying inconsistencies and improving the quality of the database. Computing the boundary information can help detect inconsistencies such as near-identical places and the placement of locations such as cities under wrong parents such as provinces or countries. Singh and Rafiei show that the boundary information derived in their work can move more than 20% of locations in GeoNames to better positions in the spatial hierarchy and the accuracy of those moves is over 90%.

References

  1. ^ "Marc Wick: Geek of the Week". Simple Talk. 2009-05-06. Retrieved 2020-07-01.
  2. ^ "Datasources used by GeoNames in the GeoNames Gazetteer". Retrieved 2020-08-20.
  3. ^ "GeoNames web site". Geonames.org. Retrieved 2018-09-08.
  4. ^ "GeoNames API". ProgrammableWeb. Archived from the original on 2018-11-26. Retrieved 2018-09-08.
  5. ^ "How can I help ?". GeoNames Forum. GeoNames. Retrieved 11 August 2018.
  6. ^ "GeoNames ontology". Geonames.org. Retrieved 2013-12-15.

Further reading

  • Ahlers, Dirk (2013), "Assessment of the accuracy of GeoNames gazetteer data", Proceedings of the GIR Workshop, pp. 74–81, CiteSeerX 10.1.1.722.8740
  • Singh, Sanket Kumar; Rafiei, Davood (2018), "Strategies for Geographical Scoping and Improving a Gazetteer", Proceedings of the Web Conference (PDF), pp. 1663–1672