KML in Google Maps and the challenge of standards

The Google Maps API is generally a pleasure to work with. After more or less inventing the whole Web 2.0 thing, the people behind Google Maps have continued to innovate, recently adding a Geocoder to their API, allowing address information to be translated into coordinates in Latitude and Longitude. Naturally having this information expands considerably the number of things you can do with Google Maps. Even better, it doesn’t just return Coordinates for an address, it “normalizes” a query and returns the address in a structured way.
There’s no doubt that creating a global XML standard for addresses must have been a challenging thing to try and do. Some countries use street names in addresses others do not. Some nations use island names, others have no islands. Some (like America) has county names but do not use them in addresses. So Kudos to OASIS for creating a standard called xAL and kudos to Google for trying to use it in their Google Maps API. Unfortunately, the devil is always in the details.

First, OASIS makes reading the standard harder than it should be by making you download a 10 megabyte zip file containing a whole bunch of standards, not all of which are currently relevant to me. Law #1 in the web age: Google is the source of all* traffic. The only reason I even found them is because Google helpfully links them.

While it’s clearly better than nothing, the following quote from the spec tells much of the story: xAL can be used to define addresses in simple terms or in complex terms. It is up to the user to decide how they want to implement xAL.

I’d be very surprised if anyone anywhere has implemented software that can read and do something useful with the full range of XML that technically conforms to the spec. There’s just too much of it.
Unfortunately (and as with all things this complex) there appear to be outright bugs in the Google implementation. Take two address examples:

Example #1: 91 N. MAIN ST., WARSAW NY 14569 This address works beautifully and returns results as you might expect. There’s a hierarchy and each element in the hierarchy maps to a US centric address element:

  • Country (Country)
    • Administrative Area (State)
      • Sub Administrative Area (County)
        • Locality (City)
          • ThoroughFare (Street)
          • PostalCode (Zip)

Beautiful. Google has mapped US addresses nicely into the standard and it’s easy to write code that does useful things. Unfortunately, in my small sample, Google only returns results of this form about 87% of the time.

Example #2: 760 HOOSICK RD., TROY NY 12180 This address also works well but returns a different hierarchy:

  • Country (Country)
    • Administrative Area (State)
      • Sub Administrative Area (City)
        • ThoroughFare (Street)
        • PostalCode (Zip)

Notice that County information is missing. This is not a fatal problem and is to be expected with a dataset as large and noisy as addresses. However, rather than simply leaving Sub Administrative Area unnamed (as the spec seems to suggest with its examples) the API remaps the City up to the Sub Administrative Area and omits the Locality element. Notice also that ThoroughFare and PostalCode are no longer a child of Locality but of Sub-Administrative Area.

I’ve sent a link to this post to the folks at Google Maps. I’ll be interested to see whether they agree that it’s a bug or think they’re adhereing to the letter of the spec. That’s the challenge with standards. They either say to much or to little.

Note: The examples above link to cached versions of what Google returns when you connect with a valid Maps Key. I cannot easily figure out what a key for your host might be so you’ll need to get your own key and make your own queries to see the results for yourself.

Comments

  1. I disagree regarding xAL. There are data quality products that have implemented the complete specifications of xAL. 10 mb is not the size of one standard namely, xAL. OASIS CIQ has 5 standards, xAL for international address representation, xNL to represent party names, xNAL is a combination of name and address, xCIL to represent party centric information and xCRL to represent party relationships. 10 MG consists of schemas, several examples and many different documents. Complaining about a standard without understanding in detail is not fair. Because xAL is industry and application neutral, it provides the flexibility to end users to have a clear idea of how they want to implement and this is why the statement, “xAL can be used to define addresses in simple terms or in complex terms. It is up to the user to decide how they want to implement xAL.” was introduced and should not be taken out of context.

    If Google Earth decided to implement xAL, then it is clear that xAL provides the fleibility to represent addresses of earth compared to other standards.

    Cheers