Oct 9, 2012
When Apple Maps came out, there were a lot of anecdotes about its problems, but not a lot of systematic comparisons with Google Maps. The only one I know of is of town names in Ontario, Canada (Apple Maps, Google Maps), and that wasn't particularly relevant to me. First, I live in the United States, and there's a good possibility that Apple's data is better here because they're a US company (sorry, Canada). Second, and more importantly, I hardly ever search for town names, I almost always search for addresses or business names. The quality of Apple's address data is much more important to me than whether I can find Beeton, Ontario.
So, here's a comparison of both place names in New York State and addresses in New York City.
Disclosure: I'm a Google employee, but I don't work on Maps or anything related to it, and no Google resources or people were used for (or informed of) this project, aside from public APIs. This post does not represent the views of my employer, etc.
There's a big difference between what the Google Maps web API provides and what CLGeocoder under either iOS 5 or iOS 6 provides, and the Google Maps app on iOS 5 appears to match the web API.
The Google Maps web API is consistently great.
CLGeocoder under iOS 5 and iOS 6 differ significantly, with iOS 5 mostly being better, except in Queens.
There's more going on here than we think.
(Or just skip to the results.)
Originally, I ran all the tests using the iOS simulator and the CLGeocoder class, as the other comparisons have done. I also ran one of the tests on an iOS 5.1 device to ensure the simulator gave the same results as the device (and it did), so if you want to replicate these tests you don't need to pay the $100 to get an App Store developer account.
I then spot-checked some of the results on actual iOS 5 and iOS 6 devices using the Maps app. The iOS 6 results matched up with the Maps app (except that the Maps app gives you a "Did you mean?" in a lot of cases), but the iOS 5 CLGeocoder results didn't match up with the Maps app on iOS 5.
Suspecting that iOS 5's CLGeocoder implementation wasn't actually being used by the Maps app on iOS 5, I then ran one of the tests using the Google Maps web API. Indeed, the web API gave an entirely different set of results, one that matched closely with the Maps app on iOS 5, so I reran all the address tests with that as a third set of results (though I only got through about 90% of the New York State place names in the web API before I hit the 15,000 requests per day limit).
New York State places
I used this data set from the USGS that shows all the populated places in the United States along with their coordinates. I eliminated all the ones that didn't have a state code of NY, didn't have coordinate information, or whose name ended with "(historical)". I then geocoded strings of the form "$TOWNNAME, NY" and compared them to the expected coordinates. You could do this for any state, of course, but even after eliminating historical locations there were more than 180,000 populated places in the whole US, so I cut it down to just New York.
I chose 1km as the boundary for a correct location because most towns aren't much bigger than 2km across and the coordinates for a town are generally close to the center. To give an idea of how much margin that is, the correct location to geocode New York City to is City Hall, and 1km reaches down below Wall Street and up to SoHo, so most anywhere in financial district would count, but Battery Park would not. I picked the quite generous 100km for the "close" metric because that would at least mean you're in the right part of the state. For reference here, New York City to Buffalo is about 470km, and New York City to Albany is about 210km.
New York City addresses
I used this data set of all the "public facilities" in New York City. I figured this would give a good sample of the various address forms used in NYC, as well as a selection of data from all five boroughs. Since it includes a lot of government buildings, it probably oversamples the downtowns of each borough and undersamples the residential neighborhoods, but it does include schools, nursing facilities, soup kitchens, and other such things that are spread relatively evenly through the city. The only items I excluded from the list were facility types 1511-1541, which are all the different kinds of parks, because they include such addresses as "East River and Harlem River" (for Randall's Island Park) and "Greenwich Ave @ 7 Ave, NW side of intersection" (for one of the Greenstreets locations). I then geocoded strings of the form "$ADDRESS, $TOWNNAME, NY", where for $TOWNNAME I used "New York" for Manhattan and the borough name for the other boroughs. This is technically incorrect for Queens, because addresses in Queens use the city name they had before Queens became a part of New York City (e.g., "Flushing, NY" or "Astoria, NY"), but I assumed that "Queens, NY" would still work okay.
The biggest wrinkle with dealing with this data set was that the location coordinates were expressed in the New York-Long Island State Plane coordinate system, which is a coordinate system used by surveyors which treats a region as being flat (i.e., without the Earth's curvature). You can find additional information about the State Plane coordinate system on Wikipedia, and instructions on how to convert between it and latitude and longitude in this document from NOAA. Finding the instructions for how to do the conversion and then implementing it properly took a surprisingly long time.
I chose 200m as the boundary for a correct location because a central Manhattan avenue block is approximately 400m long, so that meant that if you gave up and placed a location in the middle of its block, it would count as correct. I chose 2km as the boundary for the "close" metric because that meant that it ended up approximately in the right neighborhood.
New York State places
The bad results may have more to do with the fact that place names in New York State are messy than with their mapping abilities, though. For instance, there's a village in New York near Lake Ontario named Hilton situated within the larger town of Parma. Searching for "Hilton, NY" in iOS 5 gives you the Hilton hotel in Midtown Manhattan, which is almost certainly more likely to be helpful. iOS 6 produces a pointer to the Hilton neighborhood of Maplewood, New Jersey, which isn't very good, but is still probably more likely than that you're actually looking for the village of Hilton.
Even more, the data set includes lots of items that are extremely difficult, if not downright impossible. For instance, there are entries for Adams (in Jefferson County), Adams Basin (Monroe County), Adams Corners (Putnam County), and Adams Cove (Jefferson again). There are two places named Ashland (one in Greene County, one in Cayuga), two named Bellevue, two named Bethel, etc. There are two Brooklyns in the data set, neither of which is the borough of New York City. This puts a pretty low ceiling on just how good any geocoder could be at answering this kind of query (and is yet another reason why I think this isn't a good basis for comparison).
The web API, though, does a reasonable job. As mentioned above, it only included about 90% of the place names, but even if you assume all 10% remaining were bad it would have gotten 68% right, much better than either iOS option.
New York City addresses
Addresses should be much easier to get right, and the consequences for getting them wrong are usually much higher. I divided the addresses by borough and ran them separately.
iOS 6 does way better than iOS 5 here, with more than 50% of addresses correctly mapped to only 15% by iOS 5. They have similar numbers of totally incorrect addresses, though. This is also the first borough where iOS 6 has a negligible number of "no result" responses.
The web API is worse here than it is in the other boroughs, but it still puts out an impressive showing.
None of the requests were made with any context information. In practice, when using a mapping app, it knows where the viewport is currently located and often knows where the person is located, and that information is used to improve results. All of the geocoders would probably improve with that information.
The use of "Queens, NY" might have caused trouble for the iOS geocoders. Then again, it might just be because they can't handle Queens.
I feel safe in making the following conclusions:
- Lists of place names aren't a good way to compare geocoders. Addressees are a much better choice.
- Google Maps' web API consistently gives great results.
- iOS 6's results are extremely varied. Sometimes it's excellent (like in Staten Island), sometimes it's awful (like in Manhattan). I expect this accounts for the variety in people's experiences with it.
- iOS 5's CLGeocoder class definitely isn't backed by the same geocoder as Google Maps on the web, but I have no idea what it is backed with. A third party geocoder? Maybe Bing Maps? Apple was reported to have been in talks with them a couple years ago. (The requests go to an Apple server and are presumably forwarded on from there.)
- Nobody can find anything in Queens.
- Being able to get freely available geographical data in machine-readable formats is awesome. The USGS and NYC do a great public service by providing it.
Please contact me if you have questions or think I've made a mistake somewhere, either by e-mail or @flooey.