Development Seed Blog
Find Your Bearings: Geocoding Experience Unearths Helpful Hints
As I fired off requests to Google's geocoder API to plot several thousand points on a map for a recent project, a quick calculation showed that only about a third of the data returned valid latitude and longitude information. The data set I was dealing with was international - mapping members from nearly every country - so I knew I would have some issues, but only a third successful? I thought, come on, Google! After a few hours I developed a short list of helpful tips to follow to get a much higher success rate when geocoding data using Google or Yahoo.
I had a complete address for nearly every member I needed to geocode, but short of picking through country data sets, there's only so much latitudinal and longitudinal data out there for exact addresses. Google and Yahoo have the United States pretty well covered and an open source government data set makes it easy. However, try to geocode a house on a road that only locals know the name of and you're pretty much out of luck, unless you get that guy to send in the coordinates off his Bushnell Onix hand held GPS. Fortunately for this case, I only needed city and country coordinates.
When I saw the one-third success rate, I knew something wasn't right. To make the actual requests on Google, I used a function in the location module (called location_geocode_us_google) that relies on passing a location as an array, which it converts into a single line address to send to Google. When Google didn't know the address, I guessed that it didn't like the entire address and was just unable to save the city/country pair after failing with the full address. Go to this Google example and toss it a full international address for somewhere in the developing world - chances are it will not be able to geocode it. Now try just the city and country - it should work.
This was an easy guess, but before I got much further I decided to modify the Google geocoder function and record what the actual response code was for every request. There are six possible response codes. The default behavior of the function was to check if it was 200 - which mean it was a successful request - and if it was not, don't do anything. The five other possible response codes are listed here in Google's documentation.
I tried using just city and country in the location array, and it significantly raised the success rate. However, I noticed a lot of code 603 responses, which as you can see on the above link, means the "geocode for the given address cannot be returned due to legal or contractual reasons." It turns out that Google cannot return latitude and longitude data for China or the United Kingdom, and possibly a few other countries. I didn't investigate the reasons for this, but switched over to Yahoo, which worked fine.
This experience may help make a smarter geocoding module that's trained to handle bad input data from the user and use other geocoders as fallbacks whenever a successful response is not returned. In a few weeks I should have another tale from the geocoding trail about how Yahoo's geocoder can help with split congressional districts, where the five digit zip code cannot distinguish your correct representative. Any guesses?

Comments
Using geonames
There is some work underway to use Geonames as a geocoding solution as part of the location module, which should help this situation as well, as Mikel points out. I have to admit though, even with Geonames, there were a lot of places it didn't know about (yet) in the area I was testing (the Isle of Man).
Hi Dan,
Hi Dan,
I just came across http://drupal.org/node/75459 from the links below. I've got to check this out closer. Thanks for posting about Geonames. I heard about another location module being worked on as well but I have no link yet. Also, I just came across geo.module which sounds like it may grow.
Ian
I have been meaning to get
I have been meaning to get back to that issue and help get it into the module but have been distracted by other things of late. Hopefully soon :)
I've heard of the other, much simpler, location module being worked on as well. I'm hoping it will be released soon, and once we start shifting focus from 4.7 to 5.x here, I hope I can help out with that one too. The geo.module is new to me though, I'll have to check that one out...
Thanks Dan. You were missed
Thanks Dan. You were missed at OSCMS. I hope you will be pressing on mapping out at DrupalCon in Barcelona. Your write up on "Google Maps vs OpenStreetMap"
Barcelona drupalcon
I wish I could have traveled out to OSCMS, but I am likely to get to the Barcelona drupalcon and will be hoping to move the geo stuff further forward. A lot of the dev work will depend on other commitments here at work though, so time will tell how much actually gets done :)
Had the same problems
Hi I have the same issues in my new drupal site...
http://www.pawmap.com
Basically we use geocoding to locate the dogs in the respective locations.
It causes the users puting their"dogs" on the map to be puzzled because sometime it works and sometime it doesn't
I had problems in the UK and in Canada.
Can you confirm that Canada shares the same laws or does it work for you there?
Lior
Hi Lior,
Hi Lior,
I went to the Google geocoder example here and I am able to send it locations in Canada without errors. This is not the best interface for testing because it does not show you the response code or anything. I also just tried going to my sandbox which I have enabled for geocoding and from there you're actually able to click on China (the way it works is you click on the map and it will plot the latitude and longitude based on where you click) and it gives me a latitude and longitude. I have not looked to see what the actual reason is why Google does not support geocoding in some countries.
By the way, I was reading about a collar that tracks your dog via GPS the other day and plots him on a map. Pretty soon it will be able to guide him home too :)
Ian
GeoNames
Wonder if you'd have more luck with the Geonames gazateer?
http://www.geonames.org/export/geonames-search.html
Hi Mikel,
Hi Mikel, Does Worldkit's geocoder use Geonames or is it using another source? In my tests I tried a few requests on Worldkit as well. Where the requests on Worldkit worked while they were failing on Google prompted me to look closer at what the response codes were coming back as from Google. The reverse geocoding on Geonames looks handy. I hope you're well.
Ian
Post new comment