Tinkering with Mozilla Location Services

First there was OpenStreetmap, a crowd sourced map. Now, Mozilla started Mozilla Location Service:

“The Mozilla Location Service is a research project to investigate crowdsourced mapping of wireless networks (WiFi access points, cell towers, etc.) around the world.”

mozstumbler icon

Mozilla want to create a database with the positions and id’s of cell towers and wifi access points. The nice thing with such a db is that knowing the id’s of some nearby wifi access points or cell towers you can determine your location on earth pretty good, WITHOUT a gps.

Google has such a database already: when they drive around taking pictures of your street and houses, they also pick up all wifi id’s ( and more ... ), and even your android/i-phone will sent info back to Apple and Google. They make those available for their browsers and paid services too, but difference with Mozilla is that Mozilla tries to respect your privacy more, and Mozilla tries to do this in a more open way, see for example their FAQ: https://wiki.mozilla.org/CloudServices/Location/FAQ

For the purpose of doing some accuracy research I tried to enable the Google service, but got stuck when they insisted to have my credit card, even when I only want to use it for testing. Apparently this service IS valuable according to Google: you can only use it for free for 100(!) hits a day...

Collecting wifi locations

Anyway. The way it works: you download MozStumbler from https://github.com/mozilla/MozStumbler/releases This is an Android application, which after starting ask you to enable your GPS and will start scanning for GSM Cells and Wifi Access Points.

mozstumbler scanning screen

From within the application you can also test the service. Pretty accurate in my case ( but that is maybe because I do a lot of wifi scanning around here :-) ).

mozstumbler map

Mozilla has a map on which they show more or less the locations sent to them: https://location.services.mozilla.com/map#11/52.3672/4.6232

Using the location api

Mozilla is running a location server, which collects the location, and also serves as the api: https://mozilla-ichnaea.readthedocs.org/en/latest/

The api mozilla uses is nowadays the same api as Google is using, see https://developers.google.com/maps/documentation/business/geolocation/

In short, you do a POST with the wifi/cell information into a JSON structure, and then you get back a JSON structure with location and accuracy data.

POST the following to: https://location.services.mozilla.com/v1/geolocate?key=test

{
    "wifiAccessPoints": [
        {
            "macAddress": "24:65:11:A1:80:17",
            "signalStrength": -46
        },
        {
            "macAddress": "5C:A3:9D:B3:CD:78",
            "signalStrength": -51
        },
        {
            "macAddress": "84:9C:A6:59:0C:8A",
            "signalStrength": -72
        }
    ]
}

And you will get something back like:

{"location": {"lat": 52.3972655, "lng": 4.6480852}, "accuracy": 100.0}

Want to see where this is? Start your browser and create an osm url with the lat and lon value in it like this:

http://www.openstreetmap.org/#map=19/52.3972655/4.6480852

Nice huh?

But where do you get those macAddresses (actually BSSID’s) from?

On Linux there is the iwlist command, which if ran as root will sniff for all available networks available and report them with an awfull lot of information. Try running this as root, or using sudo (given you are using wifi and you are using wlan0 as wifi device:

sudo iwlist wlan0 scan

Oops, that IS a lot of info isn’t it?

So let’s use some python code to clean that up and morph it to some json. The following code can be used to pipe the iwlist output into it and will show you the json, AND post it to give url. As api key you can ‘test’ for the time being. If you are developing a real application, get a key yourself.:

import fileinput
import json
import requests

postdata = {}
ifaces = []
postdata['wifiAccessPoints'] = ifaces

for line in fileinput.input():
    if "Address" in line:
        iface = {}
        key = line.split('Address: ')[1].strip()
        iface['macAddress'] = key
        ifaces.append(iface)
    if "dBm" in line:
        signal = line.split('level=')[1].strip().replace(' dBm', '')
        iface['signalStrength'] = int(signal)
    if "ESSID" in line:
        essid = line.split(':')[1].strip().replace('"', '')
        #iface['essid'] = essid

# sort the wifi points based on signalStrength (smaller = better)
postdata['wifiAccessPoints'].sort(
    key=lambda x: x['signalStrength'], reverse=True)

# to only sent the 3 stronges access points
del postdata['wifiAccessPoints'][3:]

# url = 'https://www.googleapis.com/geolocation/v1/geolocate?key=%GOOGLE_API_KEY%'
# pfff that is just too difficult to registre for that ^ key....
url = 'https://location.services.mozilla.com/v1/geolocate?key=test'
print "POSTING to %s" % url
print json.dumps(postdata, sort_keys=True, indent=4, separators=(',', ': '))
r = requests.post(url, data=json.dumps(postdata))

print "ANSWER:"
print r.text

Save the above as wifi.py, and make sure you have the modules fileinput and requests available.

Now this command in a terminal:

sudo iwlist wlan0 scan | python wifi.py

Will give you your location.

My plan was to try out some different situations, like varying the number of AP’s, or fiddling with the signalStrenght values and maybe compare that to Google using the exact same values. But as said earlier: Google makes it just too difficult for me, and second.. I tried, but I think that the localisation algorithm is not on full speed yet, as for example the signal strenght seems not to be taken into account...

Anybody very much into this and having time to invest in this project? It would be great if this becomes a success just like OSM! Go and download MozStumbler or FxStumbler if you have a FirefoxOS phone...

Disclaimer

Note that from the FAQ of mozilla it is not 100% clear IF they are going to make the data available as download. But I think it would be silly to NOT do this.

Mozilla states that there is privacy data in their raw data. But that can be easily removed? A dump, only containing BSSID’s and their approximate location, is indeed giving people more information about a requester then before, but as this info (plus more) is already available via commercial providers I think it is ok to do this (you can add ‘_nomap’ to your ESSID to make the scanners know you do NOT want to use your location).