As a run-up to the Do-Din event in Hyderabad, geohackers.in is co-hosting an event called DataLore about putting data to good use, and how statistics and visualisations sometimes twist data to tell lies, this Wednesday, November 20th at 7 High Street Cooke Town, Bangalore.
People who want to make the world a better place look towards data in an effort to make that change. This very data then needs to be channeled into maps, statistics, and visualizations before it can be useful — and people are doing this everywhere. Stories of politics, corruption, oppression, and war are being told around the world using such tools. Unfortunately, a lot of what is being made fails at its task. Maps that miss the point, visualizations that fail to engage, and statistics that mislead, all undermine action. On Wednesday evening, as a run-up to Do-Din, DataLore will attack this problem on two fronts:
When all you have is a hammer, everything starts to look like a nail. There are maps being made for every reason but some of them lack the point, they misrepresent information, they lie or they fail to engage the audience. We would like to discuss how people come up with these maps, what disasters they cause and how, as storytellers, we can improve the situation.
Nothing is what it seems — especially not statistics
As they say, there’s lies, damned lies, and then there’s statistics. It’s easy to mislead or be misled by statistics and visualizations. Preconceptions and agendas can leak into them, and colour them with bias. Sometimes, a lack of knowledge about statistics leads to false conclusions, which is rather disastrous. We’ll use some examples to show you how this can happen, and how to both interpret and represent data properly.
I’ve re-entered the academic world as a student at the University of Cambridge in the United Kingdom, and one of the benefits I’m enjoying the most is near-unlimited access to one of the world’s largest repositories of recorded information; the Cambridge University Library. Commonly known as the UL, this is a copyright library which means that under British rules on legal deposit, the library has the right to request a copy of any work published in the UK free of charge. Currently, the UL has over 8 million items, which includes books, periodicals, magazines and of course, maps.
The Map Room in the UL is a fascinating place; it functions as the reading room for the Map Department, which holds over a million maps (as the librarian told me; Wikipedia claims it has 1.5 million). It’s not a very large room, as reading rooms go, but is a beautiful space and is very well managed. Everything is catalogued very efficiently with a filing card system, and there’s one card with the name, date of publication and classmark (UID/coordinates) for each map. Visitors are not allowed to simply browse through the map collections; to refer to a map, one must fill out a request form with the appropriate details and submit this form to the library assistants, who will then pull out the required map folio from its storage location. The title of this post comes from the fact that map holdings with classmarks beginning with ‘S696′, ‘Maps’ or ‘Atlases’ are held in the Map Room, in various drawers and cabinets.
The Map Room is a pen-free zone; if you’re writing something, use a pencil. Smartphones and hand-held cameras are allowed, but under UL policy photos cannot be taken of the building itself. With prior permission however, it is possible to take images of material in the UL, which I did. The first series is from a map on display in the UL; titled “A map containing the towns villages gentlemen’s houses roads river and other remarks for 20 miles around London“, it was printed for a William Knight in 1710 and is a wonderful piece of cartography. The second series is from a map I requested using the card-index system; this map dates back to 1949 and beautifully illustrates tea-growing regions in the Indian-subcontinent.
If there’s a map in the UL you want an image of (for non-commercial or private-study purposes only!), I’d be happy to do what I can to help; I would actually be very grateful for an excuse to spend an afternoon looking at maps.
We often find ourselves choosing between various data formats while dealing with spatial data. Consider this (not-so) hypothetical example: your data collection department passed on a bunch of KML files but your analysts insist on SHP files and your web team is very particular about their GeoJSON. If this sounds familiar, you’re reading the right post; we will quickly run through some of the popular vector and raster data formats you should care about and discuss some of the ways to convert data between these formats.
The shapefile is perhaps the most popular spatial data format, introduced by Esri.
Esri still has the right to change the format when and if they choose to do so, it is otherwise open and is highly interoperable. Shapefiles can store all the commonly used spatial geometries (points, lines, polygons) along with the attributes to describe these features. Unlike other vector formats, a shapefile comes as a set of three or more files – the mandatory .shp, .shx, .dbf and the optional .prj file The .shp file holds the actual geometries, the .shx is an index which allows you to ‘seek’ the features in the shapefile, the .dbf file stores the attributes and the .prj file specifies the projection the geometries are stored in. Continue reading →
Denis Wood: Maps are just nude pictures of reality, so they don’t look like arguments. They look like “Oh my god, that’s the real world.” That’s one of the places where they get their kick-ass authority. Because we’re all raised in this culture of: if you want to know what a word means, go to the dictionary; if you want to know what the longest river in the world is, look it up in an encyclopedia; if you want to know where some place is, go to an atlas. These are all reference works and they speak “the truth.” When you realize in the end that they’re all arguments, you realize this is the way culture gets reproduced. Little kids go to these things and learn these things and take them on, and they take them on as “this is the way the world is.”
The fabulous neogeographers at the Oxford Internet Institute used Alexa data to identify the most visited websites in each country, and mapped it as an old colonial style choropleth map of ‘Internet empires’. Do not miss another map included in the same page, which uses hexagonal cartograms to qualify the most-visited websites in each country by the population of Internet users in the same country.