Our tutorials so far have been focused on several aspects of cartography, from data structures to their analysis and representation. Not surprisingly, most of them are aligned to web technologies, and client browsers expect the applications to consume relatively low resources. Spatial data have comparatively higher memory footprints owing to the structure and the amount of information they hold. For instance, the taluk boundary level data for India is 46.3 MB in GeoJSON format. This means that it cannot be used directly in a web project; it needs to be optimised first.

Optimising spatial data essentially translates to simplifying the geometries in the file. Since, in a web context, it is not using it for analysis, a slight difference in the area or shape of corners will not make a huge difference. Users may not even realise that the shapes are simplified, if it is done in just the right way.

To give you an idea of the process, have a look at the following maps of Florida. The first row showcases the original data from the Florida Geographic Data Library, converted to GeoJSON (8.2MB). The second set of images shows the simplification of the geometry (note the sharp edges) in the GeoJSON (now 427KB). This really hasn’t changed the way the map looks on the whole, which is exactly what we need for web representation.

florida_combined florida_optimised_combined

In this article I will quickly look at a few easy methods to simplify geometries.

TopoJSON

TopoJSON, developed by Mike Bostock, is an extension of GeoJSON with encoded Topology.

Rather than representing geometries discretely, geometries in TopoJSON files are stitched together from shared line segments called arcs.

This simplifies the structure of the data by identifying the relationships and storing them in the same file, thus eliminating redundancy. TopoJSON works seamlessly with D3.js and can be integrated with pretty much any other web application.

Simplify using QGIS

The QGIS vector processing suite comes with a tool for simplifying geometries. It employs the popular Ramer–Douglas–Peucker algorithm which reduces the number of points in a curve. You have to select the layer that you want to simplify and pick a tolerance level. The higher the tolerance, the lesser the number of points and the lower the size of the file.

qgis_simplify

PostGIS ST_Simplify

In case you are serving spatial data from a PostgreSQL database through an API to the client-side, PostGIS implements the previously mentioned Ramer-Douglas-Peucker algorithm through the procedure called ST_Simplify. For example, to apply ST_Simplify on a geometry called ‘state’ of id 1, with a tolerance of 0.002 from a table called ‘country, the PostGIS command would be:

SELECT ST_Simplify(state, 0.002) from country where id=1;

These techniques are very essential when you deal with large amounts of spatial data that are required to be rendered in the browser. If you have more ideas or questions, let us know in the comments!

 

 

It has been a while since we started writing in a consistent pace. But somehow, I see that happening now. Today, we will see how to organize and align your data so that you can make a map or two out of it.

We often deal with data in CSV formats, which potentially can be visualized as a map. Let’s start with a sample file.

code district boys_appeared girls_appeared total_appeared boys_passed girls_passed total_passed pass_% rank
GA UDUPI 8013 8058 16071 6852 7537 14389 89.53 1
PA SIRSI 4582 4633 9215 3955 4183 8138 88.31 2
LL HASSAN 11783 11968 23751 9722 10685 20407 85.92 3
DD TUMKUR 12312 11085 23397 10305 9780 20085 85.84 4

The table above shows the first few rows from a CSV file containing SSLC results in Karanataka for the year 2012. You can download the complete file here. The contents of the file and what each row means is very evident from the column headers.

The column of interest for you right now should be ‘district’. We will now use this column to make a map from this data. The process of converting an address or part of an address to a geographic coordinate is called geocoding. We will geocode this data to find the latitude and longitude of the districts.

There are several ways of geocoding data – from free and easy APIs to comprehensive as well as expensive ones. Two of our favourites are: Batch Geocode and the MapBox Google Docs Geo plugin. We will use the second one for this exercise.

 

Continue reading

I was employed as a spatial data and cartographic consultant on a project to analyse specific agricultural commodities and Agricultural Produce Marketing Committees (APMCs) in the Indian states of Karnataka and Madhya Pradesh. The final product was a set of maps for various publications, as well as the clean datasets themselves.

Agricultural market datasets for the states of Karnataka and Madhya Pradesh were obtained for the purposes of spatial visualisation; these contained information on wheat procurement in Madhya Pradesh (2008 – 2012), tuar production in Karnataka (2007 – 2009) and the locations and categories of APMCs in both these states. Some of the data was linked to district names, while the rest was geocoded using a free online geocoding service. I used Quantum GIS, TextEdit and Microsoft Excel extensively for this project; Excel and TextEdit are invaluable when processing CSV files, and QGIS is where all the actual mapping itself takes place.

The actual process itself involved lots of data-cleaning and a little bit of mapping. First, for the geocoding, I ran the column containing the village names through the geocoder thrice; at each repetition, I tweaked the names a little more to get more accurate coordinate results. I then had to similarly tweak the district names to get them to match up with my source shapefiles; fixing bad spellings can be a LOT of work. In its entirity, this was a tedious process that involved organising, cleaning and validating four distinct datasets with both automated and manual operations. However, the final products were datasets that were clean, had accurate spatial locations and could easily be used to produce analytically valuable maps.

CASI _ Five years of wheat procurement in Madhya Pradesh _ Animated

 

Continue reading