For the attention of last week

archive of tokumine.com

Month: September, 2010

Web GIS data payload benchmarks

Hardware Accelerated web GIS

Vector rendering speeds in browsers are set to go through the roof in 2010 thanks to 2D/3D Hardware Acceleration. Though the demos shown so far mainly target 3D games, this next step of browser evolution is a huge deal for the web based GIS/mapping community, which to date have been limited by poor browser performance with vector graphics. Yup, expect to see the words “revolution”, “lightning fast” et al. in a lot of GIS marketing materials (will make a nice change from “geowiki”).

What will these apps render?

Data, and lots of it. You are going to be able to handle vastly larger amounts of data on the client than the current generation of browsers based web apps.

It’s a pretty safe bet that data transport efficiency will become increasingly important to the performance of your future web GIS application.

What’s the best way to send GIS data to the browser for render?

I took a look at the relative payloads sizes of some of the most often used simple lossless web GIS data transport formats and what effect Gzip’ing (a common feature of most modern webservers) and MessagePack object serialization had on them:

  • GeoJSON
  • KML
  • GML
  • SVG
  • EWKT
  • EWKB

For test data, I used the country boundary data of all 200+ countries found in the UN country boundary dataset we use at UNEP-WCMC, recording byte size of final payload.

GIS data payload sizes

Average GIS data payload sizes in bytes (smaller is better)

GIS payload data sizes (smaller is better)

Average GIS payload data sizes in bytes (smaller is better)

Key points

1) There are big differences in the size of uncompressed data. WKB is by far the smallest.
2) MessagePack brings GeoJSON style hash datastructures to near WKB sizes if Gzip is not possible.
3) Gzipping levels the playing field between the formats, and makes a huge difference.
4) When Gzipped, WKT, not WKB, has the smallest payload by about 20%
5) GeoJSON is the most bulky of all the formats when Gzipped (even over K/GML!)
6) The effect of MessagePack on Gzipped payloads is minimal.
7) MessagePack is only effective for native data structures, not string compression.

GeoJSON isn’t the smallest, no clear winner

Though (somewhat shockingly) the most bulky of the formats tested when Gzipped, GeoJSON probably offers the best current development experience due to the abundance of parsers, human readability and toolchain support in exchange for a very small penalty in payload size. The only caveat is the dependence on GZip, which may not be possible depending on traffic.

If you wanted to eek out the smallest payload possible, Gzipped WKT offers the best choice (a shame that any time saved would probably be lost in the parsers). A future benchmark could also include serialisation/deserialisation times to the mix.

Without Gzip, MessagePacking JSON datastructures appears to be a very interesting alternative, providing similar payload sizes to plain WKB, whilst offering simple and fast serial/deserialisation.

Keep an eye on SVG

Although looking at SVG makes me want to run screaming into the hills, I’d be hard pushed not to back SVG as a key lightweight GIS data transfer format of the future. Consider:

  • PostGIS can already output your geometry as SVG.
  • SVG payload sizes are smaller than GeoJSON, Gzipped or not.
  • SVG is natively renderable by all the latest browsers.
  • SVG will have an impressive toolchain developed around it.
  • SVG supports metadata.
  • The other formats will all need to be converted into SVG for HW accel rendering, a step which can be skipped.

Last notes on lossy optimisations

I just thought I’d add that if the geometry is not being roundtripped back to the server, overall payload sizes can be further reduced by simplification and snapping geometry to a grid through reducing the precision of coordinates used.

Data used in analysis (no. bytes per payload) is available as a Fusion Table, code is on Github

Advertisements

Handy textmate snippet to convert hash syntax

This is one of those things that I’ve known existed for years, but I’ve always avoided because of a previous irrational fear of regexes.
Read the rest of this entry »

PostGIS manuals for when refractions forget to renew their domain names…

PDF
HTML

iPhone4 HD video test, huge robots & APIs

I was passing Trafalgar Square last night, and happened across this robotics display as part of the “OUTRACE” art installation. The perfect excuse to give the HD video on the iPhone4 a try.

Aside from being pretty, on further investigation this is actually a huge, ultra expensive web mashup.

Mashup you say?

Mashup indeed. I shall illustrate the point with the use of the Google Charts Graphiz API I found yesterday

outrace.org flow

Boris Bikes on Fusion Tables

http://tables.googlelabs.com/embedviz?viz=MAP&q=select+col0%2Ccol1%2Ccol2%2Ccol3%2Ccol4%2Ccol5%2Ccol6%2Ccol7%2Ccol8%2Ccol9%2Ccol10%2Ccol11%2Ccol12%2Ccol13+from+245192+&h=false&lat=51.509383501611595&lng=-0.13586997985839844&z=13&t=4&l=col11

To road test the release of the new geometry styling controls in Google Fusion Tables, I threw together a live visualisation of London Cycle Hire Rank availability. Using the Ruby Fusion Tables gem it sips data from the Boris Bikes API, and is basically a total rip off of homage to Oliver O’Brien’s (excellent) original visualisation.

Aside from the speed of development, the best thing about this version is: you don’t need a server.

Resources
Fusion table
Source code