The Early History of Spatial Databases and PostGIS

For PostGIS Day this year I researched a little into one of my favourite topics, the history of relational databases. I feel like in general we do not pay a lot of attention to history in software development. To quote Yoda, “All his life has he looked away… to the future, to the horizon. Never his mind on where he was. Hmm? What he was doing.”

Anyways, this year I took on the topic of the early history of spatial databases in particular. There was a lot going on in the ’90s in the field, and in many ways PostGIS was a late entrant, even though it gobbled up a lot of the user base eventually.

WKB EMPTY

I have been watching the codification of spatial data types into GeoParquet and now GeoIceberg with some interest, since the work is near and dear to my heart.

Writing a disk serialization for PostGIS is basically an act of format standardization – albeit a standard with only one consumer – and many of the same issues that the Parquet and Iceberg implementations are thinking about are ones I dealt with too.

Here is an easy one: if you are going to use well-known binary for your serialiation (as GeoPackage, and GeoParquet do) you have to wrestle with the fact that the ISO/OGC standard for WKB does not describe a standard way to represent empty geometries.

Empty

Empty geometries come up frequently in the OGC/ISO standards, and they are simple to generate in real operations – just subtract a big thing from a small thing.

SELECT ST_AsText(ST_Difference(
	'POLYGON((0 0, 1 0, 1 1, 0 1, 0 0))',
	'POLYGON((-1 -1, 3 -1, 3 3, -1 3, -1 -1))'
	))

If you have a data set and are running operations on it, eventually you will generate some empties.

Which means your software needs to know how to store and transmit them.

Which means you need to know how to encode them in WKB.

And the standard is no help.

But I am!

WKB Commonalities

All WKB geometries start with 1-byte “byte order flag” followed by a 4-byte “geometry type”.

enum wkbByteOrder  {
    wkbXDR = 0, // Big Endian
    wkbNDR = 1  // Little Endian
};

The byte order flag signals which “byte order” all the other numbers will be encoded with. Most modern hardware uses “least significant byte first” (aka “little endian”) ordering, so usually the value will be “1”, but readers must expect to occasionally get “big endian” encoded data.

enum wkbGeometryType {
    wkbPoint = 1,
    wkbLineString = 2,
    wkbPolygon = 3,
    wkbMultiPoint = 4,
    wkbMultiLineString = 5,
    wkbMultiPolygon = 6,
    wkbGeometryCollection = 7
};

The type number is an integer from 1 to 7, in the indicated byte order.

Collections

Collections are easy! GeometryCollection, MultiPolygon, MultiLineString and MultiPoint all have a WKB structure like this:

wkbCollection {
    byte    byteOrder;
    uint32  wkbType;
    uint32  numWkbSubGeometries;
    WKBGeometry wkbSubGeometries[numWkbSubGeometries];
}

The way to signal an empty collection is to set its numGeometries value to zero.

So for example, a MULTIPOLYGON EMPTY would look like this (all examples in little endian, spaces added between elements for legibility, using hex encoding).

01 06000000 00000000

The elements are:

  • The byte order flag
  • The geometry type (6 == MultiPolygon)
  • The number of sub-geometries (zero)

Polygons and LineStrings

The Polygon and LineString types are also very easy, because after their type number they both have a count of sub-objects (rings in the case of Polygon, points in the case of LineString) which can be set to zero to indicate an empty geometry.

For a LineString:

01 02000000 00000000

For a Polygon:

01 03000000 00000000

It is possible to create a Polygon made up of a non-zero number of empty linear rings. Is this construction empty? Probably. Should you make one of them? Probably not, since POLYGON EMPTY describes the case much more simply.

Points

Saving the best for last!

One of the strange blind spots of the ISO/OGC standards is the WKB Point. There is an standard text representation for an empty point, POINT EMPTY. But there nowhere in the standard a description of a WKB empty point, and the WKB structure of a point doesn’t really leave any place to hide one.

WKBPoint {
    byte    byteOrder;
    uint32  wkbType; // 1
    double x;
    double y;
};

After the standard byte order flag and type number, the serialization goes directly into the coordinates. There’s no place to put in a zero.

In PostGIS we established our own add-on to the WKB standard, so we could successfully round-trip a POINT EMPTY through WKB – empty points are to be represented as a point with all coordinates set to the IEEE NaN value.

Here is a little-endian empty point.

01 01000000 000000000000F87F 000000000000F87F

And a big-endian one.

00 00000001 7FF8000000000000 7FF8000000000000

Most open source implementations of WKB have converged on this standardization of POINT EMPTY. The most common alternate behaviour is to convert POINT EMPTY object, which are not representable, into MULTIPOINT EMPTY objects, which are. This might be confusing (an empty point would round-trip back to something with a completely different type number).

In general, empty geometries create a lot of “angels dancing on the head of a pin” cases for functions that otherwise have very deterministic results.

  • “What is the distance in meters between a point and an empty polygon?” Zero? Infinity? NULL? NaN?
  • “What geometry type is the interesection of an empty polygon and empty line?” Do I care? I do if I am writing a database system and have to provide an answer.

Over time the PostGIS project collated our intuitions and implementations in this wiki page of empty geometry handling rules.

The trouble with empty handling is that there are simultaneously a million different combinations of possibilities, and extremely low numbers of people actually exercising that code line. So it’s a massive time suck. We have basically been handling them on an “as needed” basis, as people open tickets on them.

Other Databases

  • SQL Server changes POINT EMPTY to MULTIPOINT EMPTY when generating WKB.
    SELECT Geometry::STGeomFromText('POINT EMPTY',4326).STAsBinary()
    
    0x010400000000000000
    
  • MariaDB and SnowFlake return NULL for a POINT EMPTY WKB.
    SELECT ST_AsBinary(ST_GeomFromText('POINT EMPTY'))
    
    NULL
    

Readings of 2024

I did a lot of reading last year, a lot, perhaps because I had a lot of down time. I tend to read before going to sleep, and recovery from surgery and other things means I go to bed early and then fill the time between bed and sleep with books. Books, books, and more books.

To be totally precise, I read books on a Kindle, which allows me to read in the middle of the night in the dark with the back light. Also to read from any position, since all books are the same, light weight when consumed via an e-reader. I am a full e-reader convert.

Anyway, I’ve had means, motive and opportunity, and I read a tonne. Some of it was bad, some of it was good, some of it was memorable, some not. Of the 50 or so books I read last year, here are ten that made me go “yes, that was good and memorable”.

Demon Copperhead, Barbara Kingsolver

I used to read Booker Prize winners, but I found the match to my taste was hit-and-miss. The Pullitzer Prize nominees list, on the other hand, has given me piles of great reads. I am still mining it for recommendations, older and older entries.

Anyways, this modern day re-telling of Dicken’s David Copperfield is set in Apallacia, amid the height of the opiod crises. The book is tightly written, has some lovely turns of phrase, and a nice tight narrative push, thanks to the borrowed plot structure. I re-read the Dickens after, because it was so much fun to mark out the character borrowings and plot beats.

Master Slave, Husband Wife, Ilyon Woo

This non-fiction re-telling of an original slavery escape narrative is occasionally verbose, but an excellent entrant into a whole category of writing I did not know existed, the contemporaneous slavery escape narrative. For obvious reasons, abolitionists before the Civil War were keen to promote stories that humanized the people trapped in the south, who might otherwise be theoretical to Northern audiences.

The book re-tells the escape of Ellen and William Craft, and wraps that story in a lot of historical context about the millieu they were escaping from (Georgian slavery) and to (abolitionist circles in the North). The actual text of their story is liberally quoted from, but this is a re-telling. Frederick Douglass appears in their story, which gave me the excuse I have been waiting for a long time to read the next book in this list.

Narrative of the Life of Frederick Douglass

It took me way too long to finally pick up this book, given that Douglass has showed up as such an important figure in the other historical books I have read: Team of Rivals, Memoirs of Ulysses S. Grant, And There Was Light.

One goes into books from the 1800s wondering just how punishing the language is going to be. Clauses upon subclauses upon subclauses? None of that here. Douglass writes wonderfully clean prose the modern mind can handle, and tells his story with economy but still enough context to make it powerful. Probably because as a master story teller, he was pitching for an audience much like the modern one – made up of people with little knowledge of the particulars of the slave system, just a broad and overly simple sense of the injustice. After 150 years, still devestating and accessible.

How Much of These Hills Is Gold, C Pam Zhang

The Goodreads crew does not seem to think this book is as good as I do, but what strikes me about it and what makes me slot it into my “years best” is that I remember it so clearly. This is a historical novel of the California gold rush, from the eyes of children born to Chinese immigrants in the gold fields. It’s both an intense family drama, and an meditation on the power of place. It left me with a strongly remembered sense of the land, and the characters. Even though it covers a big swathe of years, the cast of characters remains small and their interactions meaningful. It’s memorable!

(Also, and this is no small thing, I read Into the Distance by Hernan Diaz this year too, which is set in the same time period and has some of the same beats… so maybe these books are a pairing.)

Julia, Sandra Newman

It’s a great time to be reading about authoritarianism! In the same spirit as pairing up Demon Copperhead with David Copperfield, I also paired up a reading of George Orwell’s 1984 with this retelling of the same story from the point of view of Julia, the love interest in Orwell’s book.

Newman takes the opportunity to flesh out Julia as a character and also the world of 1984 a little more, which makes the re-read of the original really fun. I do not think I noticed before just how much Winston Smith is a self-absorbed schmuck, but once you’ve seen it, you cannot unsee it.

The Bee Sting, Paul Murray

A tragedy told from the inter-leaved view points of four members of a family falling apart. Each chapter from a different character, each builds up the point of view narrator and also illuminates the others. Mostly the reveal is who these people are, bit by bit, but the plot also slowly clicks together like a puzzle until that last piece slides in, and oh boy.

An easy engaging read that gets more and more intense, but you cannot look away.

Yellowface, R F Huang

Written by an Asian-American author, about a white author appropriating the story of an Asian-American author, the story is gripping, snarky, and unblinking in its takedown of the publishing industry. Come for the plot, stay for the commentary on modern meme-making and self-promotion, the intersection between who we are and who we present ourselves as. On the internet, nobody knows you are a dog. Or everybody knows you are a dog and hates you for it.

The Librarianist, Patrick deWitt

I don’t think this book made many or any “best of” lists, so it is not clear to me what caused me to read it, but it was a treat. Just a very quiet story about an introverted retired librarian, finding his way as he transitions into retirement, and builds some new connections with his community. Sounds really boring, I know, but I hoovered it up and it still sticks with me. A good read if you need some optimism and calm in your life.

Say Nothing, Patrick Radden Keefe

A history of the Troubles in Ireland, wrapped around the story of a particular murder, long unsolved, that slowly reveals itself over the decades, as the perpetrators come to terms with their part in that violent chapter of history. The Goodreaders really like this one and I agree. I knew the bare minimum of this chapter of world history (what I gleaned from CNN at the time, and from Derry Girls more recently) and this telling makes an easy introduction, covering a wide sweep of time and context.

Notes from the Burning Age, Claire North

Claire North remains a lesser-known science fiction author, despite her low-key hit The First Fifteen Lives of Harry August (read it!), but I’m a convert, and this novel reminded me why. The world is a post-climate crisis culture that has achieved some spiritual and technological balance with the ecology, but is wrestling with the return of what we would describe as “business as usual” – the subjugation of the natural world to the needs of humans.

Following an ecological monk, turned spy, from inside the capital of the new humanists, through the other realms of this world is easy because the journey is wrapped in a high-stakes espionage story. Of all the climate stories I have read lately, this one taken from such a long distance in the future speaks to me most. I want to think we will build something new and better, and while I know our human nature can be malign, I also know it can be beautiful.

Trust, Hernan Diaz

Best for last. Told in multiple sections from multiple perspectives in multiple styles, every narrator is unreliable, each in their own way, but the idea that there is a kernel of truth lying beneath it all never goes away (and yet, is never truly revealed). Perhaps a perfect book club novel for that reason. (Not where I got it, it’s another Pullitzer winner.)

Some facts everyone agrees on. There is a very rich and powerful financier. He has a relationship with a woman who he marries who is very important to him. But in what way? Unclear. And man is malign, but in what ways? The usual mercenary ones you might expect of a Wall Street lion? Worse and additional ways? Unclear. The whole thing is a puzzle box, the language, the characters, the events. Read it. Read it again. Read it a third time.

Cancer 12

Back to entry 1

I was glancing at the New York Times and saw that Catherine, the Princess of Wales, had released an update on her treatment. And I thought, “wow, I hope she’s doing well”. And then I thought, “wow, I bet she gets a lot of positive affirmation and support from all kinds of people”.

I mean, she’s a princess.

Priness Katherine

Even us non-princesses, we need support too, and I have to say that I have been blown away by how kind the people around me in my life have been. And also how kind the other folks who I have never really talked with before have been.

I try to thank my wife as often as I can. It is hard not to feel like a burden when I am, objectively, a burden, no matter how much she avers I am not. I am still not fully well (for reasons), and I really want to be the person she married, a helpful full partner. It is frustrating to still be taking more than I’m giving.

From writing about my experience here, I have heard from other cancer survivors, and other folks who have travelled the particular path of colorectal cancer treatment. Some of them I knew from meetings and events, some from their own footprint on the internet, some of them were new to me. But they were all kind and supportive and it really helped, in the dark and down times.

From my work on the University of Victoria Board of Governors, I have come to know a lot of people in the community there, and they were so kind to me when I shared my diagnosis. My fellow board members stepped in and took on the tasks I have not been able to do the past few months, and the members of the executive and their teams were so generous in sending their well-wishes.

And finally, my employers at Crunchy Data were the best. Like above and beyond. When I told them the news they just said “take as much time as you need and get better”. And they held to that. My family doctor asked “do you need me to write you a letter for your employer” and I said “no, they’re good”, and he said, “wow! don’t see that very often”. You don’t. I’m so glad Crunchy Data is still small enough that it can be run ethically by ethical people. Not having to worry about employment on top of all the other worries that a cancer diagnosis brings, that was a huge gift, and not one I will soon forget.

I think people (and Canadians to a fault, but probably people in general) worry about imposing, that communicating their good thoughts and prayers could be just another thing for the cancer patient to deal with, and my personal experience was: no, it wasn’t. Saying “thanks, I appreciate it” takes almost no energy, and the boost of hearing from someone is real. I think as long as the patient doesn’t sweat it, as long as they recognize that “ackknowledged! thanks!” is a sufficient response, it’s all great.

Fortunately, I am not a princess, so the volume was not insuperable. Anyways, thank you to everyone who reached out over the past 6 months, and also to all those who just read and nodded, and maybe shared with a friend, maybe got someone to take a trip to the gastroenterologist for a colonoscopy.

Talk to you all again soon, inshala.

Mountain

Cancer 11

Back to entry 1

What happened there, I didn’t write for three months! Two words: “complications”, and “recovery”.

In a terrifying medical specialty like cancer treatment, one of the painful ironies is that patients spend a lot of time suffering from complications and side effects of the treatments, rather than the cancer. In my case and many others, the existence of the cancer isn’t even noticable without fancy diagnostic machines. The treatments on the other hand… those are very noticable!

cdiff

A lot of this comes with the territory of major surgery and dangerous chemicals. My surgery included specific possible complications including, but not limited to: incontinence, sexual disfunction, urinary disfunction, and sepsis.

Fortunately, I avoided all the complications specific to my surgery.

What I did not avoid was a surprisingly common complication of spending some time in a hospital while taking broad spectrum antibiotics–I contracted the “superbug” clostridioides difficile, aka c.diff.

Let me tell you, finding you have a “superbug” is a real bummer, and c.diff lives up to its reputation. Like cancer, it is hard to kill, it does quite a bit of damage while it’s in you, and the things that kill it also do a lot of damage to your body.

cdiff

Killing my c.diff required a couple of courses of specialized antibiotics (vancomycin), that in addition to killing the c.diff also killed all the other beneficial bacteria in my lower intestine.

So, two months after surgery, I was recovering from:

  • having my lower intestine handled and sliced in a major surgery
  • having that same intestine populated with c.diff and covered in c.diff toxins
  • having the microbiotic population living in my intestine nuked with a modern antibiotic developed to kill resistant superbugs

Not surprisingly, having all those things at once makes for a much longer recovery, and a pretty up-and-down one. My slowly recovering microbiota is in constant flux, which results in some really surprising symptoms.

  • highly variable stomach discomfort (ok)
  • highly variable appetite (makes sense)
  • random days of fatigue (really?)
  • random days of anxiety (what?!?)

I had not really understood the implications of gut/brain connection, until this journey showed me just how tightly bound my mental state was to the current condition of my guts. The anxiety I have experienced as a result of my c.diff exposure has been worse, amazingly, than what I felt after my initial cancer diagnosis. One was in my head, but the other was in my gut.

cdiff

I have also developed a much more acute sympathy for people suffering from long Covid and other chronic diseases. The actual symptoms are bad enough, but the psychological effect of the symptom variability is really hard to deal with. Bad days follow good days, with no warning. I have mostly stopped voicing any optimism about my condition, because who knows what tomorrow will bring.

When people ask me how I’m doing, I shrug.

One thing I have got going for me, that chronic disease sufferers do not, is a sense that I am in fact improving. I started journaling my symptoms early in the recovery process, and I can look back and see definitively that while things are unpredictable day to day, or even week to week, the long term trajectory is one of improvement.

Without that, I think I’d go loopy.

Anyways, I am now rougly three months out from my last course of antibiotics, and I expect it will be at least another three months before I’m firing on all cylinders again, thanks mostly to the surgical complication of acquiring c.diff. If I was just recovering from the surgery, I imagine I would be much closer to full recovery.

Harvesting apples