Paul Ramsey2024-02-07T19:32:37+00:00http://blog.cleverelephant.caPaul Ramseypramsey@cleverelephant.caBuilding the PgConf.Dev Programme2024-02-06T16:00:00+00:00http://blog.cleverelephant.ca/2024/02/pgconf-program<p><strong>Update: The programme is now <a href="https://www.pgevents.ca/events/pgconfdev2024/sessions/">public</a>.</strong></p>
<p>The programme for <a href="https://pgconf.dev">pgconf.dev</a> in Vancouver (May 28-31) has been selected, the speakers have been notified, and the whole thing should be posted on the web site relatively soon.</p>
<p><img src="/images//2024/vancouver2.jpg" alt="Vancouver, Canada" /></p>
<p>I have been on programme committees a number of times, but for regional and international <a href="https://foss4g.org/">FOSS4G</a> events, never for a PostgreSQL event, and the parameters were notably different.</p>
<p>The parameter that was <strong>most</strong> important for selecting a programme this year was the over <strong>180 submissions</strong>, versus the <strong>33 available speaking slots</strong>. For FOSS4G conferences, it has been normal to have between two- and three-times as many submissions as slots. To have almost six-times as many made the process <strong>very difficult indeed</strong>.</p>
<p>Why only 33 speaking slots? Well, that’s a result of two things:</p>
<ul>
<li>Assuming no more than modest growth over the last iteration of PgCon, puts <strong>attendence at around 200</strong>, which is the size of our plenary room. 200 attendees implies no more than <strong>3 tracks of content</strong>.</li>
<li>Historically, PostgreSQL events use talks of about 50 minutes in length, within a <strong>one hour slot</strong>. Over three tracks and two days, that gives us around 33 talks (with slight variations depending on how much time is in plenary, keynotes or lightning talks).</li>
</ul>
<p>The content of those 33 talks falls out from being the successor to <a href="https://pgcon.org">PgCon</a>. PgCon has historically been the event attended by all major contributors. There is an invitation-only contributors round-table on the pre-event day, specifically for the valuable face-to-face synch-up.</p>
<p><img src="/images//2024/room.jpg" alt="Seminary Room" /></p>
<p>Given only 33 slots, and a unique audience that contains so many contributors, the question of what <a href="https://pgconf.dev">pgconf.dev</a> should “be” ends up focussed around making the best use of that audience. <a href="https://pgconf.dev">pgconf.dev</a> should be a place where users, developers, and community organizers come together to focus on Postgres development and community growth.</p>
<p>That’s why in addition to talks about future development directions there are talks about PostgreSQL coding concepts, and patch review, and extensions. High throughput memory algorithms are good, but so is the best way to write a technical blog entry.</p>
<p>Getting from 180+ submissions to 33 selections (plus some stand-by talks in case of cancellations) was a process that consumed three calls of over 2 hours each and several hours of reading every submitted abstract.</p>
<p>The process was shepherded by the inimitable <a href="https://jkatz05.com/">Jonathan Katz</a>.</p>
<ul>
<li>A first phase of just coding talks as either “acceptable” or “not relevant”. Any talks that all the committee members agreed was “not relevant” were dropped from contention.</li>
<li>A second phase where each member picked 40 talks from the remaining set into a kind of “personal program”. The talks with just one program member selecting them were then reviewed one at a time, and that member would make the case for them being retained, or let them drop.</li>
<li>A winnow looking for duplicate topic talks and selecting the strongest, or encouraging speakers to collaborate.</li>
<li>A third “personal program” phase, but this time narrowing the list to 33 talks each.</li>
<li>A winnow of the most highly ranked talks, to make sure they really fit the goal of the programme and weren’t just a topic we all happened to find “cool”.</li>
<li>A talk by talk review of all the remaining talks, ensuring we were comfortable with all choices, and with the aggregate make up of the programme.</li>
</ul>
<p>The <a href="https://2024.pgconf.dev/cfp/">programme committee</a> was great to work with, willing to speak up about their opinions, disagree amicably, and come to a consensus.</p>
<p><img src="/images//2024/SFU.jpg" alt="SFU" /></p>
<p>Since we had to leave 150 talks behind, there’s no doubt lots of speakers who are sad they weren’t selected, and there’s lots of talks that we would have taken if we had more slots.</p>
<p><strong>If you read all the way to here, you must be serious about coming, so you need to register and <a href="https://2024.pgconf.dev/venue/">book your hotel</a> right away. Spaces are, really, no kidding, very limited.</strong></p>
PgConf.Dev @ Vancouver, May 28-312024-01-07T16:00:00+00:00http://blog.cleverelephant.ca/2024/01/pgconf-dev<p>This year, the global gathering of PostgreSQL developers has a new name, and a new location (but more-or-less the same dates) … <a href="https://www.pgcon.org/2024/">pgcon.org</a> is now <a href="https://2024.pgconf.dev/">pgconf.dev</a>!</p>
<p>Some important points right up front:</p>
<ul>
<li>The <a href="https://2024.pgconf.dev/cfp/">call for papers</a> is closing in one week! If you are planning to submit, now is the time!</li>
<li>The hotel scene in Vancouver is <strong>competitive</strong>, so if you put off booking accomodations… don’t do that! Book a <a href="https://2024.pgconf.dev/venue/">room</a> right away.</li>
<li>The venue capacity is <strong>200</strong>. That’s it, so once we have 200 registrants, we are full for this year. <a href="https://2024.pgconf.dev/registration/">Register now</a>.</li>
<li>There are also limited <a href="https://2024.pgconf.dev/sponsor-levels/">sponsorship</a> slots. Is PostgreSQL important to your business? Sponsor!</li>
</ul>
<p><img src="/images//2024/vancouver.jpg" alt="Vancouver, Canada" /></p>
<p>I first attended <a href="https://www.pgcon.org/2011/">pgcon.org</a> in 2011, when I was <a href="https://blog.cleverelephant.ca/2011/05/keynote-may.html">invited to keynote</a> on the topic of <a href="https://postgis.net">PostGIS</a>. Speaking in front of an audience of PostgreSQL luminaries was really intimidating, but also gratifying and empowering. Notwithstanding my imposter syndrome, all those super clever developers thought our little geospatial extension was… kind of clever.</p>
<p>I kept going to PgCon as regularly as I was able over the years, and was never disappointed. The annual gathering of the core developers of PostgreSQL necessarily includes content and insignts that you simply can not come across elsewhere, all compactly in one smallish conference, and the hallway track is amazing.</p>
<p>PostgreSQL may be a global development community, but the power of personal connection is not to be denied. Getting to meet and talk with core developers helped me understand where the project was going, and gave me the confidence to push ahead with my (very tiny) contributions.</p>
<p>This year, the event is in Vancouver! Still in Canada, but a little more directly connected to <a href="https://www.flightsfrom.com/YVR">international air hubs</a> than Ottawa was.</p>
<p>Also, this year I am honored to get a chance to serve on the <a href="https://2024.pgconf.dev/cfp/">program committee</a>! We are looking for technical talks from across the PostgreSQL ecosystem, as well as about happenings in core. PostgreSQL is so much larger than just the core, and spreading the word about how you are building on PostgreSQL is important (and I am not just saying that as an extension author).</p>
<p>I hope to see you all there!</p>
Data Science is Getting Ducky2023-12-19T16:00:00+00:00http://blog.cleverelephant.ca/2023/12/duck<p>For a long time, a big constituency of users of <a href="https://postgis.net">PostGIS</a> has been people with large data analytics problems that crush their desktop GIS systems. Or people who similarly find that their geospatial problems are too large to run in R. Or Python.</p>
<p>These are data scientists or adjacent people. And when they ran into those problems, the first course of action would be to move the data and parts of the workload to a “real database server”.</p>
<p>This all made sense to me.</p>
<p>But recently, something transformative happened – <strong>Crunchy Data upgraded my work laptop to a MacBook Pro</strong>.</p>
<p><img src="/images//2023/m2.jpeg" alt="" /></p>
<p>Suddenly a <a href="https://libgeos.org">GEOS</a> compile that previously took 20 minutes, took 45 seconds.</p>
<p>I now have processing power on my local laptop that previously was only available on a server. The MacBook Pro may be a leading indicator of this amount of power, but the trend is clear.</p>
<p>What does that mean for default architectures and tooling?</p>
<p>Well, for data science, it means that a program like <a href="https://duckdb.org/">DuckDB</a> goes from being a bit of a curiosity, to being the default tool for handling large data processing workloads.</p>
<p>What is DuckDB? According to the web site, it is “an in-process
SQL OLAP database management system”. That doesn’t sound like a revolution in data science (it sounds really confusing).</p>
<p>But consider what DuckDB rolls together:</p>
<ul>
<li>A column-oriented processing engine that makes the most efficient possible use of the processors in modern computers. Parallelism to ensure all CPUs are made use of, and low-level optimizations to ensure each tick of those processors pushes as much data through the pipe as possible.</li>
<li>Wide ranging support for different data formats, so that integration can take place on-the-fly without requiring translation or sometimes even data download steps.</li>
</ul>
<p>Having those things together makes it a data science power tool, and removes a lot of the prior incentive that data scientists had to move their data into “real” databases.</p>
<p><img src="/images//2023/duck.jpg" alt="" /></p>
<p>When they run into the limits of in-memory analysis in R or Python, they will instead serialize their data to local disk and use DuckDB to slam through the joins and filters that were blowing out their RAM before.</p>
<p>They will also take advantage of DuckDB’s ability to stream remote data from data lake object stores.</p>
<p>What, stream multi-gigabyte JSON files? Well, yes that’s possible, but it’s not where the action is.</p>
<p>The CPU is not the only laptop component that has been getting ridiculously powerful over the past few years. The network pipe that <strong>connects that laptop to the internet</strong> has also been getting both wider and lower latency with every passing year.</p>
<p>As the propect of streaming data for analysis has come into view, the formats for remote data have also evolved. Instead of JSON, which is relatively fluffy, and hard to efficiently filter, the Parquet format is becoming a new standard for data lakes.</p>
<p><img src="/images//2023/parquet.jpg" alt="" /></p>
<p>Parquet is a binary format, that <a href="https://www.crunchydata.com/blog/parquet-and-postgres-in-the-data-lake#wait-parquet">organizes the data into blocks</a> for efficient subsetting and processing. A DuckDB query to a properly organized Parquet time series file might easily pull only records for 2 of 20 columns, and 1 day of 365, reducing a multi-gigabyte download to a handful of megabytes.</p>
<p>The huge rise in available local computation, and network connectivity is going to spawn some new standard architectures.</p>
<p>Imagine a “two tier” architecture where tier one is an HTTP object store and tier two is a Javascript single page app? The <a href="https://geotiffjs.github.io/cog-explorer/#long=16.370&lat=48.210&zoom=5&scene=&bands=&pipeline=">COG Explorer</a> has already been around for a few years, and it’s just such a two tier application.</p>
<p class="note">(For fun, recognize that an architecture where the data are stored in an access-optimized format, and access is via primitive file-system requests, while all the smarts are in the client-side visualization software is… the old workstation GIS model. Everything old is new again.)</p>
<p>The technology is fresh, but the trendline is pretty clear. See <a href="https://www.youtube.com/watch?v=PFWjMHXdRdY">Kyle Barrron’s talk</a> about GeoParquet and DeckGL for a taste of where we are going.</p>
<p><img src="/images//2023/duckelephant.jpg" alt="" /></p>
<p>Meanwhile, I expect that a lot of the growth in PostGIS / PostgreSQL we have seen in the data science field will level out for a while, as the convenience of DuckDB takes over a lot of workloads.</p>
<p>The limitations of Parquet (efficient remote access limited to a handful of filter variables being the primary one, as will cojoint spatial/non-spatial filter and joins) will still leave use cases that require a “real” database, but a lot of people who used to reach for PostGIS will be reaching for Duck, and that is going to <strong>change a lot of architectures</strong>, some for the better, and some for the worse.</p>
How to Become a CEO2023-12-14T16:00:00+00:00http://blog.cleverelephant.ca/2023/12/ceo<p>As a young man, I had a lot of ambition to climb the greasy pole, to get to the “top” of this heap we call a “career”, and as time went on I started doing <strong>little explorations of the career histories</strong> of people who made it to that apex corporate title, the “CEO”.</p>
<p><img src="/images//2023/ceo0d.jpg" alt="" /></p>
<p>It is worth doing this because, by and large, our society is run by people who either have been CEOs or who have come very close. Pull lists of boards of both private and public institutions and you will see a lot of people who have <strong>ascended to the top of large institutions</strong> before moving into governance. These are the people who determine the direction of our society, by and large.</p>
<p>And how have they gotten there? Through a surprisingly small number of routes, that are all highly <a href="https://en.wikipedia.org/wiki/Path_dependence">path dependent</a>.</p>
<p>If you spend some time exploring the employment histories of corporate leaders, you’ll find really just a couple archetypes.</p>
<h2 id="the-long-term-corporate-climber">The Long-term Corporate Climber</h2>
<p>By far the most common pattern is for a future CEO to find an entry- or mid-level position in a large organization, and then <strong>work at that one organization for 15 to 25 years</strong>, ascending the ranks.</p>
<p><img src="/images//2023/ceo1d.jpg" alt="" /></p>
<p>Once they get to just below the CEO, they either leap to a CEO position at another firm, or finally ascend to the CEO position of their originating organization.</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Darren_Woods">Darren Woods</a>, CEO of ExxonMobile, spent 25 years working his way to the top of Exxon.</li>
<li><a href="https://www.linkedin.com/in/ginnirometty/details/experience/">Ginny Rometty</a>, former CEO of IBM, spent 25 years working her way to the top of IBM.</li>
<li><a href="https://www.freep.com/story/money/cars/ford/2019/06/19/ford-executive-jim-farley-toyota/1299871001/">Jim Farley</a>, CEO of Ford, actually did his important climbing (from entry level to upper management) over 17 years at Lexus, then spent another 13 years at Ford completing the climb to CEO through a sequence of high level regional jobs.</li>
<li><a href="https://www.linkedin.com/in/david-hutchens-9a618210/details/experience/">David Hutchens</a> the CEO of our local gas utility, spent 26 years climbing the rungs of Tuscon Electric.</li>
</ul>
<p>I was spurred to write about this topic today when I learned that EDB has a new CEO (what?!), <a href="https://www.linkedin.com/in/kedallas/">Kevin Dallas</a>, who (wait for it), spent 24 years climbing the greasy pole at Microsoft, before being tapped for his first CEO gig in 2020.</p>
<p>Speaking of Microsoft, even corporate leadership savant <a href="https://www.business-standard.com/about/who-is-satya-nadella">Satya Nadella</a> started as an entry level engineer in Microsoft, taking the CEO slot in 2014 after 22 years of slogging upwards.</p>
<p>In the main, the <strong>way to become a CEO</strong> (of a large organization) is to <strong>get yourself a job in a large organization early in your career</strong>, so you can accumulate the experience and contacts necessary to be considered a viable candidate later in your career.</p>
<p>The path dependence is kind of obvious. If you spend your early career on something else, by the time you get into a large organization you will be starting too far down the heirarchy to reach the top before your career tapers off.</p>
<p>To many, the surprising thing about these career profiles is <strong>how rarely there are mid-career jumps</strong> between corporations. Probably this is because people under-estimate the <strong>power of social networks</strong>.</p>
<p>Your reputation for “getting things done”, the density of people who find you charming, the employees and hangers on who benefit from your rise in the organization, they are all highest in one place: the place you already work. Moving laterally in mid-career to a new organization instantly <strong>resets your accumulated social capital to zero</strong>.</p>
<h2 id="the-founder-or-early-hire">The Founder or Early Hire</h2>
<p>One exception to the rule is the <strong>founder</strong> of a company that grows to a scale sufficient to be considered comparable to existing institutions.</p>
<p><img src="/images//2023/ceo2d.jpg" alt="" /></p>
<p><strong>This is, as you can imagine, quite rare.</strong></p>
<p>In the “wow that’s insane” founder category: Bill Gates, Steve Jobs, Mark Zuckerberg, Sara Blakely.</p>
<p>Or the “locally known but still huge” founder category: Ryan Holmes (Hootsuite), Chip Wilson (Lululemon), Stewart Butterfield (Slack, Flickr), James Pattison (Pattison Group), Dennis Washington (Seaspan).</p>
<p>In the tech space, there’s also a lot of early hires, who necessarily progressed quite quickly through the “ranks” as the company they had lucked into exploded in size.</p>
<ul>
<li>Erik Schmidt, who rode Sun Microsystems rocket to senior management before finding CEO roles at Novell and Google.</li>
<li>Steve Balmer, who… do I need to even say?</li>
<li>Sundar Pichai, who joined Google in 2004 and held on to become CEO as the founders burned out.</li>
</ul>
<h2 id="the-well-connected">The Well Connected</h2>
<p>This is an interesting third category, which is difficult to join, but is very much real – <strong>knowing people</strong> who will elevate you early on.</p>
<ul>
<li>
<p>Like, former Treasury Secretary <a href="https://www.cbsnews.com/news/a-closer-look-at-treasury-sec-geithner/">Tim Geithner</a> had on the one hand, a kind of conventional “grind it out” career working his way up the ranks of the senior federal civil service. But on the other hand, the roles he was in, right from the start were <strong>quite high level</strong>. How did he manage that? He was recommended to his <a href="https://archive.ph/KOOFI">first job out of college</a> at <strong>Kissinger Associates</strong>, by the Dean of his faculty at Johns Hopkins. From there he met lots of powerful people who would vouch for his brilliance, and away he went. Now, it surely helped that he was brilliant! But, the connections were necessary too.</p>
</li>
<li>
<p>I checked out the career history of <a href="https://en.wikipedia.org/wiki/Jamie_Dimon">Jamie Dimon</a>, CEO of JP Morgan, expecting to find a long slog at a major financial institution, but it turns out Dimon got an <a href="https://en.wikipedia.org/wiki/Jamie_Dimon#Early_life_and_education">early boost</a> into leadership, through his connection to <a href="https://en.wikipedia.org/wiki/Sanford_I._Weill">Sandy Weill</a>, who recruited him to American Express. And how did Weill know Dimon? Dimon’s <strong>mother</strong> knew Weill and got Dimon hired for a summer job with him. Again, it surely helped that Dimon was sharp as a tack! But, without his mom…</p>
</li>
</ul>
<p><img src="/images//2023/ceo3d.jpg" alt="" /></p>
<p>In lots of cases, this category is fully subsumed in the first. Anyone who grinds up a corporate heirarchy will find boosters and mentors who will in turn help them get ahead. Often a senior leader gets a lot of help from a talented junior and they ascend the heirarchy in parallel. Being the “assistant to the President” might make you officially lower on the totem pole than the CFO, but unofficially and in terms of career advancement… that can be another story altogether.</p>
<h2 id="advice">Advice?</h2>
<p>Despite my long-time desire to climb the greasy pole, I have never worked for an instution large enough to have any serious opportunities to climb, and have finally achieved a zen calm about career. By and large my career has been something that happened <strong>to</strong> me, not something I <strong>planned</strong>, and that colors my perceptions a lot.</p>
<p>First jobs lead to first connections, and first connections determine what paths open up as you move on to second and third jobs. <strong>Path dependence in career progression is huge</strong>. Probably the most important moment is early career, getting into an institution or industry that is poised for growth and change.</p>
<p><img src="/images//2023/ceo4d.jpg" alt="" /></p>
<p>It’s possible to rise in an older, established institutions, but my impression is that it’s more of a knife fight. I don’t think the alternate universe Steve Balmer who started in sales at IBM would have risen to be a CEO.</p>
<p>Far and away the most important thing you can amass, at any career stage, is connections. Take every opportunity to meet new people, and find people and topics that stimulate your curiosity. If what you are doing is boring or unpleasant, it’s never going to matter what your title is, or how high up the pole you are.</p>
Keynote @ FOSS4G NA 20232023-12-12T16:00:00+00:00http://blog.cleverelephant.ca/2023/12/foss4g-keynote-2023<p>Preparing the keynote for <a href="https://foss4gna.org/">FOSS4G North America</a> this year felt particularly difficult. I certainly sweated over it.</p>
<ul>
<li>Audience was a problem. I wanted to talk about my usual thing, business models and economics, but the audience was going to be a mash of people new to the topic and people who has seen my <em>spiel</em> multiple times.</li>
<li>Length was a problem. Out of an excess of faith in my abilities, the organizers gave me a full hour long slot! That is a very long time to keep people’s attention and try to provide something interesting.</li>
</ul>
<p>The way it all ended up was:</p>
<ul>
<li>Cadging some older content from keynotes about business models, to bring new folks up to speed.</li>
<li>Mixing in some only slightly older content about cloud models.</li>
<li>Adding in some new thoughts about the way everyone can work together to make open source more sustainable (or at least less extractive) over the long term.</li>
</ul>
<p>Here’s this year’s iteration.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/1OfunxBysmg?start=190" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen=""></iframe>
<p>The production of this kind of content is involved. The goal is to remain interesting over a relatively long period of time.</p>
<p>I have become increasingly opinionated about how to do that.</p>
<ul>
<li><strong>No freestyling.</strong> Blathering over bullet points is unfair to your audience. The aggregate time of an audience of 400 is very large. 5 minutes of your “um” and “ah” translates into 33 hours of dead audience time.</li>
<li><strong>Get right to it.</strong> No mini-resume, no talking about your employer (unless you are really sneaky about it, like me 😉), this is about delivering ideas and facts that are relevant to the audience. Your introducer can handle your bona fides.</li>
<li><strong>Have good content.</strong> The hardest part! (?) Do you have something thematic you can bookend the start and end with? Are there some interesting facts that much of the audience does not know yet? Are there some unappreciated implications? This is, presumably, why you were asked to keynote, so hopefully not too, too hard. This is the part that I worry over the most, because I really have <em>no faith</em> that what I have to say is actually going be interesting to an audience, no matter how much I gussy it up.</li>
<li><strong>Work from a text.</strong> The way to avoid blather is to know exactly what you are going to say. At 140 words-per-minute speaking pace, a 55 minute talk is 7700 words, which coincidentally (not) is exactly how long my keynote text is.</li>
<li><strong>Write a speech, not an article.</strong> You will have to say all those words! Avoid complicated sentence constructions. Keep sentences short. Take advantage of parallel constructions to make a point, drive a narrative, force a conclusion. (see?) Repeat yourself. Repeat yourself.</li>
<li><strong>Perform, don’t read.</strong> Practice reading out loud. Get used to leaving longer gaps and get comfortable with silence. Practice modulating your voice. Louder, softer. Faster, slower. Drop. The. Hammer. Watch a gifted speaker like Barack Obama <a href="https://youtube.com/watch?v=pWe7wTVbLUU">deliver a text</a>. He isn’t ad libbing, he’s performing a prepared text. See what he does to make that sound spontaneous and interesting.</li>
<li><strong>Visuals as complements, not copies.</strong> Your <a href="https://docs.google.com/presentation/d/1v0S_ExDBR7AcDOqH9C8TpdZ04iuXTFdotiQXGivhbpY/edit">slides</a> should complement <em>and amplify</em> your content, not recapitulate it. In the limit, you could do all-text slides, which just give the three-word summary of your current main point. (This <a href="https://www.youtube.com/watch?v=hxKlmCh0tGA&ab_channel=lessig">classic Lessig talk</a> is my favourite example.)</li>
<li><strong>Visuals as extra channel.</strong> Keep changing up the visual. Use the slide notes space to get a feel for how long each slide should be up. (Hint, about 50 words on average.) Keeping slide duration low also helps in terms of using the per-slide speaker notes as low-end teleprompter (increase notes font size! reduce slide preview size!) from which you <em>deliver your performance</em>.</li>
</ul>
<p>I originally started scripting talks because it allowed me to smooth out the quality of my talks. With a script, it wasn’t a crapshoot whether I had a good ad lib delivery or a bad one, I had a nice consistent level. From there, leveraging up to take advantage of the format to increase the talk quality was a natural step. Speakers like Lessig and <a href="https://www.youtube.com/watch?v=JB87qJGSvuk">Damian Conway</a> remain my guide posts.</p>
<p>If you liked the keynote video and want to use the materials, the slides are <a href="https://docs.google.com/presentation/d/1v0S_ExDBR7AcDOqH9C8TpdZ04iuXTFdotiQXGivhbpY/edit">available here under CC BY</a>.</p>
MapScaping Podcast: Pg_EventServ2023-07-08T00:00:00+00:00http://blog.cleverelephant.ca/2023/07/mapscaping-pgeventserv<p>Last month I got to record a couple podcast episodes with the <a href="https://mapscaping.com/podcast/rasters-in-a-database/">MapScaping Podcast</a>’s <a href="https://mapscaping.com/about-us/">Daniel O’Donohue</a>. One of them was on the benefits and many pitfalls of <a href="https://mapscaping.com/podcast/rasters-in-a-database/">putting rasters into a relational database</a>, and the other was about real-time events and pushing data change information out to web clients!</p>
<ul>
<li><a href="https://mapscaping.com/podcast/postgresql-listen-and-notify-clients-in-real-time/">PostgreSQL – Listen and Notify Clients In Real Time</a></li>
</ul>
<p>TL;DR: geospatial data tends to be more “visible” to end user clients, so communicating change to multiple clients in real time can be useful for “common operating” situations.</p>
<p>I also recorded a presentation about <a href="https://github.com/crunchydata/pg_eventserv">pg_eventserv</a> for <a href="https://www.youtube.com/playlist?list=PLesw5jpZchudJTmRukWO1eP5-6zPpIm5x">PostGIS Day 2022</a>.</p>
<ul>
<li><a href="https://www.youtube.com/watch?v=Z_nOzHmpY8M">Web Sockets and Real-Time Updates for Postgres with pg_eventserv</a></li>
</ul>
Keynote @ CUGOS Spring Fling2023-05-24T16:00:00+00:00http://blog.cleverelephant.ca/2023/05/cugos<p>Last month I was invited to give a keynote talk at the <a href="https://cugos.org/2023-spring-fling/">CUGOS Spring Fling</a>, a delightful gathering of “Cascadia Users of Open Source GIS” in Seattle. I have been speaking about open source economics at FOSS4G conferences more-or-less every two years, since 2009, and took this opportunity to somewhat revisit the topics of my <a href="/2019/05/foss4g-keynote-2019.html">2019 FOSS4GNA keynote</a>.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/RN7SSj5LB6k" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen=""></iframe>
<p>If you liked the video and want to use the materials, the slides are <a href="https://docs.google.com/presentation/d/1X8gua9b1Qh1U5ob8uPLdPYrICBK7goBWy7JYViCGObk/edit#slide=id.g22e4c0e5445_0_0">available here under CC BY</a>.</p>
MapScaping Podcast: Rasters and PostGIS2023-05-17T00:00:00+00:00http://blog.cleverelephant.ca/2023/05/mapscaping-pgraster<p>Last month I got to record a couple podcast episodes with the <a href="https://mapscaping.com/podcast/rasters-in-a-database/">MapScaping Podcast</a>’s <a href="https://mapscaping.com/about-us/">Daniel O’Donohue</a>. One of them was on the benefits and many pitfalls of putting rasters into a relational database, and it is online now!</p>
<ul>
<li><a href="https://mapscaping.com/podcast/rasters-in-a-database/">Rasters In A Database?</a></li>
</ul>
<p>TL;DR: most people think “put it in a database” is a magic recipe for: faster performance, infinite scalability, and easy management.</p>
<p>Where the database is replacing a pile of CSV files, this is probably true.</p>
<p>Where the database is replacing a collection of GeoTIFF imagery files, it is probably false. Raster in the database will be slower, will take up more space, and be very annoying to manage.</p>
<p>So why do it? Start with a default, “don’t!”, and then evaluate from there.</p>
<p>For some non-visual raster data, and use cases that involve enriching vectors from raster sources, having the raster co-located with the vectors in the database can make working with it more convenient. It will still be slower than direct access, and it will still be painful to manage, but it allows use of SQL as a query language, which can give you a lot more flexibility to explore the solution space than a purpose built data access script might.</p>
<p>There’s some other interesting tweaks around storing the actual raster data outside the database and querying it from within, that I think are the future of “raster in (not really in) the database”, <a href="https://mapscaping.com/podcast/rasters-in-a-database/">listen to the episode</a> to learn more!</p>
LLM Use Case2023-05-09T00:00:00+00:00http://blog.cleverelephant.ca/2023/05/ai-spam<p>I can only imagine how much AI large language model generated junk there is on the internet already, but I have now <strong>personally</strong> found one in my blog comments. It’s worth pointing out, since comment link spam is a long time scourge of web publishing, and the new technology makes it just that little extra bit invisible.</p>
<p>The target blog post is <a href="/2008/05/feel-burn.html">this one</a> from the late 2000’s oil price spike. A brief post about how transportation costs tie into real estate desirability. (Possible modern day tie in: will the rise of EVs and decoupling of daily transport costs from oil prices result in a suburban rennaisance? God I hope not.)</p>
<p>The LLM spam by “Liam Hawkins” is elegant in its simplicity.</p>
<p><img src="/images//2023/blog-spam.png" alt="Blog Spam" /></p>
<p>I imagine the prompt is nothing more complex than “download this page and generate 20 words that are a reasonable comment on it”. The link goes to a Brisbane bathroom renovation company, that I am sure does sterling work and maybe should concentrate on word of mouth rather than SEO.</p>
<p>I need to check my comment settings, since the simplest solution is surely to just disallow links in comments. An unfortunate degredation in the whole point of the “web”, but apparently necessary in these troubled times.</p>
My Subscriptions2023-04-30T08:00:00+00:00http://blog.cleverelephant.ca/2023/04/subscriptions<p>It is the age of the unbundled subscription, and I am wondering how long it will last? And also, what do our subscriptions say about us?</p>
<p>Here are mine in approximate order of acquisition:</p>
<ul>
<li><a href="https://www.newyorker.com/">New Yorker Magazine</a>, I have been a New Yorker subscriber for a very long time, and for a period in my life it was almost the only thing I read. I would read one cover-to-cover and by the time I had finished, the next would be in the mail box, and the cycle would repeat.</li>
<li><a href="https://amazon.ca">Amazon Prime</a>, I was 50/50 on this one until the video was added, and then I was fully hooked. It’s pricey, and intermittently has things I want to watch, so I often flirt with cancelling, but not so far.</li>
<li><a href="https://netflix.com/">Netflix</a>, for a while this was too cheap to not get, the kids liked some of it, I liked some, there were movies I enjoyed. However, the quality of is going down and the price up so it might be my first streamer cancellation.</li>
<li><a href="https://washingtonpost.com">Washington Post</a>, I got lucky and snagged a <strong>huge deal</strong> for international subscribers which has since disappeared, but got me a $2 / month subscription I couldn’t say “no” to, because I do read a lot of WP content.</li>
<li><a href="https://talkingpointsmemo.com">Talking Points Memo</a>, the best independent political journalism site, which was pivoting to subscription years before it became cool. My first political read of every day.</li>
<li><a href="https://nytimes.com">The New York Times</a>, a very pricey pleasure, but I found myself consuming a lot of NYT content, and finally felt I just had to buck up.</li>
<li><a href="https://disneyplus.com">Disney+</a>, for my son who was dying to see all the Star Wars and Marvel content. Now that he’s watched it all, we are discovering some of their other offerings, they own a quality catalogue.</li>
<li><a href="https://open.spotify.com">Spotify</a>, once the kids were old enough to have smart phones, the demand for Spotify was pretty immediate. I’ve enjoyed having access to this huge pile of music too (<a href="https://open.spotify.com/artist/0dEvJpkqhrcn64d3oI8v79">BNL</a> forever!).</li>
<li><a href="https://www.slowboring.com/">Slow Boring / Matt Yglesias</a>, my first sub-stack subscription. You can tell a lot about my political valence from this.</li>
<li><a href="https://www.volts.wtf/">Volts / David Roberts</a>, highly highly recommended if you are a climate policy nerd, as he covers climate and energy transition from every angle. Never easy, never simplistic, always worth the time.</li>
</ul>
<p>In the pre-internet days I was also a subscriber to <a href="https://harpers.org/">Harper’s</a> and <a href="https://www.theatlantic.com/">The Atlantic</a>, but dropped both subscriptions some time ago. The articles in Harper’s weren’t grabbing me.</p>
<p>The real tragedy was The Atlantic, which would publish something I really wanted to read less than once a month, so I would end up … reading it on the internet for free. The incentive structure for internet content is pretty relentless in terms of requiring new material <strong>very very frequently</strong>, and a monthly publication like The Atlantic fits that model quite poorly.</p>
<p>Except for <a href="https://www.volts.wtf/">Volts</a>, this list of paid subscriptions is curiously devoid of a huge category of my media consumption: podcasts. I listen to <a href="https://www.nytimes.com/column/ezra-klein-podcast">Ezra Klein</a>, <a href="https://www.msnbc.com/msnbc-podcast/why-is-this-happening-chris-hayes">Chris Hayes</a>, <a href="https://crooked.com/podcast-series/strict-scrutiny/">Strict Scrutiny</a>, <a href="https://podcasts.apple.com/ca/podcast/revolutions/id703889772">Mike Duncan</a>, and <a href="https://www.bloomberg.com/oddlots">Odd Lots</a> for hours a week, for free. This feels… off kilter.</p>
<p>Although I guess a some of these podcasts are brand embassadors for larger organizations (NYT, NBC, Bloomberg), it seems hard to believe advertising is really the best model, particularly for someone like Mike Duncan who has established a pretty big following.</p>
<p>(If Mike Duncan committed to another multi-year history project, I’d sign up!)</p>
<p>One thing I haven’t done yet is tot up all these pieces and see how they compare to my pre-internet subscription / media consumption bill. A weekend newspaper or two every week. Cable television. The three current affairs magazines. The weekly video rental. Even taken <em>ala carte</em>, I bet the old fashioned way of buying did add up over the course of a year.</p>
<p>I’m looking forward to a little more consolidation, particularly in the individual creator category. Someone will crack the “flexible bundle” problem to create the “virtual news magazine” eventually, and I’m looking forward to that.</p>