Letters from Somnolescent February 6, 2024

SomnolCCSO and Reviving an Old, Dead Database Lookup Protocol

by mariteaux

On a whim about two weeks ago, I decided to finally start redoing the Somnolescent Gopher server. Gopher is such a throwback, nostalgic thing for me–it was one of the first things we got set up for Somnol right when we first got hosting all the way back in December 2018. Alas, the Gopher had not been touched since 2021, outdated and rather embarrassing for me, so I ripped it all out and got it reassembled. Still working on it, but I think it’s coming out absolutely killer. You can visit it at gopher://gopher.somnolescent.net if you have a capable client, or you can use this HTTP proxy link if you’re just looking at it in your browser.

While Gopher is highly neat, among the culty hipster retro tech geeks, it’s a known quantity. There’s new Gopher clients every year, and Gemini clients oftentimes double as Gopher clients thanks to the similarities of their protocols. Not so with the true subject of today’s post. Today’s topic has no modern server software support (before us, anyway), and accessing it is even tougher, practically requiring Windows 3.1 or a *nix box with Docker and the whole setup around that. I’ve spent the last week doing a deep, deep dive into a protocol so obscure, there’s less than ten servers for it still in existence. And we’re one of them now.

CCSO lookup in NCSA Mosaic
You’ll want to click on all the screenshots for full-size, since the text can get a little hard to read.

Say hello to SomnolCCSO, my friends. I’ll tell you how we made it happen and how you can try it out for yourself.

CCSO, ph, qi, chi, and balancing your chakras

The CSO nameserver provides an efficient database for the storage and retrieval of a variety of information over the Internet. Its primary use is for telephone and email directories, but it may be used to store any type of information.

Noel Hunter, FAQ for ph (CSO nameserver)

Take a step into the Wayback Machine with me. The year is 1988. The Soviet Union and the Iron Curtain were starting to disintegrate, the best-selling album of the year is George Michael’s Faith, and the Morris worm, written by one of the guys who later brought you “Hacker” “News”, has just been unleashed on a still rather isolated Internet. Tim-Berners Lee would start work on that whole World Wide Web thing, but there’d be no finished software for it until 1990. The Internet was still largely a computer geek and university thing.

The Computing and Communications Services Office at the University of Illinois needed a way for students to look up campuswide information (y’know, office hours and phone numbers, people’s names, email addresses, all that), but were unhappy with the choice of systems for doing information lookups. At the time, there were three potential candidates, the CSnet Nameserver and White Pages from Andrew University, and NetDB from Stanford. None met all of the CCSO’s requirements, and Steve Dorner, who’s better known for writing the Eudora email client, was tasked with amending CSnet, which was the closest to what they wanted.

Three systems that preceded CCSO in usage

CCSO was the result–well, CCSO is technically a misnomer. Officially, the client software is called ph (for phone book), and the server software is called qi (for query interpreter). But let’s be honest here, ph and qi are pretty ambiguous names for software in 2024, and referring to them separately just confuses matters. I call the whole enchilada CCSO, and so does everyone else in my experience. Regardless of nomenclature, CCSO was a simple-to-use (for the time), high-performance, lightweight information lookup protocol, one you could theoretically use for any data, though most often used to look up people.

Better yet, the fledgling Gopher protocol was officially integrated with CCSO to some degree. Many old-school browsers and Gopher clients (including pre-Gecko Netscape and NCSA Mosaic) also feature CCSO support, and there’s an official itemtype (itemtype 2) in Gopher menus for links to CCSO lookups. There’s even a writeup from the December 1992 edition of the University of Illinois’ quarterly UIUCnet publication talking about all the things Gopher is able to do–one of which is to be your Internet phonebook.

"Gopher as Electronic Phone Book"

CCSO, never all that popular anyway, fell by the wayside by the mid-90s. Authentication and data security was a big reason: qi featured no built-in encryption, only support for MIT’s Kerberos protocol (which is still somehow being maintained to this day) for authenticating users. In English, while CCSO supports encrypted passwords, it doesn’t mandate their usage (clearnet passwords are perfectly kosher to qi), and any commands sent to the server are sent in the clear without encryption, meaning it’s pretty trivial to sniff, log, and track a user’s session history. LDAP, CCSO’s functional replacement, has support for TLS built-in. (And by the time the Web came around, this all became moot anyway. We just look people up on Google now.)

In case you’re curious, while Michael Lazar of mozz.us (whose efforts to wrangle up old CCSO software and information have been supremely useful in this journey) lists a total of nine CCSO servers still active (not including ours) on his master list at gopher://mozz.us:70/1/ccso/servers, I was only able to get five of them to reply. Whether some are down at the moment or down forever, I’m not sure. That assumes there aren’t more, undiscovered ones out there, of course, but somehow, I doubt it.

Establishing our own server

I’ve known I wanted a CCSO server for Somnolescent since 2020 or so. It’s partially a thrilling hipster novelty, partially exciting bringing such old software back to life, and partially because I had a really good idea for how to use one: listing not just the folks involved in the group, but also their characters. I thought it’d be a lot of fun to have what is effectively a master database of the who’s-who of Somnolescent, not just for folks outside of the group, but for us as well (it’s hard remembering who everyone in Wisp is sometimes…).

The only problem: CCSO is a right pain in the ass to get going these days.

Gopher servers are a dime a dozen. You’ve got Bucktooth, Gophernicus, Flask-Gopher, Pituophis of course, all very modern server packages that can be installed anywhere you got xinetd or a Python interpreter. Gopher clients are pretty easy to come by as well: Gophie is a nice standalone client, Lagrange is a Gemini client with Gopher support that I find pretty excellent, there’s Gopher clients for iOS and Android, and even legacy browsers like Netscape and RetroZilla make for good Gopher clients.

While there are third-party CCSO clients, many of them were written for Unix–not Linux, full fat, old-school Unix distributions–or DOS or, worse yet, Windows 3.1. This means they’re virtually impossible to get running today unless you have access to a 32-bit copy of Windows or experience with winevdm. Not exactly the plug-and-play solution that makes people want to give your obscure database software a try. I know of one other third-party CCSO server, according to Wikipedia–it’s called phd, it was written in Perl, and uses some form of relational database. That’s all I know. There was no download link. I don’t know anything about running Perl programs.

That leaves the original qi and ph programs. One other useful contribution to CCSOspace from Michael is his packaging both into a Docker container that you can just install and run. I tried this back in 2020 on the Raspberry Pi that I use for Somnol’s Gopher and Internet radio and could not make it work. Worse yet, he couldn’t replicate my errors. (Knowing what I know now about Docker, I doubt it would’ve run very well anyway.) I thanked him for his time and teetered off, defeated.

That’s when dcb took matters into his own hands. He wrote out a little basic CCSO script in Python using JSON for the database back in 2019 and finished it up for me a year later. Now–and I’ve told the lad how appreciative I am of that original script, because I would’ve never been able to get started with it on my own–I didn’t like it at the time. It didn’t really work like a CCSO server should. For one thing, the original version of the script let you return the entire database with any malformed command, and it was pretty user-unfriendly with basically no error messages to speak of.

I was too much of a gigantic baby to realize that I could just woodshed the script myself, but as time passed, as we all grew up a bit, and as I took on this new iteration of the Somnolescent Gopher server, I decided to flex my atrophied coding muscles and fix it up myself. I christened that little script SomnolCCSO, we set up a repo for it, and I got to work. (Skipped from this retelling is all the time spent writing entries for characters to add to the database–thank you very kindly to Savannah for doing all her lads for me and giving me a bunch of data to test with very early on!)

Researching the extinct in the wild

Of course, with no real extent CCSO servers to test on and model my own after, it became a matter of trying the script with various clients, seeing what they didn’t like about SomnolCCSO’s output, and fixing it. The eMachines Box has been invaluable for this; because it runs XP, I can easily install any DOS or 16-bit Windows CCSO client and try it out. In the end, I used Netscape 4.8, NCSA Mosaic 3.0, Lynx, and a freeware C++ Windows 3.1 app called phwin, which was nice to have in the mix since it lets you send any command to the server, not just queries. (No, I didn’t test it with ph during development because I wasn’t keen on trying to get it running again–hold that thought, there is a happy ending to that story.)

I had the script uploaded and running on the Pi, and I’d edit that remotely through Filezilla and Notepad++ on my main machine, restarting it from an SSH session every time I wanted to try out a fix. I spent the next week fixating, pouring through Python documentation and best practices, fixing bugs, adding features, rewriting the search logic to be able to return different errors for malformed queries, no matches found, or no desired fields found in the entries requested, reading the status and siteinfo data from files instead of having them hardcoded into the script, logging to both a file and stdout, and reformatting and commenting what dcb had already written. It was an intense process.

Looking at raw CCSO data in telnet
Telnet can be used to look at raw CCSO data and send commands in a pinch. Lots of debugging done in telnet.

Sometimes, the bugs drove me nuts, as happens when you write programs. On the topic of logging, I wanted to be able to check errors while the script was running headless through my rc.local (as opposed to in the SSH session live on my PC), and after hours of trying to format tracebacks and write them to a text file manually with no success, I learned of Python’s logging module, which lets you dump exceptions to a file through one single function. Turns out, if it already exists, you should probably not rewrite it.

One of the most frustrating bugs came right at the very end of development. I’d sent some of the less technically-inclined Somnolians a copy of Netscape 4 so they could finally see everything I’d written for their characters, and Caby noticed that some entries were actually being fused together! Normally, Netscape formats the first field in the entry (oftentimes the name field, as is the case for our server) as a heading and the rest of the entry as a preformatted text block–and instead, the name field was getting mistaken for more of the previous entry, but only sometimes.

CCSO entries getting fused together
Note that Casper’s entry is getting glued onto the end of Arthur’s entry.

I poured over the raw response data in telnet on the eMachines Box, trying my best to see a single character out of place, but no such luck. I had no idea why entries were getting fused together, and stranger still, phwin was occasionally just outright losing parts of entries! For a single field, part of the data would just get truncated, and again, I didn’t know why. I figured these might be due to field length, perhaps old clients with hardcoded field lengths getting confused–but no, that wound up not being the case. I spotted this in the RFC, emphasis mine:

If a particular command can apply to more than one entry, then the multilined response must be so organized that all information pertaining to each entry is returned on consecutive lines, and that each of those lines must have one and the same entry index directly following the resultcode. The first entry index should be 1 and incremented each time a new entry is referred to.

RFC 2378 – The CCSO Nameserver (ph) Architecture

SomnolCCSO was returning the number of the entry in the database as that entry index’s (seen in the telnet screenshot as the second number of each line), when really, it should just count up by one. In other words, if you searched for all raccoon characters and it returned Berry, Savannah’s raccoon sona, and Colton from Pennyverse, Berry might be the 37th entry in the database and Colton might be the 47th, and those numbers would be used for the entry index–when really, Berry’s index in the results should be 1 and Colton’s should be 2.

This was both what was confusing Netscape and Mosaic about the starts of certain entries and causing some data to get cut off in phwin; making the indices increment by 1 every time solved both issues.

Read your RFCs carefully, kids.

How to actually use SomnolCCSO

With the lack of software support, accessing SomnolCCSO is a little difficult. The most plug-and-play way I can think of for the less technically-inclined on modern machines would be to grab Netscape 4.8. Here’s my mirror of it, extracted out of its 16-bit installer. It’ll run fine on Windows 10 and 11 out of the box, but there will be some benign errors about not being able to update the registry. You can run the executable as admin and in compatibility mode for Windows 98/ME to silence those errors.

You can access the server at gopher://gopher.somnolescent.net:105/2/. For most clients, you’ll get a text box you can do searches in. Searches are in the form of [field]="[thing to search for]". Fields are points of data, like species, creator, universe, or affiliation with a larger group. (Most clients can only send query commands, so they don’t require you to specify that it’s a query. If you were using ph proper, you’d have to do query [field]="[thing to search for]".) If you do universe="pennyverse", you’d get this:

A CCSO search for Pennyverse characters

You can also append return [list of fields] to the end of your search, which means you’ll only see certain fields. (Internally, if you don’t specify this, we assume you just want to see all fields.) If you’re only interest in seeing species, summaries, and creators, you could do universe="pennyverse" return species summary creator and get this instead:

A CCSO search using the return parameter

This is the part where CCSO gets powerful. You can actually search as many fields as you want at once, and SomnolCCSO will only return entries that match them all. So! If you wanted to see only raccoon characters that have appeared in Pinede, you could type universe="pinede" species="raccoon" type="character" and only get fantasy raccoons that aren’t sonas. You can pair this with the return parameter if you wanna get really specific:

A very fancy CCSO search

I dream of a CCSO renaissance

I am unbelievably proud of SomnolCCSO. I just think it’s neat! This is the first modern CCSO server ever written, least to my knowledge, and it works great with absolutely every client I’ve tried it with. And yes! I did manage to get that Docker with qi and ph working finally, this time on WSL instead of on the Pi. It runs slow as balls because Docker is so memory-hungry, but yes, you can indeed send queries to our server from the original ph client, and another thanks to Michael for submitting the patches needed to make it all work. The saga is finally closed.

Using SomnolCCSO on the official ph client

SomnolCCSO isn’t a full implementation of the CCSO standard. There’s no authentication, no wildcard searches, and no live database editing. You have to edit the JSON file to add entries to it and then send a reload command to the server to get it to read the database anew. This is a much better way of working it, though, I think. For one thing, it’s very unlikely that new CCSO servers will need updates from many different people, and no authentication = no expectation of privacy = no security holes to find. SomnolCCSO keeps winning. (That said, here’s the repo link again–if you find any glitches, I’d be happy to accept pull requests.)

I do hope that having this new server out there will spark some interest in CCSO again from the retro tech crowd, because I do think there’s a lot of life left in the protocol, at least as much as there is Gopher. For one thing, there’s plenty of organizations that operate over Gopher, tildes and the SDF being great examples. They’d absolutely benefit from having a CCSO lookup for users, especially if the database had fields for things like interests or what revival social media/chat applications users had. You want to find someone who uses Escargot to chat with? Do a lookup for all escargot fields and find some new friends.

CCSO is also not necessarily tied to searching for people! It’s a database, and can hold basically anything you’d want in a database. Imagine using CCSO to look up information on items in a game, information on games, movies, media, songs, song lyrics, quick facts on guns or sandwiches–you get the idea. And again, with the Gopher integration, CCSO fits tidy into any menu you write, so the other half of Gopher’s promise, a structured and orderly way to retrieve information, is plenty fulfilled.

The biggest hurdle for CCSO continues to be software support. As said, there’s no particularly good way to access CCSO at the moment. In my brief chat with Michael, he expressed interest in adding CCSO support to his Gopher-HTTP proxy, and dcb’s said he wants to add it to Gopherlens as well when it finds a new home (perhaps I should buy him a VPS already). That’d be ideal, because that would mean CCSO lookups through Web browsers, making it accessible to literally everybody. Beyond that, I’d love to see RetroZilla get CCSO support, I’d love to see some more standalone Gopher and Gemini clients get CCSO support, and hey, some more server packages for different languages and use cases wouldn’t be bad either.

CCSO on the eMachines Box (and Snowman)

It’s really up to the retro tech community to revive CCSO. Somnolescent did our part, and maybe someone out there will find setting up their own server with our script to be a fun little weekend project. I’d love to see it. It’s a novelty, an extremely obscure, nerdy novelty–but that’s not a bad thing whatsoever.

About mariteaux

Somnolescent's webmaster with way too much to write about and a stack of CDs he'll never finish.


Leave a Reply

Your email address will not be published. Required fields are marked *