Matthew Thomas’ post about the ideal weblogging system last week (justifiably) generated a fair amount of commentary. One of the points that seems to have generated the most dissent is the idea that the “ideal system” has to be database driven. As you might expect from a Blosxom proponent like myself, I don’t agree. He argues in a followup post today that this is a matter of data preservation:
Sure, it makes the system more brittle (and more difficult to install) in the short term, but it also makes it much less brittle in the long term. In the long term, you must be able to generate the presentation format dynamically.
But, as Sven-S. Porst writes:
I still don’t see the difference in functionality of abusing files as the database really. If in doubt, it will always be easier to recover your data from a bunch of text files than from a database.
About a month ago, I fell victim to a nasty bit of disk corruption where a number of files were being physically allocated to the same disk blocks. A number of binary files were completely trashed. It took me a while to figure out this was happening. They symptoms were that recently installed binaries would crash instantaneously on launch, or segfault, etc. The clue that finally tipped me off was when I opened a few text files and found their content intermixed, which is the sort of thing that becomes quickly obvious when comparing text files. I was up and running completely within a few hours of diagnosing the problem because I had no database to restore, no indexes to rebuild, and I could quickly tell, with an unaided eye, whether my weblog data was intact or not. I can back up my entire weblog now in about 90 seconds on a CDR, and, assuming I pick my media stock wisely, I’m pretty sure that I could read this data back 25 years from now and have something I can integrate into whatever personal publishing systems are then current, probably with a trivial bit of scripting and a search-and-replace to whack deprecated tags. The biggest problem will likely be linkrot.
Matthew asks: “if you’re going to make the files static, what format are you going to use?” He then mentions various flavors of HTML, XHTML, SGML, and even Troff as formats of the past and perhaps future. The beauty of text is that these formats, though semantically different, are all, at their base level of existence, the same darned thing: text files with tags. A bold new world of formats is never more than grep and a pipe away.
DBMSs are transient, tagged text is forever.
:: Dave Walker 17:35 (EST/EDT) [+] ::
:: [/opinion/technology]
:: tags: technology
:: Comments (0)
Here are the latest updates to the static playlist. Next live show TBA.
:: Dave Walker 12:57 (EST/EDT) [+] ::
:: [/station/playlists]
:: tags: playlists
:: Comments (0)
If you stew apples like cranberries, they taste more like prunes than rhubarb does. -- Groucho Marx