Groff – a #retrotech project

I’ve been mucking about with groff. With what, you ask? Groff! Otherwise known as GNU Troff, and if that doesn’t clue you in (it probably doesn’t), then let me explain.

Troff is an oooooold Unix tool for typesetting. It was born way, way back in the early 1970s as part of the original Unix documentation toolkit. In the old days, this was the tool you used to print formatted docs. The first version was written to drive a phototypesetter (a GSI C/A/T, in case you were wondering)—this is back in the days when phototypesetting was nouveau. Later in the 1970s, it was modified to be usable with any typesetter. Around that time, our friends at Coach House Press (now you begin to understand where I’m coming from) got hold of it and developed a version of it that was up to the standards of professional printing/publishing: better hyphenation & justification, kerning, and so on. The result was SQTroff, where SQ stood for SoftQuad, the Coach House’s software startup.

Coach House did their typesetting via Troff all through the 1980s. By then Troff could produce Postscript output, so it was used to drive laser printers and imagesetters and such modern output devices. Here’s an interesting note: although Coach House eventually went to Mac-based DTP in the 1980s, the Troff way of working is still in use at publishers like Wilfrid Laurier U Press and even the Porcupine’s Quill. If you pay attention to Canadian literary presses, you probably already know that Porcupine’s Quill is very proud of their fine printing on ‘heritage’ equipment; you may not have known that Unix is part of that.

So this brings us to Groff. When Unix largely went open source in the late 80s and early 90s (via GNU and Linux), the vast majority of the legacy tool suite was re-written under a free license. Hence Troff became Groff. And, Groff developer James Clark (a very big name in the XML world) was well-enough connected to Softquad that the free-licensed tool included a good portion of the typographic improvements made there (although not some of the software-maintenance facilities in SQTroff).

The most important thing about Groff is that it is still around, and still maintained. Because it’s part of the free software toolkit which underlies almost everything these days, Groff is a ubiquitous (though mostly ignored) component in modern operating systems. If you run Mac OS X, you already have it on your hard drive! If you run Linux… natch. Windows may be another story, but it’s still easily installed. We may take for granted that nobody has thought about doing typesetting and page layout this way since the days of PageMaker and QuarkXPress, but that doesn’t mean Groff is obsolete. In fact, it has been actively maintained and used ever since.

Got a Mac? Here’s a thing to try out:

Write a text file. Give it some nice long paragraphs, and separate the paragraphs with blank lines (this will serve as our cheapest-possible typesetting code :-). Save is as example.txt

Open up the Terminal… navigate to the folder where you saved example.txt. Type this in the terminal:

groff -mom example.txt > example.ps; open example.ps

You should see Preview open up with a typeset version of your text; note the H&J, and if you typed “fi” anywhere, note the ligature. This is a demo of Groff’s bare, default behaviour, with no formatting instructions. Beyond this, Groff is almost infinitely malleable and directable. Have a look at the docs for the “mom” macro set at http://www.schaffter.ca/mom/momdoc/toc.html

So what’s the point? At the base level, high-quality typesetting exists—for free—outside the normal Adobe-controlled world. Yes, it’s a batch-processing model, uninformed by the WYSIWYG world we’ve been living in for a generation. But…

But, I ask you to consider this: when we write and design for the Web, when we separate content from formatting by writing clean markup and designing elegant CSS stylesheets, we are doing something similar: we’re preparing the templating and general instructions for how things will look, and then applying that to well prepared (marked-up) content.

Groff is no different. And so I suggest that an opportunity exists to leverage this venerable technology in Web-first workflows, to produce beautifully formatted print out of marked-up content online. We did this already by piping HTML content into InDesign with ickmull. Hugh McGuire’s Pressbooks project incorporates that, plus a second TEX-based templated output option as well. We could go further…

To the best of my knowledge, nobody has written a general-purpose HTML-to-Groff transformation—such a thing would combine well-formed HTML (such as you might have already in Wordpress or Drupal) with a set of templates and stylesheets, and produce perfect Postscript/PDF output. What’s more, it would do it using free software. And it would be simple: Groff is a tiny little piece of software compared to anything remotely comparable, output-wise.

Nobody has written that transformation piece. But I am going to.

Read full storyComments { 1 }

#retrotech and Early Digital Innovation at Coach House

In early June this year I went to Ottawa and Toronto as a major research piece for my “early tech history at Coach House Press” project. SSHRC came up with a bit of money to dredge through the archives of the Canada Council for the Arts. The Coach House Press has been at the bleeding edge of nearly every significant advance in digital publishing technology: computer driven phototypesetting, UNIX, SGML, relational DBMS in the 1970s and 1980s—and onto the web with their frontlist in the late 1990s (yes, well over a decade ago; how’s your frontlist?).

I wrote a longish report on the trip over at http://tkbr.ccsp.sfu.ca/tkbr/retrotech-trip-2011 and you can watch our ever-expanding collection of notes and queries in the project wiki: http://thinkubator.ccsp.sfu.ca/wikis/chb

Read full storyComments { 0 }

Speaking at Editors Association of Canada, May 29

I’ll be giving a talk at the EAC 2011 conference in Vancouver this weekend, titled “Re-imagining Publishing as if the Web Mattered.” Here are notes/slides toward that.

The World Wide Web has established itself as the dominant publishing platform of our time, and of the future. So why do so many book and magazine publishers seem to pretend that it doesn’t exist? This session explores this question and the apparent crisis of imagination lurking behind it.

Read full storyComments { 0 }

PubWest 2010

I’ll be at the Publishers Association of the West conference in Santa Fe NM, later this week—and I’m stoked about that. November in Vancouver is dark and wet and gloomy, and New Mexico sounds like just the right antidote to that!

The conference theme this year is “Unleashing Your Publishing Potential” and I’ll be talking on the Thursday (and again on the closing panel Saturday) about unleashing the publishing potential of the Web. Which is a silly thing to say, really, because the publishing potential of the Web has been unleashed already, to staggering effect. Google indexes something like fifty trillion web pages, and they don’t claim that’s anywhere near the whole thing… So really, what I’ll be talking about at PubWest is “recognizing” the publishing potential of the Web, especially for trade book publishers.

A few recent items feed into this. First, of course, is our work here at SFU, developing tools to help integrate web and print workflows—both editorial and production-oriented. This approach is reflected in the large in a movement that Hugh McGuire at BookOven is spearheading, developing a set of Web-based tools to support book publishing. And it’s an idea that quite recently found voice at the Internet Archive’s Books in Browsers conference in October this year.

My talk thursday is part of the pre-conference tutorial “intensives.” I’ll be sharing the day with Anne-Marie Concepcion of Seneca Design and InDesign Secrets fame. Between us, we’ll cover going from print to e-books and from ebooks to print (though I’m obviously gunning for the latter option).

If you’re following from home, the hashtag for PubWest is (unsurprisingly) #pubwest — and you can check out my PubWest slides (still in revision, even after I deliver them, probably).

Read full storyComments { 1 }

Traversing the Book of MPub: A New Article

Kathleen Fraser (MPub 09) and I have written a new article for the Journal of Electronic Publishing, forthcoming late this year. It’s called “Traversing the Book of MPub: An Agile, Web-first Publishing Model.” It describes the process that led to The Book of MPub, our web-first, multi-platform publishing project from last spring, and elaborates on a more conceptual level the rationale(s) for doing things this way.

Here’s the abstract:

In the 21st Century, content normally lives on the web. But what would a web-based book publishing environment look like? In spring 2010, graduate students at Simon Fraser University created The Book of MPub, an end-to-end, web-first book publishing project. The re-visioning of the book as a web-born entity presents enormous opportunities for publishers to push the operational, expressive, and social horizons of their businesses. We have identified four key concepts which shape a modern book publishing approach: the concept of an agile publishing methodology; the centrality of online content management systems; leveraging the web’s HTML markup as a way of achieving an XML-based workflow; and the radical reconfiguration of promotion and marketing.

Read full storyComments { 0 }

Migrating to Wordpress

I’ve begun the process of moving our ongoing research material from its old home at http://thinkubator.ccsp.sfu.ca/wikis/ to the new Wordpress world. This process is going to take a while, basically, because it’s going to happen on an as-needed basis, as new stuff comes onstream. The first pieces to come across have to do with my chapter in Coombe & Wershler-Henry’s forthcoming book on Fair Dealing. But lots more will migrate here in the weeks and months to come.

Read full storyComments { 0 }