GutenMark
Home Page
Attractively formatting Project
Gutenberg texts
[home] [download]
[usage]
Contents
Introduction
The Problem
What's the Solution?
How does GutenMark fit in?
How do I get GutenMark, and what does it Cost?
Introduction
Project Gutenberg -- or PG for short
-- is a marvelous project for freely providing online books.
Thousands of such "etexts" have been made available. Many are familiar
classics, and many others are completely unfamiliar books you're unlikely
to find anywhere else. I've provided a dozen or so of the etexts
myself, so I'm on safe ground asserting that I am quite fond of PG.
GutenMark is a program I'm making freely available to the community,
in the hopes of making it more enjoyable to read PG etexts.
The Problem
A problem with PG etexts (dare I say it!) is that they are not very pretty,
in comparison to typical hand-held books. PG etexts have traditionally
been provided in a format providing most of the content of the books
(i.e., what was in the author's mind) but have discarded the attractive
formatting (provided by the publisher). There have been sound reasons
for doing so, and I can't quibble with them.
Alas! it turns out, for many of us, that we really do prefer the more
attractive printed version over the plain-Jane PG version. We'd rather
buy the book. I fear that this effect has limited PG readership somewhat.
The situation has improved somewhat in recent years, in several ways.
Special software for reading the online books can make the books appear
more attractive on the computer screen. There is a project of the
HTML
Writers Guild to provide HTML versions of PG etexts. Even PG
itself is now willing to accept formatted versions of etexts, as long as
the plain-Jane version exists also. I applaud all of these efforts,
and I hope it does not denigrate them to add my own efforts to theirs.
What's the Solution?
I think that printing on demand would go a long way towards making
PG better for the readers. It would be great to be able to take any
random PG etext and automatically format it so that it is as attractive
as a printed book, for either online reading or for printing.
How does GutenMark fit in?
GutenMark is a free command-line utility for Win32 or Linux (or
BSD or UNIX or Mac OS X ...). It accepts a Project Gutenberg etext,
applies what I hope are intelligent heuristics to it, and produces attractively-formatted
HTML. My definition of "attractively formatted" is that it should
look like a book when you print it.
How well does GutenMark succeed? It depends on the particular
etext; in my view, it works pretty well. To give you some idea, here
is a sample:
In looking at the HTML or PDF file, be sure to skip past the PG header
at the top of the text, since this is just taken as-is, without reformatting.
You might need to download free Adobe's
Acrobat Reader program to view or print the PDF file. I created the
PDF file just for fun, using a freely available HTML-to-PDF conversion
utility, so that you could see what a book-like printout might look like.
How do I get GutenMark, and what does it Cost?
GutenMark is free. To download it, or to get more detailed
information about it, click here.
Last updated 11/02/01 by RSB. Contact me.