GutenMark Home Page
Attractively formatting Project Gutenberg texts
[home]   [download]   [usage]

Contents

Introduction
The Problem
What's the Solution?
How does GutenMark fit in?
How do I get GutenMark, and what does it Cost?

Introduction

Project Gutenberg -- or PG for short -- is  a marvelous project for freely providing online books.  Thousands of such "etexts" have been made available.  Many are familiar classics, and many others are completely unfamiliar books you're unlikely to find anywhere else.   I've provided a dozen or so of the etexts myself,  so I'm on safe ground asserting that I am quite fond of PG.

GutenMark is a program I'm making freely available to the community, in the hopes of making it more enjoyable to read PG etexts.


The Problem

A problem with PG etexts (dare I say it!) is that they are not very pretty, in comparison to typical hand-held books.  PG etexts have traditionally been provided in a format providing most of the content of the books (i.e., what was in the author's mind) but have discarded the attractive formatting (provided by the publisher).  There have been sound reasons for doing so, and I can't quibble with them.

Alas! it turns out, for many of us, that we really do prefer the more attractive printed version over the plain-Jane PG version.  We'd rather buy the book.  I fear that this effect has limited PG readership somewhat.

The situation has improved somewhat in recent years, in several ways.  Special software for reading the online books can make the books appear more attractive on the computer screen.  There is a project of the HTML Writers Guild to provide HTML versions of PG etexts.  Even PG itself is now willing to accept formatted versions of etexts, as long as the plain-Jane version exists also.  I applaud all of these efforts, and I hope it does not denigrate them to add my own efforts to theirs.


What's the Solution?

I think that printing on demand would go a long way towards making PG better for the readers.  It would be great to be able to take any random PG etext and automatically format it so that it is as attractive as a printed book, for either online reading or for printing.


How does GutenMark fit in?

GutenMark is a free command-line utility for Win32 or Linux (or BSD or UNIX or Mac OS X ...).  It accepts a Project Gutenberg etext, applies what I hope are intelligent heuristics to it, and produces attractively-formatted HTML.  My definition of "attractively formatted" is that it should look like a book when you print it.

How well does GutenMark succeed?  It depends on the particular etext; in my view, it works pretty well.  To give you some idea, here is a sample:

In looking at the HTML or PDF file, be sure to skip past the PG header at the top of the text, since this is just taken as-is, without reformatting.  You might need to download free Adobe's Acrobat Reader program to view or print the PDF file.  I created the PDF file just for fun, using a freely available HTML-to-PDF conversion utility, so that you could see what a book-like printout might look like.


How do I get GutenMark, and what does it Cost?

GutenMark is free.  To download it, or to get more detailed information about it, click here.



Last updated 11/02/01 by RSB.  Contact me.