Tag Archives: xml

How to index Project Gutenberg files

project_gutenberg_logoProject Gutenberg offers over 40,000 free ebooks: choose among free EPUB books, free kindle books, download them or read them online. Recently, they have added support for Dropbox, so you can download ebooks directly to your Dropbox account. It will create a folder ‘Apps/gutenberg’, and will store all ebooks in that folder.

After a while, this Dropbox folder will have a long list of files, all with names like pg1234.epub and pg5678-images.epub. They have some meaning, but which file contains which title? Of course, you can rename each ebook  after downloading, but this is extra work, and I want a smart solution.

pg_filelist

So I created a script to create an index for all ebooks–at least, for those in EPUB format. Now there is simply an index.html file, which will open in any browser. It shows the file name, together with the creator, title and language. Depending on how you view the index.html file, the links may be clickable, In any case, you can quickly see which file is which book.

pg-index

The script works on any set of EPUB file, but usually, they will have a more meaningful name and there will be less need for a script like this.

Continue reading