一千萬個為什麽

搜索

Project Gutenberg如何構建他們的epubs?



我已經開始探索古騰堡計劃的一些文章,並且第一次恰好是杜伊蘭斯的聖經。我正在查看“文本”部分,只看到沒有任何語義順序劃分的文檔塊,看起來只是將文檔的線性間隔索引編制為統一塊。有沒有人對他們的文檔創建方法有所了解?是否有一些性能的原因做到上述?每個文本似乎都是大約1400個奇數行的html。我正在考慮編輯圍繞聖經的實際書目命令的潛臺詞,但是想知道我是否會不知不覺地打破/違反某些東西。

editing Douay-Rheims bible from Project Gutenberg

轉載註明原文: Project Gutenberg如何構建他們的epubs?

一共有 1 個回答:

看起來確實是用一些自動工具創建的。我在Miscellaneous文件夾中搜索文件,發現有 content.opf 這個 split 操作出現在評論中的一堆痕跡以及一個相對統一塊大小。

此外,我通過電子郵件發送了關於目錄中的錯誤信息,並得到了回復。

David Widger via RT
7:05 AM (3 hours ago)
Hi jxramos,

I find that this was one of the early PG productions which were only in the
ASCII format and had no accompanying html file made by the producer of the text
file. The html file listed with this ebook was one autogenerated and these are
often quite unsatisfactory.

A much better PG edition is:

http://www.gutenberg.org/files/8300/8300-h/8300-h.htm

The html file was manually produced and the mobile viewer files appear
satisfactory.

I would refer you to PG #8300

Regards,

Project Gutenberg

content.opf enumerating chunk size