Table of Contents
Brown Library - French Collection
The French literature collection of the John Carter Brown Library is of particular interest to those interested in political, social, and intellectual history and in the literature of travel. In addition to accounts by writers such as Cartier, Champlain, Thevet, Hennepin, and Tonti, the collection is strong in translations of French works in other languages. Semi-legendary and imaginary voyages hold particular interest. The library also holds a substantial collection of laws, decrees, and proclamations relating to France’s American possessions. This collection will include books printed in France and then expand to include books written in French and books written by French authors. In addition, the JCB’s digitization of the French books in its collection supports its celebration of the 250th anniversary of the circumnavigation of Louis-Antoine, Comte de Bougainville, in 2016. To date, 2,155 documents in this collection have been release on the Internet Archive.
As part of preliminary work on the development of a larger collection of French and English documents related to slavery in the 18th and early 19th century, we are developing automatic processes to run new OCR applications on existing pages images found in collections on various platforms. For this experiment, we developed an automated script which downloaded the existing page images and metadata from the Internet Archive, used the Kraken OCR system, and loaded the results in the latest version of PhiloLogic (4.7). For this work, we selected 1,697 of the 2,155 available documents, eliminating more than 150 works not in French and earlier date ranges due to print quality and poor original OCR. This collection has minimal overlap with Brown Haiti Collection, the selected Newberry FRC collection and the Maryland French Collection.
This database should be considered to be a form of standoff index which provides facilities to link back to the Internet Archive site to browse page images. Preliminary estimates suggest that the OCR text is significantly better than what is currently available on the IA, providing better search and retrieval capabilities as well as improved legibility for readers. There are, of course, still many errors and the user is advised to always use the page images as the definitive source for citation and other scholarly uses. In some cases, the alignment of the OCR'd to the text page images on the Internet Archive may be off by 1 page.