As most of our users know, the Project for American and French Research on the Treasury of the French Language (ARTFL) is a collaborative project with the French Centre national de la recherche scientifique (CNRS) laboratory Analyse et Traitement Informatique de la Langue française. With over 270 subscribing institutions around the world, the ARTFL project is now one of the oldest and most successful online full-text services serving the scholarly community of research and higher learning.
With success have come new challenges and responsibilities. Over the last several years, the ARTFL project has been working to enhance both our collections and our software. This has been a collective effort involving discussions with our users as well as internal research and development efforts. On behalf of the whole ARTFL team, I am happy to announce this new release of both our search and retrieval engine, PhiloLogic, and of our main database, ARTFL - FRANTEXT.
Growing the Collection: ARTFL-FRANTEXT and other databases
We have been steadily augmenting our holdings. We have now consolidated them. We have integrated into our main database many more works over time, bringing the number of works to over 3,500 and the total number of words to over 200 million. We have been growing our collections in several ways, among which the most important have been:
I. Collaborative projects
This has consistently been one of our priorities. While we are engaged in many such projects, a few stand out and I mention them for their importance and also as a means of inviting other researchers and institutions to work with us collaboratively.
- Our partnership with Le groupe Θ at the Centre Jean Pépin UMR 8230 and ENS/PSL has led to the ongoing digitization of the Encyclopédie Méthodique, with over 34 volumes of text and plates now available, including key sections such as Philosophie ancienne et moderne, Théologie, and Assemblée nationale constituante.
- The collaboration with Anne Simonin, Pierre Serna, and Yann Arzel Durelle-Marc has resulted in La Loi de la Révolution Française 1789-1799, uniting for the first time the "Baudouin" and "Louvre" collections, the first two official compilations of French legislation.
- Our longstanding partnership with the Voltaire Foundation at Oxford University has produced several major resources:
- The TOUT Voltaire database of Voltaire's complete works
- The TOUT d'Holbach collection
- The Commonplace Cultures project exploring intertextual relationships
- Our collaboration with the Sorbonne began with the Labex OBVIL on the Practices and Legacies of 18th Century Culture project, and continues now with the Observatoire des textes, des idées et des corpus (ObTIC). These partnerships have resulted in the first PhiloLogic implementation of the Très Grande Bibliothèque (TGB), providing access to over 112,000 texts from Gallica's digital collections, allowing us to explore the broader impact of Enlightenment thought on 19th century French culture.
II. In-house data entry projects
Our in-house data processing efforts have focused on expanding the diversity and coverage of our collections. We have carefully digitized and edited several major works, including Robespierre's Oeuvres complètes, the Décade philosophique, and the Procès verbaux du Comité d'Instruction publique. Recently, we have processed additional materials using modern OCR technologies optimized for early-modern texts, such as the John Carter Brown Library's Haiti Collection and Brown French Collection. We have also begun experimenting with LLM-assisted OCR correction for specific projects, as demonstrated in our digital edition of Savary des Brûlons' Dictionnaire universel de commerce.
III. Freely available online editions of suitable quality
We have incorporated a number of freely available collections, including the Archives Parlementaires, a chronologically-ordered collection covering the first five years of the French Revolution, Chambers' Cyclopaedia (4th edition, 1741), and the Correspondance littéraire, Grimm and Meister's manuscript journal covering literary and cultural news from Paris between 1753 and 1793. We welcome information about additional freely available texts of suitable quality.
While texts in our main databases conform to high standards of correction, we occasionally make available preliminary versions of texts that are still being processed or corrected. We believe that providing early access to these resources, even in an imperfect state, serves the research community's interests. We welcome collaboration on improving these texts.
Our Search and Retrieval Engine: PhiloLogic
PhiloLogic, developed under the leadership of Clovis Gladstone with contributions from Mark Olsen, Charles Cooney, and Richard Whaling, is our flagship search and retrieval engine, now in its fifth major version. It combines powerful text analysis capabilities with an intuitive interface designed for both traditional scholarly research and computational methods.
Key features include faceted browsing, advanced visualization tools, customizable search parameters, and comprehensive metadata integration. For a complete overview of PhiloLogic's capabilities and documentation, please visit our dedicated PhiloLogic page.
Current Research Initiatives
The ARTFL Project continues to develop innovative tools for large-scale textual analysis. Our TextPAIR (Pairwise Alignment for Intertextual Relations) software has enabled groundbreaking research in text reuse detection and the study of intertextual relationships across large corpora. This work has been particularly valuable in understanding the transmission of ideas across time periods and linguistic traditions.
Building on these capabilities, we have created the Intertextual Hub, a digital humanities reading environment that helps scholars discover and analyze relationships between texts through shared passages, topics, and linguistic patterns. This platform combines close and distant reading approaches, allowing researchers to move seamlessly between individual texts and broader patterns of cultural transmission.
Our Topologic project explores new approaches to topic modeling and thematic analysis, with particular attention to the challenges of working with diachronic corpora. By combining traditional scholarly methods with advanced machine learning techniques, we are developing tools that help researchers identify and track the evolution of ideas across large text collections.
Looking ahead, we are investigating applications of recent advances in natural language processing to enhance our text analysis capabilities while maintaining our commitment to rigorous scholarly standards. These developments will complement our existing tools while opening new possibilities for humanities research at scale.
Conclusion
In closing I would like emphasize that most of the development efforts we have undertaken, be it in the area of collections or software, would have been impossible without the help and collaboration of the rather extraordinary set of graduate students from a whole range of disciplines -- humanities, social sciences, computer science, mathematics -- who have worked with us over the years. I would like to express my deep recognition for their contributions as well as for the support of my colleagues in the Department of Romance Languages and Literatures at the University of Chicago.
- Robert Morrissey