History of the ARTFL Project
In 1957 the French government initiated the creation of a new dictionary of the French language, the Trésor de la Langue Française. In order to provide access to a large body of word samples, it was decided to transcribe an extensive selection of French texts for use with a computer. Twenty years later, a corpus totaling some 150 million words had been created, representing a broad range of written French -- from novels and poetry to biology and mathematics -- stretching from the seventeenth to the twentieth centuries.
It soon became apparent that this corpus of French texts was an important resource not only for lexicographers, but also for many other types of humanists and social scientists engaged in French studies - on both sides of the Atlantic. The result of this realization was American and French Research on the Treasury of the French Language (ARTFL) -- a cooperative project established in 1981 by the Centre National de la Recherche Scientifique and the University of Chicago .
The ARTFL project has focused on three objectives over its long history:
- Text corpus development: offer a variety of text collections
- Research and development: improving the navigational tools to explore our collections
- Inquiry: participating in Digital Humanities scholarship and research.
The Databases
At present, ARTFL's main corpus, ARTFL-FRANTEXT, consists of more than 3,500 texts, ranging from classic works of French literature to various kinds of non-fiction prose and technical writing. The eighteenth, nineteenth and twentieth centuries are about equally represented, with a smaller selection of seventeenth century texts as well as some medieval and Renaissance texts. Genres include novels, verse, theater, journalism, essays, correspondence, and treatises. Subjects include literary criticism, biology, history, economics, and philosophy. In most cases standard scholarly editions were used in converting the text into machine-readable form, and the data contain page references to these editions. The FRANTEXT corpus is updated as new high-quality digital texts become available.
In addition to FRANTEXT, ARTFL has built over 50 databases for researchers and students working in specialized disciplines and languages other than French. Please see our Databases, Resources, and Collaborations pages for links to these projects.
New Opportunities for Research
ARTFL's stable of databases is one of the largest of its kind in the world. The number, variety and historical range of its texts allow researchers to go well beyond the usual narrow focus on single works or single authors. The databases permit both rapid exploration of single texts and inter-textual research of a kind virtually impossible without the aid of a computer. For a description of the latest research developments underway at ARTFL, please visit our Research Blog.
PhiloLogic
PhiloLogic is a tool for ARTFL text research which provides a menu driven system featuring a sophisticated help program that can be accessed at any time. PhiloLogic does not require any specialized knowledge of computers -- in fact, this system provides an excellent opportunity to become acquainted with the possibilities of computer-assisted research and teaching. The ARTFL Project has written full documentation for PhiloLogic, available here.
Queries
ARTFL's PhiloLogic system supports a number of searching options. A user may search for a single word, a word root, prefix, suffix or a list of words created by the user. For example, one might search for the word liberté in the texts published between 1789 and 1794, or all of the words associated with "artist" -- artiste, artistes, écrivain, écrivains, poète, poètes, etc -- in the works of Zola. In many cases a researcher will not merely be interested in the occurrences of single words or lists of words, but where words occur in texts. Philologic allows the user to search for logical combinations of words and word lists. One might, for example, search for all the occurrences of words associated with "artist" where words beginning with "fem" -- femme, femmes, feministe, etc. -- are found in the same sentence in the works of Zola.
PhiloLogic also offers advanced features such as collocation analysis, word frequency distributions, and faceted browsing of search results. Users can explore word patterns across time periods, analyze changing vocabulary usage between authors or genres, and generate visualizations of search results. The system supports complex proximity searches, allowing researchers to find words or phrases that appear within a specified distance of each other, as well as regular expression searches for more sophisticated pattern matching.
Display
PhiloLogic provides researchers with five complementary analytical lenses through its main search reports: concordance views for examining words in context (with expandable context windows), KWIC (KeyWord In Context) display for rapid pattern recognition, aggregation reports for statistical analysis, collocation analysis for discovering word associations, and time-series visualizations for tracking changes across periods. Complementing these reports, faceted browsing allows users to filter and explore results using metadata categories such as author, date, genre, and other bibliographic fields. Each of these views can be further enriched with comprehensive bibliographic information.
The concordance interface serves as the primary entry point, offering highlighted search terms with dynamically expandable contexts. Users can seamlessly transition to reading full texts, with navigation enhanced by direct access to page images from the original editions when available. This side-by-side display of text and facsimile enables immediate verification of transcriptions against source materials.
To facilitate further analysis and collaboration, all search results can be exported in multiple formats (CSV, XML, JSON). The system's responsive design works across devices, and researchers can save and share specific views and search configurations, making it easy to return to or share particular research states with colleagues.
Access To The ARTFL Database
Access to the databases is organized through a consortium of user institutions, in most cases universities and colleges, each of which pay an annual subscription fee. This fee is $500 (US) for PhD granting institutions and $250 (US) for other universities and colleges. All scholars and students at affiliated institutions have access to the database. Our Subscription Information page contains more on database access.
Future Development
The ARTFL Project continues to expand and improve its resources through multiple initiatives: growing our text collections, enhancing existing databases through corrections and updates, and developing new analytical tools and access methods. We actively seek contributions of high-quality texts and welcome proposals for new collaborative projects. Our development direction is shaped by ongoing dialogue with our user community, and we encourage institutions to join us in expanding digital humanities research capabilities. For the latest updates on our collections and tools, please visit our news page.
More Information
The ARTFL Project is supported by a full-time staff at the University of Chicago. We encourage you to contact us with any questions you may have about the project, such as the availability of texts, operation of the system, or the costs of using the database.