Digging into Image Data to Answer Authorship Related Questions
The primary datasets that we propose to use during this research include:
Nine complete Froissart manuscripts from the 15th century that have been digitized to similar standards and quality. These are: Toulouse, Bibliothèque d´Etude et du Patrimoine MS 511 ; Besançon, Bibliothèque d´Etude et de Conservation MS 864 & MS 865 ; Stonyhurst College MS 1 ; Brussels, Bibliothèque Royale MS II 88, MS IV 251 tomes 1 & 2 ; and Paris, BnF MSS français 2663 and 2664. We are currently seeking funding to add two further complete manuscripts to this dataset: Pierpont Morgan Library MS M.804, and British Library MS Royal 15 E.VI. The current collection of 15th-century manuscripts consists of over 6,100 images mainly at 500 DPI, hosted on a federated Storage Resource Broker (SRB) facility between UoS and UIUC using a web-front end collaboratively developed between the two sites (see http://cbers.shef.ac.uk). The images can also be retrieved from the SRB system via an API which provides direct access to the image dataset within a programming environment.
17th- and 18th-century map collections: the University of Illinois Library holds a 1664 Blaeu Atlas and over twenty of the Atlases published by Herman Moll in the early 18th century, as well as digital scans of the maps for this project. These atlases include hundreds of additional maps, and the algorithms developed by this project can be applied to the thousands of pre-1800 maps that are gradually being digitized by libraries across the world.
19th- and 20th-century quilt images: the Quilt Index (a partnership of Michigan State University and the Alliance for American Quilts) contains images and detailed information on nearly 25,000 quilts, which will grow to 50,000 by the end of the grant period. The quilts, dating from the 1700s to the present day, are mostly American in origin though the Index will expand to include international collections in the future. Access images (550 pixel-wide JPEG files 72-150 ppi resolution) have been contributed by museums, libraries and documentation projects for education and research use. The set is hosted in MATRIX´s open source digital repository, KORA, and available at www.quiltindex.org. Many thousands of styles and quilt makers are represented in this dataset as well as a range of image quality depending on original photography.