Stylometry approaching Parnassus

AbstractThe article gives a brief outline of the fruitless attempts in the past at finding the author(s) of the Parnassus Plays, an anonymous trilogy performed between 1597 and 1601 at St. John’s College in Cambridge as part of the Christmas festivities, and then moves on to employ the R Stylo features Rolling Delta and Rolling Classify, which in various approaches confirm John Marston and Thomas Nashe as the authors.

Topic modelling characterization of Mudejar art based on document titles

AbstractText mining techniques were applied to a corpus consisting in the titles of 2,454 documents on Mudejar art, a style unique to Spanish art history. Probabilistic topic modelling was used to analyse the semantic structure underlying the suite of documents studied. Two classifications were obtained, an initial, generic division into five topics followed by a second more refined division into ten. These were compared to the preliminary subject categories found for the corpus with the guidance of an area specialist.

Correcting real-word spelling errors: A new hybrid approach

AbstractSpelling correction is one of the main tasks in the field of Natural Language Processing. Contrary to common spelling errors, real-word errors cannot be detected by conventional spelling correction methods. The real-word correction model proposed by Mays, Damerau, and Mercer showed a great performance in different evaluations. In this research, however, a new hybrid approach is proposed which relies on statistical and syntactic knowledge to detect and correct real-word errors.

Evaluating multi-criteria Connection mechanisms: A new algorithm for browsing digital archives

AbstractSearching for articles of interest in a digital archive need not be through a free-form text search. In fact, many authors have suggested that the best way to find relevant items in an archive is to browse its contents rather than to search for specific keywords.

At the crossroads of digital humanities and historical lexicography: The Middle Dutch ‘seemly play (abel spel) of Winter and Summer’ as a research case

AbstractDigitization has changed the concept of dictionaries from merely alphabetically ordered reference works into lexical databases providing flexible search systems with interconnected lemmas. This article investigates ensuing opportunities and useful design options of digitized historical dictionaries as research tools for the study of texts. It appears that we have arrived at an interesting intersection of digital humanities and historical lexicography. The 14th-century ‘seemly play of Winter and Summer’ serves as a research case.

EVI-LINHD, a virtual research environment for the Spanish-speaking community

AbstractLaboratorio de Innovación en Humanidades Digitales (UNED) has developed Entorno Virtual de Investigación del Laboratorio de Innovación en Humanidades Digitales (EVI-LINHD), the first virtual research environment devoted mainly to Spanish speakers interested in digital scholarly edition. EVI-LINHD combines different open-source software for developing a complete digital project: (1) a Web-based application markup tool—TEIscribe—combined with an eXistdb solution and a TEIPublisher platform, (2) Omeka for digital libraries, and (3) WordPress for simple Web pages.

Visualizing Mouvance : Toward a visual analysis of variant medieval text traditions

AbstractMedieval literary traditions provide a particularly challenging test case for textual alignment and the visualization of variance. Whereas the editors of medieval traditions working with the printed page struggle to illustrate the complex phenomena of textual instability, research in screen-based visualization has made significant progress, allowing for complex textual situations to be captured at the micro- and the macro-level.

The rationale of the born-digital dossier génétique : Digital forensics and the writing process: With examples from the Thomas Kling Archive

AbstractThe article outlines the rationale of the born-digital dossier génétique from a digital forensic perspective in the light of the recent discussion about digital materiality. In its first part, the study addresses theoretical, conceptual, and methodological questions that arise from the specific materiality of the born-digital avant-texte, namely, the dualism of ‘forensic materiality’ and ‘formal materiality’ (M. Kirschenbaum) and the role of distributed materiality (J.-F. Blanchette).

Authorship attribution, constructed languages, and the psycholinguistics of individual variation

Abstract‘Authorship attribution’, the problem of determining the author (or the author's attributes, such as gender, age, native language, or other characteristics) by examining the writing style of an unknown work, is an important problem in applied linguistics. The theory of authorship attribution is relatively straightforward: language is an underspecified system, and people can pick and choose among several different ways to describe the same thing.