How statistics and text mining can be applied to literary studies?

AbstractStatistics and data mining techniques provide exciting approaches for extracting knowledge from data. Recently, using statistics and data mining has sought to be exploited in many research fields. In this study, it was demonstrated that how statistics can be applied to literary studies. First, all the lines in Khaghani’s divan are classified and coded into three categories (mystical, non-mystical, and borderline). Then a set of chi-square goodness-of-fit tests are used to investigate and compare the frequency of different line’s categories for all lines and all odes, separately.

Function words in statistical machine-translated Chinese and original Chinese: A study into the translationese of machine translation systems

AbstractStatistical approaches have become the mainstream in machine translation (MT), for their potential in producing less rigid and more natural translations than rule-based approaches. However, on closer examination, the uses of function words between statistical machine-translated Chinese and the original Chinese are different, and such differences may be associated with translationese as discussed in translation studies. This article examines the distribution of Chinese function words in a comparable corpus consisting of MTs and the original Chinese texts extracted from Wikipedia.

An introduction to the functioning process of embedded paratext of digital literature: Technoeikon of digital poetry

AbstractThe recent digital-born electronic literature has heterogeneous components such as kinetic texts, kinetic images, graphical designs, sounds, and videos. These digital components are embedded with the main text as the paratext of print and digital works such as preface, author’s name, illustrations, and title. However, the comparative study between paratext and embedded paratext of electronic literature shows the different strategic patterns and functions of these entities.

Remembrance of contemporary events: On setting up the Sunflower Movement Archive

AbstractIn the late evening of 18 March 2014, students and activists stormed into and occupied the main chamber of Taiwan's Legislature. The event set off the Sunflower Movement, signifying a turning point in Taiwan's recent history. Researchers at Academia Sinica arranged to acquire all the supporting artifacts and documentary materials in the chamber before the protest came to a peaceful end. In this article, we discuss the issues in archiving and making available to the public a large collection of artifacts created by thousands of participants during a contemporary event.

SageBook: Toward a cross-generational social network for the Jewish sages’ prosopography

AbstractIn this research we devised and implemented a semi-automatic approach for building a SageBook–a cross-generational social network of the Jewish sages from the Rabbinic literature. The proposed methodology is based on a shallow argumentation analysis leading to detection of lexical–syntactic patterns which represent different relationships between the sages in the text. The method was successfully applied and evaluated on the corpus of the Mishna, the first written work of the Rabbinic Literature which provides the foundation to the Jewish law development.

Representing stories as interdependent dynamics of character activities and plots: A two-mode network relational event model

AbstractRecent advances in data science and machine learning have enhanced our ability to analyze and understand the structure of social interactions in fictional stories by using formal and quantitative approaches. However, an objective assessment of these aspects of fictional stories remains a relatively new and technically difficult field. In this brief report, we introduce our study in which we modeled story dynamics from a novel perspective.

The problem of microattribution

AbstractMicroattribution is the name of a method which has recently started to be used in the attribution of parts of early modern plays. The method seeks to make authorship attributions by using samples of writing consisting of less than two hundred words. This article argues that the method should not be used, fundamentally because it flouts the well-founded scientific insistence on the sufficiency of sample sizes. The article considers two recent applications of the method, showing that huge amounts of evidence were overlooked which would have invalidated the conclusions drawn.

Knowledge Organization and Cultural Heritage in the Semantic Web – A Review of a Conference and a Special Journal Issue of JLIS

Review of the Knowledge Organization and Cultural Heritage:
Perspectives of the Semantic Web conference held at the Academia Sinica Center for Digital Cultures in Taipei
on June 2, 2016 and a special journal issue of academic papers related to the


