Topic modelling characterization of Mudejar art based on document titles

AbstractText mining techniques were applied to a corpus consisting in the titles of 2,454 documents on Mudejar art, a style unique to Spanish art history. Probabilistic topic modelling was used to analyse the semantic structure underlying the suite of documents studied. Two classifications were obtained, an initial, generic division into five topics followed by a second more refined division into ten. These were compared to the preliminary subject categories found for the corpus with the guidance of an area specialist. The classifications delivered by the automatic and manual procedures were observed to be compatible. The conclusion drawn was that the deployment of digitized data affords the opportunity to conduct humanities studies from new perspectives.