Lasse Mårtensson

New Eyes on Sweden s Medieval Scribes. Scribal Attribution using Digital Palaeography in the Medieval Gothic Script

This project proposes the development and evaluation of digital methods for scribal attribution in the medieval gothic script, the most widely used script type in medieval Europe. The groundbreaking aspect of the project lies in the methodology, as it is built up on a combination of world-leading competence in both traditional philology and digital image analysis/machine learning. These disciplines make a common effort to map the script's characteristics. The project is thus clearly multidisciplinary. The digital methods focus on the minute features of the script signs, the so-called micro-palaeographical features. These have unanimously been regarded as most significant from an individual perspective, but they are difficult, often impossible, to measure and quantify using manual methods. The material consists of the complete corpus of medieval original charters in Old Swedish (approximately 14 000 documents). The medieval charters are very suitable for studies in scribal attribution, as time and place of origin are stated directly in the text. This makes them safe points of departure for future studies of chronological and geographical variation, regarding script, language etc. The results from the project will revolutionize the research on the material in question, and the methods will be central for practically all research on handwritten text. Most importantly, the project shows the possibilities that lie in the combination of humanistic research and digital technology.
Final report
The aim and development of the project

The overarching aim of the project has been to study the writing process of medieval Swedish scribes, partly using digital methods from image analysis/machine learning and computer linguistics. A particular area of focus has been digital palaeography, i.e. the study of the physical shape of the script using digital tools. The composition of the project group has had an emphasis in this area, with several senior researchers within image analysis/machine learning and two postdoc positions (divided over three individuals). Within digital palaeography we have investigated script features indicating time of production (for dating purposes) and individuality (for attribution purposes).

In order to broaden the view of the work of the medieval scribes we have also included an aim focusing on language and text from a scribe’s perspective, here also with components from digital methodology. Questions of interest in this area have been, for example, what factors determined the choices of the scribes when choosing the linguistic forms in the medieval documents. These questions are particularly relevant for Swedish scribes active during the Late Middle Ages, a time of far-reaching changes in the Swedish language. The linguistic and text perspective has been in focus during the later part of the project, and this work has given inspiration to forthcoming research tasks (see below under New research questions).


Short about the carrying through of the project

The project has proceeded according to plan. Initially the University of Gävle was the administrator, and then the project was moved to Stockholm University, and finally it was moved to Uppsala University. However, during the entire project period the project meetings have been held in Uppsala. The current project followed directly after a VR-project with nearly the same participants (PI Anders Brun, Dept. for IT, Uppsala University), and to a large extent the working process in the present project was adapted from the mentioned VR-project. Thus the continuity of the project group is very long.

During the project period, three postdocs have been hired, all with emphasis on image analysis/machine learning. One of these, Fredrik Wahlberg, has been placed at the Department for linguistics and philology. Sukalpa Chanda had the other postdoc, placed at the Department for IT. After having been recruited to a tenured position at Østfold University College (Department of Computer Science and Communication) he left the position, and was then replaced by Ekta Vats, also placed at the Department for IT. Through Ekta Vats, the work of the project has built an affiliation with the newly established Center for Digital Humanities at Uppsala University, as Ekta Vats got a position as a research engineer at that center.


The three most important results of the project, and a discussion of the conclusions of the project

In the present section, we will focus on three areas of the results, displaying the cross-disciplinary nature of the project. The first concerns the results from a palaeographical perspective, the second falls within the track focusing on text and language, and the third deals with the working process. The tracks dealing with palaeography and text/language comprise different types of methods and questions. Within the project they are still related, as it is in both cases a matter of following the working process of the medieval scribes closely, and of mapping what results their choices have for the form of the written documents. In many investigations within e.g. historical linguistics it is often forgotten that that the preserved manuscripts and charters are the result of the efforts of individual scribes. In the current project, the perspective has been the opposite: the scribes themselves have been in focus.

The first central result concerns, as stated, the form of the script, and this will be illustrated with an article published towards the end of the project (2021 in Arkiv för nordisk filologi). In this investigation, the focus was on the proportions of the script signs, i.e. the proportions between high, middle and low script components. How high are the ascenders in e.g. ‘b’, ‘h’ and ‘l’ in relation to the minims in ‘i’, ‘m’ and ‘n’, and in turn in relation to the low components in ‘g’, ‘p’ and ‘q’? In previous research, this has been pointed out as a possible individual trait among the scribes, but without the aid of digital tools this has been difficult to measure with any degree of precision. The current investigation was the first measuring of script proportions having been carried out on Nordic medieval script. The investigation showed that the scribes were relatively consistent in the proportions of the script, but that they could still vary them if, for instance, the space demanded a more compressed script. The picture was similar in several palaeographical investigations; the variation within one individual scribe was considerable regarding many script features. This means that the picture of a certain scribe’s script appears through a combination of many different measurements, and not through one single feature.

The second central result concerns linguistic forms in Late medieval Swedish textual witnesses. During the Late Middle Ages, the four-case system of the Classical Old Swedish was replaced by the present day two-case system, and a large number of Low German loan words entered the language. This means that the scribes had to deal with considerable linguistic variation. The linguistic norm of the scribes may have consisted of a changed morphology, with the present day two-case system, whereas the exemplar that they copied could have had a language form in the old four-case system. An investigation of a number of textual witnesses containing a certain text, all produced during or directly after the time of the mentioned case reduction, shows that the scribes tended to ignore their own linguistic norm and instead fully reproduce the linguistic form of the exemplar. This means that the scribes really follow their exemplar on a very detailed level, very closely and without making any effort to inserting forms according to their own linguistic norm. This is important information for e.g. research in the history of Swedish during the Middle Ages, as the language form of a certain textual witness could have been handed down through many stages in a copying chain, and could have its origin far back in time as compared to the preserved manuscript.

The last aspect that we want to call attention to in the project is the cross-disciplinary composition of the project group. It has been very stimulating to work in a group consisting of very different academic disciplines on a common research question. The meeting between Nordic philology and image analysis/machine learning has led to very interesting insights, and at the same time we have also grown aware of the challenges such work involve. It is necessary to work during a long period of time, with competence from both philology and computer science in close collaboration, in order to adapt the methods, often originating from other fields, to the present purpose. During the time of the project, milieus within digital humanities have come into existence at several seats of learning, both in Sweden and internationally. This type of cross-disciplinarity has been the very foundation of our project, and we have also been active close to such milieus, both in Sweden and abroad. Our project has to a large extent contributed to develop the computational research on handwritten material generally in Sweden. We have laid the foundation for future projects within this area, for ourselves and others.


New research questions

The project has been going on for a long time, with large resources, and many new questions have emerged during the work. The track within digital palaeography contains many opportunities for future work even though the project has reached its end. The project has also generated important research questions in the mentioned track dealing with linguistics and text production, which we will return to.

To study the medieval copying process in detail, and to follow the scribes’ work with their exemplars, is a very important task for both historical linguistics (with a focus on investigating the development of the linguistic system) and philology (with a focus on text transmission). There are many possible ways to approach this area, among other things by investigating preserved exemplars and direct copies. We want to explore the possibilities for methods for analyzing large amounts of text with digital tools in order to distinguish different chronological layers in a certain manuscript. As mentioned, the medieval textual witnesses are, with very few exceptions, copies, often involving several steps; each step in the copying process leaves traces. However, this type of investigation demands a large amount of text in a format suitable for the purpose. Such a project needs to be proceeded by a larger transcription effort, parallel with the testing of new methodology.


The spreading of the research and how collaboration has been carried out

We have striven to participate in the conferences that we have judged to be the most suitable from a purely scientific perspective. As an example one could mention that we have participated in the conference series Digital Humanities in the Nordic and the Baltic countries, from the first conference (2016) until the last (present year). We have also prioritized the conferences that have focus on Nordic philology, e.g. those arranged by Sällskap för östnordisk filologi, in order to spread the message to the most important receivers of the research. In the way that we have put focus on philology, we have also participated in specialized conferences within image analysis, such as International Conference on Document Analysis and Recognition (ICDAR).

With one exception, the project publications have been published with Open Access, in the channels that we have judged to be the most important ones in our fields of research. Regarding Nordic philology, these include Arkiv för nordisk filologi (not open access) and the series for monographs published by Svenska fornskriftsällskapet. We have also published articles in journals within digital humanities, e.g. Nordic Digital Humanities Journal (Vol. 14).
Grant administrator
Uppsala University
Reference number
NHS14-2068:1
Amount
SEK 13,429,000
Funding
New prospects for humanities and social sciences
Subject
Specific Languages
Year
2015