Some reflections on the flow of knowledge in digital history – Doctoral Training Unit “Digital History & Hermeneutics”

Digital humanities as a diverse concept

In this blog post, I will present some reflections on the epistemic potential and limits of digital history and, more generally, digital humanities (DH). I will argue for two theses related to interdisciplinary research in DH. Clearly, space will not allow for a full defence of these two hypotheses, but at least some support can be provided that should vindicate them as prima facie tenable.

The first thesis is about the positive potential of DH. It claims that proper use of digital tools in DH can lead to new knowledge that cannot be acquired without digital tools. In other words, DH holds great promise as a means of significantly promoting our scientific understanding in the humanities.
The second thesis is about a corresponding restriction or limitation of DH: the digital tools in DH cannot provide such new knowledge unless a large body of background knowledge is present and properly put to work.

Taken together, these two claims delineate the epistemic prospects of DH. Knowledge acquisition in DH via digital tools is promising yet not autonomous.

Digital humanities are diverse. There is no such thing as the paradigm, but rather a cluster of paradigms. Some digital tools are designed and used to extract particular pieces of information from large bodies of sources (text mining, social network analysis, etc.).

Other digital techniques are employed to acquire explanatory information about the behaviour of certain entities (computational modelling, such as agent based modelling). And still others are exploited for the purpose of making information visible or accessible in a way that is relevant and aligned to human cognisers and their characteristic perceptual abilities and biases (primarily visualisations).

Looking at the practice in our DHH research group and at the literature, one cannot but conclude that in this area the research activities and pertinent epistemic aims are varied and legion. At the same time, there is also a significant degree of unity. The use of digital tools is essential (in some sense) to the digital humanities. This is not just a superficial unity, and it brings with it interesting and significant features and questions, since the evidence is generated by digital tools and thus takes on a digital format. Digital tools that are suitable for digital humanities research have general characteristics that partly determine the epistemological profile of research in DH.¹ Indeed, these characteristics are at the heart of the potential and limitations of digital humanities research. In the following, I will present some reflections on these characteristics and the corresponding epistemological profile in order to support the two claims above.

Data

The core features of the use of digital tools in DH can be stated very briefly: huge quantities of “data” can be generated and processed very rapidly and in a way that is strictly in accordance with a deliberately designed algorithmic procedure. This is the sheer computational power of digital devices put to use on “data” that can be gathered in various different ways. While each single step could presumably be performed by a human being, no human being could perform all the steps that computers can do in a few minutes.²As is well known, of course, Thomas Schelling originally used coloured coins on a chess board and performed the procedures of his segregation model by hand. However, with large numbers of individuals and steps, one will quickly surpass the possibilities of such a manual implementation. Yet it is true that computation need not be done by digital computers. We may speculate as to whether (so-called) artificial intelligence might allow us to go beyond human abilities, but this would lead us far astray since it introduces additional complicated difficulties. The cognitive limitations of memory and speed that are characteristic of human beings can thus be overcome, in principle at least.

Content

But the “data” and outputs of digital tools are not self-interpreting. Meaning has to be bestowed on them by a human interpreter. The reason is simple: states called “data” as well as output states of computers are simply physical/computational states that do not have any genuine, non-derivative representational content of their own.³

These physical states play a certain causal/functional role in the computational system and can therefore be seen as realising a computational state. A human being must then interpret these computational states in suitable ways in order to bestow representational content on them.⁴ This act of understanding or interpreting, performed by the researcher, turns the digital tool into a description – a model – of the target system under investigation (a “model” in the broad sense, in which any set of states that are interpreted as “saying” this or that about a target system according to a particular modelling interpretation is considered as a model) ⁵

Knowledge

A human being, a scholar in the humanities – a humanist – uses digital tools in an attempt to gain new knowledge. And a lot of knowledge – suitably called “background knowledge” – is needed in order to make this possible. Most importantly, some knowledge and understanding of how digital tools work is needed. (In other words, the epistemic opacity of (big) data processing must be reduced to some extent.) So the historian has to acquire some computer science knowledge. And the “data” have to be known to contain some relevant information if the purpose is information extraction. If the “data” consist of digitised texts, for example, this knowledge can often be secured through familiarity with the analogue sources, their origins and the process of digitisation. Indeed, some background knowledge is often used in order to select (suitable) texts for further digitisation and informational exploitation. One need not know all the relevant information contained in the data but rather that some information (of a certain kind) is contained in it. This knowledge can reasonably be extrapolated from previous research that relied on non-digital methods (directly or indirectly).

Historians have developed means of acquiring knowledge about historical facts that use texts, maps and pictures. Thus, it can be reasonably presumed (by relying on the ubiquitous inference to the best explanation) that, for example, co-occurrences of words in a suitably chosen set of texts from a historically relevant context can reveal social relations between the authors of these texts and the flow of information between them.⁶. Of course, this is not the only way in which background knowledge is relied on. It is also used, for example, to select texts and other objects, to understand the meanings of signs and words, and so on.

Against this backdrop of background knowledge, historians can now let the machine search for correlations. Given the (known) reliable performance of the algorithm, the output can then be interpreted accordingly by historians. The sheer volume of data means that historians are unable to do this on their own and without digital technology. But if it is known that digital tools reliably and stably execute a procedure that computes the function as specified, historians can acquire new knowledge from suitably interpreting the outputs produced by the digital tool.

Reflection on evidence and its use

In general, this ability to knowledgeably interpret the workings of digital technology includes source and algorithm criticism, and it lies at the heart of “digital hermeneutics”.⁷ Using this approach, for example, historians can gain knowledge of social networks in certain historical contexts. The lesson is clear enough: considerable background knowledge and reflective skill are required when digital tools are used in this way to gain new knowledge.

From an epistemological perspective, the use of digital tools is no different from the use of instruments and measurement techniques. Background knowledge of their workings is needed in order to arrive at new knowledge by using these instruments. The system under investigation is linked in known ways to the – digital or non-digital – tool and so the tool’s behaviour (“output”) can be used as evidence that enables the acquisition of knowledge about the system under investigation.

In sum, we can see that relying on a large body of background knowledge (about target objects and digital technology) and the ability to interpret and reflect on it can enable us to produce and properly interpret digitally generated outputs as new evidence that leads us to further knowledge about the target objects.

What lessons are to be drawn? Is a new revolution with a new paradigm emerging in the humanities?

Perhaps. There is a lot of new terminology and new notions – but that in itself is not enough for a scientific revolution and a new paradigm. I am rather sceptical about the term “revolution”. More likely, there will be continuation and an “update”⁸ rather than a break or paradigm shift between non-digital and digital humanities.

One job for the future, however, is relatively clear. We should work hard to establish a common pool of general notions that can be employed in many if not all research activities in DH. Some notions developed in epistemology and philosophy of science – like the notions of evidence, knowledge and inference that I have used above – are suitably general and tailor-made for these interdisciplinary purposes.

For me, being part of the DTU DHH has confirmed the impression that flourishing DH research that fulfils the demands of reflection and awareness characteristic of genuine science requires (more) general and abstract notions that bridge the gaps between disciplines and allow us to identify possible pitfalls. That being said, it is clear that the field of digital humanities is promising – even if it is not autonomous.⁹

English Review: Sarah Cooper; Editor: Juliane Tatarinov

Notes

See Thomas Durlacher “Philosophical perspectives on computational research methods in digital history. The cases of topic modelling and network analysis.” In Digital History and Hermeneutics – Between Theory and Practice, edited by Juliane Tatarinov and Andreas Fickers, (Berlin: DeGruyter forthcoming in 2021), for further thoughts about the methodological unity of DH, including a critique of Kuhnian repudiations of methods.
Jörg Wettlaufer, “Neue Erkenntnisse durch digitalisierte Geschichtswissenschaft(en)? Zur hermeneutischen Reichweite aktueller digitaler Methoden in informationszentrierten Fächern,” Zeitschrift für digitale Geisteswissenschaften 1, 2016.
For the distinction between non-derived – or “original” – representational content (or intentionality) and derived content – i.e. content that is merely possessed by virtue of an act of interpretation from some other intentional system – see e.g. Jerry Fodor, A Theory of Content and Other Essays (Cambridge/Mass: MIT Press, 1992), Fred Dretske, Naturalizing the Mind (Cambridge/Mass.: MIT Press, 1995).
Zenon Pylyshyn, Computation and Cognition (Cambridge/Mass.: MIT Press, 1984).
This general understanding of models has been convincingly argued by Roman Frigg and James Nguyen, “Scientific representation is representation as.” In Philosophy of Science in Practice, edited by Hsiang-Ke Chao and Julian Reiss, (Berlin: Springer Verlag, 2017), 149-179.
See, for example, the work of Eva Anderson in her PhD thesis “Psychiatric knowledge circulation in Europe during the mid-19th and mid-20th century” in the DTU DHH.
See Andreas Fickers, “Update für die Hermeneutik.” Zeithistorische Forschungen/Studies in Contemporary History 17:1 (2020): 157-168.
Ibid.
Many thanks to Thomas Durlacher and Juliane Tatarinov for their very helpful comments and constructive criticism.