Another paper on Upper Palaeolithic marking systems has been getting a lot of attention, and various colleagues have been asking my opinion on it. I am not any kind of expert on this period but I do have a few thoughts and I figure that it’s time to get them into shape.
Christian Bentz and Ewa Dutkiewicz argue that sequences of signs on a corpus of 260 mobile artefacts from the Swabian Aurignacian bear a statistical signature of information-density that is consistent with sequences of signs on a set of non-linguistic Mesopotamian clay tablets from the Uruk V layer, ie, the proto-cuneiform texts that predated the emergence of writing. While Aurignacian objects retain a roughly stable information density over 10,000 years, the Uruk material is different. Clay documents from the later period, known to represent the Sumerian language, have a far greater information-density.
Assuming that the coding of the primary data is reliable and its analysis is robust, this finding is definitely interesting. It amounts to a demonstration that two sequential-marking practices—separated by time, geography and cultural context—have similar statistical patterns. And if the marks on the Aurignacian artefacts are symbolic, as opposed to decorative, this patterning suggests that they may have functional affordances that line up with the earlier Uruk material. All this confirms the observation that human societies have been using graphic codes for a long time, that they innovate these codes independently of one another, and that some these codes may have convergent properties. They can also be adapted to carry different information loads under different conditions.
This has turned out to be a long essay rather than the short blog post I first imagined, but the tl;dr version is this: Bentz and Dutkiewicz have produced a thought-provoking paper that is theoretically well motivated. Its implications are modest yet important, and the methodology should be taken seriously, despite genuine challenges and limitations that I have tried to identify below.
Media interpretations
The authors dangle much more exciting implications within the paper itself without fully committing to them. Parenthetically they suggest that the findings shed light on the emergence and evolution of writing itself, as well as the incremental advancement of human cognitive potential. This discussion was clearly intended to provoke the curiosity of readers rather than to stake any defensible claim. But it is these remarks, rather than the findings, that have captured the imagination of journalists who prefer to tell a simpler story: that scientists have discovered a 40,000 year old ancestor to writing.

This is a seductive trope. Since the first years of the 20th century European scholars began speculating that marked objects and cave paintings from the Upper Palaeolithic represented primordial forms of writing. I’ve written about this intellectual history here, and also here. The authors of the paper touch on some of this history too, drawing my attention to a curious paper from 1977 whose authors link certain Palaeolithic icons to Sumerian, Egyptian and Chinese signs, as if they stood in a direct line of inheritance. Narratives of this kind resonate with popular understandings of evolutionary processes in which simplicity is seen to give way to complexity, and ‘primitive’ cultures advance in steady, unbroken steps towards ‘civilisation’.
Anthropologists rejected that model of stepwise goal-directed evolutionism over a century ago, but not before it sank deep into public consciousness. It has proved very hard to compete with such a compelling story. No one is going to bother clicking on the headline “Scientists discover that human societies across history are wonderfully diverse and generate a range of responses to different circumstances, sometimes independently hitting on the same solution, sometimes by repurposing an older solution to a different problem.” Much better to run with “Ape creates bone tool that becomes space ship”. When it comes to ancient marking practices, readers can visualise early Europeans doodling on various surfaces in an effort to discover writing, much like a frustrated novelist scrunching up endless drafts and trying again, each draft getting closer to the desired effect.
Yet there is a general consensus among palaeographers that writing did not in fact emerge in a slow incremental fashion over thousands of years. Writing was independently invented quite recently in Mesopotamia, Egypt, China and Mesoamerica, and perhaps elsewhere. Only in the first two sites do we have reasonably good archaeological evidence that helps us understand the approximate circumstances of its origins. In Mesopotamia and Egypt it appears that specialist scribes repurposed and extended existing notational systems into a code that mapped signs onto linguistic structures. For the first time, a spoken word or phrase could be modelled in a visual way, allowing it to be reconstructed by any person literate in that system. What’s especially interesting is that the invention of writing appears to be a quantum leap. As soon as the language-mapping principle was discovered it was very quickly generalised into fully functional writing system in a process that necessarily involved intentional coordination and trial-and-error experimentation. You can see that quantum leap in Mesopotamia in the more recent Uruk layers when the code was rapidly extended to represent the Sumerian language. In China, likewise, there are no surviving ‘intermediate’ texts that are only partially linguistic. It’s all or nothing.
But the various notational systems that preceded and influenced this innovation were never locked onto any kind of inevitable and continuous pathway towards writing. They were simply part of the symbolic raw material that made writing possible. Nor did these ‘precursor’ systems become obsolete with the advent of writing. Some remained in concurrent use, while new non-linguistic codes were innovated alongside writing. The idea of writing as a kind of intellectual end goal is further undermined by the fact that this technology has come and gone with the vicissitudes of history. As it happens, of the four known lineages of writing only the Egyptian and Chinese lineages have survived to this day, even if the now-defunct Mesopotamian line remains the longest surviving example. Writing was famously lost and rediscovered in Greece, and may have been invented and lost again multiple times in various locations without leaving enough evidence behind to make it visible. In Mesoamerica, where the archaeological context is admittedly thin, we even see cases of fully linguistic writing systems replaced over time by non-linguistic notations, as if the Uruk sequence were reversed.
In much more recent times writing has been reinvented within small-scale and traditionally non-literate communities in Southeast Asia, West Africa, and the Pacific. Like in the ancient examples, modern non-literate inventors have often drawn on existing systems of iconography as sources of inspiration. It doesn’t seem to matter what those prior symbols were originally for as long as they were conventional and somewhat systematic. Similarly these traditional iconographic repertoires and systems continued in parallel with the writing systems they influenced.
When writing is invented and is sustained over successive generations, it tends leaves a trace. It literally makes ‘history’ in the earlier sense of that word. This is what is infuriating about media reports that imply that we just don’t know anything about where writing may have come from and that it’s all up for grabs. It’s all a bit like imagining that aliens built the pyramids, despite the fact that we know the names of some of the work gangs and even the kinds of songs they sang while hauling stone blocks. Such is the allure of the mystery narrative. Speculation by non-specialists can be elevated to equal status of those who did the homework, like true crime podcast fans solving cases from their internet browsers.
At the same time, I’m not immune to the pull of an ancient mystery story. After all, there really is so much we don’t know about the context and atmosphere of creative activities in the past. What did cave artists feel about their own productions? What stories did audiences tell about them, and how were they embedded in everyday experiences? The material record, even when it is spectacular, is a pale shadow of a much richer and irretrievable world of networked populations, storytelling in lost languages, celebrations of seasonal events, religious activity, music, love, war, trade and cross-cultural encounters.
A higher-order question, implicit in the work of scholars such as Bentz and Dutkiewicz, is this: “How did ancient mark-makers generate meaning, and how was this meaning received by intended audiences?”
Anthropologists have always championed the importance of ‘being there’ as an active participant-observer in the communities they want to understand rather than relying on somebody else’s description. No matter how much preparatory reading you do, or how deeply you analyse a dataset, the experience of fieldwork can be overwhelming in its intensity, like seeing in colour for the first time. Previous blind spots or misconceptions become embarrassingly obvious, while at the same time you might get momentary glimpses of other dimensions of knowledge that lie just out of reach. Of course, when it comes to reconstructing a deep past beyond human memory it is never possible to ‘be there’ as a participant-observer, but it is well worth cultivating an ethnographer’s instinct for imagining what might be missing from a scene, and a sense of humility in the face of one’s own ignorance.
When guided by a culturally relative and ethnographic perspective it becomes easier to appreciate that our distant ancestors were just as capable in all respects as we are. Marking systems from 40,000 years ago ought to be appreciated as creative achievements in their own right with uses that were already fit for purpose at the time they were developed. At the same time, and with the benefit of a historical perspective, we can also hold a sense of their multiple and as-yet-unrealised potentialities.
Why writing is different
One rhetorical way of emphasising continuities between early marked objects and modern inscriptions is to categorise any systematic graphic code as ‘writing’ regardless of whether or not it models language. This so-named ‘broad’ definition of writing contrasts with the narrow view in which a code must model linguistic structure if it is to be called ‘writing’. Some advocates see the broad view as a corrective to ethnocentrism, since a vast number of contemporary societies have innovated their own non-linguistic graphic codes that fail to get the same recognition. Attributing unique importance to language-based writing, they argue, is a relic of colonialism, its obsession with cultural hierarchies and its arbitrary reification of a divide between ‘literate’ and ‘illiterate’ communities. Surely all graphic systems deserve the status of ‘writing’? And yet this attitude comes with its own ethnocentric consequences. There are some societies that don’t use any graphic codes all but are just as capable of rich communication and dynamic cultural transmission. As such, the broad definition of writing merely redraws a new category of ‘illiterate savages’ to serve as a foil for the ‘civilised’, a point I have made in more detail here.
I maintain that the distinction between a linguistic and non-linguistic codes is, in fact, analytically meaningful. Language-dependant writing systems really do have properties that make them distinct from non-linguistic codes. Human language, whether it is spoken or signed, is an information system that is of an order of magnitude more complex than, say, a classificatory system used for organising and counting goods. As such, visual codes that model language are going to be more complex that those that don’t. Complexity of the signal, however, should not be treated as synonymous with ‘more advanced’. An efficient code is one that models its target system with minimum redundancy and maximum effectiveness for the task at hand.
Information entropy
This brings me to what I think is probably the most important and valid aspect of the Bentz and Dutkiewicz’s paper: the application of information entropy as a technique for measuring the information-potential of graphic codes, especially in the absence of a decipherment.
To oversimplify: information entropy is a way of measuring the ‘surprise factor’ in any sequence of signs. Or to put it even more simply, it’s a measure of relative repetition. If when presented with a page of apparently random capital letters you notice that the letter A is always immediately followed by the letter B, then you would say that the presence of A predicts a B every single time. Another way of putting this is to say the sequence ‘AB’ should really be counted as one sign instead of two signs because there is no contrast between them. But in another text, A might be followed by B about 50 percent of the time, or 10 percent. Other longer sequences of repetitions might be present or absent at a larger scale. Counterintuitively, the more random the sequences, the more ‘informative’ they are. By contrast, predictable sequences are ‘uniformative’. B is not telling us anything new if it always follows A.
Consider the sentences you are reading right now in terms of a string of written signs. There is novelty at the level of individual alphabetic letters, as well as spatially separated words and their syntax. But the text is not maximally random because it exhibits structure. The different layers of combinatoriality hit a kind of sweet spot between predictability and novelty, rendering it intelligible.
This structure can be measured with numbers. Richard Sproat has elsewhere demonstrated that information entropy can be applied to graphic codes of different kinds. In theory, codes that represent language will have a kind of ‘fingerprint’ of repetition that contrasts, for example, with accounting notations. In order to discriminate between the two, its necessary to figure out a baseline for linguistic codes and a corresponding baseline for non-linguistic codes. An advantage of this measurement, pointed out by Bentz and Dutkiewicz, is that it doesn’t matter whether the text you are measuring is in fact a ‘text’ at all. A string of marks that were intended by their maker to be decorative can also be measured, as could the pattern of droplets along a window sill. In a sense, it’s not really measuring information content so much as information-bearing potential.
The central analysis performed by the authors was to measure the entropy of a set of cuneiform texts representing the Sumerian language and compare this to earlier proto-cuneiform accounting notations from the same site, that do not have linguistic content, finding that the Sumerian texts have higher information entropy. This result is predicted by the theory. In a separate analysis they measured the entropy of marks on the Aurignacian objects and found that their lower entropy corresponds to the lower entropy of the earliest non-linguistic texts from Uruk.
Writing is relatively easy to analyse in this way, since it is almost self-coding. You can feed text—in Chinese or French or Sumerian—into a machine and analyse the entropy. Non-linguistic systems, however, have to be patiently hand-coded, and they don’t always resolve into neat linear packages that are readable by a machine. Proto-cuneiform texts are not strictly linear so manual coders have to shoehorn them into a single defined sequence. Even if inter-coder reliability is high, linearisation bleaches out the non-linear structure at the level of tablet panels, as well as potential diagrammatic relationships between elements. The situation is just as tricky for the Aurignacian marks, some of which are already linear because the natural shape of a bone or an antler constrains the mark-maker in this direction, but others are more ambiguous. (I am, incidentally, very appreciative of their public SignBase database of marked Palaeolithic objects – it is so rare for excellent datasets like these to be shared.)
Relationships between somewhat-abstract and somewhat-iconic signs are also difficult to reproduce statistically. When looking at the coding decisions for the Adorant object, for example, the iconic figure is reduced to a single sign with just as much ‘informative’ power as any one of the repeated notches that surrounds it. Having said that, I’m not sure how else this could be done, but this limitation in itself is instructive and raises a more fundamental question: what kind of information is being communicated that resists capture in a linear sequence?
Simultaneous modalities and the context of communication
This brings me back to the question of what is missing from a scene that cannot be accessed by ‘being there’. First-hand observation of sequential marking in contemporary or near-contemporary societies reveals that the marked object in and of itself is often merely one node in a multifaceted communication. The example of Australian message sticks is suggested by the authors, and I believe this comparison has validity. I would expect that if a message stick were to be excavated after 40,000 years, and it was analysed purely on its sequential markings, the information-density would no doubt line up with Uruk V and the Aurignacian objects.
However, real-world message stick communication is highly multimodal involving spoken language, gesture, body paint, and contextual common ground including the expectations of social institutions like classificatory kinship. The interactions themselves are constrained by semi-standardised routines with structure and predictability. A messenger must signal his or her approach, camp at a respectful distance, maintain a certain demeanour, be invited to approach within speaking range, accept the invitation, and so on. Even from a distance, the messenger might already by identifiable, meaning that their known relationship to the recipient will constrain the kinds of messages that they can be expected to convey, as does their visible body adornment, or whether they are travelling alone or in a pair or a larger group. As such, the addressee of a message stick can already have a great deal of information before the message stick itself is delivered. The most information-dense or ‘unpredictable’ part the communicative routine is the spoken interaction that happens when the object changes hands. This speech is declarative and intended to precipitate social action: eg, a request, a summons, a challenge, or a response to previous demand. The spoken performance is anchored to the signs on the object that may be emphasised with pointing gestures and often involves more than one language within and between participants. As such, message sticks themselves can even be thought of as almost supplementary props within a multi-channel communicative routine. The objects do not, on the whole, extend the memory capacities of the person delivering the message. More accurately, message sticks are social warranting devices in which the marks verify and reinforce elements of a short verbal statement, just as the verbal statement reinforces the meaning of the marks.
Marks on certain message sticks sometimes encode information that can be expressed in a linear sequence such as quantities and dates. But just as often these marks can be diagrammatic or cartographic in their arrangements. Even apparently linear sequences may be misleading, for example on objects with undifferentiated ‘tally’ marks that appear numerical but which actually stand for stages in a journey, with each place distinguished through the oral explanation of the sender and messenger. In the absence of orality, the entropy measurement of this sequence would be very ‘uninformative’ and repetitive. But paired with speech, each apparently identical mark would be revealed to be maximally contrastive since no place or landmark would be understood as interchangeable with another.
What’s fascinating to me is that some of the first settlers to make records of message stick communication from the 1870s, explicitly compared them to the marked objects like the Aurignacian material that were coming to light in European deposits at the same time. In both cases, they assumed that the objects instantiated a rudimentary form of writing and that the marks were intended as memory prompts. It was not until later that participant-observers were able to glimpse a much bigger picture after consulting directly with message-stick makers and messengers, and sometimes even participating in communications themselves.
One lesson I take from this historical mischaracterisation of message sticks is that popular terms of art like ‘Artificial Memory Systems’, favoured by Bentz and Dutkiewicz, may be misleading and set the analysis off on the wrong foot. Front-loading terminology like this already primes us to analyse the markings as a genre of writing system or at least a recording system, without evidence that this is how they were used. If the Aurignacian artefacts were anything like message sticks, then memory-reinforcement may have had little to do with it.
It strikes me that a more representative unit of measurement, or comparison, would be the entire communicative event and not simply the marked object in isolation. It’s easy for me to imagine that proto-cuneiform accounting at Uruk V involved complex oral interactions in which the tablet itself was an anchor point but was never intended as self-sufficient container of information. (This would make them similar to Andean khipus that required orality and other common ground to be functional.) From this perspective, the later linguistic texts from Uruk would have a higher information-density primarily because much more of the oral mode was embedded in written channel than in the oral channel. Since this foundational orality cannot be retrieved there is a survivor bias in our analysis of the marked objects, making it look as if the interactions involving written linguistic texts were more information-dense.
In other words, the lost interactions that accompanied Uruk V objects may well have involved more-or-less the same amount of information as the Uruk III objects when structured orality and other context is taken into consideration. Writing in the narrow sense is a game-changer only to the extent that it allows more information to be embedded in visual channel than was previously possible but it doesn’t supersede orality. As such, I am cautious about any evolutionary narrative that traces a unilinear movement between less-informative sequences to more-informative sequences. Instead, rather than an overall increase in information what we might be seeing is a pendulum swinging between oral and visual channels, with each modality communicating different types and degrees of information.
Disclaimer
Prior to writing this post I accepted an invitation from one of the authors to speak at an event in Berlin that is funded by the Fritz Thyssen Foundation.
This blog is rarely updated! If you want an email notification whenever there is a new post, click on the follow button right at the top ↑ that looks like this:


