For understandable reasons, anxieties about the disinformation potential of generative AI tend to be focussed on current events. That’s something I leave it up to others to worry about, since my focus is (as usual) on history. All of us have seen the potential of generative AI to generate convincing-looking historical images (so far, it seems to be mostly images, rather than purported historical texts), from photographs to ‘illuminated manuscripts’, and it is already noticeable that search engines are beginning to fill up with AI-generated fakes. This is bad. But it should be borne in mind that historical disinformation has always been popular on social media and was pushed for years before the advent of AI for by unscrupulous clickbait accounts aggregating historical ‘facts’. One notorious example is the inspirational quotes from historical figures that regularly circulate on social media, which in almost all cases aren’t genuine (or are paraphrased beyond reasonable boundaries of meaning). So while I deplore the fact that Facebook is filling up with AI-generated ahistorical nonsense, it’s really just a supercharged version of a longstanding trend rather than a wholly new phenomenon. But how worried should we be about the long-term effects of generative AI on the discipline of history itself? This is the question that keeps me awake at night.
First of all, it would only be right to acknowledge that there are some genuinely groundbreaking applications for AI in history and archaeology. This is a useful as well as a harmful technology, and the use of AI to unscramble and ‘read’ unrollable scrolls from the Villa of the Papyri in Herculaneum is perhaps the best-publicised example of such applications. AI could be tremendously useful for ‘reading’ through vast archives and identifying salient items for human study and translation – an oft-cited example is the vast archives of cuneiform cylinder seals held by some museums, which are probably beyond the capacity of any humans to read in a reasonable timeframe, not least because there aren’t enough Assyriologists. Victorian newspapers and periodicals are another example. AI has real and useful applications in the digital humanities, so I would not want anyone to think that I am denigrating it wholesale. But it is undeniable that the potential of generative AI to deceive and mislead presents particular problems.
Several historians have made the point that, at this stage, trained historians are in the best position to fight off the challenge of AI-generated historical disinformation – not so much because we can recognise that an image is fake from internal clues (that’s becoming more and more difficult, although historians of costume and technology are still ahead of the game on this, as historical accuracy and temporal continuity aren’t gen AI’s strong suit), but because we insist on provenance. There is a lot of truth in this argument; experts in a field have a very good idea of what images and sources already exist, and so they are the hardest to fool with AI-generated fakes. This is especially true in fields where the source base is limited – although it becomes more complicated as we approach the 20th-century, and available source materials become too vast for any one human being to assimilate. Take photographs of the Second World War, for example – people often discover (or claim to discover) lost caches of historically significant photographs in private family collections, because the idea of such discoveries is plausible. But such claims are now very easy to hoax using gen AI, and much harder for historians to detect and debunk than, say, someone hoaxing a medieval manuscript.
The challenge that gen AI poses to history is not disinformation per se, because people have always been willing to believe what they want to believe about history, and have always lapped up disinformation to support their preconceptions. The real challenge is the prospect that, over time, the barrage of AI will poison the well of genuine historical information because trained historians will slip and make mistakes, with potentially devastating consequences further down the line. History is full of cautionary tales of eminent historians taken in by hoaxes, from William Stukeley duped by Charles Bertram’s De situ Britanniae to Hugh Trevor Roper duped by the Hitler Diaries. These are notorious examples, but the reality is that all historians rely, to a large extent, on the reliability of our secondary sources, and most of us slip up in small (and usually unimportant) details from time to time. In an ideal world we would all check and double check everything against the archive, of course; but that isn’t the reality. Things need to get published, discussion needs to continue; and we are all reliant on the purity of the chain of information supply to allow us to do our job. This uncomfortable fact about the business of doing history – our reliance on secondary sources – is the reason why peer-review matters, the reason why some journals and publishers are regarded as more reputable than others, the reason why we are more likely to trust publications with a more extensive critical apparatus, and so on. Scholarship is a trust-based community, and a pretty fragile one at that.
Most of us have heard tales of Wikipedia pranks that resulted in some nonsense made up by a teenager in Nebraska being solemnly repeated in peer-reviewed academic tomes, thereby amusingly exposing the laziness and credulity of academics. But the reality is that the reliability of scholarship is only as strong as its weakest link; I don’t really blame scholars for repeating Wikipedia-derived nonsense if they genuinely trusted the reputation of the scholar from whom they referenced it, and had no good reason to doubt its veracity. Because the reality is that we don’t check everything – much is taken on trust, especially if it’s something tangential to our own research that lies outside our specific area of expertise, but nevertheless merits a mention. But imagine that the integrity of scholarship is imperilled, not by a few teenagers pratting around on Wikipedia to catch the unwary, but by a never-ending barrage of AI-generated ahistorical slop. The probability that at least one historian will slip up and inadvertently poison the well, introducing a fake source into the chain of trust that is historical scholarship, is close to a certainty.
This is the real threat posed by gen AI to history – the possibility (indeed, perhaps the certainty) that the well will be poisoned. This problem will become ever more acute over time. Convincing AI is only a few years old, so every historian working today (even PhD students) received their training in a pre-AI world. But what about the next generation, who will grow up exposed to AI-generated disinformation? What about the next generation, for whom the errors of those who have slipped up and contaminated the supply chain of information are part of the background to their ‘knowledge’ of history? There is a dystopian scenario here, where the discipline of history itself breaks down and humanity’s history becomes wholly unknowable, because the AI slop has breached the dams of expert verification and everything has become mixed together. It would be a situation somehow worse than the suppression or denial of historical knowledge under Stalin or imagined in Orwell’s Nineteen Eighty Four. When history is suppressed, it can be recovered; but we might be denied a history forever by the sheer abundance of pseudo-information and the impossibility of distinguishing genuine sources from fabrications in an AI-saturated society.
I don’t think we are necessarily heading this way. Historians are wising up, and wising up fast – although the low priority accorded to history and the humanities by universities and governments is a source of despair. There is an alternative future in which historians become more important than ever, as the arbiters of real historical information and the guardians of the purity of our flow of information. But in addition to the financial and political pressure on the survival of history as a discipline, other nefarious factors militate against such a future. One is the mystical power that some impute to AI to actually increase or improve our historical knowledge. There were early signs of this in the colourisation-mania that swept social media about a decade ago, with many seemingly convinced that the capacity of sophisticated programs to ‘clean up’ and colourise photographs somehow recovered a ‘reality’ previously hidden by the defects of early photography. Peter Jackson’s film They Shall Not Grow Old (2018), which ironed out the staccato movement of old film and colourised it, is a case in point. I have no objection to Jackson’s film, but many responses to it confused the actual recovery of more reality with the illusion of veridicality. Yes, computers can make historical images seem more real. But they don’t actually add any more historical information; and if people are incapable of seeing this distinction, we’re in a bad place.
The big AI scandal in history doesn’t seem to have happened yet. Maybe that is because such scandals are unlikely to hit the headlines now – compared for the potential of AI deepfakes to swing elections and corrupt democracy, a scholar being hoodwinked by a fake source hardly moves the dial. Perhaps it is because historians are genuinely on their guard against this sort of thing happening. But it is clear to me that the current trust-based approach to secondary sources in scholarship is ticking down to catastrophe. The only solution would seem to be a radical ressourcement in historiography: a return to primary sources, and an intense focus on issues of provenance and verification. But this runs counter to the equally important imperatives of accessibility, digitisation and public communication. We live in an era of brilliant public history and communication high-quality historical research to the general public, but at the same time we may be standing at the precipice of the breakdown of the historical knowledge economy as we know it.
I really appreciate that you're writing about this and trying to convey the gravity of the situation. My field is language and literature, but in some ways we're in an even worse predicament—most of us still require substantial, and often extensive, historical context to teach our classes and develop our analyses, yet we have even less time to engage with (non-literary) primary sources and less expertise with archival research, and therefore we rely even more on secondary sources.
"Scholarship is a trust-based community, and a pretty fragile one at that"—it is indeed, and I think we need to be forthright about the status of AI in humanities scholarship: Whatever productive applications we occasionally find for it are insignificant compared to its potential for bringing about the dystopian scenario that you described.
While I agree that computers do not add information per se, I would argue that the improved clarity they can help create – in certain circumstances – can nonetheless impart greater information to an audience. A fine distinction I’ll admit, but still an important one.
Like Photoshop before it, Generative AI (as opposed to General AI, which is still very much fictional) can only create digital images and data – not real artifacts, so I am confident that archaeologists and historical researchers like yourself will still rule the day.