AI and Handwriting Recognition.

As a card-carrying AI hater, I feel it my duty to point out when it’s actually useful, and Dan Cohen presents such a case:

“All goes in the usual monotonous way.” That is the depressed sigh of George Boole in a letter to his sister Maryann in 1850. It was the spark for my book Equations from God: Pure Mathematics and Victorian Faith. Boole, the English mathematician who gave us the logic at the heart of the digital device you are reading this on, was teaching in Cork, Ireland at the time. On a cold December day, he wrote to Maryann about his feelings of profound loneliness. In a city that was on edge from religious strife and famine, he played piano at home to an empty room, and took long walks by himself. At the end of the day, he retreated to his equations, which seemed to transcend the petty differences of humanity.

But before developing my thesis about the fervent emotions behind Boole’s seemingly cold mathematical logic, I first had to read his damn handwriting. Talk about monotony! There were hundreds of letters and notebooks in his drifty scrawl. In retrospect, Boole’s handwriting is actually not that bad; I’ve encountered far worse since reading his in Cork. And it helped that I had taken a brief course on paleography, the art of deciphering handwritten historical documents. But it would have saved me a lot of time getting to the interesting interpretive phase of my research if a computer could have converted his handwriting into machine-readable text, as it already could for typeset text through a process called optical character recognition (OCR).

Since I wrote that book, university and industry labs have been trying to solve the incredibly difficult problem of handwritten text recognition (HTR). OCR quickly approached 99% accuracy for digitized books, whereas even the best HTR systems struggled to reach 80% — two incorrect words out of every ten. The issue is obvious: unlike the rigorous composition of books, handwriting is highly variable by author, and words are often indeterminate and irregularly arranged on a page.

He uses George’s letter to Maryann as a test, which most approaches fail; then he hits the jackpot:

They have gotten incrementally better over the past three years, but I was frankly stunned when I put the letter into Gemini 3 Pro this week and asked it to have a go at the transcription […] Gemini transcribed the letter perfectly: it figured out that the right side is the beginning of the letter, not the left (the letter actually continues on the other side of the paper, which accounts for the discontinuity between the two sides we are viewing); it left off the periods where Boole also (oddly) omitted this punctuation; and it includes a self-reflective analysis of where it might be wrong and provides alternative readings.

Even wilder, when you click on a “show thinking” tab, Gemini provides a long discourse on its approach and minute details about word choices […] This thinking goes on for almost 2,000 words, and what’s remarkable is that it is essentially a verbalization of what you’re taught to do in a paleography class: assess the overall document first, determine key features, study letter shapes and strokes across the letter to refine your understanding of the particular script, consider context and word/phrase possibilities, think about the coherence of content, grammar, and usage, identify any contractions, proper names, and other oddities, etc.

He tests Gemini on increasingly harder materials, and the results are impressive. By all means click through and read the whole thing, with the examples; as he says:

At this point, AI tools like Gemini should be able to make most digitized handwritten documents searchable and readable in transcription. This is, simply put, a major advance that we’ve been trying to achieve for a very long time, and a great aid to scholarship. It allows human beings to focus their time on the important, profound work of understanding another human being, rather than staring at a curlicue to grasp if it’s an L or an I. Could we also ask Gemini to formulate this broader understanding? Sure we could, but that’s the line that we, and our students, should resist crossing. The richness of life lies in the communion with other humans through speech, the written word, sounds, and images.

Comments

  1. Jen in Edinburgh says

    I have nothing particular to say about AI (although wouldn’t transcriptions of the letters already exist in the material it’s scraped?), but I don’t think it says ‘try’ (or ‘by’) at all – I think he just ran out of room and put the ‘ly’ of ‘occasionally’ on the next line.

  2. Since I have done a fair bit of manuscript transcribing, I gave it a try, with some non-English handwritten texts. It did almost perfectly on them, though it is quite slow (about a minute a page).

  3. David Marjanović says

    Image recognition is something AI has “long” been good at. I’m still a bit surprised to find this in an LLM specifically, but not that much…

  4. I should add that the few errors it did make betray its AI-ness, like making up a crossed out word where only one letter was present.
    The texts were unpublished btw, so no chance of having cribbed the texts, and were arguably less clear than Boyle’s letter.

  5. Some more reviews and tests from Mark Humphries, a humanities-AI enthusiast, on his Generative History Substack: here and here.

    It doesn’t bother me too much, personally, to be able to take a shortcut to transcription. It’s satisfying but hard work, and takes a long time. The generated illustrations are inexcusable though.

  6. I have a counterexample to the brilliance of AI in language-related matters. I get frequent emails from Crate and Barrel telling me about their latest products, sales, special offers and whatnot. Lately, gmail adds a little note at the top of these emails that says “it looks like this message is in Greek.” I have scoured a few of the emails looking for hellenic cues, but have come up empty. I tried clicking the ‘translate’ button and, as far as I can tell, nothing changes. So I can’t figure out what the AI is latching onto.

  7. AI is certainly fundamentally as stupid as any computer, but specifically, the model used to add useless comments to webpages is a quick-and-dirty one, not the big, slow, expensive one used for cutting-edge transcriptions.

  8. Seong of Baekje says

    Amazing. Makes you wonder how much mileage we are still going to get out of LLMs and their cousins before we hit a wall.

  9. Nat Shockley says

    This is very good to see. I’ve been expecting this from AI ever since GPT made its appearance. Transcribing manuscripts correctly is obviously something that this technology should be very good at doing.

    Now they just have to learn to do it with medieval manuscripts in Latin… Maybe they already can. I haven’t yet had time to play with them to test that.

  10. AI handwritten text recognition has been around for some time, an example is Transkribus. (AFAIK It has been applied to 17th century manuscripts.) Perhaps Gemini has integrated an existing AI model?

  11. The article goes into all that; if the subject interests you, I suggest you actually read it. Another excerpt:

    Later, the tranScriptorium Project and Transkribus used neural networks to improve accuracy, but required a significant training corpus.

    Even with considerable prep work, error rates remained stubbornly high. Here’s Transkribus’s best guess at George’s letter to Maryann, above:

    (There follows a bad transcription.)

  12. Thanks for reminding me to read first. Several years ago I was involved as a volunteer in a crowdsourced project to review Transkribus results, and IIRC they were already rather good, but I have no idea of the amount of AI model pre-training that went into it.

  13. I’ve tried to use Transkribus a few times in my research. It’s never returned anything usable, and quite often, it’s returned no results at all.

    All-purpose LLMs tend to produce junk transcriptions in my experience. I have given up offering help on Reddit with genealogical transcriptions because the people looking for help can’t tell the difference between a human expert’s work and AI hallucinations—and I’m tired of arguing with people.

    FamilySearch now offers full-text search*, and both it and Ancestry offer AI-driven transcription of documents. These are usually better than what the generic LLMs produce. FamilySearch also has a project of human volunteers reviewing their AI captures of names which probably has greatly improved its results.

    * I’ve read that Ancestry is in the middle of a slow rollout of its own full-text search, but my account doesn’t have that feature yet.

  14. Yes, Transkribus in the EU is a well-known specialized project to use machine learning to decipher handwriting which goes back before the cult at OpenAI started their Great Work. I think the project focuses on the 16th and 17th century so no surprise that it has trouble with nineteenth-century handwriting.

    Be wary of the possibility that the Boole to Maryann letter is already transcribed somewhere that the slop machines scrape up. They have also paid specialists like paleographers to feed them samples of what a competent paleographer sounds like.

  15. David Eddyshaw says

    My late father-in-law (a university professor) had handwriting only comprehensible by first-degree relatives and by his (long-term) secretary. (I’m really not exaggerating.)

    My own father’s – entirely cursive, and rather beautiful – handwriting is almost the most legible I have ever encountered. He’s never been tolerant of anything that might impair the clear communication of his opinions.

  16. Are jokes about doctors’ and pharmacists’ handwriting still current in this day and age?

    Sean: the links I gave above, at Generative History, addresses this. He got similarly good results on manuscript material that was never published (which is why he was working on it), and in fact had been miscatalogued.
    Transkribus has a number of models available, including ones for 19th century English. The model I tried had been trained on 16th through 19th century materials in several languages, IIRC.
    Google is certainly capable of producing good software, at least until the marketing/managerial people get their grubby hands on it. I believe that PyLaia, the engine underlying Transkribus, is the result of one person’s PhD project, and has been sporadically maintained. Not that that’s bad, but a trillion-dollar company working hard to develop a competing product is likely to be a serious contender, at least.

  17. David Marjanović says

    Are jokes about doctors’ and pharmacists’ handwriting still current in this day and age?

    I’ve only seen cartoons*, not jokes, but doctors’ signatures absolutely live up to the stereotype very often.

    * Doctors protesting in the street, holding signs with illegible scrawls on them.

  18. When I have the chance, I’ll try it on Horace Greeley’s handwriting. Mark Twain once built a whole ridiculous plot around Greeley’s awful scrawl.

  19. “Are jokes about doctors’ and pharmacists’ handwriting still current in this day and age?”

    Several years ago the State of New York prohibited the use of handwritten prescriptions. Now, they have to be typewritten and sent to pharmacies, etc. by email.

  20. Caught me just at the tail of a weekend when I was trying to figure out some postcards sent amongst my ancestors in the early 20th century in Yiddish. I sicced three AI models on them, and they all failed miserably. I shall consult a speaker instead.

  21. Yuval, was Gemini 3 one of them? That is the one they were raving about.
    It wouldn’t surprise me, though, if it were only that good with Latin-script materials. The custom handwriting recognition programs, like Transkribus and Kraken, have been used with Arabic, Hebrew, and many more minor scripts.

  22. Gemini was, and I just tried with 3, it’s somehow worse: much more eager to hallucinate in order to give *some* answer, whereas the others were very apprehensive. But one of its fibs got me in an interesting direction: that’s a עיה”ק, all right, but it ain’t Jerusalem, although maybe Safed?

  23. I translated a hand-written diary and some letters in various hands from Dano-Norwegian of the period 1870-1943 a year or two ago. It was a favour for a part-Norwegian friend who wanted to understand her family archive. At first I found the handwriting very hard indeed, partly because my Norwegian and a fortiori my Dano-Norwegian wasn’t good enough, and partly because of the great variations in spelling. People were moving towards modern Norwegian from Danish spelling conventions, and some of them seemed not to care very much about standards and dictionaries, which is fair enough, but it made it almost impossible for me to look up words I didn’t know.

    At that time AI was no help with the handwriting, but produced a very good translation of a typewritten document from the 19th century containing a number of trade and technical terms.

    The work gave me great satisfaction because, although necessarily imperfect, it was found acceptable by the Norwegian members of my friend’s family.

    I used Larsen’s ‘Dictionary of the Dano-Norwegian and English Languages / Dansk-Norsk–Engelsk Ordbog’ (Copenhagen, 1888), which is excellent apart from having all the headwords in Fraktur, which takes some getting used to: for example, capital N is very similar to capital R in the typeface used in this book.

  24. Athel Cornish-Bowden says

    Are jokes about doctors’ and pharmacists’ handwriting still current in this day and age?

    I don’t know about jokes, but it’s a simple observation that my doctor’s handwriting is almost impossible to read, though the pharmacist seems to understand it. Most French signatures are completely illegible, but apart from that the usual standard of handwriting isn’t bad at all.

  25. Handrwitten prescriptions? In Germany, all prescriptions are supposed to be digital. No handwriting involved at all. And before that, they were nicely printed out, the only thing handwritten being the doctor’s signature — and signatures are supposed to be illegible, the more legible a signature, the easier it is to fake it.

  26. David Eddyshaw says

    Doctors are often called upon to write their signatures many, many times a day. Degeneration tends to follow.

    Question I was once asked: “Is it your signature that just looks like a big capital R?” Me: “Yes. Yes, it is.”

  27. that my doctor’s handwriting is almost impossible to read, though the pharmacist seems to understand it

    That’s the way I heard it: only pharmacists can read doctors’ handwriting, and no one can read pharmacists’ handwriting.

  28. Doctors are often called upon to write their signatures many, many times a day.

    I’m curious — what are you signing? Internal paperwork? I haven’t seen a doctor’s signature on anything for years.

  29. David Eddyshaw says

    Internal paperwork?

    Yes. Mounds of it. Hospital notes, letters, acknowledgments that you’ve seen other people’s letters, investigation request forms …

    (Not so much prescriptions, as management decided years ago that it was better to make the patient go to their own GP rather than pick up what you’ve just prescribed for them in the pharmacy of the hospital they’re already at.)

    Though part of this has changed over the past few years as things like hospital notes have been computerised (making the whole process much slower.*) Perhaps, in the fullness of time, doctors’ signatures will revert to the mean.

    * The plus side of computerisation is that the wise and benificent Peter Thiel now has all our medical data.

  30. I’ve heard of doctors here complaining that digitization has become a huge burden. Although I don’t know how doctors wrote up their notes in olden times — quill pens, pots of ink, and parchments?

    Speaking of AI, I saw my hematologist* a couple of weeks ago, and she asked if I would be willing to try a new system in which our conversation would be recorded and AI would summarize it. It was supposed to save time, so I said fine. So now everyone at Google or wherever knows my medical secrets.

    *Yes, I have my very own hematologist. Because I’m special.

  31. David Eddyshaw says

    The digitisation does, in all seriousness, have upsides: the infrastruction needed to take care of paper medical records and get them where they’re needed is considerable, and also seriously fallible. So the arguments in favour are not stupid, and frontline staff tend to be oblivious of such externalities.

    The problem is that the politicians and management who drive the process don’t have to live with the downsides. I’ve never encountered a digital notes system that isn’t significantly slower and significantly less flexible than the Old Ways. So you have less time to actually see and talk to patients, and you have to fit your notes into a template which, no matter how much work has gone into it, artificially constrains what you can record, sometimes seriously. These problems are not all due to the immaturity of the software and hardware: to a large extent they are instrinsic to the digitalisation methodology itself, which encapsulates fundamentally erroneous technocratic misconceptions about the nature of medical consultation itself.

    (You actually end up physically turned away from the patient much of the time, too, though that aspect has potential technological solutions if management can be convinced it’s worth the money. Good luck with that.)

    All this is exacerbated by what one can only call the gullibility of politicians and management in the face of the often unscrupulous providers of the technology. Fanciful claims of potential magic-bullet “savings” are accepted at face value: their inevitable failure to be realised is readily seen by these people as the fault of the people actually trying to use the systems – undertrained at best, and probably plain Luddite saboteurs, whose feedback regarding problems should be discounted as being offered in bad faith anyway.

    One is seeing the same sort of dynamic with “AI”: an “AI” vendor does not need to have a product that can do your job adequately (indeed, it would be a waste of money even to attempt to build one): the salesman only needs to convince your employer (who can’t do your job and doesn’t fully understand what you do) that the “AI” can do your job.

  32. Stu Clayton says
  33. David Eddyshaw says

    There was a university lecturers’ strike in the UK during which the protesting strikers supposedly chanted (as I was told by one who was involved at the time):

    “What do we want?”
    “Rectify the anomaly!”

    “When do we want it?”
    “Rectify the anomaly now!”

    EDIT: it’s actually true:

    https://en.wikipedia.org/wiki/Talk:Association_of_University_Teachers

  34. During a Hollywood screenwriters strike some years, protesters chanted:

    “What do we want?”
    “Residuals!”

    “When do we want them?”
    “Later!”

  35. Doctors are often called upon to write their signatures many, many times a day.

    I’m curious — what are you signing? Internal paperwork? I haven’t seen a doctor’s signature on anything for years.

    In NZ (which presumably closely follows the UK), Doctors are required to put a physical signature on a prescription — even if the patient is taking it to the Pharmacy only a few metres away. (GP practices are these days organised as ‘health hubs’ with all sorts of allied trades in the same building.)

    But also the prescription is electronically filed and that’s what the pharmacist relies on, so often the patient has to stand at the pharmacy counter waiting for it to ‘come through the system’.

    And I can get a repeat prescription just by phoning up the practice and requesting they send it electronically to my local pharmacy. In that case, the GP must authorise the repeat, presumably electronically. So I’m not seeing what benefit the signature gives.

  36. David Marjanović says

    DM: Doctors protesting in the street, holding signs with illegible scrawls on them

    I think that’s exactly the one! Direct link to the picture because Pinterest is bloatware.

  37. Handrwitten prescriptions? In Germany, all prescriptions are supposed to be digital. No handwriting involved at all. And before that, they were nicely printed out,
    I can still remember handwritten prescriptions in Germany, into the 90s, before computers and printers became ubiquitous in practices.

Speak Your Mind