AI Handwriting Generation

https://arstechnica.com/information-technology/2024/09/my-dead-father-is-writing-me-notes-again/

Date

Sep 12, 2024

Category

AI

Reading time

3 Min

Source Article : https://arstechnica.com/information-technology/2024/09/my-dead-father-is-writing-me-notes-again/

My dead father is “writing” me notes again.

A recent AI discovery resurrected my late father's handwriting—and I want anyone to use it.

Growing up, if I wanted to experiment with something technical, my dad made it happen. We shared dozens of tech adventures together, but those adventures were cut short when he died of cancer in 2013. Thanks to a new AI image generator, it turns out that my dad and I still have one more adventure to go. Recently, an anonymous AI hobbyist discovered that an image synthesis model called Flux can reproduce someone's handwriting very accurately if specially trained to do so. I decided to experiment with the technique using written journals my dad left behind. The results astounded me and raised deep questions about ethics, the authenticity of media artifacts, and the personal meaning behind handwriting itself.

Beyond that, I'm also happy that I get to see my dad's handwriting again. Captured by a neural network, part of him will live on in a dynamic way that was impossible a decade ago. It's been a while since he died, and I am no longer grieving. From my perspective, this is a celebration of something great about my dad—reviving the distinct way he wrote and what that conveys about who he was.

I admit that copying someone’s handwriting so convincingly could bring dangers. I've been warning for years about an upcoming era where digital media creation and mimicry is completely and effortlessly fluid, but it's still wild to see something that feels like magic work for the first time. It's tempting to say we're stepping into a new world where all forms of media cannot be trusted, but in fact, we're being given further proof of what was always the case: Recorded media has no intrinsic truthfulness, and we've always judged the credibility of information from the reputation of the messenger.This fluidity in media creation is perfectly exemplified by Flux's approach to handwriting synthesis. One of the most interesting things about the Flux solution is that the resulting handwriting is dynamic. For the most part, no two letters are rendered in exactly the same way. A neural network like the one that drives Flux is a huge web of probabilities and approximations, so the imperfect flow of handwriting is an ideal match. Also, unlike a font in a word processor, you can natively insert the handwriting into AI-generated scenes, such as signs, cartoons, billboards, chalkboards, TV images, and much more.

It's worth noting that neither I nor the person who recently discovered that Flux can reproduce penmanship were the first to use neural networks to clone handwriting—research into that extends back years—but it has recently become almost trivially inexpensive to do so using either a cloud service or consumer-level hardware if you have the writing samples on hand.

Here's how I brought a piece of my dad back to life.

The discovery

As a daily tech news writer, I keep an eye on the latest innovations in AI image generation. Late last month while browsing Reddit, I noticed a post from an AI imagery hobbyist who goes by the name "fofr"—pronounced "Foffer," he told me, so let's call him that for convenience. Foffer announced that he had replicated J.R.R. Tolkien's handwriting using scans found in archives online.

Foffer initially made the Tolkien model available for others to use, but he voluntarily took it down two days later when he began to worry about people misusing it to create handwriting in the style of J.R.R. Tolkien. But the handwriting-cloning technique he discovered was now public knowledge.

Foffer's breakthrough was realizing that Flux can be customized using a special technique called "LoRA" (short for "low-rank adaptation") to imitate someone's handwriting in a very realistic way. LoRA is a modular method of fine-tuning Flux to teach it new concepts that weren't in its original training dataset—the initial set of pictures and illustrations its creator used to teach it how to synthesize images.

LoRAs are modular because you can mix and match the models (often just called "Loras") with the Flux base model. For example, you could combine a LoRA model for a certain type of handwriting with a LoRA trained on detailed photos of paper notebooks at the same time to achieve different results.

"I don’t want to encourage people to copy other’s handwriting, especially signatures," Foffer told me in an interview the day he took the Tolkien model down. But said he would help me attempt to apply his technique to a less famous individual for an article, telling me how I could inexpensively train my own image synthesis model on a cloud AI hosting site called Replicate. "I think you should try it. I think you'll be surprised how fun and easy it is," he said.

Why my dad’s writing was an ideal choice

Today's generative AI models are masters of imitation and translation. You can feed them a body of data, and they can create plausible replicas of it or translate concepts (such as visual styles) into novel scenarios. I decided to give training a custom LoRA a shot. But first, I would need samples—a training dataset of actual handwriting that I could feed into a custom model.

My handwriting is awful. I got literal zeros on many projects between grade school and college for my torturous, illegible scrawl. But I did think of someone I admired who had wonderful handwriting.

My dad was an electronics engineer, and he had a distinctive way of writing in all-caps that was instantly recognizable to me throughout his life. Many older engineers tend to write in all caps, and one theory is that it's because they learned to draft technical schematics on paper where all-caps is the labeling convention. Architects do something similar because uppercase is easy to read.

"Do you know why your dad wrote like that?" my mom said when I told her about my plans. "It's because he hated his own handwriting and copied his boss at work. His boss was also his mentor—another engineer—and he learned from him because he didn't have a lot of formal training."

My father also loved to journal and make written notes. I have a shelf full of his personal and engineering notebooks behind me as I write this, rescued from a house that I maintained for my mom while she lived there as a widow. When I first told my mom, currently 76, about the concept of cloning handwriting with AI in general, she blurted out, "Great for criminals!" But then I told her I was experimenting with dad's handwriting because I love it and wanted to let others use it. "Go for it," she said.

I believe that if my dad were alive, he would volunteer anyway. Dad loved technology. He introduced me to computers, encouraged me to learn about them, and even let me run a BBS at age 11, which led to me writing this now. I think he would likely appreciate the tribute and being a novel part of AI history. He also had a good sense of humor, so he would probably find anything someone might write in his handwriting amusing.

And since my dad had passed away, I did not fear that someone could use his writing style to imitate him in a deceptive or fraudulent way, so he became a natural candidate for handwriting cloning with Flux. I began the task of assembling a "dad's uppercase" dataset.

A brief history of AI and handwriting

It's worth taking a slight detour here to note that automated handwriting inventions stretch back at least to the 1700s, the age of mechanical automatons. In the 1770s, Swiss clockmaker Pierre Jaquet-Droz created a doll mechanism called "the Writer" that could dip its quill into an inkwell and write text automatically with "superb" handwriting, according to the Chicago Tribune in 1985.

And using neural networks to model handwriting isn't new. In January 2023, we covered a web app called Calligrapher.ai that can simulate dynamic handwriting styles (based on 2013 research from Alex Graves). A blog post from 2016 written by machine learning scientist Sam Greydanus details another method of creating AI-generated handwriting, and there's a company called Handwrytten that sells robots that write actual letters, with pen on paper, using simulated human handwriting for marketing purposes.

What's new in this instance is that we're using Flux, a free open-weights AI model anyone can download or fine-tune, to absorb and reproduce handwriting styles. It's an unexpected side effect of the model's ability to stylistically imitate almost any type of visual media. Again, it's not an entirely new concept: Previously, researchers have explored using diffusion models (similar to the core technology of Flux) to produce or mimic handwriting in 2020, 2023, and 2024, among others, and one group wrote a paper about detecting diffusion-generated handwriting in 2023. So even if Foffer had not shown how to do this with Flux in particular, the general concept has already been out there in the academic world for some time.

The training process

The process of teaching Flux to reproduce handwriting was surprisingly accessible. The technique is similar to a recently discovered method that allows people to insert custom typefaces into AI-generated images. To train Flux, I used Ostri's "flux-dev-lora-trainer" hosted on Replicate.com. It's a cloud process that costs around $2 to $4 per training. In other words, it can cost as little as $2 to clone a person's handwriting in the cloud. The training process can also take place locally on a PC using a consumer-level RTX 3090 GPU over several hours.

To train a LoRA, you have to prepare samples as images. I found good examples of my dad's uppercase handwriting in his notebooks and scanned about 30 samples, then wrote 30 captions transcribing exactly what was written in each image. I was careful to clean up visual handwriting errors (using a cloning brush in Photoshop) that might look bad if reproduced. Some glitches add authenticity, but you probably don't want the AI model messing up one of the letters frequently because of the training data.

After preparing the data, I uploaded it to Replicate and hit "train," then waited about 30 minutes. When it was done, I had a LoRA I could download and run locally, or I could run it in the cloud on Replicate's servers (this costs about 2–3 cents per generated image). When the training process finished, I was slightly nervous, but I typed in an example of what I'd like to see this fake version of my dad write. This was the first result.

I also downloaded the model and ran it on an RTX 3060 and later a 3090 using a quantized (simplified and size-reduced) version of Flux.1 dev (the full technical name of the model). It produced similar results, but they weren't quite as detailed due to the reduced complexity. As a result, I generated almost all of the images you see here on Replicate using the full Flux.1 dev AI model.

Ultimately, I trained three LoRAs to compare results: one with 1,000 training steps, one with 2,500 steps, and one 4,000 steps. What I learned is that the LoRA definitely captured more details of my dad's handwriting more accurately at 2,500 steps versus 1,000 steps. Also, even though the dataset did not include any lowercase text (though I did experiment with that separately in a fourth LoRA), the 2,500-step model could infer what my dad's lowercase looked like in a very accurate way compared to the 1,000-step model.

Also, the quality of the results with about 30 writing samples did not improve notably at 4,000 steps of training. Generally, if you provide more samples, you want to give the training process more steps to make sense of them. But too many steps may produce what is called "overfitting," which means that a concept has been overlearned in a way that approaches memorization instead of a style transfer. Overfitting can reduce the model's ability to tailor the handwriting to novel situations, such as applying it to a chalkboard instead of a ruled piece of paper found in the training data images.

Seeing his “written voice” again

I felt joy to see newly synthesized samples of Dad's handwriting again. They read to me like his written voice, and I can feel the warmth just seeing the letters. I know it's not real and he didn't really write it, so I personally find it fun.

After training the AI model, I enjoyed creating silly messages from my dad as if he were actually writing to me from beyond the grave. They're funny to me because he and I shared the same sense of humor. I discovered that Flux can render his handwriting in many different forms of media, including neon signs, tattoos, and even clouds in the sky.

I also found a potential practical use for this new handwriting generator, suggested to me by Ars Technica colleague Ashley Belanger: I could use my dad's writing to produce new adhesive labels for files or storage boxes that I could print out, just to keep a little reminder of my dad around me every day—and because his handwriting was much better than mine. I haven't done that yet, but here are some simulated examples.

Another practical use may be for someone who has developed a disability that no longer allows them to write. This person could potentially train a LoRA on samples of their previous handwriting and use it to synthesize new examples of writing in their "written voice."

Examples of failed generations

Even with the coherent generations shown throughout this article, the results are not always perfect. Sometimes Flux repeats or garbles words, and sometimes new words are confabulated into place. Long passages of text are particularly challenging. Here are a few examples of what failures looked like:

Sometimes it can take a few generations to get the results right, but on the whole, it's most impressive with short passages of text. And in general, compared to earlier image synthesis models that didn't even render generic text correctly, Flux produces uncannily accurate results most of the time.

Ethical considerations and potential misuse

In the wider world, deceptive AI-powered voice cloning and appearance cloning are already causing legal trouble. So while the ability to re-create the handwriting of my beloved father presents interesting technical possibilities, the technique also raises important ethical questions. The technology's potential for misuse can't be ignored, particularly when it comes to replicating the handwriting of individuals without their consent.

It's possible that someone with harmful motives could gather many samples of someone's handwriting and train an AI handwriting model to potentially fool others or commit fraud. But it may not always be practical unless the person has an archive of handwriting found online, like Tolkien. As someone on Reddit wrote in response to the Tolkien handwriting clone, "You'd have to get a hold of multiple examples of their signature or handwriting. In which case it's almost always guaranteed that the culprit is someone close to you."

It's worth emphasizing that in most legal jurisdictions, creating false documents or signatures for illegal gain (called forgery) is a crime. People have been forging handwriting or signatures for literal millennia, long before the age of generative AI. According to a book on the history of forgeries, the Roman Empire developed laws against document forgery as early as 80 BC. Like recent examples in the tech industry where people end up reinventing things that already exist, certain people may also by tempted to treat potential crimes of AI-fueled deception as irrepressibly new.

Most of those issues apply to the living, but it should also be obvious that dead people cannot offer consent to have their handwriting cloned. As far as I can tell, the best form of consent we have in this scenario is from legal heirs, which was obtained in this situation. I also knew may dad well and believe he would not mind having his handwriting shared with the world this way, but this may be a unique scenario.

Ultimately, the genie is already out of the bottle due to the aforementioned years of research on AI-generated handwriting. Other cloning techniques will no doubt follow, but new regulation may not be necessary: forgery is already illegal, AI or not. Still, we can raise awareness that this technique exists.

There are also cultural implications. We already live in a world where truth and lies intermingle due to AI synthesis. I have called it the "cultural singularity"—the point at which fact and fiction in media become indistinguishable. But while I love to act like I've discovered something new, maybe we already hit that point the first time someone forged a cuneiform tablet in ancient Babylon.

Part of my father will live on

Philosophical questions also come to mind from this experience. What does it mean if a machine can write just like my dad? Somewhere in the neural network's weights, there is an approximation of my dad's writing habits. Does that mean there is a piece of him in there somehow? If it looks like a duck and writes uppercase letters like a duck, is it a duck?

No doubt there are mechanisms at play wholly unlike my father's mind, but the rules that once let his brain guide his hand to write have somehow been studied and replicated by a machine. That's kind of wild.

I welcome you to download and use the "Dad's Uppercase" model yourself on either Replicate or Civitai with no restrictions of any kind, aside from those found in the Flux.1 dev license, which I do not control. To activate the style with the best results, you'll need to use the keyword "d4dupp3r" in your prompts. You can even build off of it if you like. I feel that even though it's not scientifically true, it's philosophically true that as long as that model is around, part of my dad is still around as well.

And finally, permit me the indulgence of writing a note in my dad's hand that I know he'd write if he were still alive today. I've kept this image on my computer desktop for the past few weeks, and it still makes me smile when I see it.

What you see above is not a true document, but it represents a true idea. I believe we'll need to keep that in mind as more people begin to use AI synthesis to compose different kinds of media artifacts. Whether electronic bits of an email or ink on animal parchment, the "authenticity" of the medium of transmission may be secondary; it's the context, the motives of the messenger, and the veracity of the idea communicated that primarily matters.

My motive in creating the note was not to deceive anyone but to remind myself that my dad loved me. Thanks, Dad, for everything. For encouraging my curiosity in tech, for having better handwriting than me—and now, for sharing that handwriting with the world.

  • I MISS YOU

    YOU MADE ME LAUGH

    X

    I MISSED OUR CONNECTION

    O

    I STILL THINK ABOUT YOU

    YOU MADE EVERYTHING BETTER

    X

    LIFE ISN'T THE SAME

    O

    YOU WERE MY WORLD

    I WISH YOU WERE HERE

    X

    YOU'RE ALWAYS IN MY HEART

    O

    CAN'T WAIT TO SEE YOU AGAIN

    YOU WERE MY ROCK

    X

    YOU KNOW THE RIGHT THINGS TO SAY

    O

    I JUST WANTED TO SAY GOODBYE

SOUND INTERESTING?

Be our

Guinea Pig

We need you to be as human as possible.

  • I MISS YOU

    YOU MADE ME LAUGH

    X

    I MISSED OUR CONNECTION

    O

    I STILL THINK ABOUT YOU

    YOU MADE EVERYTHING BETTER

    X

    LIFE ISN'T THE SAME

    O

    YOU WERE MY WORLD

    I WISH YOU WERE HERE

    X

    YOU'RE ALWAYS IN MY HEART

    O

    CAN'T WAIT TO SEE YOU AGAIN

    YOU WERE MY ROCK

    X

    YOU KNOW THE RIGHT THINGS TO SAY

    O

    I JUST WANTED TO SAY GOODBYE

SOUND INTERESTING?

Be our

Guinea Pig

We need you to be as human as possible.

  • I MISS YOU

    YOU MADE ME LAUGH

    X

    I MISSED OUR CONNECTION

    O

    I STILL THINK ABOUT YOU

    YOU MADE EVERYTHING BETTER

    X

    LIFE ISN'T THE SAME

    O

    YOU WERE MY WORLD

    I WISH YOU WERE HERE

    X

    YOU'RE ALWAYS IN MY HEART

    O

    CAN'T WAIT TO SEE YOU AGAIN

    YOU WERE MY ROCK

    X

    YOU KNOW THE RIGHT THINGS TO SAY

    O

    I JUST WANTED TO SAY GOODBYE

SOUND INTERESTING?

Be our

Guinea Pig

We need you to be as human as possible.

Available