Ghostboard pixel Skip to content

Google's Watermark Tech to Detect AI-Generated Text is Now Open Source

It's time we all get to detect AI-generated text easily, at least, most of it.

Just as AI has risen to great heights, so has the occurrence of misinformation and plagiarism of people's work. Some of the biggest names in the space have been taken to court over this, and things are only getting worse.

To remedy such problems with AI models, Google introduced a tool called SynthID back in 2023, which can be used to embed digital watermarks directly into AI-generated images, videos, text, and videos.

Now, it turns out they have open sourced the text watermarking component of it.

SynthID Text Goes Open: What To Expect?

a screenshot of a blog on hugging face titled introducing synthid text
An excerpt from the announcement blog.

Meant as a solution for identifying AI-generated text, SynthID Text can run alongside a LLM without affecting its performance, accuracy, or generation quality. It makes use of a pseudorandom function called the g-function that runs in the background, adding a watermark that can't be recognized by a human.

Currently, it is being used by Gemini and Google's various other enterprise-focused online chatbots, and now, it can be implemented on other AI models and large language models (LLMs) too.

In a conversation with MIT Technology Review, Pushmeet Kohli, VP of Research at Google DeepMind, added that:

Now, other [generative] AI developers will be able to use this technology to help them detect whether text outputs have come from their own [large language models], making it easier for more developers to build AI responsibly.

Before you go on to implement SynthID Text in your AI model, keep in mind that it has a few limitations.

For starters, even though it can safeguard against cropped text and modified words, the watermarking is less effective with factual responses. Similarly, it is ineffective when the AI-generated text is completely rewritten or translated into a different language.

Want To Check It Out?

If you want to dive into the technical details of SynthID Text, the technical paper is worth a read. On the other hand, if you just want an overview, then the official documentation and the announcement blog on Hugging Face are what you are looking for.

Google has also made a reference implementation and a demo of SynthID Text available.

Suggested Read 📖

14 Top Open Source LLMs For Research and Commercial Use
There are hundreds of open-source LLMs, here, we handpick some of the best ones for you to check out.

Here's why you should opt for It's FOSS Plus Membership

  • Even the biggest players in the Linux world don't care about desktop Linux users. We do.
  • We don't put content behind paywall. Your support keeps it open for everyone. Think of it like 'pay it forward'.
  • Don't like ads? With the Plus membership, you get an ad-free reading experience.
  • When millions of AI-generated content is being published daily, you read and learn from real human Linux users.
  • It costs just $2 a month, less than the cost of your favorite burger.

Latest