Skip to content

Google Just Open-Sourced its AI-Powered Tool 'Magika' to Help Step Up the Cybersecurity Ecosystem

This AI tool is open-sourced by Google and available on GitHub.

Google is a name that most of us are familiar with. Even though they are known for making the headlines, they do support an impressive lineup of open-source projects that have shaped how we experience the internet today.

Now, with the launch of their AI Cyber Defense Initiative, they have open-sourced Magika; their AI-powered file-type identification tool, in a bid to help others take advantage of its capabilities, and build upon it.

Suggested Read 📖

Google Discloses That Incognito Mode in Chrome Isn’t Entirely ‘Private’
Google is informing users what incognito mode really is…

Magika: What Is It?

a screenshot of the magika website banner

Magika is a tool that can be used to detect the most commonly used file types such as PNG, JPG, PDF, APK, and quite a few others by using the power of artificial intelligence.

Google claims that it can easily outperform traditional tools and methods of file identification, with an average precision level of over 99%. The most obvious use case of this would be in the field of cybersecurity, but, more on that later.

Magika isn't something that just appeared out of thin air, Google had been using it internally with Gmail, Drive, and Safe Browsing for forwarding files to the relevant security and content policy scanners.

All of that was possible, thanks to the implementation of a custom, highly optimized deep-learning model that has been tailored and trained using Keras that weighs ~1 MB.

Inference times are also quite fast thanks to Onnx, which ensures fast operations in just a matter of milliseconds; similar to non-AI tools, even when using a CPU.

a bar graph that shows the average file types f1 score of magika compared to others
Source: Google

They also shared some helpful benchmarks that compared Magika against other tools, and the average F1 score resulted in about a 20% uptick in performance when pitched against other tools on a 1M files benchmark with over 100 file types.

Helping the Cybersecurity Game

A tool like Magika can be a very potent thing to have by your side, as file scanning at such speeds was previously unheard of. Open sourcing this has opened the door for many security-focused services and products to use this as a reliable component in providing better security to their customers.

Google has themselves already begun work on integrating Magika into VirusTotal; the online service which they acquired in 2012. It helps analyze suspicious files and URLs.

And, with Magika AI integration, they plan to further bolster its existing Code Insight functionality.

The official announcement blog has more details if you are up for it, and stick around a bit longer to learn how to try Magika.

How Can You Try It?

a screenshot of the magika demo in action

The most straightforward way for trying out Magika is the demo hosted on the official website. As you can see above, it can easily distinguish file types for multiple uploaded files.

📋
The screenshot only shows the first file's result, the rest were below it.

If you want to run it locally, or on a server, then you can install it as a Python package:

pip install magika

Then, run it using the following command to start it:

magika

For command examples, or official documentation, I highly suggest you give Magika's GitHub repo a visit. Though, at the time of writing, it had a weird disclaimer at the bottom that said 😐

This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.

My best bet is that either this is a mistake, or I am missing something here. Anyway, only time will tell which one is the correct assumption.

Another interesting bit 💡

When asked during a discussion over at Hacker News — why they released a Node module for Magika, one of the co-authors, Elie Bursztein said that:

We did release the npm package because indeed we create a web demo and thought people might want to also use it. We know it is not as fast as the python version or a C++ version – which why we did mark it as experimental.
The release include the python package and the cli which are quite fast and is the main way we did expect people to use – sorry if that hasn't be clear in the post.

There also seems to be plans for a .deb and similar packages that I spotted on one of the newly created issues on the Magika repo. It is nice to see that they intend to support Linux in more ways than one.

💬 What do you think of this move by Google? Was open-sourcing such a tool the right call?


More from It's FOSS...

Latest