google/magika
google/magika
Google's AI tool that accurately identifies real file types, preventing disguised files, with high speed and precision.
Fast and accurate AI powered file content types detection
AI Summary
What This Project Does
Simply put, it's a tool that sees through file extensions. It uses AI to analyze content and tells you if a file is truly a PDF, image, or executable.
What Problems It Solves
Solves security risks from disguised files. Hackers might hide viruses as .txt, but this tool sees the truth. It replaces old methods that only check filenames.
Who It's For
Perfect for security researchers, sysadmins, or developers handling bulk downloads. Ordinary users concerned about download safety can try it too.
Typical Use Cases
1. Automatically scan download folders to identify potential malware.
2. Quickly categorize and filter content in email or cloud drive systems.
3. Prevent dangerous formats during file upload feature development.
4. Organize messy computer files by their real type.
Key Strengths & Highlights
Fast (5ms per file), high accuracy (99%), small model size. Tested by Google for Gmail and virus scanning, ensuring high trust.
Getting Started Requirements
Requires Python environment or CLI tool installation. No API Key needed, runs locally. A web demo is available for instant trial.
Purpose
Suitable for security scanning or file organization requiring precise content identification. Not needed if just checking extensions.
Category
Tech Stack
Project Info
- Primary Language
- Python
- Default Branch
- main
- License
- Apache-2.0
- Created
- Aug 22, 2023
- Last Commit
- 1 months ago
- Last Push
- 1 months ago
- Indexed
- Apr 18, 2026