pypdf
py-pdf/pypdf
A pure Python PDF library that helps you merge, split, encrypt files, or extract text, perfect for those who want to automate office tasks with code.
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
AI Summary
What This Project Does
Simply put, it's a PDF manipulation plugin for Python programs, allowing code to read, edit, and save PDF files like a human.
What Problems It Solves
Solves the tedium of manually opening hundreds of PDF files, such as merging a pile of contracts into one or extracting text from hundreds of reports for analysis.
Who It's For
Mainly Python developers, automation engineers, or anyone who wants to batch process documents using scripts.
Typical Use Cases
1. Automatically merge scattered monthly reports into an annual summary.
2. Automatically extract amounts and dates from a large number of PDF invoices.
3. Batch add access passwords to protect sensitive documents.
4. Crop images from PDFs and save them separately.
Key Strengths & Highlights
Pure Python writing, easy installation with no dependencies, open source and free, active community so solutions are easy to find.
Getting Started Requirements
Requires you to know how to write Python code, not a clickable software, needs to be called within a program.
Purpose
Ideal for developers needing batch PDF processing or document handling features. If you don't code and just want manual tools, this isn't for you.
Category
Tech Stack
Project Info
- Primary Language
- Python
- Default Branch
- main
- License
- NOASSERTION
- Created
- Jan 6, 2012
- Last Commit
- yesterday
- Last Push
- yesterday
- Indexed
- Apr 21, 2026