Up and running in seconds.
One install. Three lines of code.
$ cargo add pdf_oxideuse pdf_oxide::PdfDocument;
let mut doc = PdfDocument::open("paper.pdf")?;
let text = doc.extract_text(0)?;
let images = doc.extract_images(0)?;One Library.
Create|Edit|Extract
Every PDF operation in a single dependency.
No wrappers, no subprocess calls, no C/C++/Java runtimes.
Create
Build PDFs from any source format.
- MarkdownConvert Markdown to pixel-perfect PDFs with headings, lists, tables, and code blocks.
- HTMLTurn HTML markup into structured PDF documents with full CSS layout support.
- ImagesSingle or multi-page PDFs from PNG, JPEG, and TIFF with automatic sizing.
- QR & BarcodesCode128, EAN-13, UPC-A, and QR codes with configurable error correction.
- Builder APIFluent PdfBuilder chain for page size, margins, fonts, metadata, and headers.
- FormsText fields, checkboxes, radio buttons, dropdowns, stamps, and watermarks.
Edit
Modify any part of an existing PDF.
- DOM EditingFind text, replace content, and restyle — navigate the PDF like a web page.
- PagesRotate, crop, merge documents, extract page ranges, and reorder.
- FormsGet and set field values, add or remove fields, flatten to static content.
- AnnotationsAdd highlights, notes, and links. Modify or flatten selectively.
- ImagesReposition, resize, and replace embedded images with exact bounds.
- SecurityAES-256 encryption, passwords, and fine-grained permission flags.
Extract
Get everything out of any PDF.
- TextFull-page text, styled spans with font metadata, or per-character positions.
- ImagesContent streams, nested Form XObjects, and inline images with color spaces.
- MarkdownClean Markdown or HTML with heading detection and table preservation.
- FormsAll field values and types. Export to FDF or XFDF. XFA analysis.
- MetadataXMP, Dublin Core, page labels, catalog, and trailer dictionaries.
- SearchFull-text search with regex, case-insensitive, and whole-word modes.
5× faster than every alternative.
Benchmarked on 3,830 real-world PDFs from 3 public test suites.
Node.js, Go, and C# share the same Rust core — expect matching numbers.
| Library | Language | Mean | p99 | Pass Rate | License |
|---|---|---|---|---|---|
| PDF Oxide | 7 languages | 0.8ms | 9ms | 100% | MIT |
| PyMuPDF | Python | 4.6ms | 28ms | 99.3% | AGPL-3.0 |
| oxidize_pdf | Rust | 13.5ms | 11ms | 99.1% | MIT |
| pypdfium2 | Python | 4.1ms | 42ms | 99.2% | Apache-2.0 |
| pdfminer | Python | 16.8ms | 124ms | 98.8% | MIT |
| pdfplumber | Python | 23.2ms | 189ms | 98.8% | MIT |
| pypdf | Python | 12.1ms | 97ms | 98.4% | BSD-3 |
| unpdf | Rust | 2.8ms | 10ms | 95.1% | MIT |
| pdf_extract | Rust | 4.08ms | 37ms | 91.5% | Apache-2.0 |
| lopdf | Rust | 0.3ms | 2ms | 80.2% | MIT |
Measured on 3,830 PDFs (veraPDF, Mozilla pdf.js, DARPA SafeDocs).
Single-thread, no warm-up, 60s timeout.
100% reliable. Zero surprises.
Tested on 3,830 PDFs from three independent public test suites.
Zero panics. Zero timeouts. Zero crashes.
without a single failure
vs PyMuPDF & pypdfium2
correctly rejected
The corpus covers every PDF version (1.0–2.0), encrypted files, malformed documents, CJK encodings, and DARPA SafeDocs security edge cases designed to crash vulnerable parsers.