Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
-
Updated
Mar 25, 2026 - Python
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Get your documents ready for gen AI
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
borb is a library for reading, creating and manipulating PDF files in python.
Open source Python library for converting PDF to DOCX.
A library for converting HTML into PDFs using ReportLab
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.
文档(小说、论文、字幕)翻译工具(支持 pdf/word/excel/json/epub/srt...)Document (Novel, Thesis, Subtitle) Translation Tool (Supports pdf/word/excel/json/epub/srt...)
📚 Process PDFs, Word documents and more with spaCy
Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is Ideal for large-scale workflows, it offers text/table extraction, OCR, and batch processing with sync/async endpoints.
Simple yet powerful automation stuffs.
天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成 | Vue3+FastAPI全栈方案 | 文档解析 | 多模态信息提取
pdfCropMargins -- a program to crop the margins of PDF files
Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG
A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk
Interact with the Deep Search platform for new knowledge explorations and discoveries
Convert PDF files to nicely structured Markdown and EPUB format with intelligent layout detection using AI.
Batch convert Word, Excel, PPT to PDF - 批量转 Word、Excel、PPT 为 PDF
Self hosted file converter for images, video, audio, json, excel and more. Supports over 2,000 conversions!
Add a description, image, and links to the pdf-converter topic page so that developers can more easily learn about it.
To associate your repository with the pdf-converter topic, visit your repo's landing page and select "manage topics."