Difference between revisions of "PDF tools"

From Simson Garfinkel
Jump to navigationJump to search
m
Line 16: Line 16:


* https://stackoverflow.com/questions/391005/convert-html-css-to-pdf-with-php
* https://stackoverflow.com/questions/391005/convert-html-css-to-pdf-with-php
==Extract text from PDF==
* pymupdf (python module)

Revision as of 16:31, 7 April 2023

PDF page manipulation

  • pdftk - combines, removes, and rotates pages in PDFs
  • pdfjam - resizes pages (by running through LaTeX)

PDF OCR

  • ocrmypdf - creates PDF/A files and runs tesseract

HTML to PDF

Other sources:

Extract text from PDF

  • pymupdf (python module)