

QPDF is “a command-line program that does structural, content-preserving transformations on PDF files”. Binary distributions (as JAR packages) are available here (you’ll need the “standalone” JARs). It includes a set of command-line tools for various PDF processing tasks. Apache PDFBoxĪpache PDFBox is an open source Java library for working with PDF documents. It supports all PDF versions up to PDF 1.7 (ISO-32000). The documentation explicity mentions its main focus is strong support for batch processing and scripting via a rich command line. Pdfcpu is a PDF processor that is written in the Go language. On Debian-based systems the Poppler tools are part of the package poppler-utils. Also, Xpdf has a separate pdftopng tool for converting PDF to PNG images (this functionality is covered by pdftoppn in the Poppler version). The tools in Xpdf are largely identical, but don’t include pdfseparate, pdfsig, pdftocairo, and pdfunite.

They all work under Linux (which is the main OS I’m using these days), but most of them are available for other platforms (including Windows) as well. using additional options or alternative output formats), and they should probably best seen as (hopefully useful) starting points for the reader’s own explorations.Īll of the tools presented here are published as open-source, and most of them have a command-line interface. Also, many of the example commands in this post can be further refined to particular needs (e.g. So there’s probably a fair amount of selection bias here, and I don’t want to make any claims of presenting the “best” way to do any of these tasks here. Some of these tasks could be done using other tools (including ones that are not mentioned here), and in some cases these other tools may well be better choices. It was guided to a great degree by the PDF-related issues I’ve encountered myself in my day to day work.

Inspection of embedded image information.Document information and metadata extraction.Starting with a brief overview of some general-purpose PDF toolkits, I then move on to a discussion of the following specific tasks: It is largely based on a multitude of scattered lists, cheat-sheets and working notes that I made earlier. This post is an attempt to (finally) bring together my go-to PDF analysis and processing tools and commands for a variety of common tasks in one single place.
#Commands like pdfinfo software#
Over the years, I’ve been using a variety of open-source software tools for solving all sorts of issues with PDF documents.
