John Heidemann / Software / pdfselect

Pdfselect is a simple command-line tool to extract pages from a PDF file. It follows the psselect options.

Pdfselect exists because prior tools no longer work for various reasons: First, the otherwise good pdftk has not been packaged for Fedora since 2014 due to licensing issues. Alternatives like pdfsplit and pdfunite (part of poppler-utils in poppler) are too low level. Pdf-stapler is another alternative, but with a challenging API.

Download and installation

Sorry, currently (2023-09-22) there is no proper installation support.

Just download the one file and make it executable. It requires the pypdf library (Fedora package python3-pypdf; I use version 4.2.0). (My understanding is that lowercase pypdf supersedes PyPDF2. It has similar APIs to PyPDF2 3.x.)

Download pdfselect-1.2.py (released 2025-03-31).

Older versions

The Pdfselect Manual

usage:

pdfselect [-h] [--pages PAGES] [--debug] [--verbose] [--even] [--odd]
             [--reverse] [--version]
             input_path [output_path]

select PDF pages from a document

positional arguments:

input_path
output_path

Select pages from a PDF file (first argument), writing to an output file (second argument), like psselect, but for pdf. (For people who are sad they can no longer install pdftk.)

Defaults to writing to stdout and reading from stdin if arguments are omittted.

Pages can be given as numeric values, _numerics that count from the end of the document, “begin” and “end”, NAME matches a bookmark with that name, or NAME_1 matches the page before the bookmark with that name.

OPTIONS

(Unlike psselect, we don’t output anything ever. Nor do we have a -q –quiet option to suppress output.)

EXAMPLE

pdfselect -p 2-3 source.pdf dest.pdf

THANKS

Thanks to the pypdf library that does the actual work, and to psselect for a reasonable UI.

Copyright © 2023 by John Heidemann