PDFBox Splitter for Creating Multiple PDFs from One PDF Document

To split a PDF document into multiple PDF files in Java, use the org.apache.pdfbox.multipdf.Splitter class from Apache PDFBox. The split() method accepts a loaded PDDocument and returns a List<PDDocument>, where each item is one output PDF.

In this tutorial, we shall learn how to split a PDF into separate page files, split a PDF at a fixed page interval, and split only a selected page range. The examples use PDFBox 2.x style loading with PDDocument.load(file). A PDFBox 3.x loading note is included after the main examples.

Useful PDFBox references for this topic are the PDFBox Splitter Javadocs, the PDFBox 2.0 command-line tools, and the PDFBox 3.0 migration guide.

PDFBox Dependency and Output Folder Setup for Splitting PDFs

If you are using Maven, add the PDFBox dependency to your project. The existing examples below are written for PDFBox 2.x APIs.

</>
Copy
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.36</version>
</dependency>

The sample Java programs save the split PDFs under /home/tk/pdfs/. Create this folder before running the examples, or change the output path to a folder available on your system.

</>
Copy
mkdir -p /home/tk/pdfs

How PDFBox Splitter Decides the PDF Page Groups

The Splitter class gives you three important controls for common PDF split operations.

  • splitter.split(document) splits the loaded source PDF and returns the generated PDF documents.
  • splitter.setSplitAtPage(n) sets the number of pages in each output PDF. The default value is 1, so every page becomes a separate PDF.
  • splitter.setStartPage(n) and splitter.setEndPage(n) limit the split operation to a 1-based page range.

For example, if the input PDF has 5 pages and you call setSplitAtPage(2), PDFBox creates 3 output files: pages 1-2, pages 3-4, and page 5.

PDFBox Example 1 – Split Every PDF Page into a Separate PDF File

In this example, we will take PDF with multiple pages, and split this PDF document to multiple PDFs where each resulting PDF document contains only one page from the source document.

SplitPDFExample.java

</>
Copy
import org.apache.pdfbox.multipdf.Splitter; 
import org.apache.pdfbox.pdmodel.PDDocument;

import java.io.File; 
import java.io.IOException; 
import java.util.List; 
import java.util.Iterator;

public class SplitPDFExample {

	public static void main(String[] args) throws IOException {
	      File file = new File("/home/tk/sample_pdf.pdf");
	      
	      // load pdf file
	      PDDocument document = PDDocument.load(file); 

	      // instantiating Splitter
	      Splitter splitter = new Splitter();
	      
	      // split the pages of a PDF document
	      List<PDDocument> Pages = splitter.split(document);

	      // Creating an iterator 
	      Iterator<PDDocument> iterator = Pages.listIterator();

	      // saving splits as pdf
	      int i = 0;
	      while(iterator.hasNext()) {
	         PDDocument pd = iterator.next();
	         // provide destination path to the PDF split
	         pd.save("/home/tk/pdfs/sample_part_"+ ++i +".pdf");
	         System.out.println("Saved /home/tk/pdfs/sample_part_"+ i +".pdf");
	      }
	      System.out.println("Provided PDF has been split into multiple.");
	      document.close();
	}

}

Output

Saved /home/tk/pdfs/sample_part_1.pdf
Saved /home/tk/pdfs/sample_part_2.pdf
Saved /home/tk/pdfs/sample_part_3.pdf
Saved /home/tk/pdfs/sample_part_4.pdf
Saved /home/tk/pdfs/sample_part_5.pdf
Saved /home/tk/pdfs/sample_part_6.pdf
Provided PDF has been split into multiple.

The file names are generated with a counter: sample_part_1.pdf, sample_part_2.pdf, and so on. In production code, close every split PDDocument after saving it so that file handles and memory are released.

PDFBox Example 2 – Split PDF Pages in Groups with setSplitAtPage()

Following is a step by step guide to split a PDF document into multiple PDF documents which have been split at a particular interval in source document.

In the following program, splitter.setSplitAtPage(2) creates output PDFs with two pages each, except the last output PDF when the source document has an odd number of pages.

SplitPDFAtPageExample.java

</>
Copy
import org.apache.pdfbox.multipdf.Splitter; 
import org.apache.pdfbox.pdmodel.PDDocument;

import java.io.File; 
import java.io.IOException; 
import java.util.List; 
import java.util.Iterator;

public class SplitPDFAtPageExample {

	public static void main(String[] args) throws IOException {
	      File file = new File("/home/tk/sample_pdf.pdf");
	      
	      // load pdf file
	      PDDocument document = PDDocument.load(file); 

	      // instantiating Splitter
	      Splitter splitter = new Splitter();
	      
	      splitter.setSplitAtPage(2);
	      
	      // split the pages of a PDF document
	      List<PDDocument> Pages = splitter.split(document);

	      // Creating an iterator 
	      Iterator<PDDocument> iterator = Pages.listIterator();

	      // saving splits as pdf
	      int i = 0;
	      while(iterator.hasNext()) {
	         PDDocument pd = iterator.next();
	         pd.save("/home/tk/pdfs/sample_part_"+ ++i +".pdf");
	         System.out.println("Saved /home/tk/pdfs/sample_part_"+ i +".pdf");
	      }
	      
	      // close the document
	      document.close();
	}

}

Output

Saved /home/tk/pdfs/sample_part_1.pdf
Saved /home/tk/pdfs/sample_part_2.pdf
Saved /home/tk/pdfs/sample_part_3.pdf

By default, splitAtPage is set to 1. Pass a value greater than zero. Passing 0 or a negative value results in an invalid split size.

PDFBox Example 3 – Split Only a Selected PDF Page Range

Use setStartPage() and setEndPage() when you do not want to split the entire source PDF. The start and end page values are 1-based. The following example takes pages 3 to 8 from the source PDF and creates output PDFs with two pages each.

SplitPDFPageRangeExample.java

</>
Copy
import org.apache.pdfbox.multipdf.Splitter;
import org.apache.pdfbox.pdmodel.PDDocument;

import java.io.File;
import java.io.IOException;
import java.util.List;

public class SplitPDFPageRangeExample {

    public static void main(String[] args) throws IOException {
        File file = new File("/home/tk/sample_pdf.pdf");

        try (PDDocument document = PDDocument.load(file)) {
            Splitter splitter = new Splitter();

            splitter.setStartPage(3);
            splitter.setEndPage(8);
            splitter.setSplitAtPage(2);

            List<PDDocument> parts = splitter.split(document);

            int partNumber = 1;
            for (PDDocument part : parts) {
                try (PDDocument output = part) {
                    String outputPath = "/home/tk/pdfs/range_part_" + partNumber + ".pdf";
                    output.save(outputPath);
                    System.out.println("Saved " + outputPath);
                    partNumber++;
                }
            }
        }
    }
}

Output

Saved /home/tk/pdfs/range_part_1.pdf
Saved /home/tk/pdfs/range_part_2.pdf
Saved /home/tk/pdfs/range_part_3.pdf

This is useful when you need only a chapter, invoice range, report section, or selected pages from a larger PDF. Always make sure that the end page is not greater than the number of pages in the source PDF.

PDFBox 3.x Loading Change for Split PDF Java Examples

In PDFBox 3.x, PDF loading moved from PDDocument.load(...) to the org.apache.pdfbox.Loader class. If you are using PDFBox 3.x, replace the loading line in the examples with Loader.loadPDF(file) and import org.apache.pdfbox.Loader.

</>
Copy
import org.apache.pdfbox.Loader;
import org.apache.pdfbox.pdmodel.PDDocument;

File file = new File("/home/tk/sample_pdf.pdf");

try (PDDocument document = Loader.loadPDF(file)) {
    // Use Splitter here
}

The Splitter usage remains the same for the examples shown here; the main difference is how the source PDF is loaded.

Split PDFs from Command Line with PDFBox PDFSplit

If you only need to split a PDF file and do not need custom Java logic, PDFBox also provides the PDFSplit command-line tool through the standalone PDFBox app JAR.

</>
Copy
java -jar pdfbox-app-2.0.36.jar PDFSplit -split 2 --outputPrefix /home/tk/pdfs/sample_part /home/tk/sample_pdf.pdf

The -split 2 option means each generated PDF should contain two pages. You can also use -startPage and -endPage with PDFSplit when you want to split only part of the source document.

PDFBox Split PDF Troubleshooting Notes

  • Output folder missing: create the destination folder before calling save(), otherwise Java may throw a file path related exception.
  • Encrypted PDF: load the PDF with the correct password first. Splitter works on a successfully loaded PDDocument.
  • Large PDF files: close the source document and each split document after saving. For very large PDFs, test memory usage with realistic input files.
  • Unexpected number of output files: check setSplitAtPage(), setStartPage(), and setEndPage(). Start and end page values are 1-based.
  • PDFBox 3.x compile error for PDDocument.load: use Loader.loadPDF(file) instead.

PDFBox Split PDF FAQs

How do I split a PDF into multiple PDFs using PDFBox?

Load the source file into a PDDocument, create a Splitter, call splitter.split(document), and save each returned PDDocument as a separate PDF file.

How can I split every page of a PDF into a separate file in Java?

Do not set a custom split interval. The default splitAtPage value is 1, so PDFBox creates one output document for each page in the source PDF.

How do I split a PDF every 2 pages with Apache PDFBox?

Call splitter.setSplitAtPage(2) before calling splitter.split(document). Each output PDF will contain two pages, except the last one if the remaining page count is less than two.

Can PDFBox split only pages 5 to 10 of a PDF?

Yes. Use splitter.setStartPage(5) and splitter.setEndPage(10). You may also set setSplitAtPage() if you want the selected range divided into smaller output PDFs.

Why does PDDocument.load(file) not compile in PDFBox 3.x?

PDFBox 3.x removed the old PDDocument.load(...) loading methods. Use Loader.loadPDF(file) from org.apache.pdfbox.Loader when working with PDFBox 3.x.

QA Checklist for PDFBox Split PDF Java Tutorial

  • Confirm that the tutorial clearly distinguishes PDFBox 2.x loading from PDFBox 3.x loading.
  • Verify that every new code block uses a PrismJS-compatible language class or the output class.
  • Check that setSplitAtPage() is explained as the number of pages per output PDF, not a single page number.
  • Check that setStartPage() and setEndPage() are described as 1-based page range controls.
  • Ensure that the output folder exists before running the examples.
  • Confirm that split PDDocument objects are closed in any production-ready example.

PDFBox Split PDF Tutorial Summary

In this PDFBox Tutorial, we have learnt to split a PDF document into multiple PDFs using the Splitter class. We covered splitting every page into a separate file, splitting by a fixed number of pages, splitting a selected page range, using the PDFBox 3.x loader, and running a PDF split from the command line.