Merge Multiple PDFs to Single PDF using Apache PDFBox

Apache PDFBox can merge two or more PDF files into one PDF document using the PDFMergerUtility class. This is useful when a Java application has to combine invoices, reports, scanned pages, generated PDF documents, or attachments into a single downloadable file.

In this tutorial, we will learn the steps required to merge multiple PDF documents to a single PDF.

To merge multiple PDFs to a single PDF, create an instance of PDFMergerUtility, set the destination file name, add each source PDF in the required order, and call mergeDocuments(). You may merge as many PDF files as required, subject to memory, file size, and application limits.

PDFBox PDF Merge Requirements

The examples in this tutorial use Java and Apache PDFBox. If you are using Maven, add the PDFBox dependency to your project. Use the version that matches your project standards.

</>
Copy
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.30</version>
</dependency>

The main PDFBox class used here is org.apache.pdfbox.multipdf.PDFMergerUtility. For method-level details, refer to the Apache PDFBox API documentation for PDFMergerUtility.

Steps – Merge Multiple PDF Files

Following is a step by step guide to merge multiple PDF files.

Step 1: Load PDF Files

Load all the source PDF files you wish to merge.

</>
Copy
File file1 = new File("/home/tk/sample_1.pdf");
File file2 = new File("/home/tk/sample_2.pdf");
File file3 = new File("/home/tk/sample_3.pdf");Merge Documents.

Each source file should point to an existing PDF document. The order in which you add these files later decides the page order in the merged PDF.

Step 2: Instantiate PDFMergerUtility

PDFMergerUtility Class contains routines to merge PDFs.

</>
Copy
PDFMergerUtility pdfMerger = new PDFMergerUtility();

Step 3: Set Destination

Set path to destination file using PDFMergerUtility.setDestinationFileName(String fileName) method.

</>
Copy
pdfMerger.setDestinationFileName("/home/tk/sample_pdf.pdf");

The destination path is the final PDF file that PDFBox creates after merging all source documents. Make sure the application has write permission for the destination folder.

Step 4: Add all PDFs

Add all the source PDF files to be merged, to PDFMergerUtility using PDFMergerUtility.addSource() method.

</>
Copy
pdfMerger.addSource(file1);
pdfMerger.addSource(file2);
pdfMerger.addSource(file3);

Add all the source pdf files one by one in the sequence you would like to find in the final merged PDF file.

Step 5: Merge Documents

And finally call the method PDFMergerUtility.mergeDocuments() to merge all the documents.

</>
Copy
pdfMerger.mergeDocuments(null);

For larger PDF files, prefer using a memory usage setting instead of passing null. This allows PDFBox to use temporary storage when needed and can reduce memory pressure in long-running applications.

Complete Java Program

MergePDFsExample.java

</>
Copy
import org.apache.pdfbox.multipdf.PDFMergerUtility;

import java.io.File; 
import java.io.IOException;

public class MergePDFsExample {

	public static void main(String[] args) throws IOException {
	      // load pdf files to be merged
	      File file1 = new File("/home/tk/sample_1.pdf");
	      File file2 = new File("/home/tk/sample_2.pdf");
	      File file3 = new File("/home/tk/sample_3.pdf");
	         
	      // instantiatE PDFMergerUtility class
	      PDFMergerUtility pdfMerger = new PDFMergerUtility();

	      // set destination file path
	      pdfMerger.setDestinationFileName("/home/tk/sample_pdf.pdf");

	      // add all source files, to be merged, to pdfMerger
	      pdfMerger.addSource(file1);
	      pdfMerger.addSource(file2);
	      pdfMerger.addSource(file3);

	      // merge documents
	      pdfMerger.mergeDocuments(null);

	      System.out.println("PDF Documents merged to a single file");
	}
}

Output

PDF Documents merged to a single file

Merge PDFs with PDFBox and MemoryUsageSetting

When merging several files or large PDF documents, use MemoryUsageSetting. The following Java program shows the same PDF merge operation with temporary-file based memory handling.

</>
Copy
import org.apache.pdfbox.io.MemoryUsageSetting;
import org.apache.pdfbox.multipdf.PDFMergerUtility;

import java.io.File;
import java.io.IOException;

public class MergePDFsWithMemorySetting {

    public static void main(String[] args) throws IOException {
        PDFMergerUtility pdfMerger = new PDFMergerUtility();

        pdfMerger.setDestinationFileName("/home/tk/merged-output.pdf");
        pdfMerger.addSource(new File("/home/tk/sample_1.pdf"));
        pdfMerger.addSource(new File("/home/tk/sample_2.pdf"));
        pdfMerger.addSource(new File("/home/tk/sample_3.pdf"));

        pdfMerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());

        System.out.println("PDF files merged successfully");
    }
}

MemoryUsageSetting.setupTempFileOnly() tells PDFBox to use temporary files during the merge process. This is helpful when the source PDFs are large or when the Java process has limited heap memory.

Merge a List of PDF Files in Java using PDFBox

In real applications, PDF files are often collected from a folder, upload request, database process, or generated report list. Instead of manually calling addSource() for each file, you can loop through a list and add the files in order.

</>
Copy
import org.apache.pdfbox.io.MemoryUsageSetting;
import org.apache.pdfbox.multipdf.PDFMergerUtility;

import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;

public class MergePDFListExample {

    public static void main(String[] args) throws IOException {
        List<File> sourceFiles = Arrays.asList(
                new File("/home/tk/invoice_1.pdf"),
                new File("/home/tk/invoice_2.pdf"),
                new File("/home/tk/invoice_3.pdf")
        );

        PDFMergerUtility pdfMerger = new PDFMergerUtility();
        pdfMerger.setDestinationFileName("/home/tk/all-invoices.pdf");

        for (File sourceFile : sourceFiles) {
            pdfMerger.addSource(sourceFile);
        }

        pdfMerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
    }
}

The merged PDF will contain pages in the same sequence as the files in the list. If the final document needs a different order, sort the list before adding the sources.

Important Notes for PDFBox PDF Merging

  • Page order is based on addSource order. Add files in the exact sequence in which they should appear in the final PDF.
  • Formatting is normally preserved. PDFBox appends the pages from source documents, so the visual layout of each page is retained in most normal PDF merge cases.
  • Encrypted PDFs may need handling first. If a source PDF is password-protected or has restrictions, the merge may fail unless your application handles the protected document correctly.
  • Destination folder must be writable. If the output path is invalid or the application has no write permission, PDFBox cannot create the merged PDF.
  • Use temporary storage for large merges. For large PDF files, use MemoryUsageSetting to avoid loading too much data into heap memory.

Common PDFBox Merge Errors and Fixes

Problem while merging PDFsPossible reasonWhat to check
FileNotFoundExceptionOne of the source PDF paths is wrong.Check that every source file exists and the Java process has read permission.
Output PDF is not createdDestination folder is missing or not writable.Create the folder first and verify write permission.
PDF files merge in the wrong orderFiles were added in the wrong sequence.Add sources in the required final page order.
Application runs out of memoryLarge PDFs are merged using default memory behavior.Use MemoryUsageSetting.setupTempFileOnly() or another suitable memory setting.
Merge fails for a protected PDFThe source PDF may be encrypted or restricted.Handle password-protected PDFs according to your application requirements before merging.

PDFBox Merge Multiple PDFs: Quick Checklist

  • Add the Apache PDFBox dependency to the Java project.
  • Create a PDFMergerUtility object.
  • Set the output PDF path using setDestinationFileName().
  • Add each input PDF using addSource() in the required order.
  • Call mergeDocuments() to create the final merged PDF.
  • Use MemoryUsageSetting when merging larger PDF files.

FAQs on Merging Multiple PDFs using PDFBox

Can PDFBox combine multiple PDFs into a single PDF?

Yes. Apache PDFBox can combine multiple PDF files into a single PDF using the PDFMergerUtility class. Add each source PDF with addSource(), set the output path, and call mergeDocuments().

Does PDFBox preserve formatting while merging PDF files?

In normal merge operations, PDFBox appends the pages from each source PDF, so the page layout and visual formatting are generally preserved. However, encrypted, damaged, or non-standard PDFs may need additional handling.

How do I control the order of PDFs in the merged output?

The order depends on the sequence of addSource() calls. Add the first PDF first, then the second PDF, and continue in the exact order in which the pages should appear in the final document.

Why should I use MemoryUsageSetting when merging PDFs with PDFBox?

MemoryUsageSetting helps control how PDFBox uses memory during the merge process. It is useful for large PDF files because temporary files can be used instead of keeping all processing data in memory.

Can PDFBox merge password-protected PDF files?

PDFBox may fail to merge a protected PDF if the file is encrypted or restricted. In such cases, your application must handle the protected document correctly before adding it to the merge process.

Editorial QA Checklist for This PDFBox Merge Tutorial

  • Confirm that the tutorial clearly uses PDFMergerUtility for merging multiple PDFs.
  • Check that all new Java examples use correct imports for PDFBox merge operations.
  • Verify that the article explains source file order because it controls the final PDF page order.
  • Ensure that memory handling is covered for larger PDF merge operations.
  • Review FAQs to make sure they answer practical PDFBox merge questions directly.