How to merge two PDF files into one in Java?

JavaPdfPdfbox

Java Problem Overview


I want to merge many PDF files into one using PDFBox and this is what I've done:

PDDocument document = new PDDocument();
for (String pdfFile: pdfFiles) {
    PDDocument part = PDDocument.load(pdfFile);
    List<PDPage> list = part.getDocumentCatalog().getAllPages();
    for (PDPage page: list) {
        document.addPage(page);
    }
    part.close();
}
document.save("merged.pdf");
document.close();

Where pdfFiles is an ArrayList<String> containing all the PDF files.

When I'm running the above, I'm always getting:

org.apache.pdfbox.exceptions.COSVisitorException: Bad file descriptor

Am I doing something wrong? Is there any other way of doing it?

Java Solutions


Solution 1 - Java

Why not use the PDFMergerUtility of pdfbox?

PDFMergerUtility ut = new PDFMergerUtility();
ut.addSource(...);
ut.addSource(...);
ut.addSource(...);
ut.setDestinationFileName(...);
ut.mergeDocuments();

Solution 2 - Java

A quick Google search returned this bug: "Bad file descriptor while saving a document w. imported PDFs".

It looks like you need to keep the PDFs to be merged open, until after you have saved and closed the combined PDF.

Solution 3 - Java

This is a ready to use code, merging four pdf files with itext.jar from http://central.maven.org/maven2/com/itextpdf/itextpdf/5.5.0/itextpdf-5.5.0.jar, more on http://tutorialspointexamples.com/

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;

/**
 * This class is used to merge two or more 
 * existing pdf file using iText jar.
 */
public class PDFMerger {

static void mergePdfFiles(List<InputStream> inputPdfList,
        OutputStream outputStream) throws Exception{
	//Create document and pdfReader objects.
	Document document = new Document();
    List<PdfReader> readers = 
    		new ArrayList<PdfReader>();
    int totalPages = 0;
    
    //Create pdf Iterator object using inputPdfList.
    Iterator<InputStream> pdfIterator = 
    		inputPdfList.iterator();

    // Create reader list for the input pdf files.
    while (pdfIterator.hasNext()) {
            InputStream pdf = pdfIterator.next();
            PdfReader pdfReader = new PdfReader(pdf);
            readers.add(pdfReader);
            totalPages = totalPages + pdfReader.getNumberOfPages();
    }

    // Create writer for the outputStream
    PdfWriter writer = PdfWriter.getInstance(document, outputStream);
    
    //Open document.
    document.open();
   
    //Contain the pdf data.
    PdfContentByte pageContentByte = writer.getDirectContent();

    PdfImportedPage pdfImportedPage;
    int currentPdfReaderPage = 1;
    Iterator<PdfReader> iteratorPDFReader = readers.iterator();

    // Iterate and process the reader list.
    while (iteratorPDFReader.hasNext()) {
            PdfReader pdfReader = iteratorPDFReader.next();
            //Create page and add content.
            while (currentPdfReaderPage <= pdfReader.getNumberOfPages()) {
                  document.newPage();
                  pdfImportedPage = writer.getImportedPage(
                		  pdfReader,currentPdfReaderPage);
                  pageContentByte.addTemplate(pdfImportedPage, 0, 0);
                  currentPdfReaderPage++;
            }
            currentPdfReaderPage = 1;
    }
    
    //Close document and outputStream.
    outputStream.flush();
    document.close();
    outputStream.close();
    
    System.out.println("Pdf files merged successfully.");
}

public static void main(String args[]){
	try {
		//Prepare input pdf file list as list of input stream.
		List<InputStream> inputPdfList = new ArrayList<InputStream>();
		inputPdfList.add(new FileInputStream("..\\pdf\\pdf_1.pdf"));
		inputPdfList.add(new FileInputStream("..\\pdf\\pdf_2.pdf"));
		inputPdfList.add(new FileInputStream("..\\pdf\\pdf_3.pdf"));
		inputPdfList.add(new FileInputStream("..\\pdf\\pdf_4.pdf"));

                    
		//Prepare output stream for merged pdf file.
        OutputStream outputStream = 
        		new FileOutputStream("..\\pdf\\MergeFile_1234.pdf");
        
        //call method to merge pdf files.
        mergePdfFiles(inputPdfList, outputStream);     
	} catch (Exception e) {
		e.printStackTrace();
	}
    }
}

Solution 4 - Java

Multiple pdf merged method using org.apache.pdfbox:

public void mergePDFFiles(List<File> files,
                          String mergedFileName) {
    try {
        PDFMergerUtility pdfmerger = new PDFMergerUtility();
        for (File file : files) {
            PDDocument document = PDDocument.load(file);
            pdfmerger.setDestinationFileName(mergedFileName);
            pdfmerger.addSource(file);
            pdfmerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
            document.close();
        }
    } catch (IOException e) {
        logger.error("Error to merge files. Error: " + e.getMessage());
    }
}

From main program, call mergePDFFiles method using list of files and target file name.

        String mergedFileName = "Merged.pdf";
        mergePDFFiles(files, mergedFileName);

After calling mergePDFFiles, load merged file

        File mergedFile = new File(mergedFileName);

Solution 5 - Java

package article14;
 
import java.io.File;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.util.PDFMergerUtility;
 
public class Pdf
{
    public static void main(String args[])
    {
        new Pdf().createNew();
        new Pdf().combine();
        }
     
    public void combine()
    {
        try
        {
        PDFMergerUtility mergePdf = new PDFMergerUtility();
        String folder ="pdf";
        File _folder = new File(folder);
        File[] filesInFolder;
        filesInFolder = _folder.listFiles();
        for (File string : filesInFolder)
        {
            mergePdf.addSource(string);    
        }
    mergePdf.setDestinationFileName("Combined.pdf");
    mergePdf.mergeDocuments();
        }
        catch(Exception e)
        {
             
        }  
    }
 
public void createNew()
{
    PDDocument document = null;
    try
    {
        String filename="test.pdf";
        document=new PDDocument();
        PDPage blankPage = new PDPage();
        document.addPage( blankPage );
        document.save( filename );
    }
    catch(Exception e)
    {
         
    }
}
 
}

Solution 6 - Java

If you want to combine two files where one overlays the other (example: document A is a template and document B has the text you want to put on the template), this works:

after creating "doc", you want to write your template (templateFile) on top of that -

   PDDocument watermarkDoc = PDDocument.load(getServletContext()
				.getRealPath(templateFile));
   Overlay overlay = new Overlay();

   overlay.overlay(watermarkDoc, doc);

Solution 7 - Java

Using iText (existing PDF in bytes)

	public static byte[] mergePDF(List<byte[]> pdfFilesAsByteArray) throws DocumentException, IOException {
    
    ByteArrayOutputStream outStream = new ByteArrayOutputStream();
    Document document = null;
    PdfCopy writer = null;
    
    for (byte[] pdfByteArray : pdfFilesAsByteArray) {
        
        try {
            PdfReader reader = new PdfReader(pdfByteArray);
            int numberOfPages = reader.getNumberOfPages();

            if (document == null) {
                document = new Document(reader.getPageSizeWithRotation(1));
                writer = new PdfCopy(document, outStream); // new
                document.open();
            }
            PdfImportedPage page;
            for (int i = 0; i < numberOfPages;) {
                ++i;
                page = writer.getImportedPage(reader, i);
                writer.addPage(page);
            }
        }

        catch (Exception e) {
            e.printStackTrace();
        }
        
    }
    
    document.close();
    outStream.close();
    return outStream.toByteArray();
    
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLipisView Question on Stackoverflow
Solution 1 - JavacherouvimView Answer on Stackoverflow
Solution 2 - JavaMichael Lloyd Lee mlkView Answer on Stackoverflow
Solution 3 - JavabenitoView Answer on Stackoverflow
Solution 4 - JavaarifngView Answer on Stackoverflow
Solution 5 - JavaSabapathyView Answer on Stackoverflow
Solution 6 - JavaDave WView Answer on Stackoverflow
Solution 7 - JavaRicardo Jl RufinoView Answer on Stackoverflow