DEV Community

carlwils
carlwils

Posted on

 

Java/ Convert PDF to Excel

When a PDF file contains table, you may need to convert it to Excel for further processing. In this article, you will learn how to convert each PDF page to a single Excel worksheet as well as how to convert multiple PDF pages to one Excel worksheet by using Free Spire.PDF for Java.

Two Methods to Import the JAR Dependency

Method 1: You can download the free library and unzip it. Then add the Spire.Pdf.jar file to your project as dependency.
Method 2: Or you can directly add the jar dependency to maven project by adding the following configurations to the pom.xml.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>http://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf.free</artifactId>
        <version>4.4.1</version>
    </dependency>
</dependencies>
Enter fullscreen mode Exit fullscreen mode

Convert a PDF File Containing Three Pages to Three Excel Worksheets

Step 1: Create a PdfDocument object.
Step 2: Load a sample PDF file using PdfDocument.loadFromFile() method.
Step 3: Save the PDF file to Excel using PdfDocument.saveToFile() method.

import com.spire.pdf.FileFormat;
import com.spire.pdf.PdfDocument;

public class ToXLS {
    public static void main(String[] args) {
        //Create a PdfDocument object
        PdfDocument pdf = new PdfDocument();
        //Load a sample PDF file
        pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\Members.pdf");
        //Save to Excel
        pdf.saveToFile("output/ToExcel.xlsx", FileFormat.XLSX);
    }
}
Enter fullscreen mode Exit fullscreen mode

ToExcel

Convert a PDF File Containing Three Pages to One Excel Worksheets

Free Spire.XLS for Java offers the PdfDocument.getConvertOptions().setConvertToOneSheet(true) method to convert multiple PDF pages to one Excel worksheet.

import com.spire.pdf.*;

public class ManyPagesToOneSheet {
    public static void main(String[] args) {

        //Create a PdfDocument object
        PdfDocument pdf = new PdfDocument();

        //Load a sample PDF file
        pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\Members.pdf");

        //Convert multiple PDF pages to one Excel worksheet
        pdf.getConvertOptions().setConvertToOneSheet(true);

        //Save to Excel
        pdf.saveToFile("output/ToOneSheet.xlsx", FileFormat.XLSX);
    }
}

Enter fullscreen mode Exit fullscreen mode

ToOneSheet

Top comments (0)

An Animated Guide to Node.js Event Loop

Node.js doesn’t stop from running other operations because of Libuv, a C++ library responsible for the event loop and asynchronously handling tasks such as network requests, DNS resolution, file system operations, data encryption, etc.

What happens under the hood when Node.js works on tasks such as database queries? We will explore it by following this piece of code step by step.