Friday, 24 September 2021

Convert PDF to HTML and vice versa in Java

This tutorial will introduce how to convert PDF to HTML and convert HTML to PDF using Java codes with the help of a third-party library called Spire.PDF for Java.

Add Spire.Pdf.jar to IDEA

Before coding, we need to add a Jar file called Spire.Pdf.jar to Intellij IDEA, and there are two methods to do it.

Method One: download the package of Spire.PDF for Java from the link, unzip it and find Spire.Pdf.jar in the “lib” folder. Finally manually add it to IDEA.

Method Two: create a Maven project in IDEA, type the following codes in the pom.xml file and then click the button “Import Changes”.

<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.pdf</artifactId>
<version>.4.8.7</version>
</dependency>
</dependencies>

Using the codes

Convert PDF to HTML

import com.spire.pdf.*;

public class PDFToHTML {
public static void main(String[] args) {
//Load the PDF file
PdfDocument pdf = new PdfDocument();
pdf.loadFromFile("C:\\Users\\Test1\\Desktop\\Sample.pdf");
//Save to HTML format
pdf.saveToFile("output/ToHTML.html", FileFormat.HTML);
}
}

Output



Convert PDF to HTML with embedded SVG/Image and save HTML to stream

import com.spire.pdf.*;
import java.io.*;

public class PDFToHTMLWithEmbeddedSVG {
public static void main(String[] args) throws FileNotFoundException {

//Load the sample document file
PdfDocument pdf = new PdfDocument();
pdf .loadFromFile("C:\\Users\\Test1\\Desktop\\Sample.pdf");

//Set the bool useEmbeddedSvg and useEmbeddedImg as true
pdf .getConvertOptions().setPdfToHtmlOptions(true,true);

//Save to stream
File outFile = new File("output/toHTML_out.html");
OutputStream outputStream = new FileOutputStream(outFile);
pdf.saveToStream(outputStream, FileFormat.HTML);
pdf.close();
}
}
Convert HTML to PDF
import com.spire.pdf.graphics.PdfMargins;
import com.spire.pdf.htmlconverter.qt.HtmlConverter;
import com.spire.pdf.htmlconverter.qt.Size;

public class HTMLToPDF {
public static void main(String[] args) {
//define the HTML link and result PDF
String url = "https://www.e-iceblue.com/";
String fileName = "output/HTMLToPDF.pdf";
//Set the plugin path
String pluginPath = "D:/Qt/plugins_32";
HtmlConverter.setPluginPath(pluginPath);
//convert HTML to PDF and set the size of the result PDF page
HtmlConverter.convert(url, fileName, true, 1000000, new Size(600f, 900f), new PdfMargins(0));
}
}

No comments:

Post a Comment

Change PDF Versions in Java

In daily work, you might need to change the version of a PDF document you have in order to ensure compatibility with another version which a...