Wednesday, February 27, 2013

Convert word document (.docx) to PDF

This post will describes how to convert word document to PDF using Java.

To convert document to Pdf we will have different type of approaches.
But in this post i am using  docx4j. It is one of the good API for conversion from XSLT to PDF and Word Document to PDF etc..

We can convert from document to Pdf with Simple java program.

Steps to follow.

Step1 :open Eclipse and create new java project- provide name as you like.

Step 2: Create new Java class  which ever you like (ex: ConvertDocToPDF )

Step 3: Paste the below lines of code inside main method of created java class

 try {

long start = System.currentTimeMillis();

// 1) Load DOCX into WordprocessingMLPackage

InputStream is = new FileInputStream(new File("test.docx"));
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(is);
//If your header and body information got over lapped then use the below code
List sections = wordMLPackage.getDocumentModel().getSections();
for (int i = 0; i < sections.size(); i++) {

System.out.println("sections Size" + sections.size());

//if you want use any Physical fonts then use the below code.

Mapper fontMapper = new IdentityPlusMapper();

PhysicalFont font = PhysicalFonts.getPhysicalFonts().get("Comic Sans MS");

fontMapper.getFontMappings().put("Algerian", font);


// 2) Prepare Pdf settings

PdfSettings pdfSettings = new PdfSettings();

// 3) Convert WordprocessingMLPackage to Pdf

org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);

OutputStream out = new FileOutputStream(new File("test.pdf"));
System.err.println("Time taken to Generate pdf  "+ (System.currentTimeMillis() - start) + "ms");
} catch (Throwable e) {


Step 4: Now you can run the Java program, PDF will be generate for your Document file.


  1. Isn't it much easier to use a web-based app for the conversion process? I have been using GroupDocs Conversion for some time now and it is quite simple and provides an embed code to use without your web-page.

  2. i think u forgot mention the required jar files

  3. You should try Aspose.Words for Java API also for converting word docs to pdf and to many other formats.

  4. This comment has been removed by the author.

  5. This comment has been removed by the author.

  6. I tried the provided code for convertion of word to pdf by including all the required jars, but got some exceptions and errors (YOU CAN SEE MY NEXT POST FOR ERRORS). So please help me in this regard.

  7. log4j:WARN No appenders could be found for logger (org.docx4j.utils.ResourceUtils).
    log4j:WARN Please initialize the log4j system properly.
    18 [main] INFO org.docx4j.utils.Log4jConfigurator - Since your log4j configuration (if any) was not found, docx4j has configured log4j automatically.
    37 [main] WARN org.docx4j.XmlUtils - Using default SAXParserFactory: null
    294 [main] INFO org.docx4j.jaxb.Context - JAXB: RI not present. Trying Java 6 implementation.
    295 [main] INFO org.docx4j.jaxb.Context - JAXB: Using Java 6 implementation.
    295 [main] INFO org.docx4j.jaxb.Context - loading Context jc
    4160 [main] INFO org.docx4j.jaxb.Context - loaded com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl .. loading others ..
    4294 [main] INFO org.docx4j.jaxb.Context - .. others loaded ..
    4303 [main] WARN org.docx4j.jaxb.JaxbValidationEventHandler - [(non)FATAL_ERROR] : unexpected element (uri:"", local:"html"). Expected elements are <{}p
    4303 [main] INFO org.docx4j.jaxb.JaxbValidationEventHandler - continuing (with possible element/attribute loss)
    4303 [main] ERROR org.docx4j.openpackaging.packages.OpcPackage - javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"html"). Expected elements are <{}package>,<{}xmlData>
    org.docx4j.openpackaging.exceptions.Docx4JException: Couldn't load xml from stream
    at org.docx4j.openpackaging.packages.OpcPackage.load(
    at org.docx4j.openpackaging.packages.OpcPackage.load(
    at org.docx4j.openpackaging.packages.WordprocessingMLPackage.load(
    at asd.main(
    Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"html"). Expected elements are <{}package>,<{}xmlData>
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext$DefaultRootLoader.childElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.SAXConnector.startElement(Unknown Source)
    at Source)
    at Source)
    at$NSContentDriver.scanRootElementHook(Unknown Source)
    at$ Source)
    at$ Source)
    at Source)
    at Source)
    at Source)
    at Source)
    at Source)
    at Source)
    at Source)

  8. I got an exception java.lang.NoClassDefFoundError:

  9. may i know where you are getting NoClassDefFoundError: exception

  10. i got the following error while i tried to work with jboss 6.1. The same code is working fine with jboss 4.0.
    13:04:34,423 ERROR [org.docx4j.utils.ResourceUtils] Couldn't get resource:
    13:04:34,438 ERROR [org.docx4j.Docx4jProperties] Error reading java.lang.NullPointerException
    at org.docx4j.utils.ResourceUtils.getResource( [docx4j-2.7.1.jar:]

    i tried with the latest docx4j jars (i.e 3.1 and 3.2) but it didnt work for me..

  11. very good post bro thanks it is very useful for me