Learn and shine: Convert word document (.docx) to PDF

Wednesday, February 27, 2013

Convert word document (.docx) to PDF

This post will describes how to convert word document to PDF using Java.

To convert document to Pdf we will have different type of approaches.
But in this post i am using docx4j. It is one of the good API for conversion from XSLT to PDF and Word Document to PDF etc..

We can convert from document to Pdf with Simple java program.

Steps to follow.

Step1 :open Eclipse and create new java project- provide name as you like.

Step 2: Create new Java class which ever you like (ex: ConvertDocToPDF )

Step 3: Paste the below lines of code inside main method of created java class

 try {


long start = System.currentTimeMillis();

// 1) Load DOCX into WordprocessingMLPackage

InputStream is = new FileInputStream(new File("test.docx"));
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(is);
//If your header and body information got over lapped then use the below code
List sections = wordMLPackage.getDocumentModel().getSections();
for (int i = 0; i < sections.size(); i++) {

System.out.println("sections Size" + sections.size());
wordMLPackage.getDocumentModel().getSections().get(i).getPageDimensions().setHeaderExtent(3000);
}

//if you want use any Physical fonts then use the below code.

Mapper fontMapper = new IdentityPlusMapper();

PhysicalFont font = PhysicalFonts.getPhysicalFonts().get("Comic Sans MS");

fontMapper.getFontMappings().put("Algerian", font);

wordMLPackage.setFontMapper(fontMapper);

// 2) Prepare Pdf settings

PdfSettings pdfSettings = new PdfSettings();

// 3) Convert WordprocessingMLPackage to Pdf

org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);

OutputStream out = new FileOutputStream(new File("test.pdf"));
conversion.output(out,pdfSettings);
System.err.println("Time taken to Generate pdf  "+ (System.currentTimeMillis() - start) + "ms");
} catch (Throwable e) {

e.printStackTrace();
}

Step 4: Now you can run the Java program, PDF will be generate for your Document file.

38 comments:

AnonymousMarch 6, 2013 at 4:31 PM
Isn't it much easier to use a web-based app for the conversion process? I have been using GroupDocs Conversion for some time now and it is quite simple and provides an embed code to use without your web-page.
ReplyDelete
Replies
UnknownOctober 7, 2013 at 12:02 PM
i think u forgot mention the required jar files
ReplyDelete
Replies
wiezNovember 19, 2013 at 3:31 PM
You should try Aspose.Words for Java API also for converting word docs to pdf and to many other formats.
ReplyDelete
Replies
PriyathamFebruary 13, 2014 at 1:46 PM
This comment has been removed by the author.
ReplyDelete
Replies
PriyathamFebruary 13, 2014 at 1:47 PM
This comment has been removed by the author.
ReplyDelete
Replies
PriyathamFebruary 13, 2014 at 1:51 PM
I tried the provided code for convertion of word to pdf by including all the required jars, but got some exceptions and errors (YOU CAN SEE MY NEXT POST FOR ERRORS). So please help me in this regard.
ReplyDelete
Replies
PriyathamFebruary 13, 2014 at 1:52 PM
log4j:WARN No appenders could be found for logger (org.docx4j.utils.ResourceUtils).
log4j:WARN Please initialize the log4j system properly.
18 [main] INFO org.docx4j.utils.Log4jConfigurator - Since your log4j configuration (if any) was not found, docx4j has configured log4j automatically.
37 [main] WARN org.docx4j.XmlUtils - Using default SAXParserFactory: null
294 [main] INFO org.docx4j.jaxb.Context - JAXB: RI not present. Trying Java 6 implementation.
295 [main] INFO org.docx4j.jaxb.Context - JAXB: Using Java 6 implementation.
295 [main] INFO org.docx4j.jaxb.Context - loading Context jc
4160 [main] INFO org.docx4j.jaxb.Context - loaded com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl .. loading others ..
4294 [main] INFO org.docx4j.jaxb.Context - .. others loaded ..
4303 [main] WARN org.docx4j.jaxb.JaxbValidationEventHandler - [(non)FATAL_ERROR] : unexpected element (uri:"", local:"html"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}p
4303 [main] INFO org.docx4j.jaxb.JaxbValidationEventHandler - continuing (with possible element/attribute loss)
4303 [main] ERROR org.docx4j.openpackaging.packages.OpcPackage - javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"html"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>
org.docx4j.openpackaging.exceptions.Docx4JException: Couldn't load xml from stream
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:238)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:210)
at org.docx4j.openpackaging.packages.WordprocessingMLPackage.load(WordprocessingMLPackage.java:184)
at asd.main(asd.java:25)
Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"html"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext$DefaultRootLoader.childElement(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.SAXConnector.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDriver.scanRootElementHook(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)

ReplyDelete
Replies
ShahnaJuly 4, 2014 at 4:27 PM
I got an exception java.lang.NoClassDefFoundError:
ReplyDelete
Replies
Siva RajuJuly 21, 2014 at 11:47 AM
may i know where you are getting NoClassDefFoundError: exception
ReplyDelete
Replies
punitpannuAugust 28, 2014 at 7:27 PM
i got the following error while i tried to work with jboss 6.1. The same code is working fine with jboss 4.0.
13:04:34,423 ERROR [org.docx4j.utils.ResourceUtils] Couldn't get resource: docx4j.properties
13:04:34,438 ERROR [org.docx4j.Docx4jProperties] Error reading docx4j.properties: java.lang.NullPointerException
at org.docx4j.utils.ResourceUtils.getResource(ResourceUtils.java:45) [docx4j-2.7.1.jar:]

i tried with the latest docx4j jars (i.e 3.1 and 3.2) but it didnt work for me..
ReplyDelete
Replies
balaSeptember 24, 2015 at 3:03 PM
very good post bro thanks it is very useful for me
ReplyDelete
Replies
RehanApril 18, 2018 at 1:16 AM
Thumbs up guys your doing a really good job. altoconvertpdftoword.com
ReplyDelete
Replies
arazaseoagancyMarch 2, 2019 at 9:37 AM
suitable internet site! I basically marvel how it is straightforward to apply upon my eyes it is. i'm wondering how I might be notified whenever a auxiliary kingdom has been made. i've subscribed to your RSS which might also get the trick? Have a critical daylight! docx converter online
ReplyDelete
Replies
Steve WhartonJuly 13, 2019 at 3:36 PM
Thanks for sharing this post. I'm very interested in this topic. https://onlineconvertfree.com
ReplyDelete
Replies
aaronnssdJuly 27, 2019 at 3:02 AM
This information is meaningful and magnificent which you have shared here about the Pdf Conversion. I am impressed by the details that you have shared in this post and It reveals how nicely you understand this subject. I would like to thanks for sharing this article here.
ReplyDelete
Replies
Data Entry Solutions IndiaOctober 20, 2020 at 8:31 PM
You have given such wonderful information which you have shared here. I am very happy to get some best knowledge from this post. Keep it up. PDF Conversion.
ReplyDelete
Replies
infinitechartsAugust 13, 2021 at 3:25 AM
Thanks for publishing such excellent information. You are doing such a good job. This information is really helpful for everyone. Keep it up. Thanks. mt4 India
ReplyDelete
Replies
DataSlexIndiaAugust 24, 2021 at 12:31 AM
Excellent job, this is great information which is shared by you. This info is meaningful and factual for us to increase our knowledge about it. So please always keep sharing this type of information. Read more info about IT Outsourcing Company
ReplyDelete
Replies
freewebtoolzApril 21, 2022 at 3:01 AM
I unquestionably truly loved all aspects about Free Online Image Size Converter and I likewise have you spared to fav to take a gander at new data in your site.
ReplyDelete
Replies
Ignissta akashApril 24, 2023 at 2:44 PM
An Ignissta EML to PDF converter is a tool that allows you to convert emails saved in the EML file format to the PDF file format. This can be useful if you want to save emails for long-term storage or to share them with others in a format that is easy to read and print. Some EML to PDF converters also offer additional features, such as the ability to merge multiple EML files into a single PDF or to convert EML files with attachments. Overall, an EML to PDF converter can be a useful tool for anyone who needs to work with emails in the PDF format.
ReplyDelete
Replies
FahriSeptember 27, 2023 at 8:58 AM
https://bayanlarsitesi.com/
Ordu
Kocaeli
Düzce
Osmaniye

OYXC
ReplyDelete
Replies
MysticDancer38October 18, 2023 at 4:03 PM
ankara parça eşya taşıma
takipçi satın al
antalya rent a car
antalya rent a car
ankara parça eşya taşıma
4R2TY
ReplyDelete
Replies
Arif9October 21, 2023 at 5:19 AM
Çorum Lojistik
Karaman Lojistik
Gümüşhane Lojistik
Denizli Lojistik
Artvin Lojistik
U00Yİ2
ReplyDelete
Replies
Takipci------February 25, 2024 at 8:40 AM
706BE
bitget
bitcoin hangi bankalarda var
kucoin
kripto para telegram grupları
referans kimliği nedir
bingx
btcturk
referans kimligi nedir
okex
ReplyDelete
Replies
-Takipci-Satin-al----------February 26, 2024 at 5:28 PM
CD433
bitcoin hangi bankalarda var
binance referans kod
binance
binance referans
kripto ne demek
kucoin
kucoin
kripto telegram
probit
ReplyDelete
Replies
B4222CE859JudeA77ACC6608November 22, 2024 at 2:20 AM
9A018B572E
skype şov
canli cam show
whatsapp görüntülü show güvenilir
görüntülü show
ücretli şov
cam şov
telegram görüntülü şov
ücretli show
telegram show
ReplyDelete
Replies
2F698DC8EBBianca8E49C49065November 23, 2024 at 6:56 AM
F438BA5526
delay
performans arttırıcı
canli cam show
skype şov
telegram show
yapay kızlık zarı
whatsapp ücretli show
vega
canli web cam show
ReplyDelete
Replies
68E9792A90Kylie309D865620November 23, 2024 at 7:25 PM
D841CCE5BD
bayan azdırıcı damla
cam show
themra macun
telegram görüntülü şov
cialis
lady era
vigrande
novagra
kamagra
ReplyDelete
Replies
6124087DEAAdriana870E52EBC5November 25, 2024 at 11:09 PM
EF7518B97A
whatsapp ücretli show
whatsapp görüntülü show güvenilir
cam şov
kamagra hap
bufalo içecek
vega
cialis
fx15 zayıflama hapı
vigrande
ReplyDelete
Replies
4F0AB76A62AidanED1E2C2CE9November 26, 2024 at 5:28 AM
875CAF24E9
green temptation
canli web cam show
skype şov
telegram görüntülü şov
görüntülü show
kamagra hap
whatsapp görüntülü show güvenilir
skype show
ereksiyon hapı
ReplyDelete
Replies
2C2F9DF425FranklinED111A18D4November 26, 2024 at 4:15 PM
327C8F447B
sanal show
ReplyDelete
Replies
AnonymousFebruary 5, 2025 at 7:53 PM
180F241508
türk takipçi
Township Promosyon Kodu
PK XD Elmas Kodu
Raid Promosyon Kodu
101 Okey Vip Hediye Kodu
Titan War Hediye Kodu
Rise Of Kingdoms Hediye Kodu
Raid Promosyon Kodu
Whiteout Survival Hediye Kodu
ReplyDelete
Replies
AnonymousFebruary 7, 2025 at 11:15 AM
7A329F8C18
kadın takipçi satın al
Lords Mobile Promosyon Kodu
Erasmus Proje
Azar Elmas Kodu
Raid Promosyon Kodu
Sıra Bulucu
Free Fire Elmas Kodu
Bitcoin En Güvenilir Nereden Alınır
Stumble Guys Elmas Kodu
ReplyDelete
Replies
AnonymousMarch 22, 2025 at 7:24 AM
0573512CCA
Telegram Güvenilir Coin Botları
Telegram Coin Kazanma Botları
Telegram Para Kazanma
Güvenilir Telegram Farm Botları
En İyi Telegram Mining Botları
ReplyDelete
Replies
AnonymousJune 11, 2025 at 6:41 AM
BC110F9936
en iyi mmorpg oyunlar mobil
sms onay go
mobil ödeme bozdurma
takipçi satın alma
-
ReplyDelete
Replies
AnonymousAugust 3, 2025 at 2:26 PM
64C4DD454E
kiralık hacker
kiralık hacker arıyorum
kiralık hacker
hacker arıyorum
kiralık hacker
ReplyDelete
Replies
AnonymousNovember 16, 2025 at 11:19 PM
AB73F0FDAE
Beğeni Satın Al
Yabancı Takipçi
Takipçi Satın Al
ReplyDelete
Replies

Add comment

Learn and shine

Wednesday, February 27, 2013

Convert word document (.docx) to PDF

38 comments:

AddToAny

Contact Form