Programmatically convert MS Word DOC and DOCX files to PDF in ASP.NET C# How to convert one or more MS Word DOC and DOCX files into a PDF using Microsoft Office primary interop assemblies (PIAs), also known as Microsoft Office Interop

Convertire file Word DOC e DOCX in sintassi Wiki MediaWiki con un Add-In ufficiale Microsoft

Who never had the urge to convert one or more MS Word DOC and DOCX files into a PDF at least once? Truth to be told, it wasn’t that trivial back in the day: until the release of Office 2010, when the PDF extension appeared among the various  formats supported by the Save As… command, using Ghostscript-based software or installing PDF printer drivers was the only way to go.

After Office 2010 the problem was finally solved even for the average user, with the sole exception that he still has to have MS Office installed on his machine. Those who didn’t have it can continue to use the aforementioned free alternatives ond purchase a software that will take care of the job for them.

What about doing that in a programmatic approach? What if we are developing a C# application and we need to convert some DOC or DOCX files into PDF, thus making then available to download without giving the source document to the users, possibly without having to waste an Office license to our web server/web publishing machine?

The answer, still MS-branded, comes by the name of Microsoft Office primary interop assemblies (PIAs), aka Microsoft Office Interop. Specifically, to work with Word files, you’re going to need the Microsoft.Office.Interop.Word.dll. If you’re using Visual Studio, you can get it from NuGet and attach to your application using the Package Explorer, otherwise you will have to download and install the official distribution package.

As soon as you do that, you’ll be able to open and edit any MS Word document from the FileSystem or from a Byte Array, as explained in this post. Here’s a brief example showing what you can do:

Alternatively, if you don’t like the SaveAs2 method, you can use the ExportAsFixedFormat() method instead and achieve a nearly identical result:

It’s worth noting that everything we said about MS Word can also be done with the other software contained within the MS Office bundle such as MS Excel, MS Powerpoint and more.

IMPORTANT: Do not underestimate the call to
! If you don’t do that, the MS Word instance will be left open on your server (see this thread on StackOverflow for more info on that issue). If you want to be sure to avoid such dreadful scenario entirely you should strengthen the given implementation adding a try/catch fallback strategy such as the follow:

Unfortunately these objects don’t implement
, otherwise it would’ve been even easier.

That’s about it: happy converting!

UPDATE: in case you run into DCOM issues and/or permission errors when publishing your web application project based upon the Microsoft.Office.Interop package into a Windows Server + IIS production machine, read this post to fix them!




About Ryan

IT Project Manager, Web Interface Architect and Lead Developer for many high-traffic web sites & services hosted in Italy and Europe. Since 2010 it's also a lead designer for many App and games for Android, iOS and Windows Phone mobile devices for a number of italian companies.

View all posts by Ryan