Well this week I was in the technical world and getting my hands dirty. A couple of our products seamlessly integrate with Microsoft Office, which works great for the business at the end of the day. So because of this we have our own technical service that effectively acts as the service that integrates with Word. However, something a little different was required. A client required us to automatically generate ordering catalogues, and we are talking word documents with around some 200 pages. This automation would happen within our LOB .NET application.
Their formatting was very specific and in some cases dependent on the type of products being added to the catalogue. Because of this, there isn’t the typical option of using bookmarks and or data fields within a template, especially as I wanted to use a single generic template for all their catalogues (they have 5 which they produce every 6 months)…
Referencing the Word Object Library
This is quick and easy, and I am presuming you know how to use Visual Studio. In your project choose to add a reference. Select the COM tab, then make your way down to “Microsoft Office 12.0 Object Library” (Your version depends on your version of word installed on your machine).
NB: Notice once you have added this you will have two references in your project to Microsoft Word. Look at the path and you will see they have 2 different locations, one of which is within “Microsoft.Office.Interop.Word”
Its good practice to import in the namespace you are working with into your class. So at the top of your class add the imports statement (VB.NET):
Opening a word document / template
Some posts talk about how to create a word document on the fly. This is great, but it’s much better and easier, if you have a document to work with from the beginning. A template is great as you can then utilise your own styles from within that document from your .NET code. In my own case, I created a word document that contained only a table of contents and a second page. The document is blank, but does consist of some customised styles that I have created. Also, our template is stored within our workFile ECM repository, so I can grab a copy of the template from the repository anytime with our application and use it.
You will need to create a Word.Application object in your code. I like to set up a module variable that holds my word application object and my word document. In addition I choose to open the word document in its own procedure, again coding practice. Also remember I am inserting this code into an already existing word service layer that we have written ….
Private Sub openWordDocument() 'check if we have an active word application object If myWordApplication Is Nothing Then myWordApplication = New Word.Application End If 'set the word document myWordDocument = myWordApplication.Documents.Open(myWordDocFile) End Sub
MyWordDocFile in this case is a string value, which is the location and name of the document I wish to work with.
Inserting a paragraph, text and style
Now if you don’t have the luxury of using bookmarks or data fields, simply because you are not sure what text you are inserting, you are going to need to create paragaphs, text lines and give them some form of style.
Creating a paragraph object is easy, however, make sure you are inserting it into the document where you want. Typically this will be the end of your document as you are appending to it. Remember in the code below, myWordDocument is the actual word document itself we opened earlier.
Public Sub insertParagraph(ByVal pText As String, Optional ByVal pStyleName As String = vbNullString) Dim para As Word.Paragraph = myWordDocument.Content.Paragraphs.Add(myWordDocument.Bookmarks.Item("\endofdoc").Range) Try para.Range.Text = pText para.Range.Style = pStyleName para.Range.InsertParagraphAfter() Catch ex As Exception System.Diagnostics.Debug.WriteLine(ex.Message) Throw ex Finally Marshal.ReleaseComObject(para) para = Nothing GC.WaitForPendingFinalizers() GC.Collect() End Try End Sub
Points to notice here.
- A paragraph is a “range”, if you like, a selection within the word document.
- We simply set the text and style value for the paragraph
- Insert a paragraph after so we are again at the end of the document and working in a new paragraph. (next time we call this method)
- I choose to marshal out of memory the paragraph object. This is because it is a com based object and as such we can get some memory issues and weird errors being raised by the dll when dealing with larger document generation.
- I use the garbage can to ensure everything is cleaned up properly (this isnt over the top, as without it I received error messages for larger documents – such as “The callee refused the call”. Nice…)
Inserting text in a range without a new paragraph
If you simply want to add text and dont want to create a new paragraph, then again, you need to create a range, however this time it is just a range (not a paragraph).
Public Sub insertTextLine(ByVal pText1 As String, ByVal pText2 As String, ByVal pText3 As String) Dim textPart1 As Word.Range = myWordDocument.Bookmarks.Item("\endofdoc").Range Dim textPart2 As Word.Range Dim textPart3 As Word.Range Try textPart1.Style = "BookTitle" textPart1.Bold = True textPart1.InsertAfter(pText1) textPart1.Bold = True If pText2 <> vbNullString Then textPart2 = myWordDocument.Bookmarks.Item("\endofdoc").Range textPart2.Bold = False textPart2.InsertAfter(" " & pText2) End If If pText3 <> vbNullString Then textPart3 = myWordDocument.Bookmarks.Item("\endofdoc").Range textPart3.Bold = False 'two tabs to the correct location '---------- textPart3.InsertAfter(vbTab & pText3) End If 'insert a new paragraph... textPart3.InsertParagraphAfter() Catch ex As Exception System.Diagnostics.Debug.WriteLine(ex.Message) Throw ex Finally If Not textPart1 Is Nothing Then Marshal.ReleaseComObject(textPart1) textPart1 = Nothing End If If Not textPart2 Is Nothing Then Marshal.ReleaseComObject(textPart2) textPart2 = Nothing End If If Not textPart3 Is Nothing Then Marshal.ReleaseComObject(textPart3) textPart3 = Nothing End If GC.WaitForPendingFinalizers() GC.Collect() End Try End Sub
What we have done here is effecitvely appended 3 text values into a single text entry within our word document. Notice that by using “InsertAfter” from our Range object, we are literally inserting text, no paragraphs. Also I have used vbTab to space out the value. My word document has a set location for a tab entry within the selected style, this ensures I know where the text will be inserted in that line.
Again ensure you clean up your objects and marshal them out of memory.
Saving your document
In my case, we are working with a temporary file that has been copied locally from the workFile repository. You may be working with just a template though sitting on your hard drive somewhere, so make sure you dont save your document over the top of that template! School boy error if you do…..
Saving the file is real easy, provide your directory and file name and you are almost complete:
Public Sub saveCatalogue (ByVal pCatalogueName As String, ByVal pCatalogueLocation As String) If Not System.IO.Directory.Exists(pCatalogueLocation) Then System.IO.Directory.CreateDirectory(pCatalogueLocation) End If myWordDocument.SaveAs(pCatalogueLocation & pCatalogueName) myWordDocument.Close(False) End Sub
Tidy up memory and word
Your file has been saved, but you have yet to finish. If you look in task manager you will notice that WINWORD.exe is still running, and its memory size could be quite large. If you don’t kill this off correctly and you continue to create word documents in this fashion you will cause havoc with performance. So, we have to clean up after ourseleves.
Private Sub closeWord() On Error Resume Next 'quit the word application '---------------------------------------------------------- myWordApplication.Quit() 'marshal out the com objects, dont want any memory leaks here... If Not IsNothing(myWordDocument) Then If Not IsNothing(myWordDocument.Fields) Then Marshal.ReleaseComObject(myWordDocument.Fields) End If End If Marshal.ReleaseComObject(myWordDocument) Marshal.ReleaseComObject(myWordApplication) myWordDocument = Nothing myWordApplication = Nothing End Sub
Quit word then clean up….Again we are marshalling out objects from memory and cleaning up everyting.
If you have the luxury of knowing the format of the document (such as populating an invoice, a letter etc), then you can use fields to make life a lot easier for you. Again, set up a template word document with the content you desire. For that content which is to be added dynamically, insert a data field. (See help within your version of Office to do this).
From .NET when you have the document open, you can now loop round or search for those fields on the document. Fields are found within the word document object itself, and is a collection of Word.Field objects.
You can then update the field text and carry on….See the sample line of code below, which is using an invoice reference to insert into the data field.
field.Result.Text = CStr(invoiceRef)
Word is great to automate and can be very powerful for your .NET applications. Sometimes you may struggle to find great documentation on this, however, its worth searching for…Just remember, always clean up your code and look after your memory, if you dont, you will get some weird and wonderful error messages once processing larger files…