How to manipulate Word images programmatically
Working with micorsoft word images programmatically was never
so easy. Now with the availability of Aspose.Words .Net component, it has
become much easier to manipulate the images in a Word DOC document.
I came across a problem in my project in which I had a user
given input word Doc file and I had to extract all the images in the DOC
document and save them as files in a folder. The basic purpose of the activity
was to get instant access to all the images contained in DOC document without
opening the DOC document using Microsoft word and user could open the images
using any image viewer software available so that he could play with the
images.
First problem was that I had to extract all the images
contained in the document. I had to use the component using the C# language for
this purpose. So all the code discussed here will be in C# syntax.
Step 1: Create a new
C# project
To create a new project, choose the main menu : File > New > Project.
It will give you several options. First you must select a type from the left
side of the popup - you must choose Visual
Basic Projects or Visual C# projects based on the language you plan to use
for development. But here I am using C# so you should choose Visual C# projects.
After selecting a type, you choose a template from the right side. You may
choose Windows Application, ASP.NET Web
Application or any other template based on the nature of the application
you want. I have used ConsoleApplication
template for this tutorial so you also select ConsoleApplication template type.
When you create a ConsoleApplication
template project, VS.NET will add a sample file by default. You can simply
Build your new project.
Step 2 : Add a reference to Aspose.Words Assembly in Project
The Add Reference dialog box can be used to add project
references. This dialog box can be accessed from the Project menu.
To add Aspose.Words project reference
- In
Solution Explorer, select the project
- On the Project menu, choose
Add Reference.
The Add Reference dialog box opens.
- Select
the tab indicating the Aspose.Words
component in .Net pane
- Click OK
when you have selected the component of Aspose.Words
Selected reference of Aspose.Words
will now appear under the References node of the project
Step 3: Open an
existing DOC document
To open a document the Aspose.Words library contains a Document class that is central to the library.
This Document object allows loading documents in many formats. The file format
I had to read was Microsoft Office DOC format. I passed the filename
concatenated with the file path into the constructor of the Document object
using a String var ImageFilePath . I
had to add following line of code to read the file.
//open an
existing DOC document using the Document object class
string ImageFilePath = "c:\\imagefolder";
Document
doc = new Document(ImageFilePath + "\\ImageFile.doc");
Step 4 :Access to
Images in Document
Now I had to access the images contained in the doc object. The
Document object follows Microsoft DOM
model so accessing the images in the document was fairly easy by getting the
collection of Nodes from Document tree calling the GetChildNodes method and asking it to provide the nodes of shape
type. The class NodeCollection is
represents a collection of nodes of a specific type.
//It gives a collection of all shape nodes in the tree
NodeCollection shapes
= doc.GetChildNodes(NodeType.Shape, true, false);
Step 5: Iterate
through Node Collections
Now I had to iterate through the node collections array.
Here is the code for doing it.
int
imageIndex = 0;
foreach (Shape shape in shapes)
{
if
(shape.HasImage)
{
String
name = "DocumentImage" + "_" + imageIndex.ToString() +
".bmp";
shape.ImageData.Save(ImageFilePath +"\\"+ name);
imageIndex++;
}
}
After executing the code, I could see all the image files in
the folder “c:\ imagefolder”.
Folder contents before executing the code are as shown in
figure accessible by this zip file available in this link. figure1 Folder contents after executing the code are as shown in
figure accessible by this zip file available in this link figure2
Here is the code available defined in Class1.cs file
using System; using Aspose.Words; using Aspose.Words.Drawing;
namespace ConsoleApplication1 { class Class1 { [STAThread] static void Main(string[] args) { //open an existing DOC document using the Document object class
string ImageFilePath = "c:\\imagefolder"; Document doc = new Document(ImageFilePath + "\\ImageFile.doc"); //It gives a collection of all shape nodes in the tree
NodeCollection shapes = doc.GetChildNodes(NodeType.Shape, true, false); int imageIndex = 0;
foreach (Shape shape in shapes) { if (shape.HasImage) { String name = "DocumentImage" + "_" + imageIndex.ToString() + ".bmp"; shape.ImageData.Save(ImageFilePath +"\\"+ name); imageIndex++; } }
} } } |