Read any document (like .doc, .rtf , txt ) in ASP.Net , C#

By J S

Here is the code that helps to read any document (like .doc, .rtf , txt )from specified location.

Here is the code that helps  to read any  document (like .doc, .rtf , txt )
from specified location. This is a web based application and this code is
written in C# as code behind  in ASP.Net 2.0, where the word document
is hard to upload from client side. Here is the code that uploads the
document file and stores it into a string and from that I have placed
that string into a textbox.

The First Step is that, we need to add a COM reference (that’s how we
need to define the word application) to the project by right clicking in
the solution explorer on References->Add Reference. Click on the COM
tab and look for the Microsoft Word 9.0 Object Library. Click Select
and OK.

Now u need to add the line  <identity impersonate="true"/>

Like:

          <system.web>

          <identity impersonate="true"/>

in your  web.config .

Then here is the code u need to add in your summit button: :

protected void Submit1_ServerClick(object sender, EventArgs e)

{       

Word.ApplicationClass wordApp = new Word.ApplicationClass();

// Input box is used to get the path of the file which has to be 
  
uploaded into textbox.

string filePath = inputbox.Value;

object file = filePath;

object nullobj = System.Reflection.Missing.Value;

// here on Document.Open there should be 9 arg.

Word.Document doc = wordApp.Documents.Open(ref file,
ref nullobj, ref nullobj,ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,ref nullobj, ref nullobj,
ref nullobj); 

// Here the word content is copeied into a string which helps to
   store it into  textbox.

Word.Document doc1 = wordApp.ActiveDocument;

string m_Content = doc1.Content.Text;

// the content is stored into the textbox.

m_Textbox.Text = m_Content;

doc.Close(ref nullobj, ref nullobj, ref nullobj);

}


Hope this helps,


Cheers,

Jay

Popularity  (39770 Views)
Create New Account
Article: Read a word document (.Doc , .rtf , .txt ) from ASP.Net and C#
J S posted at Friday, September 08, 2006 6:55 AM

Here is the code that helps  to read any  document (like .doc, .rtf , txt )from specified location. This is a web based application and this code is written in C# as code behind  in ASP.Net 2.0, where the word document is hard to upload from client side. Here is the code that uploads the document file and stores it into a string and from that I have placed that string into a textbox.

The First Step is that, we need to add a COM reference (that’s how we need to define the word application) to the project by right clicking in the solution explorer on References->Add Reference. Click on the COM tab and look for the Microsoft Word 9.0 Object Library. Click Select and OK.

Now u need to add the line  <identity impersonate="true"/>

Like:
         
<system.web>
          <
identity impersonate="true"/>

in your  web.config .
Then here is the code u need to add in your summit button:


protected
void Submit1_ServerClick(object sender, EventArgs e)
{  

Word.ApplicationClass wordApp = new Word.ApplicationClass();

// Input box is used to get the path of the file which ahas to be  uploaded into textbox.

string filePath = inputbox.Value;

object file = filePath;

object nullobj = System.Reflection.Missing.Value;

// here on Document.Open there should be 9 arg.

Word.Document doc = wordApp.Documents.Open(ref file, ref nullobj, ref nullobj,ref nullobj, ref nullobj, ref nullobj,ref nullobj, ref nullobj, ref nullobj,ref nullobj, ref nullobj, ref nullobj); 

// Here the word content is copeied into a string which helps to store it into  textbox.

Word.Document doc1 = wordApp.ActiveDocument;

string m_Content = doc1.Content.Text;

// the content is stored into the textbox.

m_Textbox.Text = m_Content;

doc.Close(ref nullobj, ref nullobj, ref nullobj);
}

reply
Error Msg An unhandled exception of type 'System.Runtime.InteropServices.COMException' occurred in file_Management.exe
soma sundaram replied to J S at Wednesday, January 03, 2007 10:45 PM

I've to read the .doc files and displying contents in Richtextbox.. 

I've used the .doc file and using read purpose only ...Here i wrote the code

But I getting Error msg :

An unhandled exception of type 'System.Runtime.InteropServices.COMException' occurred in file_Management.exe"

 

Dim wordApp As Word.ApplicationClass = New ApplicationClass

Dim file As Object = myFileName

Dim Nothingobj As Object = System.Reflection.Missing.Value

doc = wordApp.Documents.Open(file, Nothingobj, Nothingobj, Nothingobj, Nothingobj, Nothingobj, Nothingobj, Nothingobj, Nothingobj, Nothingobj, Nothingobj, Nothingobj)

doc.ActiveWindow.Selection.WholeStory()

doc.ActiveWindow.Selection.Copy()

Dim data As IDataObject = Clipboard.GetDataObject()

RichTextBox1.Text = data.GetData(DataFormats.Text).ToString()

doc.Close()

 

 

 

reply
Working Perfect but With out images
Soumya Panda replied to J S at Wednesday, January 03, 2007 10:45 PM
Dear Sir,
               It is a very nice post by you. By using the above code , i can retrieve the word contents in asp.net , but one thing is... it is not retrieving the images that are present in the word documents.Any help regardung this would be greatly appriciated...

Thanking you.
reply
Working Perfect but With out images
Soumya Panda replied to J S at Wednesday, January 03, 2007 10:45 PM
Dear Sir,
               It is a very nice post by you. By using the above code , i can retrieve the word contents in asp.net , but one thing is... it is not retrieving the images that are present in the word documents.Any help regardung this would be greatly appriciated...

Thanking you.
Email--  me.srpanda@gmail.com
reply
Thanks.
s v replied to soma sundaram at Wednesday, January 03, 2007 10:45 PM
Thanks.. its working fine. :)
reply
Hi i got the error while running this the above mentioned code
Harish Kumar replied to J S at Wednesday, January 03, 2007 10:45 PM
: CS1501: No overload for method 'Open' takes '12' arguments


this was the errro, how to resolve it,
please let me know the possible ways to resolve it as soon as possible.

thanks in advance,
reply
Not able to read the Images from the WORD document
SudharshanRaj Talari replied to Soumya Panda at Wednesday, January 03, 2007 10:45 PM
Hi All,

I am facing same problem. using above code, iam able to read the content of the document but images are not reading from the document.

can any one have the solution for this, if so, please help me.
it is very urgent for me


Thanks in advance,

Sudharshan
reply
How to get rich text from data and view in word - C#.net ?
Mary Tor replied to J S at Wednesday, January 03, 2007 10:45 PM

Hi, I saw your solution and the solution looks like marvolous if it can do as I need.

I have an original Word document, and I have an rich text or HTML text that I need to import from Data ( Oracle ) on the field called TextToMerge.

So, I need to append the the fields contains text ( HTML or Rich Text ), in an word document.

Is it possible to do? If yes, could you please send me an sample code, or link to my study?

Thanks in advance.

Regards.

 

reply
No format retained
Vishal Gadilkar replied to J S at Wednesday, January 03, 2007 10:45 PM
Hi,
 The code works fine to get the text from RTF to txt, but It removes all the formattings from the text like - alignments, font style, font size etc.
 Are there any options to retain the formatting characters from the original RTF?

Thsnks and hoping for solution ......

Vishal Gadilkar,
Pune.
reply
error while running the application on the web server
Monica Jeevan replied to Vishal Gadilkar at Wednesday, January 03, 2007 10:45 PM

Hi,

I have used the same code in my application to get the content from a word document and display it in a textbox. The application works fine on my local PC. but when i deploy it in the web server, it throws error in the following line of my code

ApplicationClass wordApp = new ApplicationClass();

with the following error message

Retrieving the COM class factory for component with CLSID {000209FF-0000-0000-C000-000000000046} failed due to the following error: 80070005.

can anyone tell what could be the problem and how to rectify it.

Thanks in advance.

Monica Jeevan

 

reply
Reply to Monica
Vishal Gadilkar replied to Monica Jeevan at Wednesday, January 03, 2007 10:45 PM
HI Monica,
have u added reference for the dll    Interop.Office.dll

or simply add the dll to ur bin folder in the project on the web server.

and make sure that the server is having installed MS Office..

reply
No overload for method 'Open' takes '12' arguments
prabhu r replied to Harish Kumar at Wednesday, January 03, 2007 10:45 PM
Add the following  to fix No overload for method 'Open' takes '12' arguments

 Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Open(ref file, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj,ref nullobj ,ref nullobj,ref nullobj);

And u can find the dll in this link http://www.microsoft.com/downloads/details.aspx?FamilyId=C41BD61E-3060-4F71-A6B4-01FEBA508E52 (http://www.microsoft.com/downloads/details.aspx?FamilyId=C41BD61E-3060-4F71-A6B4-01FEBA508E52)
reply
Why use clipboard when you can get the text from the file
Omar Hussain replied to soma sundaram at Wednesday, January 03, 2007 10:45 PM
I have not used the older dlls of Microsoft.Office.Interop.Word but version 12.0 supports direct access to the contents of the file. So without having to select contents from active window, then copying to clipboard and finally reading them off the clipboard, you can just access the doc.Content.Text property for the full contents.

           string strPath = "c:\text.doc";
            object objPath = null;
            object missing = Missing.Value;
            object objTrue = true;
            object objFalse = false;
            ApplicationClass wordApp = new ApplicationClass();
                
            try
            {
           

                objPath = strPath;
                Document doc = wordApp.Documents.Open(ref objPath, ref objFalse, ref objTrue, ref objFalse, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref objFalse,
                                                        ref missing, ref missing, ref missing, ref missing);
                
                string newText = doc.Content.Text;
}finall{
wordApp.Application.Quit(ref missing, ref missing, ref missing);
}
reply
abc replied to J S at Wednesday, January 03, 2007 10:45 PM
code is not wrkn..
cnt read contents
reply
Edwin replied to abc at Wednesday, January 03, 2007 10:45 PM
Hi
the abouve code is not working with docs which contain Table

i meen if a word doc is full of tables then this code return jest "\r\rr\r\r\r\rr\r\r\r\rr\r\"

can any one help me on this?

thank you
reply
jason rle replied to Edwin at Wednesday, January 03, 2007 10:45 PM
To slove the problem, You can try to a c# word component named Spire.Doc.
http://www.e-iceblue.com/Introduce/word-for-net-introduce.html
reply
rahul replied to J S at Wednesday, January 03, 2007 10:45 PM
hi, I am gating this error in server buts working in local host

Server Error in '/' Application.

Retrieving the COM class factory for component with CLSID {000209FF-0000-0000-C000-000000000046} failed due to the following error: 80040154 Class not registered (Exception from HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG)).

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code. 

Exception Details: System.Runtime.InteropServices.COMException: Retrieving the COM class factory for component with CLSID {000209FF-0000-0000-C000-000000000046} failed due to the following error: 80040154 Class not registered (Exception from HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG)).

Source Error: 

An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.

Stack Trace: 

[COMException (0x80040154): Retrieving the COM class factory for component with CLSID {000209FF-0000-0000-C000-000000000046} failed due to the following error: 80040154 Class not registered (Exception from HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG)).]
   cvparsing_c_sharp._Default.Button2_Click(Object sender, EventArgs e) +69
   System.Web.UI.WebControls.Button.OnClick(EventArgs e) +118
   System.Web.UI.WebControls.Button.RaisePostBackEvent(String eventArgument) +112
   System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePostBackEvent(String eventArgument) +10
   System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl, String eventArgument) +13
   System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData) +36
   System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +5563


Version Information: Microsoft .NET Framework Version:4.0.30319; ASP.NET Version:4.0.30319.1
reply
rahul replied to prabhu r at Wednesday, January 03, 2007 10:45 PM
Hi

I did like that only its working in local host but its not working in server, its giving error
plz tell me any solution

Rahul

Server Error in '/' Application.

Method not found: 'Word.Document Word.Documents.Open(System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef)'.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code. 

Exception Details: System.MissingMethodException: Method not found: 'Word.Document Word.Documents.Open(System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef)'.

Source Error: 

An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.

Stack Trace: 

[MissingMethodException: Method not found: 'Word.Document Word.Documents.Open(System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef, System.Object ByRef)'.]
   cvparsing_c_sharp._Default.Button2_Click(Object sender, EventArgs e) +0
   System.Web.UI.WebControls.Button.OnClick(EventArgs e) +111
   System.Web.UI.WebControls.Button.RaisePostBackEvent(String eventArgument) +110
   System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePostBackEvent(String eventArgument) +10
   System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl, String eventArgument) +13
   System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData) +36
   System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +1565


Version Information: Microsoft .NET Framework Version:2.0.50727.4952; ASP.NET Version:2.0.50727.4955
reply
GotiBandhu replied to rahul at Wednesday, January 03, 2007 10:45 PM
please check out the following links................
http://www.mindstick.com/Articles/a25ba73f-324d-4926-93b5-89460f77621d/?Create%20Microsoft%20Word%20Document%20by%20using%20C#
http://www.mindstick.com/Articles/5cd1b721-9b94-4ea0-bd6e-2bb157401069/?Read%20Microsoft%20Word%20Document%20File%20by%20using%20C#
reply
Pandey replied to J S at Wednesday, January 03, 2007 10:45 PM
reply
Shashidhar replied to J S at Wednesday, January 03, 2007 10:45 PM
It works nicely but, my word document has some images. Can you suggest me how i can include this one on the web page.

Thanks,

Best wishes,

Shashi
reply
How to read word file in other encoding s?
masoud jabbarvand replied to Shashidhar at Wednesday, January 03, 2007 10:45 PM
Hi
I need open a .docx file in c# and save its content. i use above code. it work proper for english language but dosent work for persian and other languages!
could you help me?
tnx... 
reply
genet genevi replied to J S at Wednesday, January 03, 2007 10:45 PM
  Error 1 The name 'm_Textbox' does not exist in the current context

Error 2 The type or namespace name 'Word' could not be found (are you missing a using directive or an assembly reference?

Error 4 The name 'inputbox' does not exist in the current context

Error 7 The name 'm_Textbox' does not exist in the current context

           I have added the reference Microsoft Word 11.0 word library



reply