Automating Form POSTs with Script and IE

By Peter A. Bromberg, Ph.D.

Peter Bromberg  

The InternetExplorer.Application COM Automation interface can be very handy for doing things with script from the client side, especially when we can't insure that other http components are available. Once you get the hang of this technique, it becomes possible to submit forms and get HTML - based information such as stock quotes and so on not only inside your web pages, but also from Excel, Word and any other script-compliant program, including Visual Basic itself.



The InternetExplorer.Application scripting interface has a number of properties and methods, but the two most important ones to know are the Navigate method, and the Document property. Let's take a look at the MSDN info for Navigate:

Navigate Method
Navigates to a resource identified by a URL or to the file identified by a full path.

Syntax:
object.Navigate( _
url As String, _
[Flags As Variant,] _
[TargetFrameName As Variant,] _
[PostData As Variant,] _
[Headers As Variant])

Parameters:

url
Required. A String expression that evaluates to the URL, full path, or Universal Naming Convention (UNC) location and name of the resource to display.
Flags
Optional. A constant or value that specifies whether to add the resource to the history list, whether to read from or write to the cache, and whether to display the resource in a new window. The variable can be a combination of the values defined by the BrowserNavConstants enumeration.
TargetFrameName
Optional. String expression that evaluates to the name of an HTML frame in URL to display in the browser window. The possible values for this parameter are:

_BLANK
Load the link into a new unnamed window.
_PARENT
Load the link into the immediate parent of the document the link is in.
_SELF
Load the link into the same window the link was clicked in.
_TOP
Load the link into the full body of the current window.
<WINDOW_NAME>
A named HTML frame. If no frame or window exists that matches the specified target name, a new window is opened for the specified link.

PostData
Optional. Data to send to the server during the HTTP POST transaction. For example, the POST transaction is used to send data gathered by an HTML form to a program or script. If this parameter does not specify any post data, the Navigate method issues an HTTP GET transaction. This parameter is ignored if URL is not an HTTPURL.
Headers
Optional. A value that specifies additional HTTP headers to send to the server. These headers are added to the default Microsoft® Internet Explorer headers. The headers can specify things like the action required of the server, the type of data being passed to the server, or a status code. This parameter is ignored if URL is not an HTTP URL.

Remarks:

The WebBrowser control or InternetExplorer object can browse to any location in the local file system, on the network, or on the World Wide Web. Use the Navigate method to tell the browser which location to browse to. By including a text box in your application, you can let the user specify the location to browse to and then pass the location to the Navigate method.

Well, that was a mouthful! One of the most important items above is where it talks about HEADERS. From this you could infer that it's possible to add a header like so:

vHeaders = "Content-Type: application/x-www-form-urlencoded" & chr(10) & chr(13)

and, according to the documentation, as long as there is something being sent in the PostData parameter, (e.g. - strPostData="name=John&address=123 High St") that you will get a form post. Hurray! Our form posting problem is solved, right?     NOT!    

It will work on some sites, but on many others you'll get an error. Apparently, it's just not a full RFC compliant POST. So there has to be another way. Remember I mentioned the two most important methods or properties? The other one was the Document property, and fortunately, this is the same Document object that you (hopefully) are so familiar with from all the cool VBScript and Javascript stuff you've already been doing! So you can access it's BODY element, its innerHTML and other methods, etc. In other words, why don't we just load a blank document, dynamically BUILD our own FORM, and then just call its SUBMIT method! So now let's take a look at some real code that does work:

<HTML>
<HEAD>
<TITLE>Automating Form Post with IE</TITLE>
<script language=JScript></script>
<script language="VBScript">
function doIEPost(formdata)
Dim IE: Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "about:blank" '<-- You have to do this first!!!!
IE.Document.body.innerHTML="<Form name=frm1 id=frm1 action=http://localhost/getform.asp METHOD=POST><input type=text name=data1 value=" & escape(formdata) &"><input type=submit></form>"
IE.Document.frm1.submit()
Do While IE.Busy
Loop
result.innerHTML= unescape(IE.Document.body.innerHTML)
IE.Quit
set IE = Nothing
end function
</script>
</HEAD>
<BODY>
<input type=text id=frmData>Enter Data to Post<BR>
<input type=button onclick="doIEPost(frmData.value)" value="DO POST"><BR>
<div id=result align="center"></div>
</BODY >
</HTML>

And here is the receiving "test page":

<%
' GetForm.asp
Response.write "You sent: " & Request.Form("Data1")
%>

So what we are doing here is:

1) instantiate an instance of InternetExplorer.Application
2) Fill it with a blank HTML DOM via Navigate "about:blank"
3) Add our Form HTML complete with all form field and accompanying data, as a string, to the innerHTML property of the IE.Document.body
4) Call the submit() method on our new FORM.
5) We loop while IE is busy to give it time to receive back the new "page" (can also use ReadyState etc.)
6) You now have a new Document containing the result page from the FORM submit!

Another techique, which is simply a logical extension to this one, is to first Navigate to an actual FORM page on the Internet, and then create your form entries (i.e., "fill out" that form programmatically) via the DOM, and then call the submit method on the actual public form page that is sitting in your invisible IE browser window. You will then of course have an IE Object with the result page of whatever normally comes up in the browser after you've pressed the submit button on that public page's form, and you can manipulate it or "screenscrape" it to your heart's content.

You may say, "Well OK that's cool, but I can just create a form with Javascript and call its submit method the same way". You can do that. But unless you do it from a hidden frame or find another way to retreive the return data, you are most likely going to end up with your users looking at a brand new page in your browser.

There are a number of other things you can do with the WebBrowser (IE) automation object. This short arcticle is just the "tip of the iceberg". You can make it visible, control its features, even use it to Navigate the Filesystem! Have fun, and er - don't get too ambitious!

Peter Bromberg is an independent consultant specializing in distributed .NET solutions in Orlando and a co-developer of the EggheadCafe.com developer website. He can be reached at pbromberg@yahoo.com

 
Do you have a question or comment about this article? Have a programming problem you need to solve? Post it at eggheadcafe.com forums and receive immediate email notification of responses.