C# .NET - Screen scraping for asp page from asp.net

Asked By John M
01-Jun-07 06:18 AM

 I am using the following code the scrape results from another website which requires a userName and DateofBirth to validate and then return the results. The code works fine in Local PC, but gives error in Production Server . Remote Server returned error:(403) Forbidden .

Please suggest some ways to resolve this.

 

   HttpWebResponse res;
   System.IO.StreamReader sr;
   System.IO.Stream s;
   byte[] b;
   string postData="Pass=1&UserName=Bob&Dob=03081982&submit1=Submit";
   try{
   
    HttpWebRequest request =  (HttpWebRequest)WebRequest.Create("http://www.xxx.org/Login.asp");

    request.Method = "POST";
    //request.Accept ="*/*";       
    request.ContentType = "application/x-www-form-urlencoded";    
    b = System.Text.Encoding.ASCII.GetBytes(postData);     
    request.ContentLength = postData.Length;
    request.UserAgent ="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)";    
    CookieContainer cookies = new CookieContainer();
    request.CookieContainer = cookies;    
    s=request.GetRequestStream();
    s.Write(b, 0, b.Length);         
    Encoding enc=System.Text.Encoding.ASCII;
    WebResponse response = request.GetResponse();    
    StreamReader responseStream = new StreamReader(response.GetResponseStream(),enc);
       
    //StreamReader responseStream = new StreamReader(response.GetResponseStream(),enc,true);
    string responseHtml = responseStream.ReadToEnd();                   
    int resindex=responseHtml.IndexOf("Registration Number:");
    litHTMLfromScrapedPage.Text=responseHtml;
   }
   catch(Exception ex) {
   litHTMLfromScrapedPage.Text=ex.Source + " " +  ex.Message ;
    
   }

If you are getting back a 403  If you are getting back a 403

01-Jun-07 09:58 AM

that usually means either you aren't logged in (authenticated), or the server is set up to disallow anything except requests having a referer of the page that is "supposed" to be making the request, or possibly that you aren't transmitting a valid UserAgent request header string.

 

Getting 403 error  Getting 403 error

03-Jun-07 10:14 PM
How should i handle in code to avoid getting 403.
Create New Account
help
NET Framework I am unable to display the bitmap (gif / png / jpg) using below code HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create("http: / / localhost / company.gif"); HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse(); Stream receiveStream = httpWebResponse.GetResponseStream(); What exactly is missing here. . .can someone focus. . why I am not able to
HttpWebRequest .NET Framework Hello, Is there a way somehow to read only part of the file via Http ? (Here is the code that reads the whole file) HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create("http: / / x:8080 / axis2 / axis2-web / log s / aaa.out"); req.Proxy.Credentials = CredentialCache DefaultNetworkCredentials; HttpWebResponse resp = (HttpWebResponse)req.GetResponse(); Stream streamReader = resp.GetResponseStream(); How can I have in the final stream just part of the file according to some index? Thanks * ** Sent via Developersdex http: / / www.developersdex.com * ** C# Discussions CredentialCache.DefaultNetworkCredentials (1) HttpWebRequest.Create (1) HttpWebResponse (1) HttpWebRequest (1) GetResponseStream (1) CredentialCache (1) GetResponse (1) Stream (1) I
all these files should be stored in a folder in local machine. . . Hope this helps: httpWebRequest = (HttpWebRequest)WebRequest.Create(this.ThumbnailUrl); httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse(); Stream receiveStream = httpWebResponse.GetResponseStream(); Image image = Image.FromStream(receiveStream); string fileName = Path.Combine(directory, this.Guid.ToString() + ".jpg"); image reading images from another web site protected void Button1_Click(object sender, EventArgs e) { System.Net.HttpWebRequest wq = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(" http: / / www.pointernet.pds.hu / Kutya / wallpapers
GetWebPage { public static void Main ( string [] args ) { for ( int i = 0 ; ilt ; args . Length ; i ++) { HttpWebRequest httpWebRequest = ( HttpWebRequest ) WebRequest . Create ( args [ i ]); HttpWebResponse httpWebResponse = ( HttpWebResponse ) httpWebRequest . GetResponse (); Stream stream = httpWebResponse . GetResponseStream (); StreamReader streamReader = new StreamReader ( stream , Encoding . ASCII ); Console . WriteLine ( streamReader . ReadToEnd ()); } Console . Read (); } } } refer
I want to retrieve title of the given url. I am doing this by using HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(strURL); HttpWebResponse res = (HttpWebResponse)req.GetResponse(); if (res.StatusCode.ToString() = = "OK") / / Status will be "OK" if the url is valid. { string content = ""; System.IO.StreamReader sr = new StreamReader(res.GetResponseStream(), System.Text.Encoding.GetEncoding("UTF-8")); content = sr.ReadToEnd(); / / Content is the content of this Create(url); System.Net.WebResponse webResponse = webRequest.GetResponse(); System.IO.StreamReader sr = new StreamReader(webResponse.GetResponseStream(), System.Text.Encoding.GetEncoding( "UTF-8" )); content = sr.ReadToEnd(); / / Content is the content of this