Unable to read data from the transport connection: C# HtmlAgilityPack

c# html-agility-pack inner-exception web-scraping


So i'm making a program (for own purposes) with HtmlAgilityPack in C# that at a certain point loads a webpage. after loading lots of pages, i get this error:

Unhandled Exception: System.IO.IOException: Unable to read data from the transpo
rt connection: An existing connection was forcibly closed by the remote host. --
-> System.Net.Sockets.SocketException: An existing connection was forcibly close
d by the remote host
   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size,
 SocketFlags socketFlags)
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 s
   --- End of inner exception stack trace ---
   at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.IO.StreamReader.ReadBuffer()
   at System.IO.StreamReader.ReadToEnd()
   at HtmlAgilityPack.HtmlDocument.Load(TextReader reader) in d:\Source\htmlagil
itypack.new\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 612
   at HtmlAgilityPack.HtmlWeb.Get(Uri uri, String method, String path, HtmlDocum
ent doc, IWebProxy proxy, ICredentials creds) in d:\Source\htmlagilitypack.new\T
runk\HtmlAgilityPack\HtmlWeb.cs:line 1422
   at HtmlAgilityPack.HtmlWeb.LoadUrl(Uri uri, String method, WebProxy proxy, Ne
tworkCredential creds) in d:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\Ht
mlWeb.cs:line 1479
   at HtmlAgilityPack.HtmlWeb.Load(String url, String method) in d:\Source\htmla
gilitypack.new\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1103
   at HtmlAgilityPack.HtmlWeb.Load(String url) in d:\Source\htmlagilitypack.new\
Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1061
   at ConsoleApplication1.Program.Main(String[] args) in 
s:line 37

At line 37 i'm loading a page inside a forloop:

for (var i = 0; i< 5000; i++)
    var page = web.Load(url+Convert.ToString(i+1)+"/");

I have tried to do some research on the error, but there wasn't a lot of in formation out there.

5/3/2014 11:30:03 PM

Popular Answer

I got the same error after downloading some 1000+ webpages. Solved it with an extra catch regarding IOException, in the loop. Here is my code:

HtmlWeb web = new HtmlWeb();
web.PreRequest = delegate(HttpWebRequest webRequest)
   webRequest.Timeout = 15000;
   return true;

try { doc = web.Load(yUrl); }
catch (WebException ex)
    if (reTryCounter == 19) { MessageBox.Show("Error Program 1121 , Download webpage \n" + ex.ToString());  }
catch (IOException ex2)
    MessageBox.Show("Error Program 1125 , IOException Download webpage \n" + ex2.ToString());
    return null;
4/23/2014 3:06:23 PM

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow