Quindi sto creando un programma (per scopi personali) con HtmlAgilityPack in C # che a un certo punto carica una pagina web. dopo aver caricato un sacco di pagine, ottengo questo errore:
Unhandled Exception: System.IO.IOException: Unable to read data from the transpo
rt connection: An existing connection was forcibly closed by the remote host. --
-> System.Net.Sockets.SocketException: An existing connection was forcibly close
d by the remote host
at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size,
SocketFlags socketFlags)
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 s
ize)
--- End of inner exception stack trace ---
at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.IO.StreamReader.ReadBuffer()
at System.IO.StreamReader.ReadToEnd()
at HtmlAgilityPack.HtmlDocument.Load(TextReader reader) in d:\Source\htmlagil
itypack.new\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 612
at HtmlAgilityPack.HtmlWeb.Get(Uri uri, String method, String path, HtmlDocum
ent doc, IWebProxy proxy, ICredentials creds) in d:\Source\htmlagilitypack.new\T
runk\HtmlAgilityPack\HtmlWeb.cs:line 1422
at HtmlAgilityPack.HtmlWeb.LoadUrl(Uri uri, String method, WebProxy proxy, Ne
tworkCredential creds) in d:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\Ht
mlWeb.cs:line 1479
at HtmlAgilityPack.HtmlWeb.Load(String url, String method) in d:\Source\htmla
gilitypack.new\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1103
at HtmlAgilityPack.HtmlWeb.Load(String url) in d:\Source\htmlagilitypack.new\
Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1061
at ConsoleApplication1.Program.Main(String[] args) in
c:\Users\...ConsoleApplication1\Program.c
s:line 37
Alla riga 37 sto caricando una pagina all'interno di un forloop:
for (var i = 0; i< 5000; i++)
var page = web.Load(url+Convert.ToString(i+1)+"/");
Ho provato a fare qualche ricerca sull'errore, ma non c'era molta formazione in circolazione.
Ho avuto lo stesso errore dopo aver scaricato più di 1000 pagine web. Risolto il problema con una presa extra riguardante IOException, nel loop. Ecco il mio codice:
HtmlWeb web = new HtmlWeb();
web.PreRequest = delegate(HttpWebRequest webRequest)
{
webRequest.Timeout = 15000;
return true;
};
try { doc = web.Load(yUrl); }
catch (WebException ex)
{
reTryCounter++;
if (reTryCounter == 19) { MessageBox.Show("Error Program 1121 , Download webpage \n" + ex.ToString()); }
}
catch (IOException ex2)
{
MessageBox.Show("Error Program 1125 , IOException Download webpage \n" + ex2.ToString());
return null;
}