Error when try to load html with htmlagiltypack

.net c# html-agility-pack

Question

I am trying to run this code

string path = "http://warisons.rssing.com/chan1729325/all_p43.html";
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(path);
var div = htmlDoc.DocumentNode.Descendants("div");
foreach (var x in div)
{
    Console.WriteLine(x.Attributes["class"].Value);
}

when I debug this code in htmlDoc.LoadHtml(path); I got this error

Locating source for 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'. Checksum: MD5 {4e 14 d3 b d5 30 6e 2c bf 84 ab 8a 96 82 4a 8f} The file 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs' does not exist. Looking in script documents for 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'... Looking in the projects for 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'. The file was not found in a project. Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\crt\src\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\crt\src\vccorlib\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\atlmfc\src\mfc\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\atlmfc\src\atl\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\atlmfc\include'... The debug source files settings for the active solution indicate that the debugger will not ask the user to find the file: d:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs. The debugger could not locate the source file 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'.

Accepted Answer

Your attempt to load html document from URI is incorrect.

Methof HtmlDocument.LoadHtml loads html from string provided, so its argument is html text itself, not URI.

To load html from provided URI you need something like:

string path = "http://warisons.rssing.com/chan1729325/all_p43.html";
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlWeb().Load(path);

Also note you can get NullReferenceException here:

x.Attributes["class"].Value

since you're not checking if there is class attribute (x.Attributes["class"] != null) before accessing its value.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why