C# HTML Agility Pack (not/wrong) iterating over node collection

c# collections foreach html-agility-pack

Question

im using HTML Agility Pack to fetch URLs from w webpage. The URL is:

http://goo.gl/DqfQl

If i use the code below i get the links i want:

String html = getHtml("http://goo.gl/DqfQl");

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

doc.LoadHtml(html);

HtmlNodeCollection address_rows = doc.DocumentNode.SelectNodes("//div[@class='name']/a"); 

foreach (HtmlNode row in address_rows)
{
    MessageBox.Show(row.GetAttributeValue("href",LINK_NOT_FOUND));
}

But when i change the HtmlNodeCollection to fetch the containg div with class="row' and the want to fetch the URL i get always the first URL.

HtmlNodeCollection address_rows = doc.DocumentNode.SelectNodes("//div[@class='row']"); 

foreach (HtmlNode element in address_rows) {
    MessageBox.Show(element.SelectSingleNode("//div[@class='name']/a").GetAttributeValue("href",LINK_NOT_FOUND));
}   

I played a little with this code and for a while i thought i worked. But now i cant using the second code snippet select all the URLs i want. Can you help?

Accepted Answer

You have to add a dot "." to the XPath, otherwise it wil match from the beginning of the Document and not inside the node.

Just change your second string to ".//div[@class='name']/a" and it should work




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why