HtmlAgilityPack does not pick childNodes as intended.

.net asp.net c# html-agility-pack xpath

Question

I'm trying to parse certain links on a page using the HTML Agility Pack library, but I'm not getting the results I'd hoped for. I've included a HtmlNodeCollection of links below. I want to determine if an image node exists for each link before parsing its attributes, however the SelectNodes and SelectSingleNode methods of linkNode seem to be looking for the parent document rather than the childNodes of linkNode. Why is this?

HtmlDocument htmldoc = new HtmlDocument();
htmldoc.LoadHtml(content);
HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");

foreach(HtmlNode linkNode in linkNodes)
{
    string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
    if (linkTitle == string.Empty)
    {
        HtmlNode imageNode = linkNode.SelectSingleNode("/img[@alt]");     
    }
}

If there is an alternative, how can I get the alt property of the image child node of the linkNode?

1
38
5/13/2009 10:41:21 AM

Accepted Answer

The forwardslash prefix in "/img[@alt]" indicates that you want to start at the root of the page, so delete it.

HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");
39
5/13/2009 10:46:27 AM

Popular Answer

You may also use "." in an xpath query to specify that the search should begin at the current node.

HtmlNode imageNode = linkNode.SelectSingleNode(".//img[@alt]");


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow