Problems getting childNodes using HTMLAgilityPack and XPath

c# html-agility-pack xpath

Question

I am trying to parse the following HTML. I need to get the innertext of all links under a h4 tag with the value "Title".

<h4>Title</h4>
    <ul>
         <li>
             <a>One</a>
         </li>
         <li>
             <a>Two</a>
         </li>
         <li>
             <a>Three</a>
         </li>
    </ul>

I can get the h4 element ok using the following code:

var links = document.DocumentNode.SelectNodes("//h4[contains(text(),'Title')]");

The problem comes with trying to get the a nodes. I have tried the following code but none works:

var urls = member.SelectNodes(".//a");

foreach (var url in urls)
{
    Console.WriteLine(url.InnerText);
}

Accepted Answer

From what I can gather, I think its not working because the xpath you're using is expecting the a nodes to be children of your h4 node, I've not tested this, and may be missing interpreting your requirements but...

var links = document.DocumentNode.SelectNodes("//h4[contains(text(),'Title')]/following-sibling::*[1]//a");

This would get all of the a nodes that are found in the first sibling of the h4 node. So in your example HTML, it should get all a nodes within the ul node

Hope this helps




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why