getting text from html document using HtmlAgilityPack via XPath

c# html-agility-pack xpath

Question

I have following html in a file, I am loading this file into an HTMLDocument using HtmlAgilityPack.

The problem is that I only want to get Hello World! using XPath and not the inner text.

How do I achieve this?

<ul>
    <li>
        Hello world!
        <ul>
            <li>
                Welcome to planet!
            </li>
        </ul>
    </li>
</ul>

Accepted Answer

The XPath:

//ul/li[1]/text()

Should select the actual text "Hello World!"

You can then select the value of this node.

In use:

string text = doc.DocumentElement.SelectSingleNode("//ul/li[1]/text()").Value;

In essence, what this says is navigate to a ul node, select the first li, and then select the text() node.


Popular Answer

htmlDocument.DocumentNode.SelectNodes("//ul/li").First().FirstChild.InnerText;

will return Hello world!




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why