How to get the innertext alone without the child tags using HtmlAgilityPack?

c# html-agility-pack

Question

I have an HTML page like below. I need to take the 'blah blah blah' alone from the 'span' tag.

<span class="news">
blah blah blah
<div>hello</div>
<div>bye</div> 
</span>

This gives me all values:

div.SelectSingleNode(".//span[@class='news']").InnerText.Trim();

This gives me null:

div.SelectSingleNode(".//span[@class='news']/preceding-sibling::text()").InnerText.Trim();

How do I get the text before the 'div' tag using HtmlAgilityPack?

Accepted Answer

Your 2nd try was pretty close. Use /text() instead of /preceding-sibling::text(), because the text node is child of the span[@class='news'] not sibling (neither preceding nor following) :

div.SelectSingleNode(".//span[@class='news']/text()")
   .InnerText
   .Trim();



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why