How to get the innertext alone without the child tags using HtmlAgilityPack?

c# html-agility-pack

Question

I have an HTML page like below. I need to take the 'blah blah blah' alone from the 'span' tag.

<span class="news">
blah blah blah
<div>hello</div>
<div>bye</div> 
</span>

This gives me all values:

div.SelectSingleNode(".//span[@class='news']").InnerText.Trim();

This gives me null:

div.SelectSingleNode(".//span[@class='news']/preceding-sibling::text()").InnerText.Trim();

How do I get the text before the 'div' tag using HtmlAgilityPack?

1
7
10/18/2014 10:32:29 AM

Accepted Answer

Your 2nd try was pretty close. Use /text() instead of /preceding-sibling::text(), because the text node is child of the span[@class='news'] not sibling (neither preceding nor following) :

div.SelectSingleNode(".//span[@class='news']/text()")
   .InnerText
   .Trim();
11
10/18/2014 10:35:59 AM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow