How can HtmlAgilityPack pull text from an html node with a dynamically added class attribute?

html-agility-pack

Question

Good pals, I want to pull text out.平均3.6 星 from the following section of code taken from amazon.cn.

<div class="content"><ul>
<li><b>用户评分:</b>
<span class="crAvgStars" style="white-space:no-wrap;">
<span class="asinReviewsSummary" ref="dp_db_cm_cr_acr_pop_" name="B004GUSIKO">
<a>
  <span class="swSprite s_star_3_5 " title="平均3.6 星">
  <span>平均3.6 星</span>
  </span>
</a>

I have a query about span class tag value."s_star_3_5 " fluctuate based on various customer rating levels and are dynamically added. Therefore, I try to utilizedoc.DocumentNode.SelectSingleNode(" //span[@class='swSprite']").InnerText or //span[@class='swSprite s_star_3_5 '] yet the outcome is incorrect or not what I wanted!

Any recommendations?

1
1
5/31/2011 2:20:07 AM

Accepted Answer

To start with, I advise you to save the value ofdoc.DocumentNode.OuterHtml to a native.html check to determine whether the code you're seeing comes from that file. The truth is, sometimes when utilizing HTML Agility Pack to parse a website, the very first issue is that you're not receiving the proper HTML correctly. You could be seeing a 404 error, redirection, etc.

Due to my testing, I'm recommending this.//span[@class='swSprite s_star_3_5 '] and performed as intended.

This was the problem with the following inquiries:

If it is ineffective, provide the HTML code and I will assist you.

2
5/23/2017 12:31:02 PM

Popular Answer

For me, this works:

HtmlDocument doc = new HtmlDocument();
doc.Load(myHtml);
HtmlNode node = doc.DocumentNode.SelectSingleNode("//span[starts-with(@class, 'swSprite')]");
Console.WriteLine("Text=" + node.InnerText.Trim());

and produces

平均3.6 星

Take note that I am using XPATH initiates function.



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow