How to get the inner text for a single node using HtmlAgilityPack

c# html-agility-pack

Question

The HTML I use is as follows:

        <div id="footer">
            <div id="footertext">
                <p> 
                    Copyright &copy; FUCHS Online Ltd, 2013. All Rights Reserved.
                </p>
             </div>
        </div>

In my C# code, I want to extract the following content from the markup: "Copyright © FUCHS Online Ltd, 2013. All Rights ".

I have tried what the following:

   public string getvalue()
        {
            HtmlWeb web = new HtmlWeb();
            HtmlAgilityPack.HtmlDocument doc = web.Load("www.fuchsonline.com");
            var link = doc.DocumentNode.SelectNodes("//div[@id='footertext']");
            return link.ToString();
        }

This produces a "HtmlAgilityPack.HtmlNodeCollection" object. How can I get the text value alone?

1
0
3/15/2018 9:17:28 PM

Popular Answer

One node's value is what you need. Therefore, using is preferable.SelectSingleNode method.

HtmlWeb web = new HtmlWeb();
var doc = web.Load("http://www.fuchsonline.com");
var link = doc.DocumentNode.SelectSingleNode("//div[@id='footertext']/p");

string rawText = link.InnerText.Trim();
string decodedText = HttpUtility.HtmlDecode(text); // or WebUtility

return decodedText;

You may also have to decode the html entity.&copy; .

2
7/3/2016 11:21:42 AM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow