Html Agility Pack, SelectSingleNode

.net c# html-agility-pack


This code operates.

        WebClient client = new WebClient();
        client.Encoding = Encoding.UTF8;
        html = client.DownloadString("");
        HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

Codes in HTML here:

<a href="/title/tt4972582/?pf_rd_m=A2FGELUUNOQJNL&amp;pf_rd_p=2240084082&amp;pf_rd_r=1QW31NGD6JSE46F79CKQ&amp;pf_rd_s=center-1&amp;pf_rd_t=15506&amp;pf_rd_i=moviemeter&amp;ref_=chtmvm_tt_1" title="M. Night Shyamalan (dir.), James McAvoy, Anya Taylor-Joy">Split</a>

MessageBox displays the word "Split" in its text. However, notice these HTML codes:

<div class="summary_text" itemprop="description">
                Three girls are kidnapped by a man with a diagnosed 23 distinct personalities, and must try and escape before the apparent emergence of a frightful new 24th.

I created the following code because I wanted the message box to display the text that begins, "Three girls are kidnapped."

        WebClient client2 = new WebClient();
        client2.Encoding = Encoding.UTF8;
        HtmlAgilityPack.HtmlDocument doc2 = new HtmlAgilityPack.HtmlDocument();
        MessageBox.Show(doc2.DocumentNode.SelectSingleNode("//*[@id='title - overview - widget']/div[3]/div[1]/div[1]").InnerText);

An unhandled exception of type "System.NullReferenceException" occured when I started this code.

What should I do now that I've verified that Xpaths are true a hundred times?

2/3/2017 4:22:40 PM

Accepted Answer

Can you give it a try?

        HtmlWeb web = new HtmlWeb();
        HtmlDocument doc = web.Load("");
        var desNodeText = doc.DocumentNode.Descendants("div").FirstOrDefault(o => o.GetAttributeValue("class", "") == "summary_text").InnerText;   
2/3/2017 6:23:19 PM

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow