Can't get XPATH working with Html Agility Pack

.net c# html-agility-pack xpath


I'm attempting to use Firebug to get the XPATH value in order to scrape Wikipedia's "Today's highlighted article."

enter image description here

after which I pasted it into my code:

string result = wc.DownloadString("");

            HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();


            var featuredArticle = doc.DocumentNode.SelectSingleNode("/html/body/div[3]/div[3]/div[4]/table[2]/tbody/tr/td/table/tbody/tr[2]/td/div/p");

But featuredArticle always gives back null. Why am I misusing this?

8/8/2012 7:08:29 PM

Popular Answer

Because Firebug displays the XPath as if Firefox created the HTML, this may or may not be the same as what the server-generated HTML is. Also, since the Path from Firebug is absolute, even the smallest modification may cause it to fail.

The p-Tag you're searching for is in a div with the id, thus a simpler method is to just look at the , making it simpler to use XPath to find the div and then just grab the first p within.

akin to this

var wc = new WebClient();
var doc = new HtmlDocument();
var featuredArticle = doc.DocumentNode.SelectSingleNode("//div[@id='mp-tfa']/p");
Console.WriteLine(featuredArticle.InnerText); is the finest resource for learning how to use XPath.

Linq is another option, however I think XPath is a little more clearer.

var featuredArticle=   doc.DocumentNode.Descendants("div")
 .First(n => n.Id == "mp-tfa")
8/11/2014 8:44:05 AM

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow