Can't get XPATH working with Html Agility Pack

I'm attempting to use Firebug to get the XPATH value in order to scrape Wikipedia's "Today's highlighted article."

enter image description here

after which I pasted it into my code:

string result = wc.DownloadString("");

            HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();


            var featuredArticle = doc.DocumentNode.SelectSingleNode("/html/body/div[3]/div[3]/div[4]/table[2]/tbody/tr/td/table/tbody/tr[2]/td/div/p");

But featuredArticle always gives back null. Why am I misusing this?

8/8/2012 7:08:29 PM

Popular Answer

Because Firebug displays the XPath as if Firefox created the HTML, this may or may not be the same as what the server-generated HTML is. Also, since the Path from Firebug is absolute, even the smallest modification may cause it to fail.

The p-Tag you're searching for is in a div with the id, thus a simpler method is to just look at the , making it simpler to use XPath to find the div and then just grab the first p within.

akin to this

var wc = new WebClient();
var doc = new HtmlDocument();
var featuredArticle = doc.DocumentNode.SelectSingleNode("//div[@id='mp-tfa']/p");
Console.WriteLine(featuredArticle.InnerText); is the finest resource for learning how to use XPath.

Linq is another option, however I think XPath is a little more clearer.

var featuredArticle=   doc.DocumentNode.Descendants("div")
 .First(n => n.Id == "mp-tfa")
8/11/2014 8:44:05 AM

