XPath Query Problem using HTML Agility Pack

c# html-agility-pack xpath

Question

I'm trying to scrape the price field from this website using the HTML Agility Pack.

My code is as follows;

var web = new HtmlWeb();
var doc = web.Load(String.Format(overClockersURL, componentID));
var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");

I obtained the XPath query by using Firebug's "Copy as XPath" feature.

The problem I'm having is that SelectSingleNode is returning null - it doesn't seem to find the element specified by the query. I'm a bit stumped as to why, but I don't have much experience with XPath, so would appreciate some pointers as to what I've done wrong.

Accepted Answer

When that happens, you should check if the page is being loaded correctly (you said you're through a HTTP Proxy?)

Try writing the content of doc.DocumentNode.OuterHtml to a text file so you can see if the page is being loaded correctly. Maybe you're getting an error page instead of the original page.


Popular Answer

If I run this code:

    var web = new HtmlWeb();
    var doc = web.Load("http://www.overclockers.co.uk/showproduct.php?prodid=GX-033-HS");
    var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");
    Console.WriteLine("price=" + priceContent.InnerHtml);

It outputs:

price=529.99

So it seems to be working. You can also use //span[@id=\"prodprice\"]" which is better as it avoids all non SPAN tags.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why