I'm trying to scrape content from an example page using the HTML agility pack. The DocumentNode.SelectNodes is returning null for an XPath query when I think it shouldn't. Could someone tell me why? The code is:
HtmlDocument doc = new HtmlDocument();
string xpath = "//h1[@class='product-title fn']"; // note, it still returns
// null even with "//div"
doc.OptionFixNestedTags = true;
HtmlNode.ElementsFlags.Remove("form");
HtmlNode.ElementsFlags.Remove("option");
HtmlNodeCollection coll = doc.DocumentNode.SelectNodes(xpath);
if (coll != null)
{
// do stuff
}
else
{
// not expecting it to be null unless no matches
}
According to the upstream bug comments it is for consistency:
DarthObiwan wrote Jan 11 2011 at 9:27 PM
This has been covered before, this function is written to mimic the way the System.XML works. Doing so will cause a major breaking change and thus will probably be slated for 2.0