When using HTML Agility Pack to query HTML for ID, an exception occurs.

I'm using the HTML Agility pack to parse an ASPX file inside Visual Studio.

I'm searching for an element with a specified ID attribute.

The code I'm using is:

var html = new HtmlAgilityPack.HtmlDocument();
if (html.DocumentNode != null)
          var tagsWithId = html.DocumentNode.SelectNodes(string.Format("//[@id='{0}']", selector.Id));

However, when I run this code it throws the exception "Expression must evaluate to a node-set".

Can anyone tell me why this "must" evaluate to a node-set? Why can't it simply return no nodes (the next line calls tagsWithId.Count)? Surely the HtmlNodeCollection that is returned by the SelectNodes method can contain 0 nodes?

Or is the error due to a malformed Xpath expression? [The selector ID which I'm testing this with definitely exists in the file as <div id="thisId">.]

Is it even possible to load an ASPX file straight from Visual Studio (I'm building an add-in) or will this contain XML errors, and will I instead have to load the output HTML stream (i.e without the page declaration at the start of the file, etc.)?

7/4/2010 12:48:20 PM

Accepted Answer

The problem is in the argument to SelectNodes():


(after carrying out the replacement) is not a sybtactically legal XPath expression. So the problem is not that the XPath expresiion "returns no nodes" -- the problem is that it is syntactically illegal.

As per the XPath W3C Spec:

"// is short for /descendant-or-self::node()/"

Thus the above is expanded to:


Notice, that the last location step has no node-test and starts with the predicate. This is illegal according to the syntax rules of XPath.

Probably you want:

7/4/2010 3:35:30 PM

