I want all <p>=.+=</p>
tags. The Regex works on its own, without the <p>
tags.
Here's my XPath: "//p[re:test(.,'^=.+=$', 'i')]"
But I'm getting an exception when I plug it into,
HtmlNodeCollection pNodes = htmlDoc.DocumentNode.SelectNodes("//p[re:test(.,'^=.+=$', 'i')]");
The exception is:
Namespace Manager or XsltContext needed. This query has a prefix, variable, or user-defined function.
Edit: The Html is generated by FCKEditor and has no namespace defined. Do I need to set something for this to work?
The HTML:
<p><style type="text/css">
h2 a { color: black; }</style></p>
<p>----</p>
<h2>test <a href="http://searisen.com">link</a></h2>
<p>== Heading 2 ==</p>
<p>----</p>
<p>=== Heading [http://searisen.com SeaRisen.com] ===</p>
The error you have is due to the fact that the expression re:test
uses an XPATH function named test
(declared in a namespace whose prefix is re
), that is unknown to the XSLT context.
I don't know where you got that expression from, but it's not standard, so it means nothing in the Html Agility Pack context :-)
For indepth explanation, see this cool article here: Adding Custom Functions to XPath. Note you could make it work using these techniques.
That said, here a "pure" Html Agility Pack / XPATH implementation:
var pNodes = htmlDoc.DocumentNode.SelectNodes("//p[text()='=.+=']");
It uses a filter (between [ and ]) and the standard XPATH function text() which means "inner text".
Apparently HtmlAgilityPack doesn't handle namespaces (not that I had one). So I've come up with this hack,
var pNodes = htmlDoc.DocumentNode.SelectNodes("//p")
.Where(node => Regex.Match(node.InnerText, "^=.+=$").Success);
If there is an HtmlAgilityPack solution I'd love to hear it!