XPath "following siblings before"

html-agility-pack xpath

Question

I'm trying to select elements (a) with XPath 1.0 (or possibly could be with Regex) that are following siblings of particular element (b) but only preceed another b element.

<img><b>First</b><br>&nbsp;&nbsp;
<img>&nbsp;&nbsp;<a href="/first-href">First Href</a> - 19:30<br>
<img><b>Second</b><br>&nbsp;&nbsp;
<img>&nbsp;&nbsp;<a href="/second-href">Second Href</a> - 19:30<br>
<img>&nbsp;&nbsp;<a href="/third-href">Third Href</a> - 19:30<br>

I tried to make the sample as close to real world as possible. So in this scenario when I'm at element

<b>First</b>

I need to select

<a href="/first-href">First Href</a> 

and when I'm at

<b>Second</b> 

I need to select

<a href="/second-href">Second Href</a> 
<a href="/third-href">Third Href</a>

Any idea how to achieve that? Thank you!

Accepted Answer

Dynamically create this XPath:

following-sibling::a[preceding-sibling::b[1][.='xxxx']]

where 'xxxx' is the replaced with the text of the current <b>.

This is assuming that all the elements actually are siblings. If they are not, you can try to work with the preceding and following axes, or you write a more specific XPath that better resembles document structure.

In XSLT you could also use:

following-sibling::a[
  generate-id(preceding-sibling::b[1]) = generate-id(current())
]

Popular Answer

Here is a solution which is just a single XPath expression.

Using the Kaysian formula for intersection of two nodesets $ns1 and $ns2:

  $ns1[count(. | $ns2) = count($ns2)]

We simply substitute $ns1 with the nodeset of <a> siblings that follow the current <b> node, and we substitute $ns2 with the nodeset of <a> siblings that precede the next <b> node.

Here is a complete transformation that uses this:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="/">
   <xsl:apply-templates select="*/b"/>
  </xsl:template>

  <xsl:template match="b">
    At: <xsl:value-of select="."/>

    <xsl:variable name="vNextB" select="following-sibling::b[1]"/>

    <xsl:variable name="vA-sAfterCurrentB" select="following-sibling::a"/>

    <xsl:variable name="vA-sBeforeNextB" select=
    "$vNextB/preceding-sibling::a
    |
     $vA-sAfterCurrentB[not($vNextB)]
    "/>

    <xsl:copy-of select=
     "$vA-sAfterCurrentB
              [count(.| $vA-sBeforeNextB)
              =
               count($vA-sBeforeNextB)
               ]
    "/>
  </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the following XML document:

<t>
    <img/>
    <b>First</b>
    <br />&#xA0;&#xA0;
    <img/>&#xA0;&#xA0;
    <a href="/first-href">First Href</a> - 19:30
    <br />
    <img/>
    <b>Second</b>
    <br />
    <img/>&#xA0;&#xA0;
    <a href="/second-href">Second Href</a> - 19:30
    <br />
    <img/>&#xA0;
    <a href="/third-href">Third Href</a> - 19:30
    <br />
</t>

the correct result is produced:

   At: First <a href="/first-href">First Href</a>
    At: Second <a href="/second-href">Second Href</a>
<a href="/third-href">Third Href</a>



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why