How to get next 2 nodes in HTML + HTMLAgilitypack

c# html-agility-pack

Question

I have a table in the HTML code below:

<table style="padding: 0px; border-collapse: collapse;">
    <tr>
        <td><h3>My Regional Financial Office</h3></td>
    </tr>
    <tr>
        <td>&#160;</td>
    </tr>
    <tr>
        <td><h3>My Address</h3></td>
    </tr>
    <tr>
        <td>000 Test Ave S Ste 000</td>
    </tr>
    <tr>
        <td>Golden Valley, MN 00000</td>
    </tr>
    <tr>
        <td><a href="javascript:submitForm('0000','0000000');">Get Directions</a></td>
    </tr>
    <tr>
        <td>&#160;</td>
    </tr>
</table>

How can I get the inner text of the next 2 <tr> tags after the tablerow containing the text "My Address?"

Accepted Answer

You can use following XPath :

var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var tdOfInterests = 
        htmlDoc.DocumentNode
               .SelectNodes("//tr[td/h3[.='My Address']]/following-sibling::tr[position() <= 2]/td");
foreach (HtmlNode td in tdOfInterests)
{
    //given html input in question following code will print following 2 lines:
    //000 Test Ave S Ste 000
    //Golden Valley, MN 00000
    Console.WriteLine(td.InnerText);
}

The key of above XPath is using following-sibling with position() filter.

UPDATE :

A bit explanation about the XPath used in this answer :

//tr[td/h3[.='My Address']]

above part select <tr> element that has :

  • child <td> element that has child <h3> element with value equals 'My Address'

/following-sibling::tr[position() <= 2]

next part select following <tr> element with position <=2 from current <tr> element (the one selected by previous XPath part)

/td

the last part select child <td> element from current <tr> element




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why