Iâ€™m using HtmlAgilityPack in order to retrieve the following html (notice the nested table):
<table class="123"> <tr> <table class="789"> <tr> <td>abc</td> </tr> <tr> <td>def</td> </tr> </table> </tr> <tr> <td>info 1</td> </tr> <tr> <td>info 2</td> </tr> <tr> <td>info 3</td> </tr> </table>
Now, Iâ€™m trying to find a clever way to obtain some information from the parent table and some information from the nested tableâ€¦
So far I have the following:
var parentTable = document.DocumentNode.SelectNodes("//table[@class='123']").FirstOrDefault(); var nestedTable = parentTable.SelectNodes("//table[@class='789']").FirstOrDefault();
I can now play around with the nestedTable and get what I want (abc, def)...
But when I try to get the
<tr>â€™s from the parent table like so:
var parentTableRows = parentTable.SelectNodes(".//tr");
It seems to include (in the collection) the
<tr>â€™s from the nested table as well...
In other words, according to the above html code, I was expecting to have a collection of 4
<tr>â€™s but since it includes the
<tr>â€™s from the nested table, Iâ€™m getting a collection of 6
How can I skip the first
<tr> that happens to hold the nested table so I can play around and get the information I want (info1, info2, info3)
(hope Iâ€™m making senseâ€¦)
Thanks in advance!
// is an XPATH expression that means "scan all nodes and sub nodes". That's why
//tr gets all tr below the root one.
If you just do
"./tr" which is equivalent), you will select all TR below the root one.
If you want to skip the first one, then you can add an XPATH filter on element's
position() (an XPATH function):
var parentTableRows = parentTable.SelectNodes("tr[position() > 1]");