Trouble selecting nodes with Html Agility Pack

c# html html-agility-pack

Question

I have the most recent HTML design.

<table> //table[1]
</table>
<table> //table[2]
<tbody>
   <tr>
      <td>
         <p>
            &nbsp;
         </p>
      </td>
   </tr>
   <tr>
      <td>
         <table> //table[1]//table[1]
            <tbody>
               <tr>
                  <td>
                     <p>
                        INFO 1
                     </p>
                  </td>
                  <td>
                     <p>
                        INFO 2
                     </p>
                  </td>
                  <td>
                     <p>
                        INFO 3
                     </p>
                  </td>
                  <td>
                     <p>
                        INFO 4
                     </p>
                  </td>
               </tr>
            </tbody>
         </table>
      </td>
   </tr>
   <tr>
      <td>
         <table> //table[1]//table[2]
            <tbody>
               <tr>
                  <td>
                     <p><strong>Name</strong></p>
                  </td>
                  <td>
                     <p><strong>Quantity</strong></p>
                  </td>
               </tr>
               <tr>
                  <td>
                     <p>Apples </p>
                  </td>
                  <td>10</td>
               </tr>
            </tbody>
         </table>
      </td>
   </tr>
   <tr>
      <td>
         <table>  //table[1]//table[3]
         </table>
      </td>
   </tr>
</tbody>
</table>

I'm attempting to access the info within.//table[1]//table[2] , yet I continue to get a null HtmlNode (System.NullReferenceException ) for the subsequent:

fails to workdoc.DocumentNode.SelectSingleNode("//table[2]//tbody//tr//td//table[2]//tbody//tr"); ,

The reason this happens when I attempt to collect data for//table[1]//table[1] It works flawlessly with this syntax.

works:doc.DocumentNode.SelectSingleNode("//table[2]//tbody//tr//td//table[1]//tbody//tr");

Am I misinterpreting HTML Agility Pack's indexing process?

1
0
10/3/2014 6:10:24 PM

Accepted Answer

//table[2] second try<table> element within identical parents due to the following XPath:

The ([]) has a higher precedence (priority) than (// and /). [For Reference]

There is just one in your situation.<table> in every<td> , therefore the Xpath expression produced an empty result. Placing brackets to change the precedence is one option:

(//table[2]//tbody//tr//td//table)[2]//tbody//tr

Second place goes to Xpath above.<table> a component of all<table> the inner XPath returned s//table[2]//tbody//tr//td//table . Following that,<table> keep returning descendants//tbody//tr elements.

1
5/23/2017 11:59:34 AM

Popular Answer

I ultimately had to base this ontr I'm not sure why my prior approach failed, but this one does.

I essentially raised my table's indexing to the next level. hence, in the firsttbody Each subsequent table is included inside a tr/td statement, so I just built my HTMLNode to index off of thetr 's. Maybe expanding the selection criteria will make Agility Pack operate better? IDK.

Anyways...

For table[2]//table[1] I used:

HtmlNode table = doc.DocumentNode.SelectSingleNode("//table[2]//tbody//tr[2]//table");
foreach (var cell in table.SelectNodes(".//tr//td/p"))
...

If you look at the sample HTML above, I had previously used a tr/td with a blank space, therefore I chose tr[2].

For table[2]//table[2] I used

HtmlNode table = doc.DocumentNode.SelectSingleNode("//table[2]//tbody//tr[3]//table[1]");
foreach (var cell in table.SelectNodes(".//tr//td"))
...

If you're having trouble, consider pushing certain tags to more general ones to expand your search.



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow