Parse HTML table with HtmlAgilityPack?

html-agility-pack vb.net

Question

I'm tearing my hair out trying to figure out this HTML agility pack business. No examples I can find work with my table not matter what I modify. Here's the table I'm working with:

<td class="trow1"><strong><a href="NEED1"><span style="color:#383838">NEED2</span></a></strong></td>
<td class="trow1">NEED3</td>
<td class="trow1" align="center"" alt="" /></td>
<td class="trow1" align="center"><strong>NEED4</strong></td>
</tr><tr>
<td class="trow2"><strong><a href="NEED1"><span class="group9">NEED2</span></a></strong></td>
<td class="trow2">NEED3</td>
<td class="trow2" align="center"" alt="" /></td>
<td class="trow2" align="center"><strong>NEED4</strong></td>
</tr><tr>
<td class="trow1"><strong><a href="NEED1"><span class="group0">NEED2</span></a></strong></td>
<td class="trow1">NEED3</td>
<td class="trow1" align="center"" alt="" /></td>
<td class="trow1" align="center"><strong>NEED4</strong></td>
</tr><tr>
<td class="trow2"><strong><a href="NEED1"><span class="group7">NEED2</span></a></strong></td>
<td class="trow2">NEED3</td>
<td class="trow2" align="center"" alt="" /></td>
<td class="trow2" align="center"><strong>NEED4</strong></td>
</tr><tr>
<td class="trow1"><strong><a href="NEED1"><span class="group0">NEED2</span></a></strong></td>
<td class="trow1">NEED3</td>
<td class="trow1" align="center"" alt="" /></td>
<td class="trow1" align="center"><strong>NEED4</strong></td>
</tr>

I've replaced what I need with "NEED"1->4 for each row. I'm looking to populate a list view with this (already made this part). But I'm lost on how to go about this.

Any help? Thank you.

1
0
3/6/2015 12:11:53 AM

Popular Answer

Translating this code to VB.NET it's not difficult, you can do it the following :

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
  • NEED1

    var value = doc.DocumentNode.SelectSingleNode("//td[@class='trow1']/strong/a").Attributes["href"].Value;
    
  • NEED2

    var value = doc.DocumentNode.SelectSingleNode("//td[@class='trow1']/strong/a/span").InnerText;
    
  • NEED3

    var innerText = doc.DocumentNode.SelectSingleNode("//td[@class='trow1' and not(*)]").InnerText;
    
  • NEED4

    var innerText = doc.DocumentNode.SelectSingleNode("//td[@class='trow1']/strong[not(a)]").InnerText;
    

    I put above the single selection , if you want to select all the node in one you can use the method SelectNodes.

I hope this help you.

0
3/6/2015 1:14:51 AM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow