C# HtmlAgilityPack Select table from specific h2

c# html html-agility-pack parsing

Question

I have some html:

<h2>Results</h2>
 <div class="box">
 <table class="tFormat">
      <th>Head</th>
      <tr>1</tr>
 </table>
</div>

<h2>Grades</h2>
 <div class="box">
 <table class="tFormat">
      <th>Head</th>
      <tr>1</tr>
 </table>
</div>

I was wondering how would I get the table under "Results"

I've tried:

        var nodes = doc.DocumentNode.SelectNodes("//h2");

        foreach (var o in nodes)
        {
            if (o.InnerText.Equals("Results"))
            {
                foreach (var c in o.SelectNodes("//table"))
                {
                    Console.WriteLine(c.InnerText);             
                }
            }
        }

It works but it also gets the table under Grades h2

1
6
12/26/2012 12:40:50 PM

Accepted Answer

Note that the div is not hierarchically inside the header, so it doesn't make sense to look for it there.

This can work for you - it finds the next element after the title:

if (o.InnerText.Equals("Results"))
{
    var nextDiv = o.NextSibling;
    while (nextDiv != null && nextDiv.NodeType != HtmlNodeType.Element)
        nextDiv = nextDiv.NextSibling;
    // nextDiv should be correct here.
}

You can also write a more specific xpath to find just that div:

doc.DocumentNode.SelectNodes("//h2[text()='Results']/following-sibling::div[1]");
5
12/26/2012 1:03:00 PM

Popular Answer

 var nodes = doc.DocumentNode.SelectNodes("//h2");

        if (nodes.FirstOrDefault()!=null)
        {
            var o=nodes.FirstOrDefault();
            if (o.InnerText.Equals("Results"))
            {
                foreach (var c in o.SelectNodes("//table"))
                {
                    Console.WriteLine(c.InnerText);             
                }
            }
        }


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow