HTMLAgilityPack parse table within another table cell

arraylist c# html-agility-pack

Question

I have the following table:

<table>
    <tr><th>header1</th><th>header2</th><th>header3</th></tr>
    <tr><td>value01</td><td>value02</td><td>value03</td></tr>
    <tr><td>value11</td><td>value12</td><td>value13</td></tr>
    <tr>
        <td colspan="3">
            <table>
                <tr><td>subvalue01</td><td>subvalue02</td></tr>
            </table>
        </td>
    </tr>
</table>

I'm using this code to save the main table cell values into separate ArrayList and subtable cell values in another ArrayList. But my ArrayList for subtable cell values is saving the entire values including table and subtable:

foreach (HtmlNode table in hdoc.DocumentNode.SelectNodes("//table"))
{
    ///This is the table.
    foreach (HtmlNode row in table.SelectNodes("tr").Skip(1))
    {
        ///This is the row.
        foreach (HtmlNode cell in row.SelectNodes("th|td")) 
            ///can also use "th|td", but right now we ONLY need td
        {
            //This is the cell.
            if (cell.InnerHtml.Contains("<table>"))
            {
                foreach (HtmlNode subtable in cell.SelectNodes("//table"))
                {
                    foreach (HtmlNode subrow in subtable.SelectNodes("tr").Skip(1))
                    {
                        foreach (HtmlNode subcell in subrow.SelectNodes("th|td"))
                        {
                            arrSubList.Add(subcell.InnerText);
                        }
                    }
                }
            }
            else
            {
                arrList.Add(cell.InnerText);
            }
        }
    }
}

What is wrong with my code?

Accepted Answer

I believe your first line

foreach (HtmlNode table in hdoc.DocumentNode.SelectNodes("//table"))

will select ALL tables - at any level (including the nested tables).

Per: http://www.w3schools.com/XPath/xpath_syntax.asp

// Selects nodes in the document from the current node that match the selection no matter where they are

So, change your first line to

foreach (HtmlNode table in hdoc.DocumentNode.SelectNodes("/html/body/table"))

And see how that goes.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why