How to count rows in a table in an html file C#

c# html-agility-pack html-parsing linq

Question

How does one count the rows of the parent table when there is a composite table within an HTML file?

What I mean by a composite table is a table that has cells that include data from several tables.

Here is my coding effort. Note: I got the wrong values.

        String htmlFile = "C:/Temp/Test_13.html";
        HtmlDocument doc = new HtmlDocument();
        doc.Load(htmlFile);

        HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
        HtmlNodeCollection rows = tables[1].SelectNodes(".//tr");
        Console.WriteLine(" Rows in second (Parent) table: " + rows.Count());

Please specify in your response which namespace is being utilized.

Here is an example of a sample file:

<html>
<body>
<table border="1">
<tr>
<td>Apps</td>
</tr>
<tr>
<td>Offcie Web Apps</td>
</tr>
</table>
<br/>
<table border="1">
<tr>
<td>Application</td>
<td>Status</td>
<td>Instances</td>
</tr>
<tr>
<td>PowerPoint</td>
<td>Online</td>
<td>
    <table border="1">
    <tr>
        <td>Server1</td>
        <td>Online</td>
    </tr>
    <tr>
        <td>Server2</td>
        <td>Disabled</td>
    </tr>
    </table>
</td>
</tr>
<tr>
<td>Word</td>
<td>Online</td>
<td>
    <table border="1">
    <tr>
        <td>Server1</td>
        <td>Online</td>
    </tr>
    <tr>
        <td>Server2</td>
        <td>Disabled</td>
    </tr>
    </table>
</td>
</tr>
</table>
</body>
</html>

I'm grateful.

1
3
6/4/2013 2:39:30 AM

Accepted Answer

If I have you right, this is what you want.

int i = 1;
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
foreach (HtmlNode table in tables)
{
    var tmp = table.ParentNode;
    if (tmp.OriginalName.Contains("td"))
        MessageBox.Show("The parent of table #" + i + " has" + tmp.ParentNode.ParentNode.Elements("tr").Count().ToString() + " rows.");
    i++;
}

Two times the MessageBox will appear:

"The parent of table #3 has 3 rows."
"The parent of table #4 has 3 rows."

EDIT: (RESPONDING TO QUESTIONS)

1) I began by going againstint i = 1 . Thevar i = 1 the same thing, it merely replaces itself automaticallyvar with int .

2) I changed the code, so you'll get the same outcome as me.

3) There are tables #1, #2, #3, and #4 since I began counting from 1. Table #2, which has 3 rows, is the parent table of your two most recent tables (tables #3 and #4). The code I've provided above only prints sub-tables of other tables. Can you demonstrate the response you seek?

EDIT 2:

int i = 1;
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
foreach (HtmlNode table in tables)
{
    if (!table.ParentNode.OriginalName.Contains("td")) // If table is not sub-table
        MessageBox.Show("Table #" + i + " have " + table.Elements("tr").Count().ToString() + " rows.");
    i++;
}

Two times the MessageBox will appear:

"The parent of table #1 has 2 rows."
"The parent of table #2 has 3 rows."
0
6/10/2013 3:01:05 PM

Popular Answer

You should check out the csQuery nuget package, in my opinion. It is made to eliminate the majority of the hassles associated with carrying out those specific actions. You may use the css selector query syntax, which is quite common among web developers. You could certainly get away with it in this situation.body > table:nth-of-type(2) > tr then it will produce an array with all the TRs; after that, you can either count them or determine the length of the array by doing so. Alternatively,body > table ~ table > tr based on the example you provided, would alsobr + table > tr



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow