Delete Table Column with HTML Agility Pack

asp.net c# html-agility-pack html-table

Question

I have scraped a table from a website using C# for my own website and loaded it into a string. There are too many columns so I was wondering if there was an easy way to delete some, probably using HTML Agility Pack but in C# if necessary.

The table in the string looks like this:

    <table>
        <tr>
            <th scope="col">&nbsp; </th>
            <th scope="col">&nbsp; </th>
            <th scope="col">P </th>
            <th scope="col">W </th>
            <th scope="col">L </th>
            <th scope="col">T </th>
            <th scope="col">NR </th>
            <th scope="col">Bat </th>
            <th scope="col">Bowl </th>
            <th scope="col">Pen </th>
            <th scope="col">Pts </th>
        </tr>
        <tr>
            <td>1 </td>
            <td><a href="fixbyteam.aspx?clubid=44576&teamid=58170&divid=32181">Rayleigh 2nd</a> </td>
            <td>12 </td>
            <td>8 </td>
            <td>1 </td>
            <td>0 </td>
            <td>3 </td>
            <td>14 </td>
            <td>52 </td>
            <td>0 </td>
            <td>209 </td>
        </tr>
        <tr>
            <td>2 </td>
            <td><a href="fixbyteam.aspx?clubid=44612&teamid=58169&divid=32181">Rainham 1st</a> </td>
            <td>12 </td>
            <td>8 </td>
            <td>1 </td>
            <td>1 </td>
            <td>2 </td>
            <td>12 </td>
            <td>56 </td>
            <td>-15 </td>
            <td>199 </td>
        </tr>
        <tr class="lineAbove">
            <td>3 </td>
            <td><a href="fixbyteam.aspx?clubid=44571&teamid=58162&divid=32181">Old Chelmsfordians 2nd</a> </td>
            <td>12 </td>
            <td>5 </td>
            <td>5 </td>
            <td>0 </td>
            <td>2 </td>
            <td>10 </td>
            <td>48 </td>
            <td>0 </td>
            <td>148 </td>
        </tr>
        <tr>
            <td>4 </td>
            <td><a href="fixbyteam.aspx?clubid=44570&teamid=58161&divid=32181">Little Baddow 2nd</a> </td>
            <td>12 </td>
            <td>5 </td>
            <td>4 </td>
            <td>0 </td>
            <td>3 </td>
            <td>21 </td>
            <td>43 </td>
            <td>-15 </td>
            <td>144 </td>
        </tr>
        <tr>
            <td>5 </td>
            <td><a href="fixbyteam.aspx?clubid=44606&teamid=58159&divid=32181">Rayne 1st</a> </td>
            <td>12 </td>
            <td>5 </td>
            <td>4 </td>
            <td>0 </td>
            <td>3 </td>
            <td>6 </td>
            <td>39 </td>
            <td>0 </td>
            <td>140 </td>
        </tr>
        <tr>
            <td>6 </td>
            <td><a href="fixbyteam.aspx?clubid=44605&teamid=58158&divid=32181">Terling 1st</a> </td>
            <td>12 </td>
            <td>4 </td>
            <td>5 </td>
            <td>1 </td>
            <td>2 </td>
            <td>12 </td>
            <td>35 </td>
            <td>0 </td>
            <td>129 </td>
        </tr>
        <tr>
            <td>7 </td>
            <td><a href="fixbyteam.aspx?clubid=44602&teamid=58154&divid=32181">Willow Herbs 1st</a> </td>
            <td>12 </td>
            <td>4 </td>
            <td>6 </td>
            <td>0 </td>
            <td>2 </td>
            <td>9 </td>
            <td>34 </td>
            <td>0 </td>
            <td>117 </td>
        </tr>
        <tr>
            <td>8 </td>
            <td><a href="fixbyteam.aspx?clubid=50925&teamid=68864&divid=32181">Ongar 1st</a> </td>
            <td>12 </td>
            <td>3 </td>
            <td>5 </td>
            <td>0 </td>
            <td>4 </td>
            <td>3 </td>
            <td>42 </td>
            <td>-5 </td>
            <td>108 </td>
        </tr>
        <tr class="lineAbove">
            <td>9 </td>
            <td><a href="fixbyteam.aspx?clubid=44607&teamid=58163&divid=32181">Sandon Sports 1st</a> </td>
            <td>12 </td>
            <td>3 </td>
            <td>6 </td>
            <td>0 </td>
            <td>3 </td>
            <td>8 </td>
            <td>27 </td>
            <td>0 </td>
            <td>98 </td>
        </tr>
        <tr>
            <td>10 </td>
            <td><a href="fixbyteam.aspx?clubid=44582&teamid=58156&divid=32181">Little Waltham 2nd</a> </td>
            <td>12 </td>
            <td>1 </td>
            <td>9 </td>
            <td>0 </td>
            <td>2 </td>
            <td>14 </td>
            <td>25 </td>
            <td>0 </td>
            <td>65 </td>
        </tr>
    </table>

And I want to delete columns 8-10 (Bat, Bowl and Pen). I'm not really sure where to start so any pointers would be helpful!

Accepted Answer

You would need to iterate over each tr and remove the 8th, 9th and 10th td nodes from each.

bool first = true;
foreach (HtmlNode row in doc.DocumentNode.SelectNodes("//tr"))
{
    if (first)
    {
        row.RemoveChild(row.SelectSingleNode("th[10]"));
        row.RemoveChild(row.SelectSingleNode("th[9]"));
        row.RemoveChild(row.SelectSingleNode("th[8]"));
        first = false;
    }
    else
    {
        row.RemoveChild(row.SelectSingleNode("td[10]"));
        row.RemoveChild(row.SelectSingleNode("td[9]"));
        row.RemoveChild(row.SelectSingleNode("td[8]"));
    }
}


Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why