Parse Table with LINQ and HtmlAgilityPack

c# html-agility-pack linq

Question

How can I parse HTML using LINQ on a webpage to get the innerhtml values from the table?

I am using the HtmlAgilityPack and would like to parse some values as good as possible.

the number you see(00000, 00001, 00002..), are unique numbers from the agents.

So maybe there is a way to use LINQ to parse those numbers and get the following values from td's

(Name, 123, state, and info) => 00000, John, 123, IDLE, coffee for each so I can call them separately and work with them - maybe in a array?

</TH>
    </TR>
    <TR ALIGN=RIGHT>
        <TD ALIGN=LEFT>00000</TD>
        <TD ALIGN=LEFT>John</TD>
        <TD ALIGN=CENTER>123</TD>
        <TD ALIGN=LEFT>IDLE</TD>
        <TD ALIGN=LEFT>coffee</TD>
    </TR>
    <TR ALIGN=RIGHT>
        <TD ALIGN=LEFT>00001</TD>
        <TD ALIGN=LEFT>Lisa</TD>
        <TD ALIGN=CENTER>123</TD>
        <TD ALIGN=LEFT>IDLE</TD>
        <TD ALIGN=LEFT>coffee</TD>
    </TR>
    <TR ALIGN=RIGHT>
        <TD ALIGN=LEFT>00002</TD>
        <TD ALIGN=LEFT>Mary</TD>
        <TD ALIGN=CENTER>123</TD>
        <TD ALIGN=LEFT>IDLE</TD>
        <TD ALIGN=LEFT>coffee</TD>
    </TR>
    <TR ALIGN=RIGHT>
        <TD ALIGN=LEFT>00003</TD>
        <TD ALIGN=LEFT>Tim</TD>
        <TD ALIGN=CENTER>123</TD>
        <TD ALIGN=LEFT>IDLE</TD>
        <TD ALIGN=LEFT>coffee</TD>
    </TR>
....

Thanks in advance!

Popular Answer

This seems a lot like a "please give me the code I need question", which I seriously dislike. Have a look at the following and make sure you understand it:

var doc = ... // Load the document
var trs = doc.DocumentNode.Descendants("TR"); // Give you all the TRs
foreach (var tr in trs)
{
  var tds = tr.Descendants("TD").ToArray(); // Get all the TDs
  // Turn them into our datastructure
  var data = new {
             Name  = tds[1].InnerText,
             Number = tds[2].InnerText,
             State = tds[3].InnerText,
             Info  = tds[4].InnerText,
             };
  // Do something with data
}

Doing it with LINQ only:

var data = from tr in doc.DocumentNode.Descendants("TR")
           let tds = tr.Descendants("TD").ToArray()
           select new {
             Name  = tds[1].InnerText,
             Number = tds[2].InnerText,
             State = tds[3].InnerText,
             Info  = tds[4].InnerText,
             };



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why