Html Agility Pack parsing table into object

c# foreach html-agility-pack html-parsing

Question

The HTML I have is as follows:

<tr class="row1">
        <td class="id">123</td>
        <td class="date">2014-08-08</td>
        <td class="time">12:31:25</td>
        <td class="notes">something here</td>
</tr>
<tr class="row0">
        <td class="id">432</td>
        <td class="date">2015-02-09</td>
        <td class="time">12:22:21</td>
        <td class="notes">something here</td>
</tr>

And each subsequent customer row proceeds in the same manner. I want to parse contents of each table row to an object. I've tried few methods but I can't seem to get it work right.

What I presently have is this

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
foreach (HtmlNode row in doc.DocumentNode.SelectNodes("//table[@id='customerlist']//tr"))
{
    Customer cust = new Customer();
    foreach (HtmlNode info in row.SelectNodes("//td"))
    {
        if (info.GetAttributeValue("class", String.Empty) == "id")
        {
            cust.ID = info.InnerText;
        }
        if (info.GetAttributeValue("class", String.Empty) == "date")
        {
            cust.DateAdded = info.InnerText;
        }
        if (info.GetAttributeValue("class", String.Empty) == "time")
        {
            cust.TimeAdded = info.InnerText;
        }
        if (info.GetAttributeValue("class", String.Empty) == "notes")
        {
            cust.Notes = info.InnerText;
        }
    }
    Console.WriteLine(cust.ID + " " + cust.TimeAdded + " " + cust.DateAdded + " " + cust.Notes);
}

It operates to the point where each loop publishes information about the the table's last row. I really don't know what I'm missing, but it's something pretty basic.

Also is my way of creating the object fine, or should I use a constructor and create the object from variables? E.g.

    string Notes = String.Empty;
if (info.GetAttributeValue("class", String.Empty) == "notes")
{
    Notes = info.InnerText;
}
..
Customer cust = new Customer(id, other_variables, Notes, etc);
1
2
10/8/2016 6:59:55 PM

Popular Answer

Your XPath search is incorrect. You must usetd in place of//td :

foreach (HtmlNode info in row.SelectNodes("td"))

Passing //td to SelectNodes() matches zzz-23 zzz<td> Consequently, your inner loop executes 8 times rather than 4 times, and the last 4 times always overwrite the values that were put in your inner loop earlier.Customer object.

look at Examples of XPath

2
10/8/2016 7:14:12 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow