LINQ and HTML Agility Pack filling the parsed HTML table data into a datatable

datatable html-agility-pack linq


I am using the following query to parse html table data.

Dim q = From table In htmldoc.DocumentNode.SelectNodes("//table[@class='Seller']").Cast(Of HtmlNode)()
                    From row In table.SelectNodes("tr").Cast(Of HtmlNode)()
                    From header In row.SelectNodes("th").Cast(Of HtmlNode)()
                    From cell In row.SelectNodes("td").Cast(Of HtmlNode)()
               Select New With {Key .Table = table.Id, Key .CellText = cell.InnerText, Key .headerText = header.InnerText}

How can i use for each loops how can to fill this into a datatable?

I would create columns first using the header data then use a nested for each loop to fill the cell data in the table, but i am not sure how to, also any suggested changes on the above LINQ query?

Note: The html page contains only one table always.

11/2/2012 7:32:00 AM

Accepted Answer

Given the following html

Dim t = <table class='Seller' id='MyTable'>
                <td>Another Foo</td>
                <td>Another Bar</td>
                <td>Another Third</td>

Dim htmldoc = New HtmlAgilityPack.HtmlDocument()

and your query

Dim q = From table In htmldoc.DocumentNode.SelectNodes("//table[@class='Seller']")
            From row In table.SelectNodes("tr")
                From header In row.SelectNodes("th")
                From cell In row.SelectNodes("td")
        Select New With {.Table = table.Id, .CellText = cell.InnerText, .headerText = header.InnerText}

you can use GroupBy or ToLookup to group the objects by columns:

Dim grouped = q.ToLookup(Function(a) a.headerText)

and use this grouping to create a DataTable with the appropriate DataColumns:

Dim dt = new DataTable()

For Each h in grouped.Select(Function(g) g.Key)

Now, for filling the DataTable, you have to "rotate" the grouping, since each group contains the data for one column, but we want the data for each row. Let's use a little helper method

Function Rotate(Of T, TR)(source As IEnumerable(Of IEnumerable(Of T)), 
                          selector As Func(Of IEnumerable(Of T), IEnumerable(Of TR))) As IEnumerable(Of IEnumerable(Of TR))

    Dim result = new List(Of IEnumerable(Of TR))
    Dim enums = source.Select(Function(e) e.GetEnumerator()).ToArray()
    While enums.All(Function(e) e.MoveNext())
        result.Add(selector(enums.Select(Function(e) e.Current)).ToArray())
    End While

    Return result
End Function

to fill the DataTable.

For Each rrow in Rotate(grouped, Function(row) row.Select(Function(e) e.CellText))

And now the DataTable will look like this:

enter image description here

11/2/2012 10:35:00 AM

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow