I am using the following query to parse html table data.
Dim q = From table In htmldoc.DocumentNode.SelectNodes("//table[@class='Seller']").Cast(Of HtmlNode)()
From row In table.SelectNodes("tr").Cast(Of HtmlNode)()
From header In row.SelectNodes("th").Cast(Of HtmlNode)()
From cell In row.SelectNodes("td").Cast(Of HtmlNode)()
Select New With {Key .Table = table.Id, Key .CellText = cell.InnerText, Key .headerText = header.InnerText}
How can i use for each loops how can to fill this into a datatable?
I would create columns first using the header data then use a nested for each loop to fill the cell data in the table, but i am not sure how to, also any suggested changes on the above LINQ query?
Note: The html page contains only one table always.
Given the following html
Dim t = <table class='Seller' id='MyTable'>
<tr>
<th>FooColumn</th>
<td>Foo</td>
<td>Another Foo</td>
</tr>
<tr>
<th>BarColumn</th>
<td>Bar</td>
<td>Another Bar</td>
</tr>
<tr>
<th>ThirdColumn</th>
<td>Third</td>
<td>Another Third</td>
</tr>
</table>
Dim htmldoc = New HtmlAgilityPack.HtmlDocument()
htmldoc.LoadHtml(t.ToString())
and your query
Dim q = From table In htmldoc.DocumentNode.SelectNodes("//table[@class='Seller']")
From row In table.SelectNodes("tr")
From header In row.SelectNodes("th")
From cell In row.SelectNodes("td")
Select New With {.Table = table.Id, .CellText = cell.InnerText, .headerText = header.InnerText}
you can use GroupBy
or ToLookup
to group the objects by columns:
Dim grouped = q.ToLookup(Function(a) a.headerText)
and use this grouping to create a DataTable
with the appropriate DataColumn
s:
Dim dt = new DataTable()
For Each h in grouped.Select(Function(g) g.Key)
dt.Columns.Add(h)
Next
Now, for filling the DataTable
, you have to "rotate" the grouping, since each group contains the data for one column, but we want the data for each row. Let's use a little helper method
Function Rotate(Of T, TR)(source As IEnumerable(Of IEnumerable(Of T)),
selector As Func(Of IEnumerable(Of T), IEnumerable(Of TR))) As IEnumerable(Of IEnumerable(Of TR))
Dim result = new List(Of IEnumerable(Of TR))
Dim enums = source.Select(Function(e) e.GetEnumerator()).ToArray()
While enums.All(Function(e) e.MoveNext())
result.Add(selector(enums.Select(Function(e) e.Current)).ToArray())
End While
Return result
End Function
to fill the DataTable
.
For Each rrow in Rotate(grouped, Function(row) row.Select(Function(e) e.CellText))
dt.Rows.Add(rrow.ToArray())
Next
And now the DataTable
will look like this: