HtmlAgilityPack - C# 4.0 - How to read a specific table

c# html-agility-pack

Question

How can I read data from a specific table using C# 4.0 and HTML Agility Pack? If there are ten tables, for example, and I want to read numbers from table 6 or I know the table id,

Or, let's assume I want to read the td value that follows a certain td.

Create a table after a certain div, element, or text. Are they even feasible?

1
5
10/17/2011 11:24:30 PM

Accepted Answer

Everything you asked for could be accomplished rather simply. It doesn't matter if there may be gaps in its documentation; it should be comparable to XML and the network'sXmlDocument both in terms of usability and functionality.

How can I read values inside certain table? Let's say there are 10 tables and I want to read values from 6th or I have table id.

locating table number six:

// XPath
var table6 = doc.DocumentNode.SelectSingleNode("//table[6]");

// LINQ
var table6 = doc.DocumentNode.Descendants("table").Skip(5).FirstOrDefault();

by id locating the table/element

var myTable = doc.GetElementById("myTable");

// XPath
var myTable = doc.DocumentNode.SelectSingleNode("//table[@id='myTable']");
var myTable = doc.DocumentNode.SelectSingleNode("//*[@id='myTable']");

// LINQ
var myTable = doc.DocumentNode
    .Descendants("table")
    .Where(table => table.Attributes.Contains("id"))
    .SingleOrDefault(table => table.Attributes["id"].Value == "myTable");
var myTable = doc.DocumentNode
    .Descendants()
    .Where(e => e.Attributes.Contains("id"))
    .SingleOrDefault(e => e.Attributes["id"].Value == "myTable");
var myTable = doc.DocumentNode
    .Descendants("table")
    .SingleOrDefault(table => table.GetAttributeValue("id", null) == "myTable");
var myTable = doc.DocumentNode
    .Descendants()
    .SingleOrDefault(e => e.GetAttributeValue("id", null) == "myTable");

Let's say I want to read td value coming after certain td.

// XPath
var certainTd = table6.SelectSingleNode("//td[2]");
var tdAfterCertainTd = certainTd.SelectSingleNode("following-sibling::td[1]");

// LINQ (not so easy)
var certainTd = table6.Descendants("td").Skip(1).FirstOrDefault();
var tdAfterCertainTd = certainTd.NextSibling;
while (tdAfterCertainTd != null)
{
    if (tdAfterCertainTd.Name == "td")
        break;
    tdAfterCertainTd = tdAfterCertainTd.NextSibling;
}

Table coming after certain div or element or text.

// XPath
var certainDiv = doc.DocumentNode.SelectSingleNode("//div[1]");
var tableAfterCertainDiv = certainDiv.SelectSingleNode("following-sibling::table[1]");

// LINQ (not so easy)
var certainDiv = doc.DocumentNode.Descendants("div").FirstOrDefault();
var tableAfterCertainDiv = certainDiv.NextSibling;
while (tableAfterCertainDiv != null)
{
    if (tableAfterCertainDiv.Name == "table")
        break;
    tableAfterCertainDiv = tableAfterCertainDiv.NextSibling;
}

You ought to spot some trends.

16
10/18/2011 3:57:08 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow