HtmlAgilityPack - How to read certain table - c# 4.0

c# html-agility-pack

Question

With using c# 4.0 and htmlagilitypack how can i read values inside certain table. I mean lets say there are 10 tables and i want to read values from 6 th or i have table id.

Or lets say i want to read td value coming after certain td.

Or table coming after certain div or element or text. Are these possible ?

Accepted Answer

Everything you have asked about could be done relatively easily. It doesn't matter that its documentation might be lacking, it should be similar to XML and the network's XmlDocument implementation in both use and functionality.

How can I read values inside certain table? Let's say there are 10 tables and I want to read values from 6th or I have table id.

Finding the 6th table:

// XPath
var table6 = doc.DocumentNode.SelectSingleNode("//table[6]");

// LINQ
var table6 = doc.DocumentNode.Descendants("table").Skip(5).FirstOrDefault();

Finding the table/element by id:

var myTable = doc.GetElementById("myTable");

// XPath
var myTable = doc.DocumentNode.SelectSingleNode("//table[@id='myTable']");
var myTable = doc.DocumentNode.SelectSingleNode("//*[@id='myTable']");

// LINQ
var myTable = doc.DocumentNode
    .Descendants("table")
    .Where(table => table.Attributes.Contains("id"))
    .SingleOrDefault(table => table.Attributes["id"].Value == "myTable");
var myTable = doc.DocumentNode
    .Descendants()
    .Where(e => e.Attributes.Contains("id"))
    .SingleOrDefault(e => e.Attributes["id"].Value == "myTable");
var myTable = doc.DocumentNode
    .Descendants("table")
    .SingleOrDefault(table => table.GetAttributeValue("id", null) == "myTable");
var myTable = doc.DocumentNode
    .Descendants()
    .SingleOrDefault(e => e.GetAttributeValue("id", null) == "myTable");

Let's say I want to read td value coming after certain td.

// XPath
var certainTd = table6.SelectSingleNode("//td[2]");
var tdAfterCertainTd = certainTd.SelectSingleNode("following-sibling::td[1]");

// LINQ (not so easy)
var certainTd = table6.Descendants("td").Skip(1).FirstOrDefault();
var tdAfterCertainTd = certainTd.NextSibling;
while (tdAfterCertainTd != null)
{
    if (tdAfterCertainTd.Name == "td")
        break;
    tdAfterCertainTd = tdAfterCertainTd.NextSibling;
}

Table coming after certain div or element or text.

// XPath
var certainDiv = doc.DocumentNode.SelectSingleNode("//div[1]");
var tableAfterCertainDiv = certainDiv.SelectSingleNode("following-sibling::table[1]");

// LINQ (not so easy)
var certainDiv = doc.DocumentNode.Descendants("div").FirstOrDefault();
var tableAfterCertainDiv = certainDiv.NextSibling;
while (tableAfterCertainDiv != null)
{
    if (tableAfterCertainDiv.Name == "table")
        break;
    tableAfterCertainDiv = tableAfterCertainDiv.NextSibling;
}

You should notice some patterns.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why