XPATH utilizing HTML agility pack to extract one td at a time from a tbody

html html-agility-pack xpath

Question

I'm attempting to parse the table from the following URL for Google Finance.

http://www.google.com/finance/historical?q=BOM:533278

I'm attempting to just extract the data from the close column of the close table. However, when I use the XPATH

hd.DocumentNode.SelectSingleNode("//td[@class='rgt']")

I am receiving all the nodes with attribute as class and attribute value as rgt in a single Node.innerText.

I need each value separately, not all as once. I must be doing foolishly in this situation. I'm grateful.

The actual XPath that Firebug discovered is as follows.

/html/body/div/div/div[3]/div[2]/div/div[2]
     /div[2]/div/form/div[2]/table/tbody/tr[2]/td[5]

But after the form tag, somehow... The HTML agility pack is giving back a null node. Never anticipated that the implementation would take so long.

1
3
3/6/2011 9:10:26 PM

Accepted Answer

If you're obtaining the data via the Firebug or any Firefox extension (such as XPather),XPath you may need to eliminate the of the components you need to parse,tbody from the XPath tags.

Examine the following response on SO: Why is 'tbody' added to 'table' by Firebug?

When using HtmlAgilityPack, theXPath Because of the It's possible that the HTML source you're processing differs from the HTML source in Firefox., the results produced by Firebug or by any other instrument connected to Firefox may vary.

Open the same page in internet browser 8 and then use Engineer's Tools (F12) to do the same actions you would with Firebug if necessary. If not, use another tool, such as HAP Navigator, which can be downloaded from the page for HTMLAgilityPack.

3
5/23/2017 12:22:40 PM

Popular Answer

There are several options. Here is one answer that uses the Data td (the one with the "lm" class) as its foundation:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
... load the doc ...

foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//td[@class='lm']/../td[5]"))
{
    Console.WriteLine("node=" + node.InnerText);
}


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow