How to select a specific table cell using HTML Agility Pack

html-agility-pack vb.net xpath

Question

I have to pull out particular fields from cells in an HTML table. Using Firebug I was able to get the exact XPath to the cells I need (unfortunately, the cells don't have an id tag). I thought I could use DocumentNode.SelectSingleNode and pass in that path, but it doesn't seem to be working right. What am I doing wrong? Or is there a better approach to this than how I am doing it? Unfortunately, I have no experience with XPath so this is turning out harder than I expected it to be. Here's what I have so far (I know the HTML is particuarly messy, but that's not in my control to change):

Dim page As New HtmlAgilityPack.HtmlDocument
Dim node As HtmlAgilityPack.HtmlNode
page.LoadHtml(fileContents)
node = page.DocumentNode.SelectSingleNode("/html/body/form/div[6]/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td[2]")

Much appreciated.

Accepted Answer

Firebug maybe fixed broken html tags. If you want to pick and Html node,it is recommend use class or id. For example:

//div[@class='content']//table//tr[1]/td[2]

shorten the path,and use class or id selector.

if the table has it's own id,you can use:

//table[@id='tableid']/tr[1]/td[2]

try it,you will find XPATH is interesting.



Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow