I have to pull out particular fields from cells in an HTML table. Using Firebug I was able to get the exact XPath to the cells I need (unfortunately, the cells don't have an id tag). I thought I could use DocumentNode.SelectSingleNode and pass in that path, but it doesn't seem to be working right. What am I doing wrong? Or is there a better approach to this than how I am doing it? Unfortunately, I have no experience with XPath so this is turning out harder than I expected it to be. Here's what I have so far (I know the HTML is particuarly messy, but that's not in my control to change):
Dim page As New HtmlAgilityPack.HtmlDocument
Dim node As HtmlAgilityPack.HtmlNode
page.LoadHtml(fileContents)
node = page.DocumentNode.SelectSingleNode("/html/body/form/div[6]/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td[2]")
Much appreciated.
Firebug maybe fixed broken html tags. If you want to pick and Html node,it is recommend use class or id. For example:
//div[@class='content']//table//tr[1]/td[2]
shorten the path,and use class or id selector.
if the table has it's own id,you can use:
//table[@id='tableid']/tr[1]/td[2]
try it,you will find XPATH is interesting.