XPath in VB.NET with HTML Agility pack

html-agility-pack vb.net xpath

Question

I have the following vb.net code, which executes without a hitch, and I can see the precise number of items that contain an id property in the message box.

Dim hreftext = htmldoc.DocumentNode.SelectNodes("//*[@id]")
 MsgBox(hreftext.Count)

The issue is that despite there being 6 elements with the id rso, the following results in Object reference not assigned to an instance of an object in the message box.

Dim hreftext = htmldoc.DocumentNode.SelectNodes("//*[@id='rso']")
 MsgBox(hreftext.Count)

Is the second snippet flawed in any way?

1
3
11/2/2012 3:24:20 AM

Accepted Answer

I've read more of your SO queries, and it seems you are attempting to scrape Google Shopping, but you failed to examine the downloaded html code instead of the created dom.

You are seeing the error because the id "rso" is missing from the html source. Google makes it harder to scrape its content and does not like it.

You may add a multiline textbox to your form and substitute the following for your existing xpath code to understand what I mean:

TextBox1.Text = htmldoc.DocumentNode.OuterHtml

Pretty, indeed!

2
11/5/2012 12:51:10 PM

Popular Answer

To further explain reviewing the case:

Try:

Dim hreftext = htmldoc.DocumentNode.SelectNodes("//*[translate(@id,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')='rso']")
 MsgBox(hreftext.Count)

to take any node matching any case combination for "rso,"



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow