html agility pack parsing error?

c# html-agility-pack

Question

I'm working on a few hundred pages of amazon search results for some data analysis, and using hap to parse out the results:

hap.DocumentNode.SelectNodes("//ul[@id='s-results-list-atf']/li")

This only returns the first four li's of the result list though when there are more. I've checked and double checked and I'm sure this is right - am I doing something very wrong? I can't see why the results are limited to just 4. Typical page content might be: https://www.amazon.com/s/?url=search-alias%3Daps&field-keywords=100+percent+barstow

Using other search results pages give the same problem and do not give the results on the page, but always significantly less than what's there.

Accepted Answer

Give this a try:

hap.DocumentNode.SelectNodes("//div[contains(@id,'tfResults')]//li[contains(@class,'s-result-item')]");

No guarantees for other pages though because I'm only basing on the link you gave



Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow