I cannot locate the supporting documentation for the
on the webpage for Codeplex. To access a div on the Amazon website and scrape text data for a WPF application, that is what I now want to achieve.
var getWeb = new HtmlWeb(); var doc = getWeb.Load(uri); HtmlNode ourNode = doc.DocumentNode.SelectSingleNode("//div[@id = 'zg_centerListWrapper']");
About 12 additional divs, each of which is a component of the
It would seem laborious to access the characteristics of each one (and I'm also not exactly sure how I'd accomplish it at first sight). So, should I replace with
? And how would I go about doing it? In addition, I find it difficult to imagine that there isn't documentation for the
... Considering that YouTube now seems to be my greatest source, maybe I'm searching in the wrong areas.
in fact, a parameter of
is a expression for xpath, specifically an xpath version 1.0. (see this is the xpath 1.0 specification).
Another technique is XPath, which has its own specification, debate, and documentation. Instead of looking for HtmlAgilityPack (HAP) specific information, you should often seek for xpath tutorials or articles to obtain a better understanding of the kind of expression you should send to HAP to access certain HTML components.
Let's say your HTML looks like this for the purpose of illustration:
<div id="zg_centerListWrapper"> <div>I want this</div> <div>..and this</div> <div>..and this one too</div> </div>
s you're considering are any of the following:
div[@id = 'zg_centerListWrapper']
, you may use the xpath shown below to get them:
var xpath = "//div[@id = 'zg_centerListWrapper']/div"; HtmlNodeCollection ourNodes = doc.DocumentNode.SelectNodes(xpath);
You may utilize
then something like that
.Where(div => div.Attributes.Contains("class") && div.Attributes["class"].Value.Contains("best category"))
However, documentation would undoubtedly be helpful.