HTML AGILITY PACK Parsing Div Blocks

.net c# html html-agility-pack

Question

I need to parse items from an internet-shop—I need their name and price. Each item-block is located in a different div within a div-catalog of these items.
So I tried this, and it kinda works, but I would prefer to parse both name and price in 1 loop. How might I do so? Thanks!

        var url = "http://bestaqua.com.ua/catalog/filtry-obratnogo-osmosa";
        HtmlWeb web = new HtmlWeb();
        HtmlDocument HtmlDoc = web.Load(url);
        var RootNode = HtmlDoc.DocumentNode;
        foreach (HtmlNode node in 
HtmlDoc.DocumentNode.SelectNodes("//div[@class='catalog_blocks']"))
        {
            foreach (HtmlNode item_name in 
node.SelectNodes("//div[@class='catalog_blocks-item-name']"))
            {
                    string name = item_name.InnerText;
                    System.Diagnostics.Debug.Write("NAME :" + name + "\n" );
            }
            foreach (HtmlNode item_price in 
node.SelectNodes("//span[@class='price-new']"))
            {
                string price = item_price.InnerText;
                System.Diagnostics.Debug.Write("PRICE: " + price + "\n");
            }
        }

Accepted Answer

Since SelectNodes is using an XPATH-expression, you could just use a union in your class filter using "|", which will result in a single collection to loop over. Note that you would then still need to check which element you've actually selected within the for-loop.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why