HTML Agility Pack find Descendants

c# html-agility-pack xpath

Question

Je suis en train de ramper sur une page Web et je veux afficher du contenu sur ma page Web et je suis coincé dans le code pour trouver des descendants. Ci-dessous la page web HTML

<ul class="results">
    <li class="gts" data-webm-section="OAG-AD-14960184">
        <a class="item-link-container" href="/bikes/details/2016-Indian-Chieftain-Dark-Horse-MY17/OAG-AD-14960184/?cr=0&amp;gts=OAG-AD-14960184&amp;gtsviewtype=TopSpot&amp;gtssaleid=OAG-AD-14960184&amp;psq=%28%28Service%3D%5BBikesales%5D%26State%3D%5BNSW%5D%29%26%28%28%28%28SiloType%3D%5BBrand%20new%20bikes%20available%5D%7CSiloType%3D%5BBrand%20new%20bikes%20in%20stock%5D%29%7CSiloType%3D%5BDealer%20used%20bikes%5D%29%7CSiloType%3D%5BDemo%20%26%20near%20new%20bikes%5D%29%7CSiloType%3D%5BPrivate%20used%20bikes%5D%29%29&amp;pso=0&amp;pss=Premium">
            <header>
                <h3><span class=></span>Heading</h3>
                <div class="spotlight flag non-textual">Spotlight</div>
            </header>
            <div class="primary panel">
                <ul class="photos" data-js-lazy-load-length="3" style="width:8350px">
                    <li>
                        <img src="http//" height="221" width="334" alt="2016 Indian Chieftain Dark Horse MY17" />
                    </li>
                </ul>
                <div class="image-nav previous" data-webm-clickvalue="previous-image">
                    <span class="arrow"></span>
                    <span class="background"></span>
                </div>
                <div class="image-nav next" data-webm-clickvalue="next-image">
                    <span class="arrow"></span>
                    <span class="background"></span>
                </div>
                <div class="image-nav-count">
                    <span class="current">1</span> of 24
                </div>
            </div>
            <div class="secondary panel">
                <span class="price">$29,995*</span>
                <div data-fancybox-href="/mvcajax/bikes/PriceGuide/" class="pricing-message light-box-iframe">
                    Ride Away No More To Pay
                </div>
                <div class="features">
                    <ul>
                        <li class="ui-category">
                            <i></i>Cruiser
                        </li>
                        <li class="engine-size">
                            <i></i>1,811 cc
                        </li>
                        <li class="odometer">
                            <i></i>2,552 km
                        </li>
                    </ul>
                    <div class="bike-facts non-textual"></div>
                </div>
            </div>
            <p class="description">**NO REGRETS - 7 Day Money Back guarantee** PLUS 12 months Warranty &amp; Roadside Assist.Conditions App...</p>

        </a>
    </li>
</ul>

En dessus html je veux image, prix et catégorie de vélo

Ci-dessous mon code

 public async Task<ActionResult> Webcrawl()
    {
        string URL = "https://www.bikesales.com.au/bikes/new-south-wales/";
        List<bikes> bikelist = new List<bikes>();
        using (var client = new HttpClient())
        {
            var html = await client.GetStringAsync(URL);

            HtmlDocument Doc = new HtmlDocument();
            Doc.LoadHtml(html);

            var ProductsHtml = Doc.DocumentNode.Descendants("ul").Where(node => node.GetAttributeValue("class", "").Equals("results")).ToList();

            var ProductsList = ProductsHtml[0].Descendants("li").Where(node => node.GetAttributeValue("class", "").Equals("gts")).ToList();
            foreach (var list in ProductsList)
            {
                var PriceNode = list.SelectSingleNode("//div[@class='secondary panel']");
                var bike = new bikes
                {
                    Name = list.Descendants("h3").FirstOrDefault().InnerText,
                    Title = list.Descendants("p").FirstOrDefault().InnerText,
                    Price = PriceNode.Descendants("span").FirstOrDefault().InnerText,
                    Image = list.SelectNodes("//div[@class='primary panel']/ul[1]/li[1]/img").FirstOrDefault().ChildAttributes("src").FirstOrDefault().Value,
                    Type = list.SelectNodes("//div[@class='secondary panel']/div[2]/ul[1]/li[1]").FirstOrDefault().InnerText.Trim('\r', '\n', '\t'),
                    Engine = list.SelectNodes("//div[@class='secondary panel']/div[2]/ul[1]/li[2]").FirstOrDefault().InnerText.Trim('\r', '\n', '\t'),
                    Odometer = list.SelectNodes("//div[@class='secondary panel']/div[2]/ul[1]/li[3]").FirstOrDefault().InnerText.Trim('\r', '\n', '\t'),
                };
                bikelist.Add(bike);
            }
            return View(bikelist);
        }
    }

Lorsque je lance le code ci-dessus, je ne reçois que le premier élément de la liste, sauf le nom, c.-à-d. Même image, même type et même prix.

S'il vous plaît corriger mon erreur dans le code. Merci d'avance.

Réponse acceptée

Pour rechercher dans un élément spécifique, vous devez utiliser dot dans votre xpath. Sinon, la recherche est effectuée à la racine du document. J'ai également réécrit certains de ces xpath car ils ne renvoyaient pas de données dans certains cas.

var products = Doc.DocumentNode.SelectNodes("//ul[@class='results']/li[@class='gts']");
foreach (HtmlNode product in products)
{
    var PriceNode = product.SelectSingleNode(".//div[@class='secondary panel']");
    var bike = new bikes
    {
        Name = product.SelectSingleNode(".//li[@class='ui-category']").InnerText,
        Title = product.SelectSingleNode(".//header/h3").InnerText,
        Price = product.SelectSingleNode(".//span[@class='price']").InnerText,
        Image = product.SelectSingleNode(".//ul[@class='photos']//img").Attributes["src"].Value,
        Type = product.SelectSingleNode(".//li[@class='ui-category']").InnerText.Trim('\r', '\n', '\t'),
        Engine = product.SelectSingleNode(".//li[@class='engine-size']").InnerText.Trim('\r', '\n', '\t'),
        Odometer = product.SelectSingleNode(".//li[@class='odometer']").InnerText.Trim('\r', '\n', '\t'),
    };
    bikelist.Add(bike);
}

Il génère:

Touring
2018 Indian Roadmaster Elite
$49,995*
Touring
1,811 cc
0 km

Cruiser
2018 Indian Scout
$19,995*
Cruiser
1,133 cc
301 km

Naked
2018 Suzuki GSX-S1000
$15,690*
Naked
999 cc
0 km

Touring
2017 Indian Roadmaster
$37,995*
Touring
1,811 cc
6,577 km

Super Sport
2018 Suzuki GSX-R600
$13,790*
Super Sport
599 cc
0 km

Cruiser
2018 Indian Scout
$18,995*
Cruiser
1,133 cc
1,901 km

...and so on


Related

Sous licence: CC-BY-SA with attribution
Non affilié à Stack Overflow
Sous licence: CC-BY-SA with attribution
Non affilié à Stack Overflow