HTMLAgilityPack get class innerText


I am trying to get the innerText of a class. This is my code:

using (HttpClient clientduplicate = new HttpClient())
        "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident / 6.0)");

    using (HttpResponseMessage responseduplicate = await clientduplicate.GetAsync(@"$12-billion-of-stock-after-trump-won-456954")
    using (HttpContent contentduplicate = responseduplicate.Content)
            string resultduplicate = await contentduplicate.ReadAsStringAsync();

            var websiteduplicate = new HtmlDocument();

            var titlesduplicate = websiteduplicate.DocumentNode.Descendants("div").FirstOrDefault(o => o.GetAttributeValue("class", "") == "arial_14 clear WYSIWYG newsPage");
            var match = Regex.Match(titlesduplicate.InnerText, @"(.*?)<!--", RegexOptions.Singleline).Groups[1].Value;

        catch(Exception ex1)
            var dialog2 = new MessageDialog(ex1.Message);
            await dialog2.ShowAsync();

Now the problem is that this will also return me the text on the picture. I can find a workaround but I was wondering if there is an other approach on this. Something simpler/faster.

Plus when I use this on other articles/URLs there are other minor bugs.

Accepted Answer

There are many ways to do this. One way is to remove the carousel div before getting innerText: doc.DocumentNode.Descendants("div").FirstOrDefault(_ => _.Id.Equals("imgCarousel"))?.Remove();

Licensed under: CC-BY-SA
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why