Exiger une table en utilisant scrapysharp

c# html-agility-pack scrapysharp web-scraping

Question

J'ai des données d'un site Web que je tente de gratter. Les données ressemblent à celles ci-dessous. Comment extraire la table aide de scrapysharp ?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

using HtmlAgilityPack;
using ScrapySharp.Extensions;
using ScrapySharp.Network;

namespace Scrape
{
    class Program
    {
        static void Main(string[] args)
        {
            ScrapingBrowser browser = new ScrapingBrowser();

            //set UseDefaultCookiesParser as false if a website returns invalid cookies format
            //browser.UseDefaultCookiesParser = false;

            WebPage homePage = browser.NavigateToPage(new Uri("http://www.nasdaq.com/earnings/earnings-calendar.aspx"));
            var divs = homePage.Html.CssSelect("div");  //all div elements
            var trs = homePage.Html.SelectNodes("//div")
                .Where(n => !String.IsNullOrEmpty(n.GetAttributeValue("class"))
                //(n.GetAttributeValue("class") == "genTable")
                );                       
        }
    }
}

Voici la partie pertinente du html :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

using HtmlAgilityPack;
using ScrapySharp.Extensions;
using ScrapySharp.Network;

namespace Scrape
{
    class Program
    {
        static void Main(string[] args)
        {
            ScrapingBrowser browser = new ScrapingBrowser();

            //set UseDefaultCookiesParser as false if a website returns invalid cookies format
            //browser.UseDefaultCookiesParser = false;

            WebPage homePage = browser.NavigateToPage(new Uri("http://www.nasdaq.com/earnings/earnings-calendar.aspx"));
            var divs = homePage.Html.CssSelect("div");  //all div elements
            var trs = homePage.Html.SelectNodes("//div")
                .Where(n => !String.IsNullOrEmpty(n.GetAttributeValue("class"))
                //(n.GetAttributeValue("class") == "genTable")
                );                       
        }
    }
}

Réponse acceptée

J'imagine quelque chose comme ce code

var hw = new HtmlWeb();
        doc = hw.Load("http://www.nasdaq.com/earnings/earnings-calendar.aspx");

        foreach (HtmlNode row in doc.DocumentNode.Descendants("table").FirstOrDefault(_ => _.Id.Equals("ECCompaniesTable")).Descendants("tr"))
        {
            Console.WriteLine(row.InnerText);
        }



Sous licence: CC-BY-SA with attribution
Non affilié à Stack Overflow
Est-ce KB légal? Oui, apprenez pourquoi
Sous licence: CC-BY-SA with attribution
Non affilié à Stack Overflow
Est-ce KB légal? Oui, apprenez pourquoi