Site uses Javascript and I am having trouble accessing it using htmlagilitypack

c# html-agility-pack web-scraping windows-phone-8

Question

I am trying using a Windows Phone 8.0 Silverlight App to scrape data from PlayStation.

However, I believe the site uses Javascript and I am having trouble accessing it using htmlagilitypack.

My code so far is:

protected async override void OnNavigatedTo(NavigationEventArgs e)
{
   base.OnNavigatedTo(e);
   string htmlPageLive = "";

   using (var client = new HttpClient())
   {
      htmlPageLive = await client.GetStringAsync("https://store.sonyentertainmentnetwork.com/#!/en-us/free-games/cid=STORE-MSF77008-PSPLUSFREEGAMES?smcid=pdc:us-en:ps-plus:sub-nav-new-arrivals");
   }

   HtmlDocument htmlDocumentLive = new HtmlDocument();
   htmlDocumentLive.LoadHtml(htmlPageLive);

   foreach (var div in htmlDocumentLive.DocumentNode.SelectNodes("//ul[@class= 'pane pane0']"))
   {
      PSPGames newGame = new PSPGames();
      newGame.Title = div.SelectSingleNode(".//h3[@class= 'cellTitle']").InnerText.Trim();
   }
   lstPSPGame.ItemsSource = PSPgame;
   customIndeterminateProgressBar.Visibility = Visibility.Collapsed;
}

However, the app is crashing on the 'foreach' line when it tries to look up the node 'pane pane0'.

Is it possible to scrape the data? If so, what would I need to do?

Thanks in advance.

Popular Answer

It is not possible to scrape the data with HtmlAgilityPack since it is loaded asynchronously. What you get when querying the page is its skeleton.

What you could try is watch the network calls and see if a public webservice is called. Look out for json or xml data




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why