HttpClient doesn't get full website html source

c# html-agility-pack http web-scraping win-universal-app

Question

Using HttpClient, I'm attempting to scrape offers from the http://olx.pl/ website. The issue is that the site returned by the client is much different and lacks the offers list seen in the source code visited straight from the browser. Any ideas? Below is my code:

  string url = "http://olx.pl/oferty/q-diablo/?search%5Bdescription%5D=1";
  HttpClient client = new HttpClient();
  string result = await client.GetStringAsync(url);
1
1
4/16/2016 12:31:05 PM

Accepted Answer

HttpClient won't load any javascript-generated material. Use WebView instead, which will execute JavaScript. Running bothHttpClient result was 235507 characters long.WebView length of the result: 464476.

    WebView wv = new WebView();
    wv.NavigationCompleted += Wv_NavigationCompleted;
    wv.Navigate(new Uri(url));

    private async void Wv_NavigationCompleted(WebView sender, WebViewNavigationCompletedEventArgs args)
    {
        string wvresult = await sender.InvokeScriptAsync("eval", new string[] { "document.documentElement.outerHTML;" });
    }
1
4/16/2016 12:25:49 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow