Get web page using HtmlAgilityPack.NETCore

.net-core c# html-agility-pack


I utilisedHtmlAgilityPack for using HTML websites for work. In the past, I did this:

HtmlWeb web = new HtmlWeb();
HtmlDocument document = web.Load(url);
var nodes = document.DocumentNode.SelectNodes("necessary node");

I must now utilize the HTMLAgilityPack, however. where NETCoreHtmlWeb is missing. What alternative should I use?HtmlWeb to get the same outcome?

8/16/2017 10:11:23 AM

Accepted Answer

Apply theHttpClient as a fresh method of using http to communicate with distant resources.

Your answer will likely need the usage of theasync methods here for non-blocking your thread, instead of.Result use. Also keep in mind that since.Net 4.5, HttpClient was intended to be utilized from several threads., you shouldn't have to generate it every time:

// instance or static variable
HttpClient client = new HttpClient();

// get answer in non-blocking way
using (var response = await client.GetAsync(url))
    using (var content = response.Content)
        // read answer in non-blocking way
        var result = await content.ReadAsStringAsync();
        var document = new HtmlDocument();
        var nodes = document.DocumentNode.SelectNodes("Your nodes");
        //Some work with page....

Excellent post on async/await: @StephenCleary | March 2013 (Best Practices for Asynchronous Programming: Async/Await).

9/7/2017 7:46:40 PM

Popular Answer

I have the similar issue with netcoreapp1.0 in Visual Studio Code. Ultimately switched to HtmlAgilityPack version 1.5.0-beta5.

Recall to include:

using HtmlAgilityPack;
using System.Net.Http;
using System.IO;

I accomplished it as follows:

HttpClient hc = new HttpClient(); 
HttpResponseMessage result = await hc.GetAsync($""); 
Stream stream = await result.Content.ReadAsStreamAsync(); 
HtmlDocument doc = new HtmlDocument(); 
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[@class='whateverclassyouarelookingfor']");

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow