Get the content of website using HttpClient but without async method

asp.net-mvc asynchronous c# html-agility-pack http

Question

I am trying to fetch the website content using `httpclinet as you can see here

public async Task<List<NewsContent>> parsing(string newsArchive)
{
    List<NewsContent> lstResult=new List<NewsContent>();
    HttpClient http = new HttpClient();
    var response = await http.GetByteArrayAsync(newsArchive);
    String source = Encoding.GetEncoding("utf-8").GetString(response, 0, response.Length - 1);
    source = WebUtility.HtmlDecode(source);
    HtmlDocument resultat = new HtmlDocument();
    resultat.LoadHtml(source);

    List<HtmlNode> toftitle = resultat.DocumentNode.Descendants().Where
    (x => (x.Name == "div" && x.Attributes["class"] != null && x.Attributes["class"].Value.Contains("news_list"))).ToList();
    var li = toftitle[0].Descendants().Where
    (x => (x.Name == "div" && x.Attributes["class"] != null && x.Attributes["class"].Value=="news_item")).ToList();

    foreach (var item in li)
    {
        NewsContent newsContent = new NewsContent();
        newsContent.Url = item.Descendants("a").ToList()[0].GetAttributeValue("href", null);
        newsContent.Img = item.Descendants("img").ToList()[0].GetAttributeValue("src", null);
        newsContent.Title = item.Descendants("h2").ToList()[0].InnerText;

        //finding main news content
        var response1 = await http.GetByteArrayAsync("http://www.nsfund.ir/news" + newsContent.Url);
        String source1 = Encoding.GetEncoding("utf-8").GetString(response1, 0, response1.Length - 1);
        source1 = WebUtility.HtmlDecode(source1);
        HtmlDocument resultat1 = new HtmlDocument();
        resultat1.LoadHtml(source1);
        newsContent.Content = resultat1.DocumentNode.SelectSingleNode("//div[@class='news_content_container']").InnerText;


    }
    return  lstResult;
}

As you can see i used async method to get the data .here :

var response = await http.GetByteArrayAsync(newsArchive);

But the problem is when i call my async function :

News newagent = new News();
Task<List<NewsContent>> lst = newagent.parsing("http://www.nsfund.ir");
Task.WaitAll(lst);
List<NewsContent> enresult = lst.Result;

I don't get any result.so i decided to convert this async function to a normal function ,what kind of code should be replaced with this :

var response = await http.GetByteArrayAsync(newsArchive);

Accepted Answer

I guess i found the problem with your code.
You're not adding the NewsContent object to your List.

in the foreach loop please add it to the List

lstResult.Add(newsContent)

Hope it solves the issue with your async strategy


Popular Answer

But the problem is when i call my async function :

Task.WaitAll(lst);
List<NewsContent> enresult = lst.Result;

Yes, that's a problem, all right. Two problems, actually: Task.WaitAll and Result. They should both be replaced with a single await:

List<NewsContent> enresult = await lst;

The core problem is a deadlock scenario that I explain in full on my blog. In summary, await will capture the current context and use that to resume the async method. But ASP.NET only allows one thread within its request context at a time. So when parsing is first called, it executes until it hits its await and then returns. The calling method then blocks; this is where the problem is, because by blocking, the calling method is keeping a thread in that ASP.NET request context.

Later, when the await inside parsing is done, it attempts to resume the parsing method in that ASP.NET request context but it cannot, because there's a thread stuck in that context and ASP.NET only allows one thread at a time. The calling method is waiting for parsing to complete, and parsing is waiting for the context to be free. Classic deadlock.



Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow