如何從HTTPS URL讀取html源代碼

.net c# html html-agility-pack

我正在嘗試使用以下代碼閱讀c#中https url的html源代碼:

 WebClient webClient = new WebClient();
 string htmlString = w.DownloadString("https://www.targetUrl.com");

在此處輸入圖像描述

這對我來說不起作用,因為我得到了編碼的html字符串。我嘗試使用HtmlAgilityPack但沒有任何幫助。

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(htmlString);

一般承認的答案

該URL返回一個gzip壓縮字符串。默認情況下, WebClient不支持此功能,因此您需要轉到底層的HttpWebRequest類。通過feroze在這裡公然扼殺答案 - 通過WebClient.DownloadData自動解壓縮gzip響應

class MyWebClient : WebClient
{
    protected override WebRequest GetWebRequest(Uri address)
    {
        HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
        request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
        return request;
    }
}

熱門答案

ServicePointManager.ServerCertificateValidationCallback = delegate { return true; };
WebClient webClient = new WebClient();
string htmlString = w.DownloadString(url);



許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因
許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因