如何使用HTML Agility實用程序提取URL的標題,圖像和描述

asp.net c# html-agility-pack webforms

我想使用HTML Agility實用程序從URL中提取標題,描述和圖像到目前為止,我無法找到一個易於理解的示例,並且可以幫助我做到這一點。

我很感激,如果有些人可以幫我舉例,這樣我就可以提取標題,描述並讓用戶選擇從一系列圖像中選擇圖像(當我們共享一個鏈接時,有些東西類似於Facebook)。

更新:

我在.aspx頁面上放置了標題,desc和按鈕,文本框的標籤,然後按下按鈕點擊事件上的代碼。但它為所有值返回null。可能是我做錯了什麼。

我使用以下示例URLhttp://edition.cnn.com/2012/10/31/world/asia/india/index.html?hpt = hp_t2

protected void btnGetURLDetails_Click(object sender, EventArgs e)
{
    HtmlDocument doc = new HtmlDocument();
    var response = txtURL.Text;
    doc.LoadHtml(response);

    String title = (from x in doc.DocumentNode.Descendants()
                    where x.Name.ToLower() == "title"
                    select x.InnerText).FirstOrDefault();

    String desc = (from x in doc.DocumentNode.Descendants()
                   where x.Name.ToLower() == "description"
                   select x.InnerText).FirstOrDefault();

    List<String> imgs = (from x in doc.DocumentNode.Descendants()
                         where x.Name.ToLower() == "img"
                         select x.Attributes["src"].Value).ToList<String>();

    lblTitle.Text = title;
    lblDescription.Text = desc;
}

上面的代碼為我獲取所有變量的空值

如果我用這個修改代碼

protected void btnGetURLDetails_Click(object sender, EventArgs e)
{
    HtmlDocument doc = new HtmlDocument();
    var response = txtURL.Text;
    doc.LoadHtml(response);

    String title = (from x in doc.DocumentNode.Descendants()
                    where x.Name.ToLower() == "title"
                    select x.InnerText).FirstOrDefault();

    String desc = (from x in doc.DocumentNode.Descendants()
                   where x.Name.ToLower() == "description"
                   select x.InnerText).FirstOrDefault();

    List<String> imgs = (from x in doc.DocumentNode.Descendants()
                         where x.Name.ToLower() == "img"
                         select x.Attributes["src"].Value).ToList<String>();

    lblTitle.Text = title;
    lblDescription.Text = desc;
}

在這種情況下,它只給我標題和描述的值再次為null

一般承認的答案

protected void btnGetURLDetails_Click(object sender, EventArgs e)
{
    HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(new Uri(txtURL.Text));
    request.Method = WebRequestMethods.Http.Get;

    HttpWebResponse response = (HttpWebResponse)request.GetResponse();

    StreamReader reader = new StreamReader(response.GetResponseStream());

    String responseString = reader.ReadToEnd();

    response.Close();

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(responseString);

    String title = (from x in doc.DocumentNode.Descendants()
                where x.Name.ToLower() == "title"
                select x.InnerText).FirstOrDefault();

    String desc = (from x in doc.DocumentNode.Descendants()
               where x.Name.ToLower() == "meta"
               && x.Attributes["name"] != null
               && x.Attributes["name"].Value.ToLower() == "description"
               select x.Attributes["content"].Value).FirstOrDefault();

    List<String> imgs = (from x in doc.DocumentNode.Descendants()
                     where x.Name.ToLower() == "img"
                     select x.Attributes["src"].Value).ToList<String>();

   lblTitle.Text = title;
   lblDescription.Text = desc;

}




許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因
許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因