使用C#解析HTML

c# html html-agility-pack windows-phone

我想用C#解析html頁面。有些html頁麵包含很多html標籤,以下是其中一個的示例:

<span class=text14 id="article_content"><!-- RELEVANTI_ARTICLE_START --><span ></b>The 
     most important component for <a
     class=bluelink href="http://www.ynetnews.com/articles/0,7340,L-
     3284752,00.html%20"' onmouseover='this.href=unescape(this.href)' 
     target=_blank>Israel</a>'s
     security is its special relations with the American administration, and especially with its generous purse. When the Netanyahu government launches a great outcry against the <a  ...

但我只想把內容包含在<span class=text14 id="article_content">標籤中。起初我曾考慮使用preg匹配,但後來意識到它根本沒有效率。我後來讀到了關於Html Agility PackFizzlerEx的內容 - 我想知道是否可以通過使用這些工具提到的特定標籤來包含文本,如果有人能告訴我如何知道我會很感激快速完成這項任務。

一般承認的答案

使用Html Agility Pack非常簡單:

var markup = @"<span class=text14 id=""article_content""><!-- RELEVANTI_ARTICLE_START --><span ></b>The most important component for <a class=bluelink href=""http://www.ynetnews.com/articles/0,7340,L-3284752,00.html%20""' onmouseover='this.href=unescape(this.href)' target=_blank>Israel</a>'s security is its special relations with the American administration, and especially with its generous purse. When the Netanyahu government launches a great outcry against the</span>";

var doc = new HtmlDocument();
doc.LoadHtml(markup);

var content = doc.GetElementbyId("article_content").InnerText;

Console.WriteLine(content);


許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因
許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因