Using HtmlAgilityPack, remove attributes.

html html-agility-pack html-parsing


I'm attempting to write a code that will eliminate allstyle characteristics using HtmlAgilityPack regardless of tag.

Below is my code:

var elements = htmlDoc.DocumentNode.SelectNodes("//*");

if (elements!=null)
    foreach (var element in elements)

But I can't seem to make it stick? If I examine theelement soon followingRemove("style") . The style property has been eliminated is visible, yet it still shows in theDocumentNode object.:/

Though I feel a little foolish, something doesn't seem right. Has anybody used HTMLAgilityPack for this? Thanks!


My code was modified to the following, and it now works properly:

public static void RemoveStyleAttributes(this HtmlDocument html)
   var elementsWithStyleAttribute = html.DocumentNode.SelectNodes("//@style");

   if (elementsWithStyleAttribute!=null)
      foreach (var element in elementsWithStyleAttribute)
8/5/2014 7:28:27 PM

Accepted Answer

It seems that your code sample is accurate since it eliminates the characteristics. The issue is,DocumentNode .InnerHtml (I'm assuming you monitored this property) is a complicated property that shouldn't be used to get the document as a string since it could be modified as a result of unknowable events. as opposed to itHtmlDocument.Save How to do this:

string result = null;
using (StringWriter writer = new StringWriter())
    result = writer.ToString();

now result The string representation of your document is stored in a variable.

One more thing: altering your expression might make your code better."//*[@style]" it just provides you with componentsstyle attribute.

5/6/2011 1:49:52 PM

Popular Answer

Here is a fairly straightforward fix.




Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow