Strip Specific Styles from the Style attribute in Html string using Html Agility Pack

c# html-agility-pack html-parsing

Question

I have a string of Html and it contains varied Html but includes this

<span style="display:block;position:fixed;width:100%;height:2000px;background-color:rgba(0,0,0,0);z-index:9999!important;top:0;left:0;cursor:default;"></span>

This will seem strange, but I only want to remove specific items within the style attribute (For all Html elements). For example I want to remove

position:fixed and z-index:9999!important; and top:0; and left:0;

To name a few, but keep everything else. Now the issue is, it's not necessarily position:fixed; it could be position:absolute; or whatever. Just as it could be z-index:9998; or top:20; etc...

I need to be able to remove style elements by their key, so position:*anything* and top:*anything* etc.... AND also do this in a non-case sensitive manner. So it would get POSITION:*anything* or PoSition:*anything*

Is there a way to achieve this using the Html Agility Pack?

Popular Answer

There doesn't appear to be any support for inline style string parsing in the HTML Agility Pack, but .NET does have some capabilities for this in System.Web.UI to support WebForms controls.

It's called the CssStyleCollection, and it will convert your style string into a nice array of string key/value pairs, and allow you to remove the specific keys you do not want.

However, since it's an internal tool for WebControl use, it doesn't have a public constructor. Instead, you have to instantiate it via reflection, or use a hack like this;

CssStyleCollection style = new Panel().Style;

Once created,

style.Value = "YOUR STYLE STRING"; 

And then remove the items you don't want;

style.Remove("position");
style.Remove("z-index");
style.Remove("top");
style.Remove("left");

Retrieve your new delimited style string from style.Value.

IMPORTANT: I haven't tested this, but the process seems simple enough, if a bit hacky. There may be some surprises I haven't come across yet. In particular, I have no idea how it handles situations where there are multiple duplicate style settings in the same string;

top:0;margin-left:20;top:10; 

In inline style strings, browsers will respect the last specified value, so top:10 wins. However since CssStyleCollection uses unique keys, it cannot store both top values and most likely discards one.



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why