I have a string of Html and it contains varied Html but includes this
<span style="display:block;position:fixed;width:100%;height:2000px;background-color:rgba(0,0,0,0);z-index:9999!important;top:0;left:0;cursor:default;"></span>
This will seem strange, but I only want to remove specific items within the style attribute (For all Html elements). For example I want to remove
position:fixed
and z-index:9999!important;
and top:0;
and left:0;
To name a few, but keep everything else. Now the issue is, it's not necessarily position:fixed;
it could be position:absolute;
or whatever. Just as it could be z-index:9998;
or top:20;
etc...
I need to be able to remove style elements by their key, so position:*anything*
and top:*anything*
etc.... AND also do this in a non-case sensitive manner. So it would get POSITION:*anything*
or PoSition:*anything*
Is there a way to achieve this using the Html Agility Pack?
There doesn't appear to be any support for inline style string parsing in the HTML Agility Pack, but .NET does have some capabilities for this in System.Web.UI
to support WebForms controls.
It's called the CssStyleCollection
, and it will convert your style
string into a nice array of string key/value pairs, and allow you to remove the specific keys you do not want.
However, since it's an internal tool for WebControl use, it doesn't have a public constructor. Instead, you have to instantiate it via reflection, or use a hack like this;
CssStyleCollection style = new Panel().Style;
Once created,
style.Value = "YOUR STYLE STRING";
And then remove the items you don't want;
style.Remove("position");
style.Remove("z-index");
style.Remove("top");
style.Remove("left");
Retrieve your new delimited style string from style.Value
.
IMPORTANT: I haven't tested this, but the process seems simple enough, if a bit hacky. There may be some surprises I haven't come across yet. In particular, I have no idea how it handles situations where there are multiple duplicate style settings in the same string;
top:0;margin-left:20;top:10;
In inline style strings, browsers will respect the last specified value, so top:10
wins. However since CssStyleCollection
uses unique keys, it cannot store both top
values and most likely discards one.