How to remove the
tag in my html string using HtmlAgilityPack in C#?

c# c#-4.0 html-agility-pack

Question

I'm using HTMLAgilityPack to parse a single HTML string that I have.

My HTML string is here.

<p class="Normal-P" style="direction: ltr; unicode-bidi: normal;"><span class="Normal-H">sample<br/></span> <span class="Normal-H">texting<br></span></p>

This HTML code contains<br> add a tag twice. I thus wish to delete both tags.

can you assist me in getting rid of all<br> html tags in my string

1
3
12/15/2012 10:17:04 AM

Accepted Answer

It is as simple as:

  • the HTML snippet being loaded into an Agility PackHtmlDocument
  • obtaining all<br /> using the tags"//br" expression for xpath
  • Using the tags collected in the prior step, deletingRemove() method
  • evaluating the outcome in theDocumentNode.OuterHtml property

The code is as follows:

const string htmlFragment =
    @"<p class=""Normal-P"" style=""direction: ltr; unicode-bidi: normal;"">" +
    @"<span class=""Normal-H"">sample<br/></span>" +
    @"<span class=""Normal-H"">texting<br></span></p> ";

var document = new HtmlAgilityPack.HtmlDocument();
document.LoadHtml(htmlFragment);

foreach (var brTag in document.DocumentNode.SelectNodes("//br"))
    brTag.Remove();

Console.WriteLine(document.DocumentNode.OuterHtml);
5
12/15/2012 11:01:55 AM

Popular Answer

string html = ...;
string html = Regex.Replace(html, "<br>", "", RegexOptions.Singleline);


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow