How to comment out all script tags in an html document using HTML agility pack

c# comments html-agility-pack

Question

I would like to comment out all script tags from an HtmlDocument. This way when I render the document the scripts are not executed however we can still see what was there. Unfortunately, my current approach is failing:

foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
            {
                var commentedScript = new HtmlNode(HtmlNodeType.Comment, htmlDocument, 0) { InnerHtml = scriptTag.ToString() };
                scriptTag.ParentNode.AppendChild(commentedScript);
                scriptTag.Remove();
            }

Note that I can do this using replace functions on the html, but I do not think it would be as robust:

domHtml = domHtml.Replace("<script", "<!-- <script");
domHtml = domHtml.Replace("</script>", "</script> -->");

Accepted Answer

Try this:

foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
        {
            var commentedScript = HtmlTextNode.CreateNode(string.Format("<!--{0}-->", scriptTag.OuterHtml));
            scriptTag.ParentNode.ReplaceChild(commentedScript, scriptTag);
        }

Popular Answer

Refer to this SO post - very clean solution utilising the Linq query support of the HTML Agility Pack: htmlagilitypack - remove script and style?




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why