Html Agility Pack: replacing script tags

c# html html-agility-pack

Question

I want to replace JQuery script tag in an html string by the its code. means removing the script tag with the src attribute set to for example "scripts/jquery-1.9.1.js" to a script tag containing the current source code of JQuery.

I create a new node using this code :

HtmlNode node = new HtmlNode(HtmlNodeType.Element, htmlDocument, index);
node.Name = "script";
node.PrependChild(HtmlNode.CreateNode(jQuerySourceCodeString));

no matter what i do to the jQuerySourceCodeString it is always truncated to this :

<script>/*!
 * jQuery JavaScript Library v1.9.1
 * http://jquery.com/
 *
 * Includes Sizzle.js
 * http://sizzlejs.com/
 *
 * Copyright 2005, 2012 jQuery Foundation, Inc. and other contributors
 * Released under the MIT license
 * http://jquery.org/license
 *
 * Date: 2013-2-4
 */
(function( window, undefined ) {

// Can't do this because several apps including ASP.NET trace
// the stack via arguments.caller.callee and Firefox dies if
// you try to trace through "use strict" call chains. (#13335)
// Support: Firefox 18+
//"use strict";
var
    // The deferred used on DOM ready
    readyList,

    // A central reference to the root jQuery(document)
    rootjQuery,

    // Support: IE</script>

which is clearly not the code we can find here

what am I doing wrong ?

Update :

1 - I can not user InnerHtml since it tries to read it as html.

2 - The HtmlNode.CreateNode method breaks when it finds this "<", it thinks that its the beginning of a tag, but it is not.

Accepted Answer

Considering your problem is how to append a script node using to the parsed html document (because you want to remove the existing script node, retrieve the source from the script src's uri, and append a new one with the results), I created a sample to reproduce what you want to do.

  1. I took the cdn for jquery 1.9.1 and saved it into a local file
  2. I then tried to append it to the html document, under a script node

I tried some attempts with HtmlAgilityPack, but the resulting html always had a trailing garbage, looking like

</div></10></=></9></=8></"></$1></(?!area|br|col|embed|hr|img|input|link|meta|param)(([\w:-]+)[^></(?:"></use></9></table></tfoot></thead></tbody></table></tbody></9></=></"></[\w\w]+></tag></\></([\w-]+)\s*\></number></9></9></1.9.8+></10></=8></script>

I then gave up and tried another html parser I (more) often use - AngleSharp. With it, I get a correct resulting html.

Here are the code snippets for both attempts:

HtmlAgilityPack:

string html = @"
    <html>
    <head><title>SO Question</title></head>
    <body>
        <div>
            text text text
        </div>
    </body>

    <script>
        var a = 10;
    </script>
    </html>
";

var jsCode = File.ReadAllText("D:/jquery-1.12.4.js", Encoding.UTF8);

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);

HtmlNode jsNode = new HtmlNode(HtmlNodeType.Element, doc, 0);
jsNode.Name = "script";
jsNode.InnerHtml = jsCode;

doc.DocumentNode.InsertAfter(jsNode, doc.DocumentNode.SelectSingleNode("body"));

File.WriteAllText("D:/jsCodeOut.html", doc.DocumentNode.InnerHtml);

AngleSharp:

string html = @"
    <html>
    <head><title>SO Question</title></head>
    <body>
        <div>
            text text text
        </div>
    </body>

    <script>
        var a = 10;
    </script>
    </html>
";

var jsCode = File.ReadAllText("D:/jquery-1.12.4.js", Encoding.UTF8);

HtmlParser hp = new HtmlParser();
var parsedHtml = hp.Parse(html);

var scriptNode = parsedHtml.CreateElement("script");
scriptNode.InnerHtml = jsCode;

parsedHtml.DocumentElement.AppendChild(scriptNode);

File.WriteAllText("D:/angleSharpOutput.html", parsedHtml.DocumentElement.InnerHtml);

Conclusion:

If you need to do it exclusively with HtmlAgilityPack, then my post was ultimately of no help. Otherwise, try AngleSharp and you have solved your problem.


Popular Answer

With HtmlAgilityPack you can use textNode:

jsNode.AppendChild(doc.CreateTextNode(jsCode));


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why