HTML parsing using c#

c# html html-agility-pack replace string

Question

I am working on an asp.net website.I need to replace a particular string in the html using c#.The following is the html. Here i need to replace "@name" by a valid name using c# code. I tried using java script.It's working.How can i achieve this using c#?

How can i get the Current Page's HTML using c# or HtmlAgilityPack in order to parse it?

HTML:

<div>
In the @name, you may have configured an iPad in both the AppleDevices and the TabletDevices configuration. However, because AppleDevices may have been set for a small display size, you want an iPad to use the TableDevices configuration (which has a larger screen). Reorder the devices in the following order so that an iPad will use the TableDevices configuration first.
Tablet Devices
Apple Devices
</div>

Accepted Answer

Assuming this is MVC take a look at my CsQuery project. CsQuery is a jQuery port and CSS selector engine, which you can use to work with the HTML directly. But more importantly the project includes an example with code to access the HTML for a page before it renders in C# under MVC.

Accessing partial views is pretty easy, see Rick Strahl's blog post on the subject.

If you want to access the entire HTML of a page and potentially alter it before it's rendered, however, you need to create a custom ViewEngine, and make callbacks to the controller where you will be able to access the HTML. There's quite a bit involved in doing this right. Rather than copy hundreds of lines of code, take a look at the example MVC app included with CsQuery, specifically the classes in the CsQueryView folder:

https://github.com/jamietre/CsQuery/tree/master/examples/CsQuery.MvcApp

This includes a custom view engine and a custom Controller base class that lets you add methods to controllers like this:

// runs for all actions
public void Cq_Start()
{
    Doc["a.not-allowed"]
        .Attr("onclick","javascript:alert('You're not authorized to click this')");
}

// runs for the Index action
public void Cq_Index()
    Doc["div"].Css("border", "1px solid red;");
}

These methods are called after the regular action methods that correspond, and set the value of Doc. Doc is a CQ object (the core object in CsQuery). This contains all the HTML for a page. It's like a jQuery object. In your situation you could just use jQuery methods like:

// select all divs on the page
var div = Doc["div"];

// do parameter substitution
var newText = div.Text().Replace("@name", valid_name);

// update the text
div.Text(newText);

To switch your MVC app to use this view engine you need to add this code to Application_Start:

ViewEngines.Engines.Clear();
ViewEngines.Engines.Add(new CsQueryViewEngine());

If don't want to use CsQuery, though, the example should show you how to access the HTML output in MVC before it renders. It uses reflection to figure out the methods to call back in your controller, and it could easily be adapted to provide a string of the HTML instead of a CsQuery object.


Popular Answer

The simplest way is to use String.Replace(String, String) method:

string newString = html.Replace("@name", "valid name");


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why