HTML parsing using c#

c# html html-agility-pack replace string

Question

On an asp.net website, I'm working. I need to use C# to substitute a certain string in the html. The HTML is as follows. Here, I need to use C# code to replace "@name" with a real name. I experimented with java script. It's effective. How would I go about doing this in C#?

How can I get the HTML for the current page using C# or HTMLAgilityPack so that I can parse it?

HTML:

<div>
In the @name, you may have configured an iPad in both the AppleDevices and the TabletDevices configuration. However, because AppleDevices may have been set for a small display size, you want an iPad to use the TableDevices configuration (which has a larger screen). Reorder the devices in the following order so that an iPad will use the TableDevices configuration first.
Tablet Devices
Apple Devices
</div>
1
2
7/12/2012 2:37:45 PM

Accepted Answer

Look at my CsQuery project if you think this is MVC. Utilize CsQuery, a jQuery port and CSS selector engine, to interact directly with HTML. However, the project's example with code to view a page's HTML before it renders in C# under MVC is more significant.

It's not too difficult to get partial views; for further information, check Rick Strahl's blog entry.

However, if you want to view a page's whole HTML and maybe change it before it is produced, you must first construct a custom object.ViewEngine , callbacks to the controller where you may access the HTML, and so on. Doing this correctly involves a lot of steps. Examine the classes in the sample MVC app provided with CsQuery rather than copying hundreds of lines of code.CsQueryView folder:

https://github.com/jamietre/CsQuery/tree/master/examples/CsQuery.MvcApp

This consists of a unique view engine and a unique Controller base class that enable you to add the following functions to controllers:

// runs for all actions
public void Cq_Start()
{
    Doc["a.not-allowed"]
        .Attr("onclick","javascript:alert('You're not authorized to click this')");
}

// runs for the Index action
public void Cq_Index()
    Doc["div"].Css("border", "1px solid red;");
}

The regular action methods that correspond, set the value of, and are known as afterDoc . Doc is aCQ object (the core object in CsQuery). This is where a page's whole HTML is located. Similar to a jQuery object, You may just use jQuery techniques like:

// select all divs on the page
var div = Doc["div"];

// do parameter substitution
var newText = div.Text().Replace("@name", valid_name);

// update the text
div.Text(newText);

You must add this code to your MVC project in order to change it to utilize this view engine.Application_Start :

ViewEngines.Engines.Clear();
ViewEngines.Engines.Add(new CsQueryViewEngine());

However, the example should demonstrate how to get the HTML output in MVC before it renders if you don't want to utilize CsQuery. It may be simply modified to send a string of the HTML instead of a CsQuery object by using reflection to determine the methods to call back in your controller.

2
7/12/2012 2:59:58 PM

Popular Answer

Using the String.Replace(String, String) technique is the easiest approach:

string newString = html.Replace("@name", "valid name");


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow