I am trying to parse HTML form with HTML Agility Pack. It is working fine for the following code:
<p>Some Text</p>
But suppose I have this:
<p>Some Text in p Tag<span>Some text in span tag.</span> Again some text in p tag</p>
I am using HtmlNode nodeItem in htmlDoc.DocumentNode.Descendants(controlName).ToArray()
to get all values of a control(in our case p and span). But this is only getting text which is in span.
How can i get values of both the tage - "p" as well as "span".
UPDATE: I am trying to develop a multilingual application where resource files and keys are generated through code. In the above example: I need to create 3 keys: 1-"Some Text in p Tag", 2-"Some text in span tag." and 3-"Again some text in p tag." How can I create these keys. Current Scenario is that, it is creating key for span tag and not for p tag.
Thanks In Advance
Actually the question is not very clear. You should've posted more relevant codes showing how you tried to get value of <p>
and <span>
.
This one worked just fine to get text in both <p>
and <span>
:
var html = @"<p>Some Text in p Tag<span>Some text in span tag.</span> Again some text in p tag</p>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
foreach (HtmlNode nodeItem in htmlDoc.DocumentNode.Descendants("p").ToArray())
{
Console.WriteLine(nodeItem.InnerText);
}
foreach (HtmlNode nodeItem in htmlDoc.DocumentNode.Descendants("span").ToArray())
{
Console.WriteLine(nodeItem.InnerText);
}
The same output yielded by this single foreach
loop :
foreach (HtmlNode nodeItem in
htmlDoc.DocumentNode
.SelectNodes("//*[name() = 'p' or name() = 'span']"))
{
Console.WriteLine(nodeItem.InnerText);
}
Or if you actually don't care about tag name, you can get all elements as follow :
foreach (HtmlNode nodeItem in
htmlDoc.DocumentNode
.SelectNodes("//*"))
{
Console.WriteLine(nodeItem.InnerText);
}
If none of above samples useful for your case, please update the question to clarify further about the actual problem you're trying to solve.