C#, Html Agility, Selecting every paragraph within a div tag

c# html html-agility-pack

Question

How can I select every paragraph in a div tag for example.

<div id="body_text">
<p>Hi</p>
<p>Help Me Please</P>
<p>Thankyou</P>

I have got Html Agility downloaded and referenced in my program, All I need is the paragraphs. There may be a variable number of paragraphs and there are loads of different div tags but I only need the content within the body_text. Then I assume this can be stored as a string which I then want to write to a .txt file for later reference. Thankyou.

Accepted Answer

The valid XPATH for your case is //div[@id='body_text']/p

foreach(HtmlNode node in yourHTMLAgilityPackDocument.DocumentNode.SelectNodes("//div[@id='body_text']/p")
{
  string text = node.InnerText; //that's the text you are looking for
}

Popular Answer

Here's a solution that grabs the paragraphs as an enumeration of HtmlNodes:

HtmlDocument doc = new HtmlDocument();
doc.Load("your.html");
var div = doc.GetElementbyId("body_text");
var paragraphs = div.ChildNodes.Where(item => item.Name == "p"); 

Without explicit Linq:

var paragraphs = doc.GetElementbyId("body_text").Elements("p");  



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why