Html Agility Pack c# Paragraph parsing problem

c# html html-agility-pack

Question

I am having a couple of issues with my code, I am trying to pull every paragraph from a page, but at the moment it is only selecting the last paragraph.

here is my code.

foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[@id='body']/p"))
{
  string text = node.InnerText;
  lblTest2.Text = text;
}

Accepted Answer

In your loop you are taking the current node innerText and assigning it to the label. You do this to each node, so of course you only see the last one - you are not preserving the previous ones.

Try this:

foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[@id='body']/p"))
{
  string text = node.InnerText;
  lblTest2.Text += text + Environment.NewLine;
}

Popular Answer

IMO, XPath is no fun. I'd recommend using LINQ syntax instead:

foreach (var node in doc.DocumentNode
    .DescendantNodes()
    .Single(x => x.Id == "body")
    .DescendantNodes()
    .Where(x => x.Name == "p")) 
{
    string text = node.InnerText;
    lblTest2.Text = text;
}



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why