Parsing form with HTML Agility Pack

c# html-agility-pack

Question

I'm trying to extract all input elements from a form. When I parse the following form:

<form>
<input name='test1' type='text'>
<input name='test2' type='text'>
<input name='test3' type='text'>
</form>

everything worked perfectly, HTML Agility Pack was able to detect the input elements in the form but if it has a div parent node like the following, it will not be detected.

<form>
<div><input name='test1' type='text'></div>
<div><input name='test2' type='text'></div>
<div><input name='test3' type='text'></div>
</form>

I'm using the following code

HtmlNode.ElementsFlags.Remove("form");

foreach (HtmlAgilityPack.HtmlNode node in postForm.Elements("input"))
{
    HtmlAgilityPack.HtmlAttribute valueAttribute = node.Attributes["value"];
}

Can anyone tell me what went wrong? Thanks

Accepted Answer

HtmlNode.Elements method gets matching first generation child nodes matching name. After you put your inputs inside a <div> tag they become the second generation child nodes for the form element.

To make your code work use HtmlNode.Descendants method which gets all descendant nodes with matching name:

foreach (HtmlAgilityPack.HtmlNode node in postForm.Descendants("input"))
{
   HtmlAgilityPack.HtmlAttribute valueAttribute = node.Attributes["value"];
}

Popular Answer

Use Descendants() instead of Elements() - the later only works on direct children but your input elements are nested within divs:

 foreach (HtmlAgilityPack.HtmlNode node in postForm.Descendants("input"))
 {
     HtmlAgilityPack.HtmlAttribute valueAttribute = node.Attributes["value"];
 }



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why