I'm trying to extract all input elements from a form. When I parse the following form:
<form>
<input name='test1' type='text'>
<input name='test2' type='text'>
<input name='test3' type='text'>
</form>
everything worked perfectly, HTML Agility Pack was able to detect the input elements in the form but if it has a div parent node like the following, it will not be detected.
<form>
<div><input name='test1' type='text'></div>
<div><input name='test2' type='text'></div>
<div><input name='test3' type='text'></div>
</form>
I'm using the following code
HtmlNode.ElementsFlags.Remove("form");
foreach (HtmlAgilityPack.HtmlNode node in postForm.Elements("input"))
{
HtmlAgilityPack.HtmlAttribute valueAttribute = node.Attributes["value"];
}
Can anyone tell me what went wrong? Thanks
HtmlNode.Elements
method gets matching first generation child nodes matching name. After you put your inputs inside a <div>
tag they become the second generation child nodes for the form element.
To make your code work use HtmlNode.Descendants
method which gets all descendant nodes with matching name:
foreach (HtmlAgilityPack.HtmlNode node in postForm.Descendants("input"))
{
HtmlAgilityPack.HtmlAttribute valueAttribute = node.Attributes["value"];
}
Use Descendants()
instead of Elements()
- the later only works on direct children but your input elements are nested within divs:
foreach (HtmlAgilityPack.HtmlNode node in postForm.Descendants("input"))
{
HtmlAgilityPack.HtmlAttribute valueAttribute = node.Attributes["value"];
}