I'm having trouble figuring out how to traverse the DOM with HTML Agility Pack.
For example let's say that I wanted to find an element with id="gbqfsa"
.
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(Url);
var foo = from bar in doc.DocumentNode.DescendantNodes()
where bar.Attributes["id"].Value == "gbqfsa"
select bar.InnerText;
Right now I'm doing this (above), but foo
is coming out as null
. What am I doing wrong?
EDIT: This is the if
statement I was using. I was just testing to see if the elements InnerText
equaled "Google Search."
if (foo.Equals("Google Search"))
{
HasSucceeded = 1;
MessageBox.Show(yay);
}
else
{
MessageBox.Show("kms");
}
return HasSucceeded;
What you should do is:
var foo = (from bar in doc.DocumentNode.DescendantNodes()
where bar.GetAttributeValue("id", null) == "gbqfsa"
select bar.InnerText).FirstOrDefault();
You forgot FirstOrDefault()
to select the first element that satisfy the condition in where
.
And I replace Attributes["id"].Value
by GetAttributeValue("id", null)
not to throw an exception if an element does have an id
attribute.
I don't think foo
is coming out as null
. More likely, bar.Attributes["id"]
is null for some of the elements in the tree since not all descendant nodes have an "id" property. I would recommend using the GetAttributeValue
method, which will return a default value if the attribute is not found.
var foo = from bar in doc.DocumentNode.DescendantNodes()
where bar.GetAttributeValue("id", null) == "gbqfsa"
select bar.InnerText;