Selecting Inner Text Using HtmlAgilityPack

c# html html-agility-pack

Question

I am trying to extract some inner text using HtmlAgilityPack. Here is the HTML of interest:

<select name="Archives" onchange="javascript:setTimeout(&#39;__doPostBack(\&#39;Archives\&#39;,\&#39;\&#39;)&#39;, 0)" id="Archives" style="width:200px;">
    <option selected="selected" value="Dashboard_Jul-2012">Dashboard_Jul-2012</option>
    <option value="Dashboard_Jun-2012">Dashboard_Jun-2012</option>
</select>

I am using:

string output = htmlwriter.InnerWriter.ToString()
var doc = new HtmlDocument();
doc.LoadHtml(output);
string inner = doc.DocumentNode.SelectSingleNode("//option[@selected='selected']").InnerText;

but all I am getting is the empty string.

Any advice is appreciated.

Regards.

Accepted Answer

HTMLAgilityPack by default leaves options tags empty (you can see the author's reason for this at HtmlAgilityPack -- Does <form> close itself for some reason?). To fix it, add this line before selecting the nodes:

HtmlNode.ElementsFlags.Remove("option");



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why