HtmlAgilityPack SelectNodes Syntax

.net c# html html-agility-pack xpath

Question

The HTML I have is as follows:

<tbody>
    <tr>
        <td class="metadata_name">Headquarters</td>
        <td class="metadata_content">Princeton New Jersey, United States</td>
    </tr>
    <tr>
        <td class="metadata_name">Industry</td>
        <td class="metadata_content"><ul><li><a href="/q-Engineering-Software-jobs.html" rel="nofollow">Engineering Software</a></li><li><a href="/q-Software-Development-&amp;-Design-jobs.html" rel="nofollow">Software Development &amp; Design</a></li><li><a href="/q-Software-jobs.html" rel="nofollow">Software</a></li><li><a href="/q-Custom-Software-&amp;-Technical-Consulting-jobs.html" rel="nofollow">Custom Software &amp; Technical Consulting</a></li></ul></td>
    </tr>
    <tr>
        <td class="metadata_name">Revenue</td>
        <td class="metadata_content">$17.5 Million</td>
    </tr>
    <tr>
        <td class="metadata_name">Employees</td>
        <td class="metadata_content">201 to 500</td>
    </tr>
    <tr>
        <td class="metadata_name">Links</td>
        <td class="metadata_content"><ul><li><a href="/url?q=http%3A%2F%2Fwww.site.com&amp;h=085df2ca" target="_blank">Company website</a></li></ul></td>
    </tr>
</tbody>

In a variable where the metadata name is equal to a number, I want to be able to load the metadata content value, such as "$17.5 Million" (ex: "Revenue").

I spent many hours attempting to utilize combinations of code like this.

orgHtml.DocumentNode.SelectNodes("//td[@class='metadata_name']")[0].InnerHtml;

But I'm having trouble finding the appropriate mix. I would appreciate it if you could provide me with a SelectNodes syntax that would assist me find the answer.

1
1
12/16/2014 2:32:02 PM

Accepted Answer

It seems like this is what you're searching for:

var found = orgHtml.DocumentNode.SelectSingleNode(
    "//tr[td[@class = 'metadata_name'] = 'Revenue']/td[@class = 'metadata_content']");
if (found != null)
{
    string html = found.InnerHtml;
    // use html
}

Keep in mind that you should use to acquire the text of an elementfound.InnerText , notfound.InnerHtml , unless you require its HTML content explicitly.

4
12/16/2014 3:06:33 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow