Getting the value of the 'href' inside of a div in HTMLAgilityPack in C#

c# href html-agility-pack xpath

Question

I'm attempting to get a "href's" value. The code seems to be as follows:

          <div class="s_newsbox" style="font-size:12px; vertical-align:middle; overflow: hidden; float:left; margin:10px; margin-bottom:15px; height: 270px; width:280px; border-radius:6px; position:relative; text-align:center; padding:0px">
            <div style="background-color:#292929; background-color:rgba(0,0,0,0.8); padding:5px; padding-left:2px; padding-right:10px; width:100%; position:absolute; top:0; left:0;"><b>Samsung nx30 + zoom kit 18/55</b>
            </div>
            <a href="vendo.php?t=1395911">
              <img style="width:100%; height:100%" src="http://img1.juzaphoto.com/shared_files/uploads_mercatino/sell_1395911_small.jpg" alt="">
              <br></a>
            <div style="line-height:150%; background-color:#292929; background-color:rgba(0,0,0,0.8); padding:5px; position:absolute; bottom:0; left:0; margin-left:auto; width:100%; text-align:left">Venditore: 
              <a href="me.php?l=it&amp;p=45923"><b>Pierobob</b></a>  
              <br> Prezzo: <b>350 &euro;</b>  
              <br> Zona: <b>Bologna</b>  
              <br> 
              <a href="vendo.php?t=1395911">Leggi annuncio</a> (8 visite)
              <br>
            </div>
          </div>

I'm attempting to perform the following:

           var list = page.DocumentNode.SelectNodes("//div[@class='s_newsbox']");
           foreach (var obj in list)
            {
              var url = obj.SelectSingleNode(".//a").Attributes["href"].Value;

I'm trying to obtain the data from "vendo.php?t=1395911," but what I really receive is the href value from another line that doesn't have a parent div with the class "s newsbox."

Why am I mishandling this?

Many thanks!

1
0
7/2/2015 4:42:39 PM

Accepted Answer

As long as you don't need any of the other nodes within the s newsbox div, you may narrow down the impacted items using a more precise xpath filter.

       var list = page.DocumentNode.SelectNodes("//div[@class='s_newsbox']/a[string-length(@href)>0]");
       foreach (var obj in list)
        {
          var url = obj.SelectSingleNode(".").Attributes["href"].Value;
0
7/2/2015 5:39:49 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow