使用HtmlAgilityPack解析html文檔

c# html-agility-pack linq

我正在嘗試通過HtmlAgilityPack解析以下html代碼段:

<td bgcolor="silver" width="50%" valign="top">
 <table bgcolor="silver" style="font-size: 90%" border="0" cellpadding="2" cellspacing="0"
                                                width="100%">
   <tr bgcolor="#003366">
       <td>
           <font color="white">Info
        </td>
        <td>
           <font color="white">
              <center>Price
                   </td>
                      <td align="right">
                         <font color="white">Hourly
                         </td>
              </tr>
               <tr>
                 <td>
                     <a href='test1.cgi?type=1'>Bookbags</a>
                 </td>
                   <td>
                      $156.42
                    </td>
                    <td align="right">
                        <font color="green">0.11%</font>
                      </td>
                  </tr>
                  <tr>
                    <td>
                       <a href='test2.cgi?type=2'>Jeans</a>
                     </td>
                         <td>
                            $235.92
                               </td>
                                  <td align="right">
                                     <font color="red">100%</font>
                                  </td>
                   </tr>
               </table>
          </td>

我的代碼看起來像這樣:

<td bgcolor="silver" width="50%" valign="top">
 <table bgcolor="silver" style="font-size: 90%" border="0" cellpadding="2" cellspacing="0"
                                                width="100%">
   <tr bgcolor="#003366">
       <td>
           <font color="white">Info
        </td>
        <td>
           <font color="white">
              <center>Price
                   </td>
                      <td align="right">
                         <font color="white">Hourly
                         </td>
              </tr>
               <tr>
                 <td>
                     <a href='test1.cgi?type=1'>Bookbags</a>
                 </td>
                   <td>
                      $156.42
                    </td>
                    <td align="right">
                        <font color="green">0.11%</font>
                      </td>
                  </tr>
                  <tr>
                    <td>
                       <a href='test2.cgi?type=2'>Jeans</a>
                     </td>
                         <td>
                            $235.92
                               </td>
                                  <td align="right">
                                     <font color="red">100%</font>
                                  </td>
                   </tr>
               </table>
          </td>

在這種情況下,我想elect the item which are Jeans and Bookbags以及下面的相關prices並將它們存儲在字典中。

<td bgcolor="silver" width="50%" valign="top">
 <table bgcolor="silver" style="font-size: 90%" border="0" cellpadding="2" cellspacing="0"
                                                width="100%">
   <tr bgcolor="#003366">
       <td>
           <font color="white">Info
        </td>
        <td>
           <font color="white">
              <center>Price
                   </td>
                      <td align="right">
                         <font color="white">Hourly
                         </td>
              </tr>
               <tr>
                 <td>
                     <a href='test1.cgi?type=1'>Bookbags</a>
                 </td>
                   <td>
                      $156.42
                    </td>
                    <td align="right">
                        <font color="green">0.11%</font>
                      </td>
                  </tr>
                  <tr>
                    <td>
                       <a href='test2.cgi?type=2'>Jeans</a>
                     </td>
                         <td>
                            $235.92
                               </td>
                                  <td align="right">
                                     <font color="red">100%</font>
                                  </td>
                   </tr>
               </table>
          </td>

有誰知道如何通過htmlagility包和LINQ正確地做到這一點?

熱門答案

假設可能有其他行而你並不特別想要Bookbags和Jeans,我會這樣做:

var table = htmlDoc.DocumentNode
    .SelectSingleNode("//table[@bgcolor='silver' and @width='100%']");
var query =
    from row in table.Elements("tr").Skip(1) // skip the header row
    let columns = row.Elements("td").Take(2) // take only the first two columns
        .Select(col => col.InnerText.Trim())
        .ToList()
    select new
    {
        Info = columns[0],
        Price = Decimal.Parse(columns[1], NumberStyles.Currency),
    };



許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因
許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因