YouTube HTML Agility Pack C#

c# html html-agility-pack html-parsing

我正在嘗試從YouTube的搜索結果頁面中檢索所有視頻ID。

每個結果都有以下代碼:

<a href="/watch?v=aYIC-ebAD3o" class="ux-thumb-wrap result-item-thumb">
  <span class="video-thumb ux-thumb-128 ">
    <span class="clip">
      <img onload="tn_load(5)" alt="Thumbnail" src="//i2.ytimg.com/vi/aYIC-ebAD3o/default.jpg" >
    </span>
  </span>
  <span class="video-time">4:16</span>
  <span dir="ltr" class="yt-uix-button-group addto-container short video-actions" data-video-ids="aYIC-ebAD3o" data-feature="thumbnail">
    <button type="button" class="start master-sprite  yt-uix-button yt-uix-button-short yt-uix-tooltip" onclick=";return false;" title="" data-button-action="yt.www.addtomenu.add" role="button" aria-pressed="false">
      <img class="yt-uix-button-icon yt-uix-button-icon-addto" src="//s.ytimg.com/yt/img/pixel-vfl3z5WfW.gif" alt="">
        <span class="yt-uix-button-content">
          <span class="addto-label">Add to</span>
        </span>
    </button>
    <button type="button" class="end  yt-uix-button yt-uix-button-short yt-uix-tooltip yt-uix-button-empty" onclick=";return false;" title="" data-button-menu-id="shared-addto-menu" data-button-action="yt.www.addtomenu.load" role="button" aria-pressed="false">
      <img class="yt-uix-button-arrow" src="//s.ytimg.com/yt/img/pixel-vfl3z5WfW.gif" alt="">
    </button>
  </span>
  <span class="video-in-quicklist">Added to queue    </span>
</a>
<div class="result-item-main-content"> 

我正在嘗試解析“data-video-ids”類數據。什麼是使用HTML Agility Pack執行此操作的最佳方法?

我試過這個:

<a href="/watch?v=aYIC-ebAD3o" class="ux-thumb-wrap result-item-thumb">
  <span class="video-thumb ux-thumb-128 ">
    <span class="clip">
      <img onload="tn_load(5)" alt="Thumbnail" src="//i2.ytimg.com/vi/aYIC-ebAD3o/default.jpg" >
    </span>
  </span>
  <span class="video-time">4:16</span>
  <span dir="ltr" class="yt-uix-button-group addto-container short video-actions" data-video-ids="aYIC-ebAD3o" data-feature="thumbnail">
    <button type="button" class="start master-sprite  yt-uix-button yt-uix-button-short yt-uix-tooltip" onclick=";return false;" title="" data-button-action="yt.www.addtomenu.add" role="button" aria-pressed="false">
      <img class="yt-uix-button-icon yt-uix-button-icon-addto" src="//s.ytimg.com/yt/img/pixel-vfl3z5WfW.gif" alt="">
        <span class="yt-uix-button-content">
          <span class="addto-label">Add to</span>
        </span>
    </button>
    <button type="button" class="end  yt-uix-button yt-uix-button-short yt-uix-tooltip yt-uix-button-empty" onclick=";return false;" title="" data-button-menu-id="shared-addto-menu" data-button-action="yt.www.addtomenu.load" role="button" aria-pressed="false">
      <img class="yt-uix-button-arrow" src="//s.ytimg.com/yt/img/pixel-vfl3z5WfW.gif" alt="">
    </button>
  </span>
  <span class="video-in-quicklist">Added to queue    </span>
</a>
<div class="result-item-main-content"> 

有任何想法嗎?

一般承認的答案

您嘗試過濾的'data-video-id'不是類,而是屬性 - 請在SelectNodes中嘗試以下表達式:

"//span[@data-video-ids]"

要檢索屬性值,您可以嘗試這種方法(因為HtmlAgilityPack不支持屬性選擇,您必須先獲取元素然後選擇實際屬性):

"//span[@data-video-ids]"

熱門答案

如果您使用YouTube的API之一,我認為您在longrun中會更好。

當沒有API時,我只會使用Web請求和HtmlAgilityPack作為最後的手段。造成這種情況的主要原因是,如果YouTube更改了網頁,則會破壞您的代碼。開放API通常適合向後兼容,因此您的應用程序在大多數情況下應無限期地工作。

以下是Youtube API的代碼示例:

YouTubeQuery query = new YouTubeQuery(YouTubeQuery.DefaultVideoUri);

//order results by the number of views (most viewed first)
query.OrderBy = "viewCount";

// search for puppies and include restricted content in the search results
// query.SafeSearch could also be set to YouTubeQuery.SafeSearchValues.Moderate
query.Query = "puppy";
query.SafeSearch = YouTubeQuery.SafeSearchValues.None;

Feed<Video> videoFeed = request.Get<Video>(query);

printVideoFeed(videoFeed);

看起來很簡單吧?




許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因
許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因