Html Agility Pack - How to Select first link in nested elemets

c# html-agility-pack

Question

This is my HTML code. How to select first link(Main link 1, Main link 2,...) after <li id="item1"> ,<li id="item2"> and etc.

<div id="mainmenu">
<div class="wrapper">
    <div class="main-menu">
        <ul class="navigation">
            <li class="menu-group">
                <ul>
                    <li id="item1">
                        <a href="/">Main link 1</a>

                        <div id="item1-sub">
                            <ul>
                                <li>
                                    <a href="#1">subLink1</a>
                                </li>
                                <li>
                                    <a href="#1">subLink2</a>
                                </li>
                            </ul>
                        </div>
                    </li>
                    <li id="item2">
                        <a href="/">Main link 2</a>

                        <div id="item2-sub">
                            <ul>
                                <li>
                                    <a href="#1">subLink1</a>
                                </li>
                                <li>
                                    <a href="#1">subLink2</a>
                                </li>
                            </ul>
                        </div>
                    </li>
                </ul>
            </li>
        </ul>
    </div>
</div>

C# code: This code has problems. I hope someone can help to solve problems.

var webGet = new HtmlWeb();
var document = webGet.Load("file.html");
var menuGroup = document.DocumentNode.SelectNodes("//div[@id='mainmenu']//div[@class='wrapper']//div[@class='main-menu']//ul[@class='navigation']//li[@class='menu-group']//ul//li");
if (menuGroup != null)
{
  foreach (var Tag in menuGroup)
  {
    var atag = Tag.SelectSingleNode("./a");
  }
}

Accepted Answer

You could do this

var html = "<html><head></head><body><div id=\"mainmenu\"><div class=\"wrapper\"><div class=\"main-menu\"><ul class=\"navigation\"><li class=\"menu-group\"><ul><li id=\"item1\">" +
        "<a href=\"/\">Main link 1</a><div id=\"item1-sub\"><ul><li><a href=\"#1\">subLink1</a></li><li><a href=\"#1\">subLink2</a></li></ul></div></li><li id=\"item2\">" +
        "<a href=\"/\">Main link 2</a><div id=\"item2-sub\"><ul><li><a href=\"#1\">subLink1</a></li><li><a href=\"#1\">subLink2</a></li></ul></div></li></ul></li></ul></div></div></div></body></html>";
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var mainLink1 = doc.DocumentNode.SelectSingleNode("//li[@id='item1']//a");
var linkUrl1 = mainLink1.Attributes["href"].Value;
var linkText1 = mainLink1.InnerText;

var mainLink2 = doc.DocumentNode.SelectSingleNode("//li[@id='item2']//a");
var linkUrl2 = mainLink2.Attributes["href"].Value;
var linkText2 = mainLink2.InnerText;

OUTPUT:

enter image description here

EDIT:

Use the following code to get main links with loop, This will pick only your desired links.

foreach (var div in doc.DocumentNode.SelectNodes("//li[@class='menu-group']//ul//li//div"))
{
    div.InnerHtml = string.Empty;
}
foreach (var a in doc.DocumentNode.SelectNodes("//li[@class='menu-group']//ul//li//a"))
{
    var linkUrl = a.Attributes["href"].Value;
    var linkText = a.InnerText;
}


Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow