Parsing Form action and input name and values using html agility pack

c# forms html-agility-pack

Question

I'm trying to parse the form action value and input name with value from the following HTML code:

            <form method="post" action="actionURL" autocomplete="" name="login_form" id="login_form" onsubmit="return hash2(this)">

            <input type="hidden" name=".tries" value="1">
            <input type="hidden" name=".src" value="ym">
            <input type="hidden" name=".md5" value="">
            <input type="hidden" name=".hash" value="">
            <input type="hidden" name=".js" value="">
            <input type="hidden" name=".last" value="">
            <input type="hidden" name="promo" value="">
            <input type="hidden" name=".intl" value="us">
            <input type="hidden" name=".lang" value="en">
            <input type="hidden" name=".bypass" value="">
            <input type="hidden" name=".partner" value="">
            <input type="hidden" name=".u" value="8013sg1858dp9">
            <input type="hidden" name=".v" value="0">
            <input type="hidden" name=".challenge" value="fUhehaaMq9c2lQjndCps_rNu1eSB">
            <input type="hidden" name=".yplus" value="">
            <input type="hidden" name=".emailCode" value="">
            <input type="hidden" name="pkg" value="">
            <input type="hidden" name="stepid" value="">
            <input type="hidden" name=".ev" value="">
            <input type="hidden" name="hasMsgr" value="0">
            <input type="hidden" name=".chkP" value="Y">
            <input type="hidden" name=".done" value="somevalue">
            <input type="hidden" name=".pd" value="ym_ver=0&c=&ivt=&sg=">
            <input type="hidden" name=".ws" id=".ws" value="0">
            <input type="hidden" name=".cp" id=".cp" value="0">     
            <input type="hidden" name="nr" value="0">

            <input type="hidden" name="pad" id="pad" value="5">
            <input type="hidden" name="aad" id="aad" value="5">

                            <div id='inputs'>

                <label for='username'>Yahoo! ID</label>
                                    <input name='login' id='username' maxlength='96' tabindex='1' value=''>

                    <p id='ex'>(e.g. test)</p>

                <label for='passwd'>Password</label>
                <input name='passwd' id='passwd' type='password' maxlength='64' tabindex='2'>


    <div id="captchaDiv"></div>
            </div>
<div id='fun'></div>

        <div id='persistency'>
            <input type='checkbox' name='.persistent' id='persistent' tabindex='4' value='y' >
            <p>
              <label for='persistent'>Keep me signed in</label>
              <br>
              <span id='uncheck'>(Uncheck if on a shared computer)</span>
            </p>
        </div>


    <div id='submit'>
        <button type='submit' id='.save' name='.save' class='secondaryCta' tabindex='5'>
          Sign In
        </button>           </div>
</form>

The above form contains, input type in direct children and beside children of children. While using sample here: https://stackoverflow.com/a/9890022/1007447 code trace on c# finds no elements or Descendants for name "form".

How to get the form action and all input types with value? (sometimes, I need to skip the username password part too)

Popular Answer

This has been discussed a few times here on Stack Overflow.

The answer is in the same question you are referring to. You have to do:

HtmlNode.ElementsFlags.Remove("form");

var doc = ... //Load the document here

var nodes = doc.DocumentNode.SelectNodes("//form//input");

The key is on the line

HtmlNode.ElementsFlags.Remove("form")

and the explanation of why you need to add it can be found at the following pages:



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why