How can i use vb.net to read and print all html label innerhtml text on a webpage

html html-agility-pack innerhtml labels vb.net

Question

So i have HTML agility pack.

I am attempting to read a webpage html. I need the contents of a label but am unsure of how to obtain it.

I know what the for attribute is.. but i don't know how to use it to get the innerhtml of the label.

Can anyone help please

Private Sub SetTextboxText(ByVal Text As String)
    DirectCast(GetCurrentWebForm.item("frmLogin:strCustomerLogin_userID"), mshtml.HTMLInputElement).value = ""
    DirectCast(GetCurrentWebForm.item("frmLogin:strCustomerLogin_pwd"), mshtml.HTMLInputElement).value = ""
    ClickNormalButton()
    Memorable_Reader()
    End Sub

'Gets and Sets Memorable Information
Private Sub Memorable_Reader()
    'Read Label 'For' Attribute
    'Display Innerhtml Text in msgbox
End Sub

'CLICKS THE SUBMIT BUTTON
Private Sub ClickNormalButton()
    GetCurrentWebForm.submit()
End Sub

Update:

Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
    WebBrowser1.Navigate("https://online.lloydsbank.co.uk/personal/logon/login.jsp?WT.ac=PLO0512")
    Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
    htmlDoc.LoadHtml(WebBrowser1.DocumentText)
    Dim labelElement = htmlDoc.DocumentNode.SelectSingleNode("//label[@for='frmLogin:strCustomerLogin_userID']")
    Dim labelText = ""
    If labelElement IsNot Nothing Then
        labelText = labelElement.InnerText
    End If

    MsgBox(labelText) <---- Comes out with nothing aka ""
    MsgBox(labelElement.InnerText) <---- same as above
End Sub

Accepted Answer

First look at this simple example:

Dim htmlString = "<form><label for='something'>text text</label></form>"
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
Dim labelElement = htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
     labelText = labelElement.InnerText
End If

now the labelText variable contains text text

And here is an example for loading the html from a given link using WebClient

Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
Dim webClinet As New System.Net.WebClient
Dim html As String = ""
'add your web page link here
html = webClinet.DownloadString("http://yourlink.com/")
htmlDoc.LoadHtml(html)
'and here add your for attribute value for that label instead of something
Dim labelElement =htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
    labelText = labelElement.InnerText
End If

Update:since you said you have it already opened in a WebBrowser control use the DocumentText property to get the html text as the following:

Dim htmlDoc As New HtmlAgilityPack.HtmlDocument
htmlDoc.LoadHtml(webBrowser1.DocumentText)
Dim labelElement =htmlDoc.DocumentNode.SelectSingleNode("//label[@for='something']")
Dim labelText = ""
If labelElement IsNot Nothing Then
   labelText = labelElement.InnerText
End If

**Update:**Example on how to get Html string from the WebBrowser control

Public Class Form1
    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        WebBrowser1.Navigate("https://www.google.com")
    End Sub

    Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
        MessageBox.Show(WebBrowser1.DocumentText)
    End Sub
End Class



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why