Ich habe versucht, nach Beispielen und vielem zu suchen, aber nichts scheint zu funktionieren. also verwende ich HtmlAgilityPack und ich möchte den inneren Text zwischen zwei spezifischen Tags erhalten.
Beispiel:
<br>Terms of Service<br></br>Developers<br>
Ich möchte den inneren Text bekommen, wo zuerst <br>
und <br>
in label1 und der zweite </br>
und <br>
in label2
was wird wie sein
Label1.text = "Nutzungsbedingungen"
Label2.text = "Entwickler"
Wie erreiche ich das? Ps; Ich bin nicht so vertraut mit HtmlAgilityPack, ein Code, der zeigt, wie dies zu tun ist, wird es besser machen. :-)
Vielen Dank
das ist ein bisschen dreckig, sollte aber funktionieren.
Imports System.Text.RegularExpressions
Dim mystring As String = "<br>Terms of Service<br></br>Developers<br>"
Dim pattern1 As String = "(?<=<br>)(.*?)(?=<br>)"
Dim pattern2 As String = "(?<=</br>)(.*)(?=<br>)"
Dim m1 As MatchCollection = Regex.Matches(mystring, pattern1)
Dim m2 As MatchCollection = Regex.Matches(mystring, pattern2)
MsgBox(m1(0).ToString)
MsgBox(m2(0).ToString)
Die kurze Antwort ist, dass HAP nicht gut geeignet ist, um Ihre Aufgabe zu erfüllen. Meine Notizen unten:
Imports HtmlAgilityPack
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim mystring As String = "<BR>Terms of Service<BR></BR>Developers<BR>"
Dim myDoc As HtmlAgilityPack.HtmlDocument = New HtmlAgilityPack.HtmlDocument
myDoc.LoadHtml(mystring)
' here we notice HAP immediately discards the junk tag </br>
MsgBox(myDoc.DocumentNode.OuterHtml)
' Below we notice that HAP did not close the BR tag because it only
' attempts to close
' certain nested tags associated with tables ( th, tr, td) and lists
' ( li ).
' if this was a supported tag that HAP could fix, the fixed output
' would be as follows:
' <br>Terms of Service<br></br>Developers<br></br></br>
' this string would be parsed as if the last tag closes the first
' and each set of
' inner tags close themselves without any text between them.
' This means even if you changed BR to TD, or some other tag HAP
' fixes nesting on, it
' still would not help to parse this correctly.
' Also HAP does not appear to support XHTML in this .net 2.0 version.
myDoc.OptionFixNestedTags = True
MsgBox(myDoc.DocumentNode.OuterHtml)
' here we put the BR tag into a collection. as it iterates through
' the tags we notice there is no inner text on the BR tag, presumably
' because of two reasons.
' 1. HAP will not close a BR.
' 2. It does not fix your broken nested tags as you expect or required.
Dim myBR As HtmlNodeCollection = myDoc.DocumentNode.SelectNodes("//BR")
If Not myBR Is Nothing Then
For Each br In myBR
MsgBox(br.InnerText)
Next
End If
End Sub
End Class