Schönen Tag. Ich habe eine Aufgabe, wo ich das Word-Dokument in HTML konvertieren muss.
Dies kann mithilfe von Interop erfolgen und das Dokument als HTML speichern. Aber ich muss die HTML-Ausgabe von Interop reinigen
Aber ich habe ein Problem mit htmlagilitypack. Ich dachte, es ist ähnlich zu XmlDocument c #
das ist mein c # code
HtmlDocument doc = new HtmlDocument();
doc.Load(htmlLocation);
foreach (var item in doc.DocumentNode.Descendants("p"))
{
if (item.HasChildNodes)
{
foreach (var itm in item.Descendants("span").ToList())
{
Console.WriteLine(itm.InnerText);
}
}
}
Dies ist der HTML-Code
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=Generator content="Microsoft Word 12 (filtered)">
</head>
<body lang=EN-US link="#0066CC" vlink=purple style='text-justify-trim:punctuation'>
<div class=WordSection1>
<p class=Heading61 style='margin-bottom:0in;margin-bottom:.0001pt;text-indent:
.5in;line-height:normal;page-break-after:avoid;background:transparent'><span
class=Heading6><span style='font-size:12.0pt;color:black;background:yellow'>Epilogue</span></span></p>
<p class=MsoBodyText style='line-height:normal;background:transparent'><span
class=BodytextItalic2><span style='font-size:12.0pt;color:black;font-style:
normal'> </span></span></p>
<p class=MsoBodyText style='line-height:normal;background:transparent'><span
class=BodytextItalic2><span style='font-size:12.0pt;color:black;font-style:
normal'>Rebecca sat outside her lodge cradling her infant son in her arms. How
handsome he was, her little warrior, with his dusky skin and thick black hair.
For the first few days after his birth, she had been afraid to let him out of
her sight, out of her arms, for fear she would lose him, but he was a strong
healthy child.</span></span></p>
<p class=MsoBodyText style='text-indent:.5in;line-height:normal;background:
transparent'><span class=BodytextItalic2><span style='font-size:12.0pt;
color:black;font-style:normal'>Looking at him made her heart swell with love
for him and for his father. She had married Wolf Dreamer the day after they
returned to his people. Summer Moon Rising had left the village the following
day.</span></span></p>
</div>
</body>
</html>
Dies ist die Ausgabe des obigen Codes
Epilogue
Epilogue
Rebecca sat outside her lodge cradling her infant son in her arms. How
handsome he was, her little warrior, with his dusky skin and thick black hair.
For the first few days after his birth, she had been afraid to let him out of
her sight, out of her arms, for fear she would lose him, but he was a strong
healthy child.
Rebecca sat outside her lodge cradling her infant son in her arms. How
handsome he was, her little warrior, with his dusky skin and thick black hair.
For the first few days after his birth, she had been afraid to let him out of
her sight, out of her arms, for fear she would lose him, but he was a strong
healthy child.
Looking at him made her heart swell with love
for him and for his father. She had married Wolf Dreamer the day after they
returned to his people. Summer Moon Rising had left the village the following
day.
Looking at him made her heart swell with love
for him and for his father. She had married Wolf Dreamer the day after they day.
was ich erwarte, ist die Sekunde für jedes hängt von den Einzelteilelementen ab. aber warum wiederholt es den Text?
Sie haben 4 p-Tag und jede Tags haben zwei Spannen. Nachkommen, erhält alle Nachkommenknoten mit übereinstimmendem Namen, so dass Ihre innere Vorhersage für zwei Bereiche wiederholt wird
deine innere foreach könnte sein
foreach (var itm in item.ChildNodes)
{
Console.WriteLine(itm.InnerText);
}