I have a question about Chinese encoding and saving back to a file. I am currently using the HtmlAgilityPack to parse HTML, do some things with it and save it back to the file. I am having a problem with Encoding, such as Chinese (GB2312 (Simplified)). When i open the file, I read the encoding and I save it back, using the HtmlAgilityPack
but the Chinese letters get completely mutilated. Any ideas on how I can save back to the same file and maintain the current encoding? I also tried getting the Encoding with the HtmlAgilityPack like such:
FileStream fs = new FileStream(this._filePath, FileMode.Open); StreamReader reader = new StreamReader(fs); HtmlDocument doc = new HtmlDocument(); doc.Load(reader); Encoding enc = doc.DeclaredEncoding fs.Close(); doc.Save(this._filePath, enc);
but that didn't work either. Any ideas?
So after some work, I managed to get it to work by reading the Declared encoding out of the Meta tag. I though it was badly formed initially, but actually it was correct. The DeclaredEncoding did read the encoding from the meta tag.
When the file saved, it still saved in ANSI format, and I couldn't seem to change that. However, the meta tag encoding did seem to keep the file in check when it rendered in the browser. Hope that helps someone.