ValleyHope:
is there a way to call the codeCleaner('Word') function instead of calling the CleanCode in the Toolbar.
I tried using your regex function and it still saved a lot of word formatting junk. Surely your codeCleaner function is more robust than the regex function you provided.
Here is my version of the regex function in C#:
- public static string CleanWordHtml(string html)
- {
- String cleanstring = String.Empty;
- System.Text.RegularExpressions.Regex regex;
-
- cleanstring=html;
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"<\\?\??xml[^>]>", "",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"\s*mso-[^:]+:[^;""]+;?", "",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"<\/?\w+:[^>]*>", "",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"<\!--.*-->", "",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"[\”\“]", "\"\"",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"[\‘\’]", "'",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"<\\?\?xml[^>]*>", "",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"<span\s*[^>]*>\s* \s*<\/span>", " ",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"<span\s*[^>]*><\/span>", "",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,
- @"<(\w+)[^>]*\sstyle=""[^""]*DISPLAY\s?:\s?none(.*?)<\/\1>", "",
- System.Text.RegularExpressions.RegexOptions.IgnoreCase);
-
- return cleanstring;
- }
If you are using .net version, you can use the following method:
Editor.CleanUpMicrosoftWordHTML Method
Use the Clean Up Word HTML function to remove the extraneous HTML code generated by Microsoft Word.
asp.net Chat http://cutesoft.net/ASP.NET+Chat/default.aspx
Web Messenger: http://cutesoft.net/Web-Messenger/default.aspx
asp.net wysiwyg editor: http://cutesoft.net/ASP.NET+WYSIWYG+Editor/default.aspx
asp wysiwyg html editor: http://cutesoft.net/ASP
asp.net Image Gallery: http://cutesoft.net/ASP.NET+Image+Gallery/default.aspx
Live Support: http://cutesoft.net/live-support/default.aspx