Clean Up HTML on the server side

Last post 10-19-2009, 4:26 PM by Adam. 5 replies.
Sort Posts: Previous Next
  •  09-23-2009, 6:47 PM 55827

    Clean Up HTML on the server side

     
    I found these in the help section but when I try to implement them in my code I get the following:
     
    Object doesn't support this property or method: 'CleanUpHTMLCode'
    Object doesn't support this property or method: 'CleanUpMicrosoftWordHTML'
     
     You can teach your end users use the Clean Up HTML button (Clean Up HTML) in Cute Editor to remove extraneous tags and streamline your HTML code.

    But it's easy to make mistakes if your end users forget Clean Up HTML before saving the content into database.

    Cute Editor provides two server side methods which can fix these mistakes automatically and tidy up sloppy editing into nicely layed out markup.

    You can use Editor.CleanUpHTMLCode method to remove empty tags, combine nested FONT tags, and otherwise improve messy or unreadable HTML code.

    You can aslo Editor.CleanUpMicrosoftWordHTML method to remove the extraneous HTML code generated by Microsoft Word.
     
    Here's my object: 

    Dim editor
    Set editor = New CuteEditor

    editor.ID        = "event_comments"
    editor.FilesPath      = "/includes/components/CuteEditor_Files"
    editor.EditorWysiwygModeCss  = "/includes/components/CuteEditor_Files/style/text_editor.css"
    editor.AutoConfigure     = "Simple"
    editor.Width       = 733
    editor.Height       = 500
    editor.ThemeType       = "office2007"
    editor.CleanUpHTMLCode    = "true"
    editor.CleanUpMicrosoftWordHTML = "true"
    editor.UsePhysicalFormattingTags = "true"
    editor.Text        = theComments
    editor.ImageGalleryPath    = thePath
    editor.FlashGalleryPath    = thePath
    editor.MediaGalleryPath    = thePath
    editor.FilesGalleryPath    = thePath
    editor.TemplateGalleryPath   = thePath
    editor.Draw()

  •  09-24-2009, 10:16 AM 55836 in reply to 55827

    Re: Clean Up HTML on the server side

    You can use the following function:
     

    Function CleanUpHTMLCode(HTMLstring)

    dim cleanstring
    set regex = new Regexp

    cleanstring=HTMLstring

    regex.pattern = "<\\?\??xml[^>]>"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "")

    regex.pattern = "\s*mso-[^:]+:[^;""]+;?"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "")

    regex.pattern = "<\/?\w+:[^>]*>"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "")

    regex.pattern = "<\!--.*-->"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "")

    regex.pattern = "[\”\“]"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, """")

    regex.pattern = "[\‘\’]"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "'")

    regex.pattern = "<\\?\?xml[^>]*>"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "")

    regex.pattern = "<span\s*[^>]*>\s*&nbsp;\s*<\/span>"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "&nbsp;")

    regex.pattern = "<span\s*[^>]*><\/span>"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "")

    regex.pattern = "<(\w+)[^>]*\sstyle=""[^""]*DISPLAY\s?:\s?none(.*?)<\/\1>"
    regex.ignoreCase = true
    regex.global = true
    cleanstring = regex.Replace(cleanstring, "")

    CleanUpHTMLCode = cleanstring

    set regex = nothing
    End Function


    asp.net Chat http://cutesoft.net/ASP.NET+Chat/default.aspx
    Web Messenger: http://cutesoft.net/Web-Messenger/default.aspx
    asp.net wysiwyg editor: http://cutesoft.net/ASP.NET+WYSIWYG+Editor/default.aspx
    asp wysiwyg html editor: http://cutesoft.net/ASP
    asp.net Image Gallery: http://cutesoft.net/ASP.NET+Image+Gallery/default.aspx
    Live Support: http://cutesoft.net/live-support/default.aspx

  •  09-24-2009, 10:55 AM 55846 in reply to 55836

    Re: Clean Up HTML on the server side

    Thanks but where to I add this function and what about the 'CleanUpMicrosoftWordHTML' option?

  •  09-25-2009, 8:39 AM 55878 in reply to 55846

    Re: Clean Up HTML on the server side

    Big Kahuna,
     
    It's just a regular asp function. You can use it anywhere on your asp code.

    asp.net Chat http://cutesoft.net/ASP.NET+Chat/default.aspx
    Web Messenger: http://cutesoft.net/Web-Messenger/default.aspx
    asp.net wysiwyg editor: http://cutesoft.net/ASP.NET+WYSIWYG+Editor/default.aspx
    asp wysiwyg html editor: http://cutesoft.net/ASP
    asp.net Image Gallery: http://cutesoft.net/ASP.NET+Image+Gallery/default.aspx
    Live Support: http://cutesoft.net/live-support/default.aspx

  •  10-16-2009, 5:09 PM 56448 in reply to 55878

    Re: Clean Up HTML on the server side

    is there a way to call the codeCleaner('Word') function instead of calling the CleanCode in the Toolbar. 
     
    I tried using your regex function and it still saved a lot of word formatting junk.  Surely your codeCleaner function is more robust than the regex function you provided.
     
    Here is my version of the regex function in C#:
     
    1. public static string CleanWordHtml(string html)   
    2.     {   
    3.         String cleanstring = String.Empty;   
    4.         System.Text.RegularExpressions.Regex regex;   
    5.   
    6.         cleanstring=html;   
    7.   
    8.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    9.                  @"<\\?\??xml[^>]>""",   
    10.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    11.   
    12.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    13.                  @"\s*mso-[^:]+:[^;""]+;?""",   
    14.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    15.   
    16.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    17.                  @"<\/?\w+:[^>]*>""",   
    18.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    19.   
    20.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    21.                  @"<\!--.*-->""",   
    22.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    23.   
    24.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    25.                  @"[\”\“]""\"\"",   
    26.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    27.   
    28.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    29.                  @"[\‘\’]""'",   
    30.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    31.   
    32.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    33.                  @"<\\?\?xml[^>]*>""",   
    34.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    35.   
    36.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    37.                  @"<span\s*[^>]*>\s*&nbsp;\s*<\/span>""&nbsp;",   
    38.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    39.   
    40.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    41.                  @"<span\s*[^>]*><\/span>""",   
    42.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    43.   
    44.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    45.                  @"<(\w+)[^>]*\sstyle=""[^""]*DISPLAY\s?:\s?none(.*?)<\/\1>""",   
    46.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    47.   
    48.         return cleanstring;   
    49. }  
  •  10-19-2009, 4:26 PM 56496 in reply to 56448

    Re: Clean Up HTML on the server side

    ValleyHope:
    is there a way to call the codeCleaner('Word') function instead of calling the CleanCode in the Toolbar. 
     
    I tried using your regex function and it still saved a lot of word formatting junk.  Surely your codeCleaner function is more robust than the regex function you provided.
     
    Here is my version of the regex function in C#:
     
    1. public static string CleanWordHtml(string html)   
    2.     {   
    3.         String cleanstring = String.Empty;   
    4.         System.Text.RegularExpressions.Regex regex;   
    5.   
    6.         cleanstring=html;   
    7.   
    8.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    9.                  @"<\\?\??xml[^>]>""",   
    10.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    11.   
    12.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    13.                  @"\s*mso-[^:]+:[^;""]+;?""",   
    14.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    15.   
    16.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    17.                  @"<\/?\w+:[^>]*>""",   
    18.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    19.   
    20.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    21.                  @"<\!--.*-->""",   
    22.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    23.   
    24.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    25.                  @"[\”\“]""\"\"",   
    26.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    27.   
    28.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    29.                  @"[\‘\’]""'",   
    30.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    31.   
    32.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    33.                  @"<\\?\?xml[^>]*>""",   
    34.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    35.   
    36.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    37.                  @"<span\s*[^>]*>\s*&nbsp;\s*<\/span>""&nbsp;",   
    38.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    39.   
    40.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    41.                  @"<span\s*[^>]*><\/span>""",   
    42.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    43.   
    44.         cleanstring = System.Text.RegularExpressions.Regex.Replace(cleanstring,   
    45.                  @"<(\w+)[^>]*\sstyle=""[^""]*DISPLAY\s?:\s?none(.*?)<\/\1>""",   
    46.                  System.Text.RegularExpressions.RegexOptions.IgnoreCase);   
    47.   
    48.         return cleanstring;   
    49. }  
     
    If you are using .net version, you can use the following method:
     

    Editor.CleanUpMicrosoftWordHTML Method 

    Use the Clean Up Word HTML function to remove the extraneous HTML code generated by Microsoft Word.


    asp.net Chat http://cutesoft.net/ASP.NET+Chat/default.aspx
    Web Messenger: http://cutesoft.net/Web-Messenger/default.aspx
    asp.net wysiwyg editor: http://cutesoft.net/ASP.NET+WYSIWYG+Editor/default.aspx
    asp wysiwyg html editor: http://cutesoft.net/ASP
    asp.net Image Gallery: http://cutesoft.net/ASP.NET+Image+Gallery/default.aspx
    Live Support: http://cutesoft.net/live-support/default.aspx

View as RSS news feed in XML