Cleaning up Word HTML

If you are like me, on occasions you have people want to paste directly from Word into CMS or WordPress. I was bemoaning the fact, that you either end up with Word HTML or lose all the formatting, when a colleague of mine, Steven Miles suggest I use JavaScript to clean up the HTML and provide the some code.

So this is what I created to clean up Word HTML, you paste the HTML is to the editable div, hit the button and if you are using Internet Explorer the converted HTML is copied to your clipboard. So you can paste the clean HTML (note it is HTML code) straight into the CMS editor in HTML mode or WordPress.

Note, while you can use JavaScript to add to the clipboard with Internet Explorer, you need to use Flash to copy the contents to the clipboard for other browsers. I have not do that yet. Mainly because Internet Explorer is the corporate browser and this was created for work.So have a look under the hood, see how it works, take the code and modify it for your own use. It appears to be working for what I need, but maybe needed to be modified for your situation.

*** does not currently work in IE8, I need to investigate further ***

2 Responses to “Cleaning up Word HTML”

  1. Kim Says:

    went to use your cleaner but keep coming up with errors on the page
    The following text appears at the top of the page and I can’t paste or type in the entry field.
    I am using IE 8. The page opends fine in Firefox but won’t process the text.

    … 5 October 2009 Nick cut the javascript out the comment, that caused it to get stuck in the spam queue …

  2. Nick Says:

    Oops! serves me right for not checking in IE8 (unavailable at work), I will attempt to fix that in the next couple of weeks. Currently hiding out in Sydney with Macbook and iPhone to connect to the web.

    Kim, you can use Firefox and Safari to process the text, paste it into the first box, hit the button, hit return to ignore the alert about not using IE and then you just need to cut the processed text from the second box, it is not automatically placed in your clipboard.