Many users without computer knowledge copy contents from a word document, and paste them into a HTML form (<textarea></textarea>) and expect to retain formating as well as special characters such as smart quotes and emdashes. You may opt to translate smart quotes to regular quotes and emdashes to regular dashes with a PHP script. If any user submits a non-ASCII character contents, you’ll probably see weird characters in the database and HTML page. Finding and fixing just a few of them (curly quotes and em dashes) isn’t going to solve the real problem.
How do you go about resolving this problem? One way to solve the problem is by educating the users to convert the special characters into ASCII text and submit them into the form. To convert special characters, you may use any of the following methods.
1. Save the word document as a HTML document. Microsoft Word has an option to save .DOC document into a .HTML file. Select the contents from the HTML document, and paste them into the HTML form.
2. Copy the Word contents, and paste them into a notepad; then select the same contents from the notepad, copy and paste them into the HTML form.