Click to See Complete Forum and Search --> : Weeding out UTF-8 characters using JavaScript


websmith99
October 18th, 2002, 06:48 PM
There is a bug happening on my application for IE users if they cut and paste UTF-8 characters, such as the "smart quote" used by Microsoft Word, into a form field and then submit the form.

Because IE has a bug in that it will not send the form data correctly when these UTF-8 characters are present, it results in an application exception.

I was wondering if it is possible in JavaScript to detect these UTF-8 characters in the form elements in an onSubmit function and either remove them from the string or replace with an approximation. For example, if I come across the "opening smart double quote" I would replace with a standard double quote.

Can you detect these special characters using JavaScript?

websmith99
October 24th, 2002, 06:22 PM
Well, I've gotten this to work for the special cases of the following UTF-8 characters:

opening single smart quote   ‘   (‘)
closing single smart quote     ’  (’)
opening double smart quote “   (“)
closing double smart quote   ”   (”)

But I did this by simply cutting and pasting these characters from MS Word into the textarea and then copying them into the JavaScript in my Windows based text editor. Thus I am hard-code comparing these characters to each character in the textarea.

However, I'm looking for a generic approach to detect any non-ascii character in text fields and textareas. Ideally there would be a JavaScript method called entity() or unentity() for HTML entities similar to the escape() and unescape() for URL encoding that I could use to compare each character against the ascii HTML entities by looping through them but I am unaware of anything like this.

I don't want to hard-code compare between the ! (!) and ~ (~) and between the ¡ (¡) and ÿ (ÿ) !!