Graham TerMarsch wrote:
> I've run into an issue on one of the projects that I'm working on and thought
> that I'd ping the list to see how others are handling this...
Lucky you, I just spent a few weeks fighting with this as $work and on Krang :)
> The app accepts form data from the user, runs it through Data::FormValidator
> to validate it, then stuffs it into our PostgreSQL database. We're expecting
> users are going to cut/paste from MS-Word and as a result we're going to have
> to deal with MS "smart quotes".
>
> My issue started with a DB error from DBD::Pg telling me that the input had an
> invalid byte sequence for UTF-8 (the tables in Pg are all encoded as UTF-8).
> Googling around brought me several possible solutions, but I can't say that
> I've found one yet that actually -works-.
The only thing that will really work is to go with one character set all the way
through. I'd recommend UTF-8 cause if you do, you'll never have to change when
users want to do something that ISO-8859-1 or CP-1252 can't do. And UTF-8 can do
everything. I will warn you that if you go down the UTF-8 route, because UTF-8
can have multibyte characters there's no magic switch to press. It's making your
application know about UTF-8 all the way through.
You need to do all of the following:
+ Tell the browser that the forms/pages are UTF-8 (using HTTP headers and <meta>
tags)
+ When the form data comes in, decode_utf8() it. If you're using CGI.pm you'll
need to use 3.30 which hasn't been released (you can find it on RT) cause it has
some UTF-8 fixes.
+ When doing DB pull/push you'll need to tell the database that the data is in
UTF-8. In MySQL it's done with the 'mysql_enable_utf8' flag on the database handle.
+ If you're doing any file IO which may produce or read UTF-8 then you'll need
to make sure that your calls are using the IO layer magic syntax.
The biggest help for me was reading the perluniintro and perlunicode perldoc pages.
--
Michael Peters
Developer
Plus Three, LP
---------------------------------------------------------------------
Web Archive: http://www.mail-archive.com/suppressed/
http://marc.theaimsgroup.com/?l=cgiapp&r=1&w=2
To unsubscribe, e-mail: suppressed
For additional commands, e-mail: suppressed
Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.