Actually, Apache has:
AddDefaultCharset ISO-8859-1
And my content-type header meta tag in the HTML output also reads <meta
http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
Header looks like this under regular CGI:
HTTP/1.1.200.OK(CR)(LF)
Date:.Tue,.13.Feb.2007.23:21:28.GMT(CR)(LF)
Server:.Apache(CR)(LF)
Set-Cookie:.publish=;.domain=.lfpress.com;.path=/cgi-bin;.expires=Wed,.14-Fe
b-2007.23:21:29.GMT(CR)(LF)
Connection:.close(CR)(LF)
Transfer-Encoding:.chunked(CR)(LF)
Content-Type:.text/html;.charset=ISO-8859-1(CR)(LF)
(CR)(LF)
And like this under mod_perl:
HTTP/1.1.200.OK(CR)(LF)
Date:.Tue,.13.Feb.2007.23:22:15.GMT(CR)(LF)
Server:.Apache(CR)(LF)
Set-Cookie:.publish=;.domain=.lfpress.com;.path=/cgi-bin;.expires=Wed,.14-Fe
b-2007.23:22:15.GMT(CR)(LF)
Connection:.close(CR)(LF)
Transfer-Encoding:.chunked(CR)(LF)
Content-Type:.text/html;.charset=ISO-8859-1(CR)(LF)
(CR)(LF)
So, it can't be header... this is getting truly bizarre... The system
default charset for the linux box is ISO-8859-1. MySQL is using ISO-8859-1
as its default charset. Dunno what else to check.
Here's another weird thing - the characters aren't showing up as encoded
entities under either regular CGI or mod_perl - they are actual raw
characters, not escaped or encoded. Under regular CGI, a caption line shows
up as:
Dennis.Garnhum,.former.mentor/instructor.at.the.National.Theatre.School,.say
s.that.(93)people.who.have.a.burning.passion.that.can(92)t.be.stopped(94).ar
e.most.likely.to.succeed.as.actors.
Under mod_perl, it becomes:
Dennis.Garnhum,.former.mentor/instructor.at.the.National.Theatre.School,.say
s.that.(C2,93)people.who.have.a.burning.passion.that.can(C2,92)t.be.stopped(
C2,94).are.most.likely.to.succeed.as.actors.
BTW I am using an HTTP viewer for this:
http://www.rexswain.com/httpview.html
And the URLs I am using are:
Regular:
http://www.calgarysun.com/cgi-bin/publish.cgi?p=171767&x=articles&s=events
Mod_perl:
http://www.calgarysun.com/perl-bin/publish.cgi?p=171767&x=articles&s=events
It looks like mod_perl is trying to insert double-byte characters where
single-byte characters go. I checked the ASCII and ISO Latin-1 tables and
those numbers are supposed to be empty charset entities - and yet they're
not. When I check the ASCII table in my text editor (Ultra-edit), they show
up as characters. Is this the Windows codepage (1251 I think) in action,
extending my ASCII set? And yet, the characters show up under regular CGI in
Fedora fine... for some reason mod_perl just seems to be adding a hex C2
before every non-ASCII character. Is it an escape sequence messing up
or.... ?
*scratches head 'till the blood comes*
-----Original Message-----
From: Jonathan Vanasco [mailto:suppressed
Sent: February-08-07 1:42 PM
To: mod_perl List
Subject: Re: Strange characters in output when filtered through mod_perl
Just to clarify:
On Feb 8, 2007, at 3:03 PM, Aaron Hawryluk wrote:
> Our publishing system doesn't use any strange character sets -
Your system is working with data in one character set, and publishing
it to the web in another character set. The fix is *likely* just
setting the right character set header in apache. Personally, I
either do everything in UTF8 or ASCII with html entities for
everything else.
You could try doing:
AddDefaultCharset utf-8
in httpd.conf
or (i think this will work)
$r->content_type("text/html; charset=utf-8");
in your handler
// Jonathan Vanasco
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - -
| SyndiClick.com
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - -
| FindMeOn.com - The cure for Multiple Web Personality Disorder
| Web Identity Management and 3D Social Networking
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - -
| RoadSound.com - Tools For Bands, Stuff For Fans
| Collaborative Online Management And Syndication Tools
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - -
Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.