LinuxSA Mailing list archives

Index: [thread] [date] [subject] [author]
  From: Alan Kennington <akenning@dog.topology.org>
  To  : LinuxSA <linuxsa@linuxsa.org.au>
  Date: Thu, 8 Jun 2000 17:36:12 +0930

which alphabet encoding for PHP/PostgreSQL/Perl/HTML?

Suppose you're putting together a database of your books,
and a database of nutrition data, and so forth and so forth,
and suppose you have a fair amount of Vietnamese, Russian,
West European text etc. to be stored in a "text" field.

Which alphabet encoding would you use?

I started with ascii + HTML encodings like ü for u-umlaut.
But that's really limited, and it confuses PHP3 sometimes.

So then I tried converting all HTML special chars to ISO 8859 Latin-1.
That sort of works okay for a fair subset of languages.
But I also have books in Vietnamese, Russian, and other wierd stuff.
So I thought of using UTF-8, UTF-16, Unicode, and a few variations
of these. 
The disadvantage of the Unicode 16-bit encodings is that the text always
has to be read with special software.
"psql" will not make any sense of 16-bit encodings (as far as I know).

The disadvantage of UTF-8 encoding is that all of the non-ascii 8-bit
Latin-1 encodings are wrong for PHP3 and psql and linux generally
(which are all happy with ISO 8859-1).
But this does seem like a fair kludge - UTF-8, I mean.

Ideally, I would have liked a Unicode alphabet which is encoded
as an extension to ISO 8859-1. I.e there would be an escape character
which would escape the text out into something like UTF-8.
But I don't know of any such encoding.

My little set of links to alphabet encodings is at
http://www.topology.org/alpha.html
which proves that I've made a fair effort to research the matter
myself before asking the list about it.

Question 1:
Is there a "standard" encoding of Unicode which is an extension
to ISO 8859 Latin-1?

Question 2:
Do you have a solution which you have used for getting
full Unicode alphabets into PostgreSQL?
If so, which encoding did you use, and what sort of
software do you use instead of "psql" to view text?
And what do you do in PHP to access Unicode alphabets?

Cheers,
Alan Kennington.

===========================================================
PS.
Does anyone know the nature of the relationship between the
two people in this photo?
http://www.topology.org/images/photo46_i.jpg

-- 
LinuxSA WWW: http://www.linuxsa.org.au/  IRC: #linuxsa on irc.linux.org.au
To unsubscribe from the LinuxSA list:
  mail linuxsa-request@linuxsa.org.au with "unsubscribe" as the subject


Index: [thread] [date] [subject] [author]
Return to the LinuxSA Mailing List Information Page