[albatross-users] A better page system
Andrew McNamara
andrewm at object-craft.com.au
Thu Jun 5 13:00:38 EST 2003
>Andrew> Put another way, if your application only uses characters in
>Andrew> the range 0-127, the unicode version of albatross works
>Andrew> identically to the old version. However, if you are using
>Andrew> foreign character sets (accented characters, etc) with
>Andrew> characters in the range 128-255, your application will need to
>Andrew> be changed (but the results are much cleaner).
>
>You probably need to provide an example here.
Difficult to do without scaring people off... 8-)
My first response when I encountered all the grief that unicode appears
to create was "bugger this for a joke". But the reality is that unicode
isn't the problem - the problem is the mess that existed before.
Whereas you might previously have got away with outputing a variable
containing an accented character (that's an umlat-a if this doesn't make
it though e-mail):
>>> import albatross
>>> ctx = albatross.SimpleContext('.')
>>> ctx.locals.name = 'Häring'
>>> albatross.Template(ctx, '<magic>', '''<al-value expr="name">''').to_html(ctx)
>>> ctx.flush_content()
Häring
Now you will get a traceback:
>>> import albatross
>>> ctx = albatross.SimpleContext('.')
>>> ctx.locals.name = 'Häring'
>>> albatross.Template(ctx, '<magic>', '''<al-value expr="name">''').to_html(ctx)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/local/lib/python2.3/site-packages/albatross/template.py", line 358, in to_html
self.content.to_html(ctx)
File "/usr/local/lib/python2.3/site-packages/albatross/template.py", line 152, in to_html
item.to_html(ctx)
File "/usr/local/lib/python2.3/site-packages/albatross/tags.py", line 1038, in to_html
ctx.write_content(escape(value))
File "/usr/local/lib/python2.3/site-packages/albatross/tags.py", line 20, in escape
text = unicode(text)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
This is because all standard strings are assumed to use the 'ascii'
character set by python, rather than the browser-default iso-8859-1
character set. So when it converts a standard string to a unicode string,
it uses the 'ascii' codec, rather than the 'iso-8859-1' codec.
The "right" answer, if you are dealing with international character sets is
probably to work with unicode throughout:
>>> ctx.locals.name = u'Häring'
>>> albatross.Template(ctx, '<magic>', '''<al-value expr="name">''').to_html(ctx)
>>> ctx.flush_content()
Häring
When you accept strings from other systems, you probably need to decode
them explicitly:
>>> ctx.locals.name = sys.stdin.readline().decode('iso-8859-1')
--
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
More information about the Albatross-users
mailing list