[IronPython] x = unicode(someExtendedUnicodeString) fails.
vernondcole at gmail.com
Thu Dec 17 11:05:31 PST 2009
I just tripped over this one and it took some time to figure out what in
blazes was going on. You may want to watch for it when porting CPython code.
I was cleaning up an input argument using
s = unicode(S.strip().upper())
where S is the argument supplying the value I need to convert.
When I handed the function a genuine unicode string, such as in:
assert Roman(u'\u217b') == 12 #unicode Roman number 'xii' as a single
IronPython complains with:
UnicodeEncodeError: ('unknown', '\x00', 0, 1, '')
The Python manual says:
> If no optional parameters are given, unicode() will mimic the behaviour of
> str() except that it returns Unicode strings instead of 8-bit strings.
> More precisely, if *object* is a Unicode string or subclass it will return
> that Unicode string without any additional decoding applied.
It turns out that this was already reported on codeplex as:
but the reporting party did not catch the fact that he had located an
incompatibility with documented behavior.
It has been setting on a back burner for some time.
Others may want to join me in voting this up. Meanwhile I will add an
unneeded exception handler to my own code.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users