解释器和内存
字节串和字符串比较
>>> s = 'a'
>>> u = u'a'
>>> s == u
True
>>> s = '中'
>>> u = u'中'
>>> s == u
__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
False
字节串和字符串比较时字节串会被转换为字符串,失败时会有警告,而且结果也不符合逻辑上相等的概念
字符串转字节串比较
>>> s = '中'
>>> u = u'中'
>>> s == str(u)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u4e2d' in position 0: ordinal not in range(128)
>>> map(lambda i: hex(ord(i)), u)
['0x4e2d']
>>> import sys
>>> sys.getdefaultencoding()
'ascii'
>>> sys.setdefaultencoding('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'setdefaultencoding'
>>> reload(sys)
<module 'sys' (built-in)>
>>> sys.setdefaultencoding('utf-8')
>>> s == str(u)
True
str函数使用的是python的默认编码,默认为ascii,unicode字符码转ascii字节码就会报错
字节串转字符串比较
>>> s = '中'
>>> u = u'中'
>>> unicode(s) == u
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)
>>> map(lambda i: hex(ord(i)), s)
['0xe4', '0xb8', '0xad']
>>> import sys
>>> sys.getdefaultencoding()
'ascii'
>>> sys.setdefaultencoding('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'setdefaultencoding'
>>> reload(sys)
<module 'sys' (built-in)>
>>> sys.setdefaultencoding('utf-8')
>>> s == str(u)
True
unicode函数使用的是python的默认编码,默认为ascii,ascii字节码转unicode字符码就会报错