创建gbk文件

➜  file iconv -l | grep -i 'utf-8'
UTF-8 UTF8
UTF-8-MAC UTF8-MAC
➜  file iconv -l | grep -i 'gbk'  
GBK

➜  file iconv -f UTF8 -t GBK read_test.txt> read_gbk.txt 
➜  file file read_gbk.txt 
read_gbk.txt: ISO-8859 text

读取文件

不指定编码

➜  file cat read_gbk.py
for line in open("read_gbk.txt"):
    print line.replace("\n", "")

➜  file python read_gbk.py
��1��
��2��
��3��

不指定编码就乱码了

指定编码

➜  file cat read_gbk.py
import codecs

for line in codecs.open("read_gbk.txt", encoding="GBK"):
    print line.replace("\n", "")

➜  file python read_gbk.py
第1行
第2行
第3行

指定编码后就正常了

错误处理

➜  file cat read_gbk.py 
import codecs

print "replace with ?"
for line in codecs.open("read_gbk.txt", encoding="utf-8", errors="replace"):
    print line.replace("\n", "")
print

print "ignore the error"
for line in codecs.open("read_gbk.txt", encoding="utf-8", errors="ignore"):
    print line.replace("\n", "")
print

print "raise the errot"
for line in codecs.open("read_gbk.txt", encoding="utf-8", errors="strict"):
    print line.replace("\n", "")

➜  file python read_gbk.py
replace with ? ->
��1��
��2��
��3��

ignore the error ->
1
2
3

raise the error ->
Traceback (most recent call last):
  File "read_gbk.py", line 14, in <module>
    for line in codecs.open("read_gbk.txt", encoding="utf-8", errors="strict"):
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 699, in next
    return self.reader.next()
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 630, in next
    line = self.readline()
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 545, in readline
    data = self.read(readsize, firstline=True)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 492, in read
    newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb5 in position 0: invalid start byte
Copyright © zhujipeng 2017 all right reserved,powered by Gitbook 该文件修订时间: 2017-12-16 15:12:10

results matching ""

    No results matching ""