問題描述
- 用
r = requests.get(url, params)爬取網頁,新建一個文件并保存r.text時報錯
>>> with open("webtext.txt", 'w') as f:
f.write(r.text)
Traceback (most recent call last):
File "<pyshell#18>", line 2, in <module>
f.write(r.text)
UnicodeEncodeError: 'gbk' codec can't encode character '\xf6' in position 395497: illegal multibyte sequence
- 意思是'gbk'編解碼器不能編碼Unicode字符'\xf6'
解決途徑
>>> import locale
>>> locale.getpreferredencoding()
'cp936'
- 所以在
open()中把encoding設置為'utf-8'即可:
>>> with open("webtext.txt", 'w', encoding='utf-8') as f:
f.write(r.text)
1205641
>>>