{"code":"0","msg":"","res":{"topic":[],"page":"1","skip":"0","total":"0","list":[
{"tid":"12242860","title":"[抢完]C++ Enum","lastuid":"7530945","lastposter":"aduw","avatar":"https://baidu.com/s/tqxu.png","replies":"7","lastpost":"1523102297","isread":"0","helpcredit":"0","url":"zyj://3D12242860%26_tpl%3Dapp%26target%3D1","timages":[]},
{"tid":"12242847","title":"[抢完]Python List","lastuid":"7963672","lastposter":"aduw","avatar":"https://baidu.com/s/tqxu.png","replies":"11","lastpost":"1523102180","isread":"0","helpcredit":"0","url":"zyj://3Dapp%26target%3D1","timages":[]},
{"tid":"12242844","title":"[抢完]Java Final","lastuid":"1858970","lastposter":"aduw","avatar":"https://baidu.com/s/tqxu.png","replies":"34","lastpost":"1523102780","isread":"0","helpcredit":"0","url":"zyj://3Dapp%26target%3D1","timages":[]}]}
上面是客户端抓包的结果,想提取其中'title','tid'字段
使用 req.json()似乎会把汉字变成 unicode 编码,导致看起来很奇怪
试过 js_data['tid']['title'] 好像报错
试过 BS4,但是 parse 的使用需要指定 html/css/js 试了下 html 似乎不行?
Python 小白,还请各路大神指教
1
patx 2018-04-07 20:57:11 +08:00
chardet 了解下
|
2
RqPS6rhmP3Nyn3Tm 2018-04-07 22:16:15 +08:00 via iPhone
Python 2 吗,还是上 3 吧,默认 unicode
|
3
alvin666 2018-04-07 22:18:40 +08:00 via Android
方案一:.json ()加个.decode()
方案二:bs4 库是可以解析 json 的,再试试 方案三:解析成 string 以后用 re 库暴力提取 /滑稽 |
5
noqwerty 2018-04-07 23:16:02 +08:00
import json
res = json.loads(r.text) [(x["tid"], x["title"]) for x in res["res"]["list"]] 报错是因为你的 tid 和 title 是在 list 里面的。 |
6
lgh 2018-04-07 23:57:14 +08:00 via iPhone
也可以试试 ObjectPath
|
7
codelover2016 2018-04-08 00:14:45 +08:00 via Android
@noqwerty 正解……
|
8
hackpro OP @codelover2016 @lgh 采用 @noqwerty 的方案完美解决
[print(x["tid"], x["title"], x["url"]) for x in js_data["res"]["list"]] 只是有个问题 为啥下一行 print(x) 不行 必须现在这样放在中括号里面 变量作用域问题吗 后面还想引用 x 的值 |
9
noqwerty 2018-04-10 01:46:25 +08:00
@hackpro #8 这个是 Python 的 list comprehension,如果后面要用的话赋值给变量就好了:
result = [(x["tid"], x["title"]) for x in res["res"]["list"]] 或者就正常写循环: result = [] for x in res["res"]["list"]: result.append((x["tid"], x["title"])) |
10
hackpro OP @noqwerty 感谢大佬回复 这几天一直在琢磨这个
最终修改成 [ print(x["isread"], x["title"], x["url"]) if not x else print('No coins at current.') for x in js_data["res"]["list"] if (not x["title"].startswith('[抢完]') and x["isread"] == ‘ 0') ] 之前的问题出在最后一个等于 0 的判断,之前直接按数值类型写了 导致 else 一直打不出来 另外请教下 python 是如何区别数值 0 和字符串 0 的 或者说 response 本身决定了 fields 的数据类型? |