1
xiandao7997 2014-08-01 14:04:29 +08:00 via Android
Wget
|
3
zzetao 2014-08-01 14:16:56 +08:00
其实一些浏览器的插件可以做到
|
4
androidBrant OP @faceair 如何快速抓到这些地址的?
|
5
nealv2ex 2014-08-01 14:23:55 +08:00 1
list = $('.pic img').map(function(o,item){
var a = document.createElement('a'); a.href = $(item).attr('original'); return a.href; }) |
6
androidBrant OP @xiandao7997
jiaqiqunaerdeiMac:pic jiaqiqunaer$ wget -r http://www.szeros-wedding.com/UpFile/editor/ --2014-08-01 14:20:58-- http://www.szeros-wedding.com/UpFile/editor/ Resolving www.szeros-wedding.com... 211.154.142.215 Connecting to www.szeros-wedding.com|211.154.142.215|:80... connected. HTTP request sent, awaiting response... 403 Forbidden 2014-08-01 14:20:59 ERROR 403: Forbidden. |
7
NemoAlex 2014-08-01 14:27:23 +08:00
|
8
faceair 2014-08-01 14:29:01 +08:00
|
9
imn1 2014-08-01 14:29:58 +08:00 2
save as...
complete html |
10
Roboo 2014-08-01 14:32:35 +08:00
idm
|
11
xiandao7997 2014-08-01 14:36:02 +08:00 via Android
Wget -r --level=2 --accept=jpg [标题里的 url]
结束后在子目录的 upfile/editor 里面找 |
12
xiandao7997 2014-08-01 14:36:51 +08:00 via Android
@imn1 感觉自己 《社交网络》白看了
|
13
wesley 2014-08-01 15:24:05 +08:00
先清空浏览器缓存, 再打开那个网页, 再去浏览器缓存文件夹里找
|
14
androidBrant OP @faceair 用xpath如何找到这些地址,表达式,谢谢
|
15
mengzhuo 2014-08-01 15:36:09 +08:00 1
再来个python版
import requests from lxml import html URL = 'http://www.szeros-wedding.com/html/service/804.html#1' [x.attrib['src'] for x in html.fromstring(requests.get('http://www.szeros-wedding.com/html/service/804.html#1').text).xpath('//img')] ------- ['/skins/20140425/images/bg74.gif', '/skins/20140425/images/t0.gif', '/skins/20140425/images/t3.gif', '/skins/20140425/images/t01.gif', '/skins/20140425/images/t01.gif', '/skins/20140425/images/bg75.gif', '/skins/20140425/images/logo.jpg', '/skins/20140425/images/bg6.jpg', '/skins/20140425/images/bg7.jpg', '/skins/20140425/images/bg8.jpg', '/skins/20140425/images/bg9.jpg', '/skins/20140425/images/bg10.jpg', '/skins/20140425/images/f.jpg', '/ueditor/asp/../../UpFile/editor/2014032002455418.jpg', '/ueditor/asp/../../UpFile/editor/2014032002455652.jpg', '/ueditor/asp/../../UpFile/editor/2014032002456340.jpg', '/ueditor/asp/../../UpFile/editor/2014032002456480.jpg', '/ueditor/asp/../../UpFile/editor/2014032002457027.jpg', '/ueditor/asp/../../UpFile/editor/2014032002457496.jpg', '/ueditor/asp/../../UpFile/editor/2014032002457996.jpg', '/ueditor/asp/../../UpFile/editor/2014032002458527.jpg', '/ueditor/asp/../../UpFile/editor/2014032002458652.jpg', '/ueditor/asp/../../UpFile/editor/2014032002459152.jpg', '/ueditor/asp/../../UpFile/editor/2014032002460184.jpg', '/ueditor/asp/../../UpFile/editor/2014032002460340.jpg', '/ueditor/asp/../../UpFile/editor/2014032002460512.jpg', '/ueditor/asp/../../UpFile/editor/2014032002461262.jpg', '/ueditor/asp/../../UpFile/editor/2014032002461902.jpg', '/ueditor/asp/../../UpFile/editor/2014032002462480.jpg', '/ueditor/asp/../../UpFile/editor/2014032002463027.jpg', '/ueditor/asp/../../UpFile/editor/2014032002463746.jpg', '/ueditor/asp/../../UpFile/editor/2014032002464809.jpg', '/ueditor/asp/../../UpFile/editor/2014032002464934.jpg', '/ueditor/asp/../../UpFile/editor/2014032002465652.jpg', '/ueditor/asp/../../UpFile/editor/2014032002466230.jpg', '/ueditor/asp/../../UpFile/editor/2014032002466730.jpg', '/ueditor/asp/../../UpFile/editor/2014032002466918.jpg', '/ueditor/asp/../../UpFile/editor/2014032002467590.jpg', '/ueditor/asp/../../UpFile/editor/2014032002467746.jpg', '/ueditor/asp/../../UpFile/editor/2014032002468449.jpg', '/ueditor/asp/../../UpFile/editor/2014032002469090.jpg', '/ueditor/asp/../../UpFile/editor/2014032002469230.jpg', '/ueditor/asp/../../UpFile/editor/2014032002469902.jpg', '/ueditor/asp/../../UpFile/editor/2014032002470699.jpg', '/ueditor/asp/../../UpFile/editor/2014032002470840.jpg', '/skins/20140425/images/f.jpg', '/skins/20140425/images/jd.jpg', '/skins/20140425/hzjd/1.jpg', '/skins/20140425/hzjd/2.jpg', '/skins/20140425/hzjd/3.jpg', '/skins/20140425/hzjd/4.jpg', '/skins/20140425/hzjd/5.jpg', '/skins/20140425/hzjd/6.jpg', '/skins/20140425/hzjd/7.jpg', '/skins/20140425/hzjd/8.jpg', '/skins/20140425/hzjd/9.jpg', '/skins/20140425/hzjd/10.jpg', '/skins/20140425/images/link.jpg', '/skins/20140425/images/logo1.jpg'] |
16
zoudm 2014-08-01 16:34:16 +08:00
@androidBrant
Xpath: /html/body/table/tbody/tr[1]/td/table/tbody/tr[5]/td/div/p[1]/img ... ... /html/body/table/tbody/tr[1]/td/table/tbody/tr[5]/td/div/p[5]/img |
17
muziyue 2014-08-01 17:05:27 +08:00
如果不是特别多的页面的话,我一般都是curl+s 然后文件夹里找
|
18
decken 2014-08-01 17:54:01 +08:00
pyquery实在是太好用了
https://gist.github.com/28dea5a2553190223ca6.git |
19
decken 2014-08-01 17:55:25 +08:00
|
20
mopvhs 2014-08-02 10:20:49 +08:00
|
21
mopvhs 2014-08-02 10:37:14 +08:00 1
gist怎么可以不解析!
<script src="https://gist.github.com/mopvhs/4b93757c88b5fe558846.js"></script> https://gist.github.com/mopvhs/4b93757c88b5fe558846 |
23
BGLL 2014-08-02 17:30:11 +08:00
Chorme 扩展:Fatkun
|
24
androidBrant OP |
25
xiandao7997 2014-08-02 21:16:37 +08:00 via Android
保存网页然后去文件夹找那个最简单,21楼的方法也很酷
|
26
sobigfish 2014-08-03 00:24:55 +08:00
firefox + downthemall
|