最近一个小项目
一个纠结的点
div class = "father"
____div class = "child"
________p
____________a href = "#"
________/p
____/div
____div class = "child2"
________p
____________a href = "#"
________/p
____/div
____div class = "child3"
________p
____________a href = "#"
________/p
____/div
/div
像这种情况
是用多条 xpath 精确查找
ch_o = new_html.xpath('//[class="child"]//')
ch_o = [e.href for e in ch_o if e.text ]
ch_t = new_html.xpath('//[class="child2"]//')
ch_t = [e.href for e in ch_t if e.text]
ch_h = new_html.xpath('//[class="child3"]//')
ch_h = [e.href for e in ch_h if e.text ]
还是用一条 xpath 模糊查找再推导式过滤更好呢
list_f = new_html.xpath('//@[class="father"]//')
list_filte = [[ine for ine in oe.xx if oe.xx.text] for oe in list_f if oe.xx ]
测试发现,每多一条 xpath 语法速度就会慢接近一倍
但实际情况下,最后的过滤推导式的层数会更多
到底应该如何取舍呢?