Python爬虫编程思想（41）：XPath实战：选取DOM节点

2 years ago

source link: https://blog.csdn.net/nokiaguy/article/details/120678672
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Python爬虫编程思想（41）：XPath实战：选取DOM节点

专栏收录该内容

50 篇文章 2 订阅 ¥29.90 ¥99.00

1 选取所有节点

2. 选取子节点

3. 选取父节点

1 选取所有节点

以2个斜杠（//）开头的XPath规则会选取所有符合要求的节点。如果使用'//*'，那么会选取整个HTML文档中所有的节点

Python爬虫编程思想（37）：项目实战：抓取猫眼电影Top100榜单

本文会使用urllib3抓取猫眼电影Top100榜单，读者使用下面的URL进入Top100榜单页面。https://maoyan.com/board/4Top100榜单页面如图1所示。从Top100榜单页面可以看出，每一页有10部电影，共10页，一共100部电影。页面下方是导航，用于切换1至10个页面。这个爬虫的目的就是抓取这100部电影的信息（如电影封面图像的URL、电影名称、演员列表、评分、上映时间等），然后将这些数据以JSON格式保存到名为board.txt的文本...

Recommend

Python爬虫编程思想（41）：XPath实战：选取DOM节点

Python爬虫编程思想（41）：XPath实战：选取DOM节点

1 选取所有节点

Recommend

新一代容器平台ACK Anywhere，来了

GitHub - wimpysworld/quickemu: Quickly create and run optimised Windows, macOS a...

韩顺平2021年最新Linux视频教程

GitHub - donet5/WebFirst: .NET CORE 代码生成器，Web中使用CodeFirst模式，实体生...

GitHub - mattiasgustavsson/dos-like: Engine for making things with a MS-DOS feel...

GitHub - herosi/CTO: Call Tree Overviewer

Dialect：Linux 下的开源翻译应用 | Linux 中国

GitHub - Python-World/Python_and_the_Web: Build Bots, Scrape a website or use an...

“喜茶logo全身图”上热搜，被网友玩出花

Avoid instance_exec for controller callbacks by jhawthorn · Pull Request #43335...

About Joyk