源码:
# coding: UTF-8 from bs4 import BeautifulSoup html_doc = """The Dormouse's story The Dormouse's story
Once upon a time there were three little sisters; and their names were Elsie, Lacie and Tillie; and they lived at the bottom of a well.
...
""" soup = BeautifulSoup(html_doc, 'html.parser') data = soup.find('p', {'class': 'story'}) print(data)
输出:
Once upon a time there were three little sisters; and their names were Elsie, Lacie and Tillie; and they lived at the bottom of a well.
问题:
我想爬取的是 p 标签下的 属性值为 story 的 class, 但是结果怎么出来有 a 标签呢?
a是p的子标签,会一起全部打印出来的。
soup.find("a", attrs={"class": "sister"}) 或 soup.find("p", class_="sister")
可以参考 https://www.crummy.com/softwa...
这个是正常的,<a>是<p>标签的子元素。