Question

Не получается спарсить html

import requests
from bs4 import BeautifulSoup

headers = {
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
}

def headrs_response(url):
    response = requests.get(url,headers=headers)
    soup = BeautifulSoup(response.text, 'lxml')
    print(soup)

def main(ad):
    headrs_response(ad)

url = 'https://some-domain.com/'

if __name__ == '__main__':
    main(URL)

Submit an answer

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Want to learn more? Join the DigitalOcean Community!

Join our DigitalOcean community of over a million developers for free! Get help and share knowledge in Q&A, subscribe to topics of interest, and get courses and tools that will help you grow as a developer and scale your project or business.

Hello,

It looks like that on the last line you are passing the URL variable, but it is specified with capital letters. In python variable names are case sensitive, so you need to make sure that it is specified correctly:

if __name__ == '__main__':
    main(url)

I’ve tested this and it works for me.

Best, Bobby