python爬虫请求头如何设置？这篇文章讲解得超详细 - 大盘站

这篇文章小编给大家详细介绍一下在python中爬虫请求头的方法，感兴趣的小伙伴一定要耐心阅读一下这篇文章，对于那些使用爬虫的小伙伴帮助非常大，我们废话少说直接开始分享干货！

python爬虫请求头如何设置？这篇文章讲解得超详细

小编分别通过requests设置、Selenium+Chrome设置、selenium+phantomjs设置、爬虫框架scrapy设置、Python异步Aiohttp设置来给大家详细介绍一下

（一）requests设置请求头:

import requests
 
 url="http://www.targetweb.com"
 
 headers={
 
 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
 
 'Cache-Control':'max-age=0',
 
 'Connection':'keep-alive',
 
 'Referer':'http://www.baidu.com/',
 
 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 
 Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400'}
 
 res=requests.get(url,headers=headers)
 
 #图片下载时要用到字节流，请求方式如下
 
 #res=requests.get(url,stream=True,headers)

（二）Selenium+Chrome请求头设置:

from selenium import webdriver
 options = webdriver.ChromeOptions()
 options.add_argument('lang=zh_CN.UTF-8')# 设置中文
 options.add_argument('user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 
 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400"')
 # 设置头部
 browser = webdriver.Chrome(chrome_options=options)
 url="http://www.targetweb.com"
 browser.get(url)
 browser.quit()

（三）selenium+phantomjs请求头设置：

from selenium import webdriver
 from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
 des_cap = dict(DesiredCapabilities.PHANTOMJS)
 des_cap["phantomjs.page.settings.userAgent"] = ("Mozilla/5.0 (Windows NT 6.1; WOW64)
  AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 
  QQBrowser/9.7.13059.400")
 browser = webdriver.PhantomJS(desired_capabilities=des_cap)
 url="http://www.targetweb.com"
 browser.get(url)
 browser.quit()

（四）爬虫框架scrapy设置请求头：

在settings.py文件中添加如下：

DEFAULT_REQUEST_HEADERS = {
 'accept': 'image/webp,*/*;q=0.8',
 'accept-language': 'zh-CN,zh;q=0.8',
 'referer': 'https://www.baidu.com/',
 'user-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 
 Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400',}

（五）Python异步Aiohttp请求头设置:

import aiohttp
 url="http://www.targetweb.com"
 headers={
 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
 'Cache-Control':'max-age=0',
 'Connection':'keep-alive',
 'Referer':'http://www.baidu.com/',
 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 
 Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400'}
 asyncwithaiohttp.ClientSession(headers=headers)assession:
 asyncwithsession.get(url)asresp:
 print(resp.status)
 print(awaitresp.text())

以上就是小编给大家带来的在python爬虫请求头设置的方法，希望大家通过阅读小编的文章之后能够有所收获！如果大家觉得小编的文章不错的话，可以多多分享给有需要的人。

更多python相关文章请访问分类：python

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。