如何使用Python aiohttp构建异步爬虫，限制并发量并发起GET请求？

2026-05-07 01:512阅读0评论SEO资源

内容介绍
文章标签
相关推荐

本文共计1012个文字，预计阅读时间需要5分钟。

使用`ClientSession`时，直接使用`session.get()`等方法即可，无需在循环中重复创建新的`session`。常见错误是每次请求都使用`async with aiohttp.ClientSession()`创建新的`session`，这样会频繁开启和关闭TCP连接，可能导致连接池耗尽。正确做法如下：

正确做法是整个爬虫生命周期只建一次 session，传给每个请求协程：

import aiohttp import asyncio <p>async def fetch(session, url): async with session.get(url) as resp: return await resp.text()</p><p>async def main(): async with aiohttp.ClientSession() as session: tasks = [fetch(session, url) for url in urls] results = await asyncio.gather(*tasks)</p>

注意：session.get() 返回的是 aiohttp.ClientResponse 对象，必须用 await resp.text() 或 await resp.json() 读取内容，不能直接 .text 或 .json —— 否则会报 RuntimeWarning: coroutine 'ClientResponse.text' was never awaited。

阅读全文

标签：Python 爬虫

本文共计1012个文字，预计阅读时间需要5分钟。

正确做法是整个爬虫生命周期只建一次 session，传给每个请求协程：

阅读全文

标签：Python 爬虫

相关推荐

相关推荐