urllib 和 aiohttp 之间的不同结果

Question

所以基本上我正在尝试从在线广播直接获取当前播放的曲目 link（示例 - http://air.radiorecord.ru:8101/rr_320).

首先我在网上找到了一些东西，用urllib写的，我的应用程序是异步的所以我需要使用aiohttp。使用 urllib 它工作得很好，而 aiohttp 有时找不到任何东西。请帮助:(

之前：

def get_now(self, session):
    request = urllib.Request(self.data.get('url'),headers={'Icy-MetaData': 1} ) # request metadata

    response = urllib.urlopen(request)
    metadata = response.headers

    metaint = int(response.headers['icy-metaint'])
    for _ in range(10):  # title may be empty initially, try several times
        response.read(metaint)  # skip to metadata
        metadata_length = struct.unpack('B', response.read(1))[0] * 16  # length byte
        metadata = response.read(metadata_length).rstrip(b'[=10=]')

        # extract title from the metadata
        m = re.search(br"StreamTitle='([^']*)';", metadata)
        if m:
            title = m.group(1)
            if title:
                break
            else:
                return "No title found"
    return title.decode('utf8', errors='replace')  

except:
    return "No title found"

之后：

async def get_now(self, session):
    
    async with session.get(self.stream_url, headers={'Icy-MetaData': "1"}) as resp:
        
        content = resp.content

        metadata = resp.headers
        metaint = int(metadata['icy-metaint'])

        for _ in range(30):
            await content.read(metaint)
            metadata_length = struct.unpack('B', await content.read(1))[0] * 16  # length byte
            metadata = (await content.read(metadata_length)).rstrip(b'[=11=]')

            m = re.search(br"StreamTitle='([^']*)';", metadata)
            if m:
                title = m.group(1)
                if title:
                    return title.decode('utf8', errors='replace')
                else:
                    return "No title found"
            

        return "Nothing found"

Answer 1

下面的代码片段始终能够检测到当前曲目（大约 400 毫秒），但它不是只处理部分块，而是在读取时检查整个块：

import aiohttp
import asyncio
import re


async def get_now(stream_url, session):
    headers={"Icy-MetaData": "1"}
    async with session.get(stream_url, headers=headers) as resp:
        for _ in range(10):
            data = await resp.content.read(8192)
            m = re.search(br"StreamTitle='([^']*)';", data.rstrip(b"[=10=]"))
            if m:
                title = m.group(1)
                if title:
                    return title.decode("utf8", errors="replace")
                else:
                    return "No title found"
    return "Nothing found"


async def get_track():
    session = aiohttp.ClientSession()
    stream_url = "http://air.radiorecord.ru:8101/rr_320"
    result = await get_now(stream_url, session)
    print(f"result: {result}")
    await session.close()


asyncio.run(get_track())

我电脑上的结果（CPU 在相当老的 CPU 上使用率非常低：i7-3517U）：

[ionut@ionut-pc ~]$ time python test.py 
result: Record Club - Nejtrino & Baur

real    0m0.401s
user    0m0.198s
sys 0m0.031s

urllib 和 aiohttp 之间的不同结果

Different result between urllib and aiohttp

python

urllib

aiohttp