解析小红书无水印视频直链
source link: https://iecho.cc/2024/03/03/decode-xiaohongshu-video-url/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
以这条笔记为例,其笔记 id 为 65e2c4fb00000000030367bd
。
获取含水印视频 URL
找到页面 Open Graph 协议的视频标签
<meta name="og:video" content="">
里面的 content
属性就是含水印视频的 URL,格式如下
https://sns-video-hw.xhscdn.com/stream/110/259/01e5e2b96c2b17eb010371038dfdd2b1c0_259.mp4
获取无水印视频 URL
页面源代码中搜索 originVideoKey
,找到如下 JSON 字段
{
"originVideoKey":"spectrum\u002F1040g35830vr3bg1860005o6qr60o57fr83a7isg"
}
其中的 \u002F
是 Unicode 编码的 /
,你可以用 jq
命令来解码。
~ % echo '{"originVideoKey":"spectrum\u002F1040g35830vr3bg1860005o6qr60o57fr83a7isg"}' | jq
{
"originVideoKey": "spectrum/1040g35830vr3bg1860005o6qr60o57fr83a7isg"
}
然后拼接在 https://sns-video-bd.xhscdn.com/
的尾部,得到无水印视频 URL
https://sns-video-bd.xhscdn.com/spectrum/1040g35830vr3bg1860005o6qr60o57fr83a7isg
网上的小红书解析工具会返回一个路径为 258
的 URL,与上述 URL 不同,但是仍然有效,不知道是怎么构造出来的。对比两个 URL 的差异如下:
不同位 1 1111 1
有水印 https://sns-video-hw.xhscdn.com/stream/110/259/01e5e2b96c2b17eb010371038dfdd2b1c0_259.mp4
无水印 https://sns-video-hw.xhscdn.com/stream/110/258/01e5e2b96c2b17eb010371038dfdd239f3_258.mp4
简单的 Python 脚本
import requests
import re
import json
link = 'https://www.xiaohongshu.com/explore/65e2c4fb00000000030367bd'
def work(url: str) -> dict:
r = requests.get(url)
if r.status_code == 200:
url_with_watermark = re.findall(r'<meta name="og:video" content="(.*?)">', r.text)
if url_with_watermark:
url_with_watermark = url_with_watermark[0]
else:
url_with_watermark = None
key = re.findall(r'{\"originVideoKey\":\".*?\"}', r.text)
if key:
url_without_watermark = "http://sns-video-bd.xhscdn.com/" + json.loads(key[0])["originVideoKey"]
else:
url_without_watermark = None
return {
"url_with_watermark": url_with_watermark,
"url_without_watermark": url_without_watermark
}
else:
print(f"status code: {r.status_code}")
return {
"url_with_watermark": None,
"url_without_watermark": None
}
print(work(link))
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK