Importing Cookies from a Firefox Profile in Python (Shallow Thoughts)

Importing Cookies from a Firefox Profile in Python

I wrote at length about my explorations into selenium to fetch stories from the New York Times (as a subscriber). But I mentioned in Part III that there was a much easier way to fetch those stories, as long as the stories didn't need JavaScript.

That way is to use normal file fetching (using urllib or requests), but with a CookieJar object containing the cookies from a Firefox session where I'd logged in.

FeedMe was already using an empty CookieJar, since some sites die or go into infinite loops if they can't set cookies. Its CookieJar started out empty and just let each site write cookies as they saw fit.

from http.cookiejar import CookieJar
import urllib.request, urllib.error, urllib.parse

cookiejar = CookieJar()

opener = urllib.request.build_opener(
    urllib.request.HTTPCookieProcessor(self.cookiejar))
response = opener.open(request, timeout=100)

FeedMe uses the built-in urllib rather than requests, because the code is old, and since urllib works fine, I've never gotten around to rewriting it. But it's even easier with requests:

response = requests.get(url, cookies=cookiejar)

That just left importing cookies from a Mozilla profile.

http.cookiejar includes a class called MozillaCookieJar. So it sounds like the functionality is already there, right?

Well, no. From the documentation linked in the previous paragraph:

class http.cookiejar.MozillaCookieJar(filename, delayload=None, policy=None)
A FileCookieJar that can load from and save cookies to disk in the Mozilla cookies.txt file format (which is also used by the Lynx and Netscape browsers).

Firefox stopped using the cookies.txt format around 2008, as best I can determine, when they switched to using cookies.sqlite instead. There was a bug on MozillaCookieJar filed back then on the issue, with a patch, but the bug was rejected because the Python 2.6/3.0 release was about to happen, and the bug was closed at that time rather than merely being postponed. I filed a new bug hoping to re-raise the issue.

But meanwhile, the only way to use a MozillaCookieJar is to write code to read the sqlite file and translate it to the old cookies.txt format. The best code I've found for doing that comes from a 2009 blog post: Reading Firefox 3.x cookies in Python which I found via a StackOverflow thread, Accessing Firefox 3 cookies in Python. The code is in both places, so I needn't repeat it here.

The method is a little squinchy, using a StringIO to emulate a cookies.txt file, but it works fine, at least until such time as someone sees fit to replace the almost 15 years out of date MozillaCookieJar code with something that actually works.

Tags: programming, python, cookies, firefox, scraping
[ 12:22 Dec 03, 2021 More programming | permalink to this entry | ]

Importing Cookies from a Firefox Profile in Python

Recommend

人民网三评“种草笔记”之三：要杜绝带货变“带祸”

如何对抗技术焦虑？像业余者那样自信！

如何关闭亚马逊店铺？亚马逊卖家申请永久关闭帐户流程

什么产品适合做私域流量？哪些项目最适合私域电商变现

My Favorite Books of 2021 | Jorge Arango

基于照片地理位置的产品设计

Book Notes: “On Bullshit”

Goodbye, Rock Solid Knowledge; Hello, 10x Banking

Install Python 3.9 on Raspberry Pi OS or Debian 10 (for Ansible or other uses)

华为云数据库GaussDB(for Influx)揭秘第二期：解密GaussDB(for Influx)的数据压缩

About Joyk