7

Python Compare Wikipedia Pages

 3 years ago
source link: https://deparkes.co.uk/2020/12/27/python-compare-wikipedia-pages/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Wikimedia has an API which lets you compare Wikipedia pages, and in some cases modify pages and information within the Wikimedia group. The main page for all Wikimedia API information is here:

In this post I am most interested in the Wikipedia compare API, to show how you use it to see differences between versions of a Wikipedia page. The documentation for this part of the Wikipedia API is here:

Compare Wikipedia Pages

Using the example on the Wikipedia page it is fairly easy to use the API to return the diff for two different pages. In this case for gold and silver.

import requests
session = requests.Session()
PARAMS = {
'action': "compare",
'format': "json",
'fromtitle': 'Gold',
'totitle': 'Silver'
}
response = session.get(url=api_url, params=PARAMS)
response_json = response.json()
for key, value in response_json['compare'].items():
print(key, ' : ', value)

This code will capture and print the whole of the API response json. The diff of the pages is stored in the ‘*’ key of the ‘compare’ dictionary.

Comparing Page Versions

The API also makes it possible to compare between different revisions of the same (or different) pages. The use of the API is similar to comparing separate Wikipedia pages, only we specify individual revisions (which are unique ids across different pages) rather than page names.

An example of the comparison viewed via the webpage is here.

import requests
session = requests.Session()
PARAMS = {
'action': "compare",
'format': "json",
'fromrev': 989715533,
'torev': 989941568
}
response = session.get(url=api_url, params=PARAMS)
response_json = response.json()
for key, value in response_json['compare'].items():
print(key, ' : ', value)

Related


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK