How to Download online web pages as PDF with Percollate
source link: https://computingforgeeks.com/how-to-download-online-web-pages-as-pdf-with-percollate/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Have ever wondered how you can download web pages on your Linux terminal as PDF files?. This guide will help you use Percollate command line tool to download online web pages as beautifully formatted PDF files.
How Percollate works
Here is how Percollate works:
- Fetch the page(s) using got
- Enhance the DOM using jsdom
- Pass the DOM through Mozilla/readability to strip unnecessary elements
- Apply the HTML template and the print stylesheet to the resulting HTML
- Use puppeteer to generate a PDF from the page
How to install Percollatein Linux
Percollate needs Node.js
version 8 or later installed on your Local system, as it uses new(ish) JavaScript syntax. Install Node.js using or guide:
How to run multiple versions of Node.js on Linux
Once Node.js is installed, you can then proceed to install percollate globally using either yarn
or npm
For npm use:
npm install -g percollate
For yarn, use:
yarn global add percollate
Check the installed version by running:
$ percollate --version 0.2.10
For help page, use:
$ percollate --help Usage: percollate [options] [command] Options: -V, --version output the version number -h, --help output usage information Commands: pdf [options] [urls...] Bundle web pages as a PDF file epub [options] [urls...] Bundle web pages as an EPUB file html [options] [urls...] Bundle web pages as a HTML file
Updating Percollate
To keep the package up-to-date, you can run:
$ npm install -g percollate or $ yarn global upgrade --latest percollate
Using Percollate
The basic commands available are:
- percollate pdf: Bundles one or more web pages into a PDF
- percollate epub: Bundles one or more web pages into an epub
- percollate html: Bundles one or more web pages into an HTML file
Available options are:
- -o, –output – The path of the resulting bundle; when omitted, the output file name is derived from the title of the web page.
- –individual – Export each web page as an individual file.
- –template – Path to a custom HTML template
- –style – Path to a custom CSS
- –css: Additional CSS styles you can pass from the command-line to override the default/custom stylesheet styles
See below Examples
Transform a single web page to PDF:
percollate pdf --output file filename.pdf https://example.com
To bundle several web pages into a single PDF, specify them as separate arguments to the command:
percollate pdf --output flename.pdf https://example.com/page1 https://example.com/page2
You can use common Unix commands and keep the list of URLs in a newline-delimited text file:
cat urls.txt | xargs percollate pdf --output filename.pdf
To transform several web pages into individual PDF files at once, use the –individual flag:
percollate pdf --individual --output some.pdf https://example.com/page1 https://example.com/page2
Set Custom page size / margins
The default page size is A5 (portrait). but you can use the --css
option to override it using any supported CSS size:
percollate pdf --output some.pdf --css "@page { size: A3 landscape }" http://example.com
Similarly, you can define using:
Custom margins: @page { margin: 0 }
The base font size: html { font-size: 10pt }
Or any other style defined in the default/custom stylesheet.
Thanks for using our guide to Download Web page as PDF file.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK