Since the inception of web-scraping and web-crawling screenshot generation was a point of challenge for developers. But with the release of Google V8 engine and NodeJS things started to become developer-friendly. In recent days there were few popular nodejs libraries in the scraping and screenshot generation e.g. CasperJs, PhantomJs, Cheerio etc. Puppeteer is the recent addition to this list.
Puppeteer is built on top of NodeJS which provides a high-level API to control headless Chrome over the DevTools Protocol supported by Google V8. It can also be configured to use non-headless Chrome in desktop mode. In a word, Puppeteer could be remote programmatic control to Google Chrome and Chromium to achieve Content Scraping, Screenshot generation, HTML to PDF generation, automated testing and lots more.
Today we will learn to install Puppeteer on top of Ubuntu 18.04 LTS by following few easy steps. We assume you don’t have nodejs installed on your system, but if you already have then please skip to Step 2.
Step 1: Install nodejs & npm
# install curl if not installed $ sudo apt install curl # install node 10.x repository to the system $ curl -sL https://deb.nodesource.com/setup_10.x | sudo -E bash - # download & install nodejs 10.x along with npm $ sudo apt install nodejs
Step 2: Install puppeteer
$ mkdir your-project $ cd your-project $ npm install puppeteer
The above command should create a project directory for you and install puppeteer. Once the installation is done let’s write our first program to generate a screenshot for a given page.