Web Scraping with Puppeteer and Nodejs

Puppeteer is a nodejs library used to scrape data from website, automate browser's task and also for testing your app . Super easy to use.

Installing Pupepeteer ->

Install Nodejs if not present. -Run the following command in terminal

npm i puppeteer 
 //or
yarn add puppeteer

Getting started-> -In this tutorial we will code an app for taking screenshot of the page!

-Create a basic template for nodejs.

npm init -y

Create index.js in the root directory.
Import puppeteer using const puppeteer = require('puppeteer-core'); in index.js file

Code for Index js

const puppeteer = require('puppeteer');

//async function as we are using await

(async () => {

 // launch your browser in background
  const browser = await puppeteer.launch();

 // load a new page
  const page = await browser.newPage();

// go to url -> https://example.com
  await page.goto('https://example.com');

//take screenshot of the page 
//image will be saved in root directory
// with filename example.png
//will take screenshot of the page that is visible in browser's window
  await page.screenshot({path: 'example.png'});

// use this instead of above statement for full page screenshot
await page.screenshot({path: 'example.png',fullpage:'true'});

//for closing the browser
// tip-> always close browser after finishing all tasks
  await browser.close();
})();

For avoiding any error always use try and catch inside the async function

try{
//code 
}
catch(error)
{
console.log(error);
}

And finally for running your code use node index in your terminal

Web Scraping with Puppeteer and Nodejs

Did you find this article valuable?