Skip to content

Get People Also Ask (PAA) questions from Google SERPs with Puppeteer (2019)

License

Notifications You must be signed in to change notification settings

jpigla/PAAs-from-SERPs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Get PAAs from Google SERPs

GitHub node npm GitHub last commit

⚠ Disclaimer

This software is not authorized by Google and doesn't follow Google's robots.txt. Scraping without Google explicit written permission is a violation of thei terms and conditions on scraping and can potentially cause a lawsuit

Requirements

Local Environment

NPM-Packages

Installation

  1. Download latest project release, extract and (if desired) move folder to your home directory
  2. Check if Node and NPM are already installed. Open Terminal and ...
  • type node -v in Terminal to check NodeJS version number (and if installed already)
  • type npm -v in Terminal to check NPM-Manager version number (and if installed already)
  • if not, install Homebrew (from https://brew.sh/index_de; Mac) and then NodeJS with brew update && brew install node
  1. In Terminal move to project folder (type cd folder/ if you named the project folder "folder")
  2. Install required NPM packages, type npm install in Terminal

Usage

Type npm run scraper -- --help for help (or read on).

Run script with arguments with one of the following commands

  • npm run scraper -- --clicks=[0-2/max] --kw=[...] --lang=[de/en] (--output=csv)
  • node get_paas.js --clicks=[0-2/max] --kw=[...] --lang=[de/en] (--output=csv)

Arguments

  • --clicks=[0-2/max] : how often click on new questions [0-2/max] (be patient when using max, ~3min)
  • --kw=[...] : input of keyword (search term) or "keywords" for batch mode (read line by line keywords from keywords.txt)
  • --lang=[de/en] : choose languange of google search [de/en]
  • --output=csv : (optional) to export list of questions

Examples

  • npm run scraper
    • -- --clicks=max --kw=firefox --output=csv --lang=en
    • -- --clicks=0 --kw=angela+merkel --lang=de
    • -- --clicks=0 --output=csv --kw=keywords --lang=en (batch mode)
  • node get_paas.js
    • --clicks=max --kw=firefox --output=csv --lang=en
    • --clicks=0 --kw=angela+merkel --lang=de
    • --clicks=0 --output=csv --kw=keywords --lang=en (batch mode)

What happens here

  1. Browser goes to https://www.google.com/search?hl=de&gl=DE&ie=utf-8&oe=utf-8&no_sw_cr=1&pws=0&q=[KEYWORD] (default/de)
  2. If clicks is set to 0 initially found questions are returend
  3. If clicks is set > 0 then sets of appearing questions (after clicks) are clicked N times (first set = 4 (initial) questions)
  4. Extract all questions from serp after clicking is done
  5. Output to CLI and CSV file (if csv argument is given)

Help & Information

  • If something breaks or errors occur during runtime, please ask Philipp at [email protected].

Changelog

Version 1.1 (15.10.2019)

  • Add npm script
  • Optimize performance
  • Add --help argument
  • Add --lang (language) argument [de/en]
  • Edit readme

Version 1.0 (07.10.2019)

  • Initial upload
  • Working version

License

All assets and code are under the GPL v3 License unless specified otherwise.

About

Get People Also Ask (PAA) questions from Google SERPs with Puppeteer (2019)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published