nikhil-patil
#puppeteer #rpa #automation #node-js #backend

Building RPA Workflows with Puppeteer in Node.js

How I automated third-party website interactions using Puppeteer to eliminate hours of manual data entry

5 min read min read

What is RPA?

Robotic Process Automation (RPA) means automating repetitive tasks that a human would normally do manually in a UI — clicking buttons, filling forms, extracting data.

Our Use Case

We needed to automatically create user accounts on a third-party website and extract transaction data from it. The site had no API. Puppeteer was the solution.

Account Creation Automation

const puppeteer = require('puppeteer');

async function createAccount(userData) {
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();

  await page.goto('https://third-party-site.com/register');
  
  // Fill in the form
  await page.type('#firstName', userData.firstName);
  await page.type('#lastName', userData.lastName);
  await page.type('#email', userData.email);
  await page.type('#password', userData.password);
  
  // Submit and wait for navigation
  await Promise.all([
    page.waitForNavigation(),
    page.click('#submit-btn')
  ]);

  // Extract the new account ID from the success page
  const accountId = await page.$eval('#account-id', el => el.textContent);
  
  await browser.close();
  return accountId;
}

Handling Common Issues

1. Dynamic content — use waitForSelector instead of fixed delays

2. Rate limiting — add random delays between requests

const delay = (ms) => new Promise(resolve => setTimeout(resolve, ms));
await delay(Math.random() * 2000 + 1000); // 1-3 second random delay

3. Session management — save cookies to avoid re-logging in

const cookies = await page.cookies();
await fs.writeFile('session.json', JSON.stringify(cookies));

Lessons Learned

  • Always run headless in production but headed locally for debugging
  • Third-party sites change their UI — build selectors that are resilient
  • Queue your automation jobs with SQS to avoid hammering the target site