Playwright has become the go-to browser automation framework for production web scraping. It supports Chromium, Firefox, and WebKit with a single API, handles dynamic content natively, and has built-in proxy support. This guide covers everything from basic setup to scaling thousands of concurrent sessions with ZentisLabs proxies.
Why Playwright in 2026?
- Multi-browser: Chromium, Firefox, and WebKit from one codebase
- Auto-wait: Built-in intelligent waiting for elements, eliminating flaky selectors
- Network interception: Block images, fonts, and trackers to save bandwidth
- Stealth-ready: With plugins like
playwright-extra, you can evade most bot detection - Native proxy support: Per-context proxy configuration with authentication
Basic Setup with Proxies
# Install Playwrightpip install playwrightplaywright install chromiumfrom playwright.sync_api import sync_playwright
with sync_playwright() as p: browser = p.chromium.launch( proxy={ "server": "http://gate.zentislabs.com:7777", "username": "USER", "password": "PASS", } ) page = browser.new_page() page.goto("https://httpbin.org/ip") print(page.text_content("body")) browser.close()Stealth Configuration
Modern anti-bot systems like Cloudflare, DataDome, and PerimeterX check browser fingerprints. Here's how to configure Playwright to look like a real user:
from playwright.sync_api import sync_playwright
with sync_playwright() as p: browser = p.chromium.launch( headless=False, # Headed mode passes more checks proxy={"server": "http://gate.zentislabs.com:7777", "username": "USER", "password": "PASS"}, args=[ "--disable-blink-features=AutomationControlled", "--disable-dev-shm-usage", ] ) context = browser.new_context( viewport={"width": 1920, "height": 1080}, user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36", locale="en-US", timezone_id="America/New_York", ) # Remove webdriver property context.add_init_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})") page = context.new_page() page.goto("https://target-site.com") print(page.title()) browser.close()Saving Bandwidth
Block unnecessary resources to reduce proxy bandwidth consumption by 40-70%:
# Block images, fonts, stylesheets, and trackersdef block_resources(route): blocked = ["image", "font", "stylesheet", "media"] if route.request.resource_type in blocked: route.abort() elif any(d in route.request.url for d in ["google-analytics", "facebook", "doubleclick"]): route.abort() else: route.continue_()
page.route("**/*", block_resources)page.goto("https://target-site.com")Scaling with Concurrency
Use async Playwright with multiple browser contexts for concurrent scraping:
import asynciofrom playwright.async_api import async_playwright
async def scrape_url(browser, url): context = await browser.new_context( proxy={"server": "http://gate.zentislabs.com:7777", "username": "USER", "password": "PASS"} ) page = await context.new_page() await page.goto(url, wait_until="domcontentloaded") title = await page.title() await context.close() return {"url": url, "title": title}
async def main(): urls = ["https://example.com/page/" + str(i) for i in range(100)] async with async_playwright() as p: browser = await p.chromium.launch( proxy={"server": "http://gate.zentislabs.com:7777", "username": "USER", "password": "PASS"} ) # Process 10 pages at a time for batch in [urls[i:i+10] for i in range(0, len(urls), 10)]: results = await asyncio.gather(*[scrape_url(browser, u) for u in batch]) for r in results: print(r) await browser.close()
asyncio.run(main())Production Error Handling
import asynciofrom playwright.async_api import async_playwright, TimeoutError
async def scrape_with_retry(browser, url, max_retries=3): for attempt in range(max_retries): context = await browser.new_context( proxy={"server": "http://gate.zentislabs.com:7777", "username": "USER", "password": "PASS"} ) try: page = await context.new_page() response = await page.goto(url, timeout=30000) if response and response.status == 403: print(f"Blocked on attempt {attempt + 1}, retrying...") continue return await page.content() except TimeoutError: print(f"Timeout on attempt {attempt + 1}") finally: await context.close() return NoneRunning in Docker
FROM mcr.microsoft.com/playwright/python:v1.42.0-jammy
WORKDIR /appCOPY requirements.txt .RUN pip install -r requirements.txtCOPY . .
CMD ["python", "scraper.py"]🚀 ZentisLabs residential proxies rotate IPs automatically — each new browser context gets a fresh IP. For sticky sessions (multi-page flows), append _session-xyz to your password.
