Browser-Use Tutorial: AI-Powered Browser Automation with LLM Agents

Introduction

Browser-Use is an open-source Python library that enables Large Language Models (LLMs) to autonomously control web browsers. With Browser-Use, you can build AI agents capable of navigating web pages, filling forms, extracting data, and executing complex browser tasks just like a human would.

This library bridges the gap between LLM reasoning capabilities and real-world interaction through browsers. Unlike traditional web scraping that requires fragile CSS selectors or XPath expressions, Browser-Use leverages the vision and reasoning capabilities of LLMs to understand web pages both visually and semantically.

Popular use cases for Browser-Use include:

Web Research Agent: Automatically search and gather information from multiple sources
Form Automation: Fill web forms automatically
Testing Agent: Perform automated UI testing
Data Extraction: Extract structured data from web pages
Workflow Automation: Automate multi-step workflows involving browser interactions

In this tutorial, we'll cover installation, basic usage, advanced techniques, and best practices for building reliable AI browser agents with Browser-Use.

Installation

Prerequisites

Before installing Browser-Use, make sure you have:

Python 3.11 or newer
pip or uv as your package manager
An API key from an LLM provider (OpenAI, Anthropic, or others)

Installation with pip

pip install browser-use

Installation with uv (Recommended)

uv pip install browser-use

Install Playwright Browser

Browser-Use uses Playwright as its browser engine. After installation, run:

playwright install chromium

Setup Environment Variables

Create a .env file in your project root:

OPENAIAPIKEY=sk-your-openai-key
ANTHROPICAPIKEY=sk-ant-your-anthropic-key

Verify Installation

import browseruse
print(f"Browser-Use version: {browseruse.version}")

Basic Usage

Your First Agent

Here's the simplest example to create a browser agent:

import asyncio
from browseruse import Agent

from langchainopenai import ChatOpenAI

async def main():
    agent = Agent(
        task="Search for today's Bitcoin price on Google and provide the result",
        llm=ChatOpenAI(model="gpt-4o"),
    )
    result = await agent.run()
    print(result)

asyncio.run(main())

The agent will open a browser, navigate to Google, search for the Bitcoin price, and return the result.

Using Anthropic Claude

Browser-Use supports various LLM providers. Here's an example using Claude:

import asyncio
from browseruse import Agent

from langchainanthropic import ChatAnthropic

async def main():
    agent = Agent(
        task="Open Wikipedia and search for information about Machine Learning",
        llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
    )
    result = await agent.run()
    print(result)

asyncio.run(main())

Running with Visible Browser

By default, the browser runs in headless mode. To see what the agent is doing:

import asyncio
from browseruse import Agent, Browser, BrowserConfig

from langchainopenai import ChatOpenAI

async def main():
    browser = Browser(
        config=BrowserConfig(
            headless=False,  # Browser is visible
        )
    )

    agent = Agent(
        task="Navigate to GitHub and search for the browser-use repository",
        llm=ChatOpenAI(model="gpt-4o"),
        browser=browser,
    )
    result = await agent.run()
    print(result)
    await browser.close()

asyncio.run(main())

Extracting Structured Data

Use Pydantic models to get structured output:

import asyncio
from pydantic import BaseModel

Browser-Use Tutorial: AI-Powered Browser Automation with LLM Agents

Browser-Use Tutorial: AI-Powered Browser Automation with LLM Agents

Introduction

Installation

Prerequisites

Installation with pip

Installation with uv (Recommended)

Install Playwright Browser

Setup Environment Variables

Verify Installation

Basic Usage

Your First Agent

Using Anthropic Claude

Running with Visible Browser

Extracting Structured Data

Related Articles

Phidata (Agno) Tutorial: Build Powerful AI Agents with a Simple Framework

CrewAI Tutorial: Building Multi-Agent AI Framework

MLX Tutorial: Apple's Machine Learning Framework for Apple Silicon

TRL Tutorial: LLM Post-Training with SFT, DPO, and Reward Modeling

Related Articles

Phidata (Agno) Tutorial: Build Powerful AI Agents with a Simple Framework

Tutorial Phidata (Agno): Framework AI Agent yang Simpel dan Powerful Membangun AI agent yang cerdas dan otonom kini sema...

CrewAI Tutorial: Building Multi-Agent AI Framework

CrewAI - Framework AI Multi-Agen Daftar Isi Pendahuluan Prasyarat Instalasi dan Pengaturan Konsep Dasar 5...

MLX Tutorial: Apple's Machine Learning Framework for Apple Silicon

Tutorial MLX: Framework Machine Learning Apple untuk Apple Silicon MLX adalah framework machine learning open-source dar...

TRL Tutorial: LLM Post-Training with SFT, DPO, and Reward Modeling

Post-Training LLM dengan TRL: SFT, Reward Modeling, dan DPO Setelah sebuah base language model selesai dipretraining, mo...