Build, Break, Repeat
Read more about: agentsinfrastructureclaude-codecontext-engineeringtool-design
/Article

Web Search for Agents in 2026

I’m building Mercury - my own take on a claw-like system. It needs web search. Web search is one of those features you realize you need pretty fast. Someone asks about something recent, the model doesn’t know, and you’re stuck.

So I went looking for a search API. What I found was a market with 17+ services, contradictory benchmarks, and no clear winner. This post is everything I learned.

What I need

My agent runs tools inside a sandboxed subprocess. That means:

  • A CLI or a native pi tool - either works, but I want tight integration with the agent runtime
  • Content extraction - getting search results is half the job. For the other half I use markit (which I built) or pi’s built-in webfetch tool
  • Low latency - this is a chat interface, people are waiting
  • Reasonable cost - this thing runs 24/7 across multiple conversations
  • Reliability - if the API goes down, my agent goes blind

The landscape

There are four types of search APIs. The distinction matters more than any benchmark.

Own index - Companies that crawl the web and build their own search index. Independent from Google. This is Brave, Exa, Parallel, and You.com.

SERP scrapers - Companies that query Google/Bing and return structured results. You’re paying for someone else’s scraping infrastructure. SerpAPI, Serper, DataForSEO.

Provider built-ins - Search integrated directly into the model API. OpenAI’s web search tool, xAI/Grok’s web search, Perplexity Sonar. Convenient but opaque - you don’t control the search, the model does.

Real-time crawlers - No index at all. They fetch and parse pages on demand. Firecrawl does this. Useful for content extraction, less for discovery.

The own-index providers are the interesting ones. When SerpAPI goes down, it’s because Google changed their HTML. When Brave goes down, it’s their own infrastructure. One of these failure modes is in your control to route around. The other isn’t.

Every service I found

Own index

ServicePricingFree TierCLISDK
Brave Search$5/1k queries2,000/mo (non-commercial)NoAPI, MCP
Exa$5/1k search1,000/moNoPython, TS, MCP
Parallel$0.005/req16,000 freeYesPython, MCP
You.comEnterpriseUnknownNoAPI

SERP scrapers

ServicePricingFree TierData Source
SerpAPI$75/5k searches100/mo40+ engines
Serper$0.30-1.00/1k2,500 queries (no CC)Google
DataForSEO$0.60/1kNo ($50 min)Google

Provider built-ins

ServicePricingNotes
OpenAI Web SearchPart of model costBuilt into Responses API
xAI/GrokPart of model costIncludes X/Twitter search, image understanding
Perplexity Sonar$5/1k queriesNo free API tier (Pro users get $5 credit)

Search + extraction

ServicePricingFree TierDifferentiator
Tavily$0.008/credit1,000/mo (no CC)Popular in framework ecosystems
Firecrawl$19/mo (3k credits)YesSearch + full extraction + /agent endpoint
LinkupPay as you go€5/mo free creditsPremium/paywalled sources
ValyuFree trialYesAcademic/paywalled sources

Content extraction

ServiceWhat it does
Jina AI ReaderURL to markdown via r.jina.ai prefix
FirecrawlCrawl + extract + search in one
Parallel ExtractURL to compressed excerpts

Data sources matter

This is the part most comparisons skip. Where do the results actually come from?

Exa built their own neural search index. They’ve been crawling the web and encoding pages since they were called Metaphor. Their index covers 70M+ companies, 1B+ profiles, GitHub repos, Stack Overflow, docs. It’s trained on link prediction - given a prompt, predict what URL a human would share. That’s a fundamentally different approach than keyword matching.

Brave runs their own independent index. Not a Google wrapper. They power 22M answers/day in Brave Search and recently launched an LLM Context API specifically optimized for agent consumption.

Parallel has their own web-scale index and recently raised $100M to build it out. They offer search, extract, deep research, entity discovery, and web monitoring - all from their own infrastructure.

Then there’s Tavily. They claim to index the live web, but there are reports of issues with JS-rendered pages and cached results. One user on GitHub: “Tavily has issue with JS-rendered pages. It seems to be doing it offline and then caching it. It’s flaky.” Worth noting that Tavily was acquired by Nebius and raised $25M total.

SerpAPI, Serper, and DataForSEO are straightforward - they scrape Google. You get Google’s quality. You also get Google’s rate limits, CAPTCHAs, and the risk of your scraping proxy getting banned. At scale, this is a reliability problem.

The benchmarks

A note before we dive in: I take all of these with a massive grain of salt. Every vendor benchmark is designed to make that vendor look good. Even the “independent” ones are often published by companies with a dog in the fight, or use methodologies that favor certain architectures. I’m listing them because they’re the best data available, not because I trust any single one. The value is in the patterns across multiple benchmarks, not any individual result.

AIMultiple (Feb 2026) tested 8 APIs with 5 results per query. Their “Agent Score” (relevance × quality):

  • Brave Search: 14.89 (highest)
  • Firecrawl, Exa, Parallel Search Pro: essentially tied, within random variation
  • Tavily: notably below the top tier
  • Perplexity: also below

Valyu benchmark (Feb 2026) ran 5,000+ queries across 5 domains using Vercel AI SDK. FreshQA results (time-sensitive questions):

  • Valyu: 79%
  • Parallel: 52%
  • Google: 39%
  • Exa: 24%

That’s a 55-point gap between best and worst. Exa’s low score on freshness is notable - their neural index is great for semantic similarity but struggles with recency.

Proxyway (Mar 2026) benchmarked 15 providers across three categories. Their report is the most comprehensive I’ve found.

You.com published their own benchmarks claiming they outperform everyone on speed, accuracy, and freshness. Take vendor benchmarks with the appropriate grain of salt.

Brave claims their LLM Context API (powering Ask Brave with Qwen3) outperforms ChatGPT, Perplexity, and Google AI Mode.

The honest takeaway: Brave, Exa, Parallel, and Firecrawl are in the top tier. Tavily is popular (1M+ downloads) but consistently benchmarks below the leaders. Freshness is hard - even good providers fail on time-sensitive queries.

SDKs, CLIs, and integrations

For my use case - my agent harness uses pi as the runtime underneath. I plan to implement several of these as native pi tools and let users pick which search provider they want. A CLI is nice when one exists, but I prefer the TypeScript ecosystem and for search specifically I want a native tool - tighter control over output formatting, error handling, and cost tracking. The agent shouldn’t have to shell out to Python for something this fundamental.

Here’s what each offers:

Parallel is the only one with a proper CLI (parallel-cli, installed via pip install parallel-web-tools). It does search, extract, research, enrich, entity discovery, and web monitoring from the terminal. For a sandboxed agent, this is a huge advantage.

Exa has solid Python and TypeScript SDKs, plus an MCP server. No CLI.

Brave has an MCP server and recently launched “Skills” for coding agents. API-first, no CLI.

Tavily has npm and Python SDKs and an MCP server. Popular in the framework ecosystem, lots of tutorials and examples out there.

Firecrawl has a CLI, Python/TS SDKs, and integrations with n8n and Zapier. 98.3K GitHub stars. If you need search + full page extraction in one call, this is compelling.

For MCP support (relevant if you’re using Claude or compatible tools): Exa, Parallel, Brave, Tavily, and Firecrawl all have MCP servers.

The content extraction gap

Here’s the thing most search API comparisons miss. Search gives you URLs and snippets. Your agent needs full text. The gap between “here are 10 relevant links” and “here’s the information you need” is where most agent workflows break down.

Some APIs handle this natively:

  • Firecrawl extracts full page content as markdown, supports custom schemas
  • Exa returns page summaries and highlights with search results
  • Parallel offers compressed excerpts and full extraction
  • Tavily has include_raw_content for inline extraction

Others just give you URLs, and you need a second tool:

  • Jina Reader - prepend r.jina.ai/ to any URL, get markdown back. Simple and effective
  • markit - local CLI that converts anything (PDF, DOCX, HTML, URLs) to markdown. No API dependency. I added this to Mercury for document processing

For agents that need to read full pages, the search + extraction pipeline matters more than search quality alone. A mediocre search with great extraction beats a great search that only returns snippets.

What I’m going with

I’m going to implement several of these as native pi tools and let users pick. Brave, Exa, and Parallel are the top contenders for the first batch - they all have their own index, solid APIs, and reasonable pricing.

I’m starting with Parallel mostly because of the generous free tier (16,000 requests). That’s enough to build and test without worrying about cost. From there I’ll add Brave and Exa and compare in practice.

For content extraction I use markit locally - I’ll probably build a native pi tool for web fetching too. For browser automation when search isn’t enough, agent-browser gives the agent a headless Chrome it can drive directly.

There’s no universal right answer here. The market is still shaking out and the benchmarks contradict each other. Pick one, build with it, swap later if needed - most of these APIs have similar enough interfaces that switching isn’t painful.


All pricing and features as of March 2026. This market moves fast - verify current details at the linked pricing pages.