MarketerMatt
Back to blog
8 min read

I Tested 4 Web Scraping APIs for Retail Price Tracking

I tested 4 web scraping APIs in production for retail price tracking. Real numbers on what works for Lowes, Home Depot, and other anti-bot sites.

I tested 4 web scraping APIs in production. Here's what actually works for retail price tracking in 2026.

I run Ecom Circles, a SaaS for Amazon and Walmart sellers. One of the things our pipeline does is track competitor prices across the supplier sites our sellers source from. Lowes, Home Depot, Walmart, CVS, Sephora, plus 25+ other retail sites. To do that at any reasonable scale, you need a web scraping setup that actually works on sites with serious anti-bot protection.

Over the last few weeks I tested 8 different methods across about 30 retail suppliers, ran them in production, and figured out which providers handle which sites. Here's what I found.

Why this is harder than it used to be

Five years ago you could write a Python script with the requests library, set a User-Agent header, and scrape almost anything. That doesn't work anymore on the big retailers. Lowes, Home Depot, Macy's, Nordstrom, and Bass Pro all run Akamai or similar enterprise WAFs. They fingerprint your TLS handshake, run JavaScript challenges, and block anything that looks remotely like a bot.

So you have two options. Either pay a scraping provider that runs residential proxies and handles the anti-bot stuff for you, or accept that about 30% of retail is going to fail. We picked option one because the data is too useful to ignore.

The providers I tested

Four paid services plus direct fetch:

Direct fetch. Just a normal HTTPS request with a browser User-Agent. Free. No JavaScript rendering. 10-second timeout.

ScrapeOwl. Cheap and premium tiers, supports JavaScript rendering. More on this one in a minute.

ScraperAPI. Cheap and premium tiers, optional JavaScript rendering. About $0.005 per request at scale.

Scrape.do. Basic and "super" tiers. Super tier uses residential proxies. Basic is around $0.001 to $0.005 per request.

Firecrawl. Single-tier API, marketed as the AI-friendly scraper.

For each provider I ran 7 or more URLs per supplier and measured two things: did the request succeed, and did my parser actually extract a price from the HTML it returned. Both have to work. Either one failing kills the data point.

Direct fetch wins on more sites than you'd expect

This was the biggest surprise. 9 of the 30 suppliers I tested work just fine with a normal HTTPS request, no provider needed. Amazon, Walmart, Walgreens, Sam's Club, Ulta, Big Lots, Office Depot, Ace Hardware, and Best Buy all expose price data in JSON-LD or og:price meta tags. You don't need JavaScript rendering. You don't need a proxy. You just need to actually look at the HTML and parse it correctly.

The lesson: before you pay for any scraping provider, try a plain fetch first. A surprising amount of retail still serves prices in clean HTML. Every paid request you can avoid is money you don't spend.

When direct fails, it's almost always Akamai

The other ~30% of retailers (Lowes, Home Depot, CVS, Sephora, Macy's, Nordstrom, Wayfair, Grainger) block direct requests with 403s. In every case I traced it back to Akamai's bot protection or something similar. Same fingerprinting tech, same response.

This is where you actually need a paid provider, and not all of them get through.

Scrape.do handled every anti-bot site I threw at it

For my use case Scrape.do was the standout. Cheap tier worked on Home Depot (7/7 pages parsed) and Sephora (6/6, around 1.8 second average latency). Super tier with residential proxies handled the harder ones: Lowes (7/7), CVS (5/7), and Macy's. Lowes specifically, which Akamai locks down hard, was the one site where I couldn't find any cheaper alternative.

For context, refreshing 551 Lowes product pages costs somewhere between $1 and $3 a day with Scrape.do super. The economics work.

ScraperAPI premium did not beat Scrape.do basic

This one tripped me up. I assumed "premium" tiers across providers were roughly equivalent and that paying more would unlock the harder sites. Not true. ScraperAPI premium-residential, which is supposed to be their top tier, still got blocked on Lowes. Meanwhile Scrape.do basic (the cheaper tier) worked fine on Home Depot and Sephora.

What I ended up doing: Scrape.do as the primary for hard sites, ScraperAPI as fallback for a couple of edge cases (eBay, where it parses the marketplace offer JSON better than the others), and Firecrawl for Target and Dick's Sporting Goods where it had the highest parse rate.

Lesson: don't assume tier names mean the same thing across vendors. Test the actual sites you care about.

The trial key trap

I almost didn't notice this one. Our ScrapeOwl trial keys quietly expired three weeks before I caught it. The API was returning 403s with a "trial expired" message, the waterfall was failing through to ScraperAPI for everything ScrapeOwl used to handle, and our scraping bill went up while overall coverage stayed about the same. From the outside, nothing looked obviously broken.

If we'd had a real production account from day one, this wouldn't have happened. If we'd had per-provider success-rate alerting, we'd have caught it the first day.

Lesson: trial keys are fine for evaluation. Don't run production traffic through them. And whatever you build, monitor each provider's success rate independently so a silent failure doesn't sneak past you for a month.

Render mode is not the bottleneck. Parsing is.

Half my early debugging time went into "should I enable JavaScript rendering on this provider." Almost none of it mattered. The actual issue on the sites that gave me trouble was that the HTML came back fine but my parser couldn't find the price.

Nordstrom is the clearest example. Scrape.do super returns the page, headers and all. But the page doesn't expose price in JSON-LD or og:price like most retailers. It's buried in some site-specific data structure. So I'd get 5/7 successful fetches and 0/7 parsed prices. Burlington, Bass Pro, Cabela's, and REI all had the same pattern. Fetch succeeded. Parser failed.

If you're building a scraper, write your parser first against a known-working site (Walmart is a great test case), then add provider-specific handling for the hard ones. Don't spend two days tuning render modes when you have a parsing problem.

What I'd do starting over

If I were building this from scratch today:

1. Try direct fetch first on every site. It works more than you'd think. Free.

2. For the sites where direct fails, default to Scrape.do. Basic tier for most, super tier for Lowes and the other Akamai-heavy ones.

3. Keep one fallback provider in the mix for the few sites where Scrape.do parsing is weak. ScraperAPI or Firecrawl, whichever has better parse rates on your specific URLs.

4. Skip the trial-key approach entirely. Pay for the cheapest production tier from day one. The cost of a silent expiration is way higher than the few dollars you save during evaluation.

5. Build per-provider monitoring on day one. Track success rate, parse rate, and cost per request, per provider, per supplier. The waterfall is only as good as your visibility into which step is failing.

The bigger point

Most "best scraping API" articles online are SEO content from the providers themselves or affiliate sites. They never run real production traffic, they never test enough sites to find the gaps, and they all conclude that whichever provider paid them the most is the best one.

The real answer is that it depends entirely on which sites you're scraping. The right setup for tracking Amazon prices is different from the right setup for tracking Lowes prices. Both are different from scraping news sites or government data.

This is the same pattern I see in ecommerce automation generally. Test on your actual targets. Build monitoring before you need it. Don't trust marketing pages. That's the playbook.

Matt Hall

Builder, Marketer, Automator. I run Scepter Marketing, Ecom Circles, Alfred, and Scepter Commerce. I write about what I'm building and what I'm learning.

More from build log