Skip to content

docs(domain-skills/ebay): detect edge blocks, international pricing, fuzzy matches#362

Open
Lhy099 wants to merge 1 commit into
browser-use:mainfrom
Lhy099:feat/ebay-skill-international-pricing-and-block-detection
Open

docs(domain-skills/ebay): detect edge blocks, international pricing, fuzzy matches#362
Lhy099 wants to merge 1 commit into
browser-use:mainfrom
Lhy099:feat/ebay-skill-international-pricing-and-block-detection

Conversation

@Lhy099
Copy link
Copy Markdown

@Lhy099 Lhy099 commented May 15, 2026

Summary

The eBay scraping skill claimed "Chrome is NOT required" and the extractor assumed USD, but field-testing from a CN/HK IP exposed three gaps that silently corrupt agent output. This PR documents the gaps and patches the extractor.

Field-tested gaps

1. Akamai edge block (HTTP 403/503) is upstream of "Pardon Our Interruption"

Low-reputation IPs (VPN, datacenter, non-US residential) get rejected by Akamai before any eBay HTML is served. The old is_blocked(html) only inspects text, but http_get raises HTTPError first and never gets to call it — the agent sees a traceback instead of a clean block signal.

2. s-card__price returns localized strings, not just $xx.xx

A US IP returns "$74.99", a CN/HK IP returns "HKD 297.54", an EU IP returns "EUR 49.99". The old \$([0-9,\.]+) regex returned None for every non-US-IP price.

3. "No exact matches" is structurally invisible

A query with no exact match still renders ~60 fuzzy/recommended cards. The only signal is a small "No exact matches found" hint in the HTML. Without checking it, agents silently treat substitutes as successful matches.

Changes

  • Add Prerequisites section: BROWSER_USE_API_KEY for fetch-use routing, real-Chrome fallback when the local IP is untrusted.
  • Extend is_blocked() to accept urllib.error.HTTPError(403/429/503); add safe_get() wrapper that returns None on either block layer.
  • Rewrite price/strikethrough extraction to capture the raw inner text of s-card__price instead of a $-prefixed numeric.
  • Add has_exact_match() + new section so agents surface fuzzy fallback results to the user instead of silently substituting.
  • Three new Gotcha entries (Akamai 403, currency localization, silent No-exact-match).

Field-test evidence (2026-05-15)

Query dj posket4 from a HK IP via harness Chrome path (because http_get 403d at the edge):

  • 0 exact matches, 60 fuzzy/recommended cards
  • 5 of 60 had posket in the title (Q posket); the other 55 were unrelated
  • All prices reported as HKD: HKD 297.54, HKD 273.98, HKD 446.32, HKD 422.75, HKD 297.54
  • US-IP behavior from the original 2026-04-18 test ($74.99 etc.) is preserved in the example output — both currency formats now parse correctly with the same extractor.

Diff

Single file: agent-workspace/domain-skills/ebay/scraping.md (+88 / -20).


Summary by cubic

Docs update to the eBay scraping skill to handle Akamai edge blocks, localized currency prices, and “No exact matches” pages to prevent bad results. Adds BROWSER_USE_API_KEY/Chrome fallback guidance, a safe_get() + improved is_blocked(), currency‑preserving price extraction, and has_exact_match() to flag fuzzy results.

Written for commit 77372b7. Summary will update on new commits.

…fuzzy matches

The skill claimed "Chrome is NOT required" and the extractor assumed USD,
but field-testing from a CN/HK IP exposed three gaps that silently corrupt
agent output:

- Akamai edge returns HTTP 403/503 for low-reputation IPs (VPN, datacenter,
  non-US residential) before any eBay HTML is served. The old text-only
  `is_blocked()` never runs because `http_get` raises `HTTPError` first;
  the agent sees a traceback, not a clean block signal.

- `s-card__price` ships raw localized strings ("HKD 297.54", "EUR 49.99"),
  not just "$74.99". The old `\$([0-9,\.]+)` regex returned None for every
  non-US-IP price.

- A query with no exact matches still renders ~60 fuzzy/recommended cards
  with no structural difference. The only signal is a "No exact matches
  found" hint; without checking it, agents silently treat substitutes as
  successful matches.

Changes:
- Document `BROWSER_USE_API_KEY` + Chrome fallback as the way to bypass
  the edge layer when the local IP is untrusted.
- Extend `is_blocked()` to accept HTTPError(403/429/503); add `safe_get()`
  wrapper that returns None on either block layer.
- Rewrite price/strikethrough extraction to capture the raw inner text of
  `s-card__price` instead of a `$`-prefixed numeric.
- Add `has_exact_match()` + a new section so agents surface fuzzy fallback
  results to the user instead of silently substituting.

Field-tested 2026-05-15: query "dj posket4" from a HK IP → 0 exact matches,
60 fuzzy cards (5 with "posket" in title), all prices reported as HKD.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant