Navigation
Web scraping prevention
Scraping is automated extraction of your content: bots and crawlers pulling pages far faster than a person would.
Step 1: Wire a working integration
Wire Rupt and run an access evaluation first (see the Access protection). The policies below trigger on access, so they only fire once that evaluation is in place.
Step 2: Add the policies
A policy has a trigger (the event it runs on) and a verdict. Add these in your policies dashboard:
| Policy | Trigger | Conditions | Verdict |
|---|---|---|---|
| Block datacenter scrapers | access | ip_is_hosting, or ip_is_proxy | Deny |
| Throttle high velocity | access | high event_count | Challenge |
Datacenter IP alone catches a lot, since real users almost never browse from hosting infrastructure. Velocity separates a heavy reader from an extractor pulling pages at machine speed.
Related
- Need help? Contact support.
- Want to see Rupt in action? Request a demo.
- Questions? Talk to sales.
- Check out our changelog.
- Check our status page.
- LLM? Read llms.txt.