Navigation
View as Markdown

Web scraping prevention

Scraping is automated extraction of your content: bots and crawlers pulling pages far faster than a person would.

Step 1: Wire a working integration

Wire Rupt and run an access evaluation first (see the Access protection). The policies below trigger on access, so they only fire once that evaluation is in place.

Step 2: Add the policies

A policy has a trigger (the event it runs on) and a verdict. Add these in your policies dashboard:

PolicyTriggerConditionsVerdict
Block datacenter scrapersaccessip_is_hosting, or ip_is_proxyDeny
Throttle high velocityaccesshigh event_countChallenge

Datacenter IP alone catches a lot, since real users almost never browse from hosting infrastructure. Velocity separates a heavy reader from an extractor pulling pages at machine speed.