Bravebot & AbuseIPDB

Hi Everyone.

I use & enjoy Brave on desktop, on mobile. I also operate my own server which hosts my family’s ecommerce business and also all of our personal projects & WordPress blogs. We use abuseipdb to block malicious traffic, as quite often we get people targeting us for DDOS and other denial of service abuse, not to mention form spam. It works really well & I punch holes in the blocking for search engine crawlers and bots. I use user agents of common crawlers to allow access. AbuseIPDB also whitelists common crawlers, so brave could even contact them and add these IP ranges to the whitelist if they really don’t want to publish them.

The problem is, brave says they just use a standard user agent, so I believe they must already have a lot of false positive reports because they crawl without announcing. I can whitelist by IP, but brave also don’t make this public, so I have no way of knowing if they are being blocked or any way to give them access. I can’t be the only one dealing with this.

Can we have an IP range for the bravebot so I can whitelist & allow our sites & content to appear in brave search. I can’t imagine how many sites aren’t showing in search because website owners have no way of identifying & letting them in. Seems counterproductive.

@jodsclass Brave uses Web Discovery Project for crawling. This means users are the crawler, basically.

Because of the way it works, there’s no unique identifier or IP range for it. Essentially any website that allows itself to be indexed by Google or other search engines can be indexed by Brave. To actually quote a portion of another reply I made before:

Brave doesn’t use a traditional, centralized crawler; its data is collected anonymously via scripts running inside users’ browsers. Because there isn’t one single “crawler” (like Googlebot) making requests, the issue of disclosing its identity or contacting every site owner doesn’t arise in the same way. Their “practical reasons” remark reflects the challenges of managing a decentralized, client‑side system rather than an intent to bypass sites that block crawlers.