Does Brave sell your copyrighted data to train AI?

Is it accurate that Brave browser sells copyrighted data to train AI, as mentioned in this article: “Brave Browser Under Fire For Alleged Sale Of Copyrighted Data”? I’m seeking your input to help determine the veracity of this claim. You can find the article here:

@Daniel19 The answer is both yes and no. The issue being presented is being done in a way to mislead people and create drama. It’s basically a trick question. It’s because of this that I call them, especially the person who wrote the initial article, trolls.

If you look at (which is one of the original articles that’s quoted by the article you linked) you’ll at least see where the person included a quote from Brave:

The rights are to the output of the API request, which is a set of results to a query sent by the API user. Brave Search has the right to monetize and put terms of service on the output of its search-engine. The “content of web page” is always an excerpt that depends on the user’s query, always with attribution to the URI of the content. This is a standard and expected feature of all search engines.

I guess you can kind of think of this like Brave is a tour bus. They can drive you around and tell you about the houses outside or you can view things that are visible from the road. They aren’t selling you access to the property, copies of the art you might see, or any of that. You’d be charged for the use of their vehicle and knowledge of what routes to take.

That’s what the Search API is, just the vehicle and their basic notes of destinations and how to get there.

That aside, I’m also going to quote myself based on a different conversation I was having with someone in a semi-related topic:

just an FYI, that’s not quite what’s happening. There’s a balance of what’s called Fair Use and some governments even have rules that things like Text and Data Mining (TDM) laws in place. None of those are seen as violation of copyright laws.

Keep in mind they aren’t reselling data. They are selling their API. Otherwise Search is just helping to search the web to find resources and to have basic quotes. They don’t give full access to information or anything.

Training AI is same as if we read something and talk about it. As long as it’s not duplicating the books and all to provide or sell, then it’s just foundational knowledge.

There were some links I saw shared earlier today. I’m not sure if any of those might be helpful to you or anyone else reading here. But it’s quite interesting to see the discussions going on in the matter.

  • Does Brave sell copyrighted data? Not really. Where this is easily trolled and shared for drama and views is that there’s never an absolute. While Brave isn’t stealing and selling, it is possible that a resource from a pirated site that gets crawled/indexed and will have info. That then could be presented unknowingly via the API. In that case, would you say Brave is selling copyrighted data? I mean, technically they would have but not by design.

  • Does Brave sell access to their search engine API that other places can use to gather data and train AI? Yes

  • Is it possible for copyrighted information to end up on websites or places and not be marked as such? Therefore it could be included in data unintentionally? Yes.

  • Does Brave make an attempt to make sure they are following all laws and they aren’t violating copyright? Absolutely

  • Is it legal for Brave to do what it’s doing? Yes

  • Is the amount of data Brave shares through Search API enough to violate copyright laws? No. As you saw in the quote, it’s only a small summary/excerpt of a website.

Keep in mind the API is the same as if you’re just going to Brave Search, Google, DuckDuckGo, Yahoo, Bing, or whichever search engine and you look for topics. That’s all the API would be, search results.

1 Like

Hello Saoiray,

Thank you for your detailed response and clarification regarding Brave’s API and its use for AI training.

By the way, I’ve also noticed a minor typo on the page.


It seems that either a dot is missing or the word wasn’t completed properly.

1 Like