Facebookmatrix
© JBareham/facebook
Facebook, Inc. executives have long said that artificial intelligence would address the company's chronic problems keeping what it deems hate speech and excessive violence as well as underage users off its platforms.

That future is farther away than those executives suggest, according to internal documents reviewed by The Wall Street Journal. Facebook's AI can't consistently identify first-person shooting videos, racist rants and even, in one notable episode that puzzled internal researchers for weeks, the difference between cockfighting and car crashes.

On hate speech, the documents show, Facebook employees have estimated the company removes only a sliver of the posts that violate its rules — a low-single-digit percent, they say. When Facebook's algorithms aren't certain enough that content violates the rules to delete it, the platform shows that material to users less often — but the accounts that posted the material go unpunished.


The employees were analyzing Facebook's success at enforcing its own rules on content that it spells out in detail internally and in public documents like its community standards.

The documents reviewed by the Journal also show that Facebook two years ago cut the time human reviewers focused on hate-speech complaints from users and made other tweaks that reduced the overall number of complaints. That made the company more dependent on AI enforcement of its rules and inflated the apparent success of the technology in its public statistics.

According to the documents, those responsible for keeping the platform free from content Facebook deems offensive or dangerous acknowledge that the company is nowhere close to being able to reliably screen it.

A senior engineer and research scientist in a mid-2019 note wrote:
"The problem is that we do not and possibly never will have a model that captures even a majority of integrity harms, particularly in sensitive areas."
He estimated the company's automated systems removed posts that generated just 2% of the views of hate speech on the platform that violated its rules.
"Recent estimates suggest that unless there is a major change in strategy, it will be very difficult to improve this beyond 10-20% in the short-medium term."
This March, another team of Facebook employees drew a similar conclusion, estimating that those systems were removing posts that generated 3% to 5% of the views of hate speech on the platform, and 0.6% of all content that violated Facebook's policies against violence and incitement.