Content Moderation Quality Assurance

Quality in trust and safety refers to the process used to evaluate the correctness of moderation decisions. Monitoring quality is critical to ensuring that moderation processes are consistent, fair, and accurate. This applies to decisions from both manual reviewers and automated systems.

Errors in content moderation decisions can be broadly grouped into four categories:

  1. False positives, where an action is taken when it should not be. These errors result in unfair restrictions and removals affecting non-violating users.
  2. False negatives, where an action should be taken and is not. These errors result in a failure to protect users from violating content.
  3. Wrong selection, where an action is taken, but it is the wrong action or based on the wrong policy. These errors result in inconsistency and confusion around how different content is treated.
  4. Technical errors, where the correct decision is made based on the information available, but some intervening factor means the action taken is not correct. Examples include missing or incomplete information affecting the review, or a bug causing the action to not be completed successfully.

The amount of harm an error can cause varies significantly from case to case. For example, accidentally banning a user who has been inactive on a platform for years is likely to be less harmful than accidentally banning a major company that relies on the platform to operate every day.

How to Evaluate Quality and Identify Errors

Quality Sampling

A common way of measuring quality is to take regular samples of decisions being made and have them reviewed again to see if the first decision was correct. In addition to measuring quality, these re-reviews also provide the opportunity to prevent or correct mistakes.

Quality reviews can be simple peer reviews, or reviews by a dedicated quality team. Large samples can be expensive in terms of reviewer time but are the simplest way to get statistically rigorous data on decision quality. Samples are often weighted to get more evidence on areas of interest or to reduce volatility.

Reverse Quality Sampling

Reverse quality sampling involves taking a set of pre-reviewed examples and sending them through the regular review process to see if they are handled correctly. This method allows complete control over the examples sent to reviewers to provide information on specifically chosen issues and edge cases. However, selecting specific cases can introduce bias if the older samples do not accurately reflect the new situations reviewers are seeing.

Reverse quality can also pose technical challenges. Examples must look exactly like regular reviews to be effective; this can be a problem when looking at a user’s history is part of the review process, or in situations where older reviews involve imminent risk that is no longer imminent.


Another common way to evaluate quality is through an appeals process where users make requests for moderation decisions to be reviewed. Because appeals often question a decision and appeals are being evaluated anyway, this can provide information about quality in general and false positives in particular at a low cost. However, this approach can also introduce bias as the population of appeals is often different from the population of all decisions. This process does have some advantages though, most notably the issues that users appeal the most are often the issues they care about most. See Appeals for more details.

How to Assess Root Causes

The RCA Process

Root Cause Analysis (RCA) is an analytical process for identifying why a particular problem or change occurred. By identifying the underlying causes of such problems, RCA can help resolve these issues more quickly and reduce the likelihood of recurrence.

While there are some minor variations between different methodologies, most RCA processes consist of the following steps:

  1. Define the problem: Identify the core issue of concern and how it is affecting the user or platform.
  1. Gather data: Find information on the scale, scope, and impact of the issue. Use historical data to try and identify when the problem began. Also examine the process to find where the problem is detected.
  1. Identify possible causes: Generate a list of possible causes consistent with the problem and data. Identify any common factors such as specific teams, processes or systems. Identify any known events, such as launches or population shifts that may be responsible.
  1. Identify a root cause: Work through likely causes until a root cause is found. Confirm that a specific cause is present. Establish a causal link to the defined problem and ensure that this link is consistent with gathered data.
  1. Implement a solution: Decide on and implement a solution to address the underlying root cause. Afterwards, evaluate the affected areas and examples to confirm the issue is resolved.

These steps do not have to be followed rigidly. Often there will be an obvious and easily verifiable cause and performing extensive analysis of other potential causes is a waste of resources. Similarly, if early on in the process it is shown that the problem is trivial or appears to have stopped recently, then further analysis may be halted, especially if the cost of further investigation or addressing the error is prohibitively high.

Another important point to note is that solving one error does not always resolve the issue entirely. It is possible that there are multiple causes for a given problem, either independently or interacting with each other. The solution itself may also cause large changes in processes or overall quality, so it is important to communicate such changes clearly to ensure that everyone affected is fully aware.

Common Impact Factors

There are a wide range of reasons why an error could be made in any individual moderation decision. However, when it comes to larger negative trends and disruptions in quality, there are a few general issues that are particularly likely to be important factors. These issues occur in a wide range of situations and will often have widespread or sustained effects.

The first likely factor is policy that does not align with goals. No policy can perfectly capture all real world scenarios. Misalignment between policy and the best possible decision can result in decisions that are objectionable or harmful, even when at first glance the policy seems reasonable. This is particularly common when policies are rigidly enforced or overly simplified. It is also common with policy that is technically correct but confusing or vague, or with policy that is rapidly changing to react to world events.

The second likely factor is reviewers not implementing policy or guidelines correctly. Reviewers are human beings and inevitably make mistakes at times, but when mistakes occur repeatedly this is more concerning. This can happen with individuals, or with groups based on some common factor. For example, if a training update was not delivered correctly to one office location, or if teams in a particular country introduce a source of cultural bias, then this might result in a group of reviewers making the same common mistake.

The third likely factor is system or automation errors. Automated systems that can make or adjust decisions can cause big changes in quality because they can act at scale. Because these decisions are based on machine readable signals rather than those a human would use, the mistakes here often appear the most extreme and unreasonable to outside observers. If a correct decision is made but the process for taking action fails, this can also affect quality. These kinds of failures may not be immediately obvious as the correct review may make it appear as though everything is fine, even if users are still being negatively affected.

How to Improve Quality

While individual events and problems that affect quality can be resolved as they arise, creating sustainable high quality requires organized management. Most popular quality management systems were originally developed for physical manufacturing. However, there is a great deal of crossover with the challenges faced in “manufacturing” information and moderation decisions.

In particular, it is important to note that while quality inspections and appeals can detect problems with quality, they cannot by themselves solve those problems. Sustainable quality improvement requires the improvement of the underlying processes involved, whether through refining processes, better training, more resources, or the elimination of bugs and errors. 

Quality Management Systems (QMS) are sets of processes, tools, and methodologies used to evaluate and improve quality. Different QMS often attach different levels of focus and importance to different issues (i.e.,statistical measurement, management behavior and philosophy, testing and inspection) and may also be specialized for specific problems and industries. 

However, QMS tend to have some common features. In particular, most use a version of the PDSA or PDCA cycle (Plan, Do, Study/Check, Act), an iterative method for making improvements. Most also come with a set of core principles to follow that encourage sustained improvement, which contain elements such as management and leadership, engagement with both staff and customers, and evidence based decisions and actions.

ISO 9000

The ISO 9000 series is a family of Quality Management Systems originally created by the International Organization for Standardization in 1987. ISO produces general guidelines that can be applied to a wide variety of problems and areas as well as more specialized standards for areas such as medicine. ISO also integrates many external quality management systems into the list of standards over time. ISO 9001 is the most common and widely applicable QMS within the series.

Six Sigma

Six Sigma is a QMS developed at Motorola in 1986, with a focus on eliminating defects and minimizing overall levels of variability and volatility in a process. Six Sigma is particularly well known for heavy use of statistical measures and for the DMAIC process (Define, Measure, Analyze, Improve, Control). Courses and third party certifications for Six Sigma are widely available globally. Six Sigma is also often combined with Lean, a similar system focused more on eliminating waste and inefficiencies.

Total Quality Management (TQM)

Total Quality Management, or TQM, is a broader process management improvement system designed to broadly cover business functions that can affect users. TQM approaches quality with a principle of “right first time,” so it generally emphasizes higher up front costs for design and system construction as well as engagement across business functions, in the hopes of ensuring better long term outcomes.