The Evolution of Trust & Safety

As noted in the previous section, there is no single standard pattern of how trust and safety developed within technology companies. Rather, companies have taken different approaches depending on a range of factors, pressures, and experiences. Similarly, the emergence of trust and safety as a practice, profession, and business function has not followed a straight line. This section highlights the key trends, moments, beginnings, and maturation of trust and safety based on discussions with and writings from former and current industry participants, scholars, and journalists.

Elements of trust and safety date back several decades to early attempts to moderate content on web pages, chat rooms, and instant messaging services. Website owners and individual administrators often did their own bespoke content moderation—sometimes taking down harmful content from a comments section or blocking a user who was spamming or trolling a forum. Because few platforms reached a significant scale, companies did little on a systematic or coordinated basis to curb abusive content or behavior, and oftentimes lacked public-facing policies or community guidelines. Prior to the early 2000s, almost no technology company had a dedicated team focused on content moderation or trust and safety more broadly. This meant that, in many cases, there were inconsistent enforcement decisions and a lack of important elements to engender trust. For example, the ability to appeal moderation decisions or educate users about community policies were often not important design considerations. Early forms of proactive content moderation focused on wide scale abuse disruptive to users’ experience, such as the creation of bulk accounts for spamming purposes, rather than on other forms of abuse like extremist content, hate speech, harassment, and disinformation.

In many respects, the growth of trust and safety parallels the increased user access to the internet, and thus the growing importance of internet-based companies in the lives of billions of individuals. More users signing up for email services and messaging apps created more economic incentives for the distribution of spam and phishing, as well as new opportunities to harass individuals or share child sexual abuse material (CSAM). The widespread use of search engines like Altavista, Ask Jeeves, and Google led to the proliferation of search engine optimization (SEO) efforts that sought to boost websites in the results page, sometimes maliciously. The opportunity to monetize online through advertising led to a variety of abusive behavior and content, from misleading advertising to click fraud. The ability to securely transact online led to the expansion of e-commerce platforms like eBay and Amazon and presented new opportunities to sell harmful goods and services, target competitors, or exploit users with fake product listings.

Perhaps the most impactful development in the digital space from a trust and safety perspective was the rise of social media platforms like Friendster, Myspace, and later Facebook, YouTube, and Twitter. These platforms led to an exponential growth of public-facing user-generated content (UGC) that often necessitated dedicated teams to moderate said content as a business imperative. As users began to spend an increasing amount of time on social media to connect and interact with friends and also to access news and information, such services became further enmeshed into the fabric of society. Removing (or failing to remove) a post, banning a group, or terminating a channel evolved into an essential function to prevent harm and began to attract public, media, and regulatory attention.

In response to that scrutiny and sometimes proactively as a recognition of their own gaps, major social media companies put significant resources behind the development of content policy teams, moderation teams, expanding language capabilities, the creation of specialist teams for particular forms of abuse, and the development of automation and artificial intelligence-based technologies to identify abusive content and behavior. They also introduced new ways for users, non-governmental organizations, and government bodies to flag abusive content. In parallel and sometimes predating this work, other technology companies such as app stores and e-commerce marketplaces also invested in trust and safety. While the latter services often dealt with similar issues as social media, such as detecting impersonation attempts, the spread of malware, and buying/selling illegal goods and services, they also faced challenges unique to their services.

The shift online and subsequent safety issues also spurred the development of new legislation and regulation. The earliest of these, the Communications Decency Act (CDA), passed by the U.S. Congress in 1996, was the first of its kind to deal with online content. However, it pre-dated—and arguably enabled—the rise of large-scale social media. Section 230 of the CDA ensured that “providers of an interactive computer service” were not held liable for third-party content on their platforms, while also affording them the ability to moderate content. Other key legislation affecting trust and safety in this period included the Digital Millennium Copyright Act (DMCA) (1998), which criminalized production and dissemination of technology, devices, or services intended to circumvent measures that control access to copyrighted works, and the E-Commerce Directive in the European Union (2000), which helped set up the basic rules of e-commerce. This directive distinguished among different types of online intermediaries such as hosting services and exempted intermediaries from liability for the content so long as they remove illegal content or disable access to such content as fast as possible once they are aware of the illegal nature of it. Many trust and safety teams were established in part to ensure compliance with these various pieces of legislation, regulation, and directives. This phenomenon helps explain why in some companies trust and safety emerged from, and remains closely linked to, the broader in-house legal team.

As technology companies have become increasingly intertwined in our daily lives, new or updated efforts to regulate technology companies, often with a focus on safety issues, have followed these early forms of legislation and regulation. While these regulations and legislation have been introduced at different times and differ in substance and services covered, many share commonalities around the need for companies to do more to tackle abusive content and behavior. One of the fundamental aims of the Digital Services Act (DSA) in the EU, for example, is “to create a safer digital space where the fundamental rights of users are protected” (EU Digital Strategy). Similarly, the UK’s Online Safety Bill, India’s Intermediary Guidelines and Digital Media Code of 2021, and several bills in the U.S. Congress seek to similarly create a safer online environment. These new regulations and legislation, in part, reflect the challenges associated with preventing and detecting harmful content, the unintended consequence of overly aggressive moderation efforts, and the differing perceptions and shifting expectations around what type of content is harmful. As companies have invested in developing policies and tools to detect violations of such policies, they have often been accused of bias as to what content and behavior violates rules and should be moderated.

While much of the work of trust and safety teams has focused on developing and enforcing policies, in recent years there has been a newfound focus on education, prevention, and user empowerment. This includes more nuanced levers for platforms to employ beyond simply banning or removing. For example, the use of warning interstitials, labeling, feature blocking, reducing visibility, and other enforcement actions have been adopted by a range of platforms and services to improve the health of the ecosystem, take proportionate responses, and inform users of potentially problematic or disputed content.

Safety by design, transparency, and accountability have also emerged as key areas of focus in the last several years, driven by both policymaker and regulatory pressure and an effort to improve user experience and engender trust. Services have tested and deployed preventative features like adding friction in sign-up flows, alerting users that they may be posting potentially toxic or policy-violating content, and experiments like removing the “thumbs down” button.

Several companies such as Twitter and AirBnB have formed trust and safety advisory boards, and in 2020 Facebook established and funded an independent Oversight Board. Companies have used these bodies as accountability mechanisms, as methods of gathering input on policy development, and as a way to get expert opinions and advice on new product features. While several large tech companies began publishing transparency reports on government requests for user information beginning with Google in 2010, transparency reporting has grown in both scope, frequency, and number of participating companies. Most large social media companies and online marketplace services regularly publish content and account removals reports, or other safety-related reports.

This overview is intended to demonstrate how trust and safety has evolved from an afterthought to an important brand, user, and regulatory consideration that is now often top-of-mind for new online services and front page news for the media. It’s important to remember that each company has its own trust and safety story, and the overall narrative continues to unfold.