Why Privacy Matters
Strong privacy and security standards and practices are the cornerstone of user trust. For a platform that may hold millions of data points about an individual that identifies them personally—or can be aggregated to identify an individual, it is critical to protect the privacy and security of a user’s identity. This ensures the protection of their human rights, protects them from criminal activity, complies with relevant regulations, and safeguards the platform’s reputation.
When privacy considerations are included as a part of trust and safety investigations and processes, it is easier to manage potential risks to users and the business, create and enforce policies that keep people safe, and maintain user trust.
While privacy may be primarily known as a legal field with dedicated staff—from operators to policy professionals to program managers to engineers to product managers—it intersects with trust and safety in meaningful ways. In general, platforms may for a variety of reasons value greater privacy protections that go beyond the requirements imposed by the regulatory environments in which they operate. Concerns related to privacy are also inherent in many of the harms that platforms seek to address through their content policies and the enforcement of those policies, which further incentivizes taking proactive approaches to privacy. In other words, while regulations related to privacy tell platforms what they must do, a platform may create policies and approaches to privacy based on what it should do to serve its users.
What is Privacy
Privacy in trust and safety work largely relates to user data, particularly how it is collected, managed, accessed, deployed, and destroyed in developing safety enforcement tools and investigations. Data privacy involves protecting personal data so that only authorized individuals can access it and allowing individuals to determine who can access their personal information. It defines how data should be handled and by whom.
Terms & Definitions
What is considered personal data? If the information is public, is it still personal data? Privacy lingo can be both pervasive and vague. Establishing a shared understanding of common vocabulary is an important first step to enabling privacy risk management across all teams handling people’s personal information.
Personal Data or Personally Identifiable Information (PII)
Personal data refers to all information that directly or indirectly identifies an individual. In the industry, the term Personally Identifiable Information (PII) is commonly used to describe all personal data. This can include any unique identifier, such as a user ID, handle, or device ID, IP addresses, and other precise location information. PII data isn’t necessarily sensitive, but may include any data that can be used to identify specific individuals. Notably, personal data can be public—including but not limited to names which may appear on websites, user profiles, or publications—or used for research and marketing purposes by the platform and its partners. Related to this, privacy rights aren’t limited to confidential information.
Sensitive Data
Sensitive personally identifiable information is a type of PII that could cause significant harm to an individual if it’s lost, stolen, or disclosed without authorization. This typically includes demographic information (race/ethnicity), sexual orientation, genetic or biometric data (images, voice recordings, fingerprint); health data, religious beliefs, payment information (e.g., credit card number), and personal data related to vulnerable populations (such as minors). When it comes to sensitive data, the expectations and consequences of mishandling privacy are even more significant. This narrower category of personal data can also be referred to as Sensitive Personal Information (SPI).
Inferred Data
Inferred data refers to characteristics of people predicted or determined via analytical processing of data, rather than data provided directly by the individual or indirectly by an external source. For example, a platform might infer that a user identifies as LGBTQ based on their online RSVP to a local Pride event or from the community/interest groups they’ve joined.
Processing
“Processing” is a catch-all verb for all of the ways that personal data can be handled (e.g., collected, used, shared, stored, secured, modified, deleted).
De-identified Data
De-identified data refers to when direct and indirect identifiers are removed or manipulated to break linkage to real-world identifiers, such as replacing IP addresses with a home geographic region. It is important to note that de-identified data can potentially be re-identified with some effort, this happens in contexts in which the information can still be used to identify a specific individual even without the removed identifiers or when combined with other publicly available information. In de-identifyng data, privacy professionals must also make sure to remove data from global and regional datasets.
Anonymized Data
Anonymized data refers to when direct and indirect identifiers are removed or manipulated, and mathematical or technical guarantees are implemented to prevent re-identification. Anonymizing personal information is a quite challenging problem, especially when trying to meet required standards.
Incident
An incident refers to actual or potential unauthorized access, use, or modification of personal data, and can include scenarios where users are incorrectly identified. A data breach goes one step further, referring to a confirmed disclosure of data to an unauthorized party, often requiring notification to affected individuals and regulatory agencies.
Risk Management
A critical aspect of privacy work is the management of risks that organizations may face. The following risks are what companies may most commonly account for in building their approach to privacy governance.
Risk of real-world harm: How organizations handle personal data can pose real-world harm to people physically, mentally, and financially. For example, when private or personally identifiable information about a person is obtained and distributed widely without consent, it can result in doxxing, online and in-person harassment, and/or extortion. For vulnerable populations especially—such as children, ethnic or religious minorities, and political dissidents—real-world violence is also a risk that must be considered. T&S professionals have an obligation to think through and address these real and possible consequences in their work, and identify and address abusive behaviors quickly.
Legal risk: Privacy regulation is a dynamic environment. In recent decades, the expectations of governments and the public have evolved regarding what’s considered appropriate behavior when handling personal data. Organizations face increasing regulatory obligations worldwide, including state laws like the California Consumer Privacy Act (CCPA) in the US, the General Data Protection Regulation (GDPR) in the EU, and similar laws enacted in the UK, Brazil, and India. When privacy and data protection practices fail to meet expected compliance standards, the ramifications can be severe, including litigation, fines, and/or interruption of business operations. A potential ripple effect includes chilling technological innovation, as regulatory trends move to constrain the ability to build and implement new technologies.
Privacy and Legal Risk
Example: EU slaps Meta with record $1.3 billion fine for data privacy violations – Washington Post
Example: The No AI Fraud Act Creates Far More Problems Than It Solves – Electronic Frontier Foundation
In these examples, there is consideration for how to create enforceable regulations, and how platforms navigate their enforcement.
Reputational risk: Privacy concerns can deeply influence user and public sentiment regarding the trustworthiness of a platform or organization. Negative press about how personal information is collected, used, shared, or secured that fails to meet people’s expectations lowers trust in a platform or organization and increases the demand for privacy changes. This may result in users seeking better transparency and control over their personal information, as well as increased security and accountability for the use of their personal data. While user trust isn’t always predictive of user engagement or behavior, it can present an existential risk to some platforms, especially where expectations of privacy are high and unmet, leading to serious consequences to people’s safety or livelihood. Additionally, user sentiment regarding trustworthiness affects the adoption of new technologies.
Navigating Reputational Risk
Example: Zoom rethinks its approach to content moderation – Stanford Law School
Example: TikTok just seen as just part of the data privacy problem – Bloomberg
These examples demonstrate how platforms respond to and mitigate privacy-related issues.
Privacy Risk Mitigations
Risk Identification
A major part of privacy work is risk identification, which involves identifying and assessing privacy risks associated with data processing, storage, sharing, and deletion. This proactive approach is vital for implementing appropriate controls and mitigations. Examples of risk mitigation include creating personal information inventory to identify what PII is held by the organization and where it is stored, or conducting privacy assessments that are guided by privacy frameworks and informed by the regulatory environment the product and platform is subject to.
Data Minimization
Another crucial component of privacy work is managing data collection, including implementing techniques surrounding data minimization. Data minimization involves collecting only the data that is necessary for an agreed specific purpose and no more. While it might be tempting for companies to maximize data collection about their users for different and future uses, data minimization allows companies to make data more manageable internally, reduces the exposure of sensitive information, and minimizes privacy risks.
Employee Training
As a part of the general course of instilling privacy competency and awareness in tech organizations, it is standard practice for all employees who could potentially handle user data to receive comprehensive training on privacy policies and procedures. The reason for this is simple: educated employees are better equipped to protect user data and respond appropriately to potential incidents. This may also help fulfill some requirements from regulators, who may stipulate specific expectations surrounding privacy training for some or all employees in an organization.
Privacy by Design
Embedding privacy considerations into the design and development of new products and features, including privacy enhancing technologies (PETs) which ensure that user privacy is prioritized from the beginning. This involves educating and working closely with product managers, product policy teams, legal departments, data scientists, and software engineers, all of whom must consider privacy risks, especially as it relates to user data used in the products they are building or supporting. The Safety by Design chapter covers these topics in more detail.
Restricting Data Access
By limiting access to sensitive data to only authorized personnel, organizations can significantly reduce the risk of data exposure. This includes instituting internal policies for employees, and clearly communicating with end users on their right to limit the use and disclosure of their sensitive personal information.
Cybersecurity Tools, Techniques, and Procedures
The field of privacy is closely related and interdependent to the field of cybersecurity. While cybersecurity is responsible for protecting all sensitive digital information, including data that is sensitive company information, privacy concerns personal user information. In practice, safeguarding privacy rights depends on effective cybersecurity. It is essential to implement robust security measures, encryption protocols, and access controls to safeguard data from unauthorized access and breaches.