Trust and Safety Threat Investigator

  • Individual Contributor
  • Visa Sponsorship
  • San Francisco, CA, US
  • This position has been filled

Website anthropic Anthropic

This content was reproduced from the employer’s website on May 11, 2023. Please visit their website below for the most up-to-date information about this position.

As a threat investigator on the Trust and Safety team, you will be responsible for developing novel detection techniques to discover and mitigate abuse of Anthropic’s products and services. Through a combination of trends analysis, data visualization and studying individual policy abuse and enforcement cases, you will generate system-level insights into emerging types of harms and identify malicious actors. You’ll work closely with our product and engineering team to  build scalable enforcement mechanisms and identify abusive behavior as our services expand in use and capability.

Your Responsibilities Will Include:

    • Analyze the deployment of our products and services and identify how these systems are being misused or abused
    • Study trends internally and in the broader ecosystem to  anticipate how systems could be misused or manipulated for harm in the future
    • Work with engineering to build tools and levers to stop current and prevent future forms of abuse.
    • Identify new types of policy violations, emerging risks and threat vectors related to the use of our products and services
    • Conduct deep dives into categories of harmful behavior to strengthen our understanding and defenses
    • Keep abreast of the the latest industry risks, vulnerabilities and issues related to the use of language models and generative AI; identify opportunities for improvement to our policies, controls and enforcement mechanisms
    • Draft and implement new detection and response processes

You may be a good fit if you:

    • Have experience building automated detection systems, prompt engineering or enforcement mechanisms
    • Familiarity with abusive user behavior detection is a plus
    • You know how to derive insights from large amounts of data to make key decisions and recommendations
    • Have experience on a trust and safety team and/or have worked closely with policy or content moderation
    • Love to think creatively about how to use technology in a way that is safe and beneficial, and ultimately furthers the goal of advancing safe AI systems
Hybrid policy & US visa sponsorship: Currently, we expect all staff to be in our office at least 25% of the time. We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate; operations roles are especially difficult to support. But if we make you an offer, we will make every effort to get you into the United States, and we retain an immigration lawyer to help with this.