Site Reliability Engineer – Trust & Safety

TikTok logo
  • Individual Contributor
  • TSPA Members
  • Mountain View, CA, US
  • Experience level: 3+ years

This content was reproduced from the employer’s website on May 10, 2023. Please visit their website below for the most up-to-date information about this position.

• Manage day-to-day operations of data service, realtime/batch data pipelines, such as SLA management, system deployment, performance tuning and trouble shooting
• Create tools and automation to improve system administration and operation efficiency
• Participate in regular on-call duties
• Engage in and improve the whole lifecycle of services from inception and design, throughout development, capacity planning, and launch reviews, to deployment, operation, and refinement
• Design and implement software platforms and monitor frameworks for efficient, automated, and intelligent service-oriented architecture (SOA) governance
• Scale systems sustainably through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pushing for changes
• Practice sustainable user support, incident response, and blameless postmortems
• Bachelor’s degree in Computer Science, with at least 3 years of related experience
• Demonstrated independent thinking capabilities and troubleshooting skills
• Experience programming in one of the following programmings: Python, Go, C, C++, Java and Rust
• Familiar with backend systems such as MySQL/Redis/Nginx/Kafka/Kubernetes/Docker and big data technologies such as Hadoop/Spark/Flink/Hive/OLAP/ClickHouse, etc.
• Familiar with Unix/Linux system internals, networking, and distributed systems
• Good communication and coordination skills
• Experience in Trust & Safety is a plus

To apply for this job please visit