Senior Site Reliability Engineer

ID: 642458
Type: Contingency
Location: Falcon Heights, MN
Contact Email:
Salary: Open
Our client is currently seeking a Senior Site Reliability Engineer to join their team! You love using your knowledge of operational excellence and service design gained by working with complex systems to dig into problems or processes, define the moving pieces, and automate a solution (particularly using Python!). You provide detailed analysis and problem-solving, coupled with strong communication skills and a sense of ownership and drive. Your insatiable need to understand a system's inner workings motivates you to find flaws in implementation or design and collaborate with others to improve understanding, performance, and reliability.

This job will have the following responsibilities:
  • Continually train to be prepared to identify and resolve a variety of system failures 
  • Conduct blameless after-incident retrospectives and drive the identified outcomes into our applications and systems to improve our client experience 
  • Develop deep insights and analysis into the quality of experience for our customers, and quality of service of our platform 
  • Implement improvements to service resiliency 
  • Gracefully scale systems ahead of need using fact based metrics combined with deep knowledge of our platform, and mature systems by pushing for changes that improve reliability and velocity. 
  • Design, create, and maintain production monitoring systems that cover our platform from end to end. 
  • Automate everything.

Qualifications & Requirements:
  • 5-7 years Technical experience 
  • Demonstrated proficiency in one of the following: Java, Javascript, Python, Kotlin 
  • Experienced in a subset of these tools or their equivalent: Python, Kubernetes, Docker, Java, Springboot, Azure, Instana, AppDynamics, Splunk, Catchpoint, Grafana, Prometheus, Pivotal Cloud Foundry, Node.js, Angular, Jenkins
Apply for this Job