Job Description
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Sabre’s services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to users’ needs and a fast rate of improvement. Site Reliability Engineers are responsible for maintaining tools, systems and platforms for Sabre services. This includes troubleshooting problems with systems and services, regular deployment of new versions of the systems and their subcomponents, deployment validation and testing, service monitoring, standing up new services/tools, etc.
• Work under minimum supervision with few direct instructions
• Follow SRE principles and collaborate with other teams to spread SRE methodology
• Taking initiative to automate our processes and support our clients
• Develop release documentation covering all aspects of new release
• Design and build environments based on requirements
• Design, build, maintain and utilize automation tools to reduce manual and repetitive tasks
• Actively participate in Production Incidents as well as on-call support
• Troubleshoot, diagnose, and take corrective actions to improve stability and performance of systems running on production environment
• Strong attitude towards teamwork, knowledge sharing and documentation
Job Requirements
• Capability to work independently as well as collaborate with team
• Ability to diagnose, troubleshoot, and repair systems related problems
• Experience in deploying, maintaining, and monitoring applications on Linux systems
• Experience with installing, configuring and trouble-shooting Java applications
• Knowledge of CI/CD (Jenkins, TeamCity, Maven, GIT) and version control systems
• Knowledge about internet protocols (TCP/HTTP/SOAP/REST)
• Experience with Cloud solutions (GCP, AWS or Azure)
• Very good knowledge of English and Polish (both written and verbal)
• Experience with technologies used in Infrastructure as Code (e.g., Cloud Formation, Terraform, Ansible)
• Knowledge of cloud technologies (GCP, Kubernetes, ApiGee)