Site Reliability Engineer, IaaS

Company:  Algolia
Location: Paris
Closing Date: 28/10/2024
Salary: £80 - £100 Per Annum
Type: Temporary
Job Requirements / Description
Algolia is set to enable every company to create world-class Search and Discovery experiences with an API-first approach. Performance and Scalability is at the heart of our mission: we power trillion searches a year, for 10K+ customers all over the world. If you're a problem solver, able to think outside the box and eager to nurture others and learn from them, then this is your challenge! The Team The Infrastructure as a Service (IaaS) team aims at upholding the reliability and scalability we expect from Algolia’s infrastructure for its critical systems and products. Our focus is on enabling teams across Algolia to leverage this infrastructure while keeping it under control through an always increasing level of automation. The Opportunity The Site Reliability Engineer position within the Infrastructure As a Service team provides a dynamic opportunity for a professional with foundational experience in maintaining and optimizing scalable infrastructures. This role specifically concentrates on three key areas: server and container hosting, cloud and network expertise and flawless observability. As a member of the Infrastructure As a Service team, you will play a key role in supporting the reliability and scalability of Algolia’s Search products and core internal services. Your responsibilities will include operating components or features, ensuring proper monitoring and alerting are in place, and assisting in the transition from legacy systems. You will work on planning and accountability for the next quarter, demonstrating independence in problem-solving and minimal reliance on managers and senior team members. Your role will consist of: Kubernetes and Cloud Services Management: Help maintain and optimize a fleet of Kubernetes-based architectures and cloud services, enhancing fault tolerance and resource utilization. System Management and Configuration: Continuously improve and refine the infrastructure code and automation that manage our fleet of several thousand servers, keeping it safe, efficient, and reliable. Maintain and Extend our Control Plane: Go beyond our current control plane and turn it into a platform that everyone at Algolia can leverage to build performant, reliable, and scalable products. Observability Implementation: Support the development and deployment of observability solutions, providing your team and others with actionable insights to track and enhance system reliability. Collaboration and Problem Solving: Work collaboratively with team members to identify and solve problems, reducing dependence on senior staff for guidance. Process Improvement: Contribute to establishing engineering processes and best practices to ensure high-quality, reliable, and scalable systems. You might be a fit if you have: Programming Skills: Basic to intermediate knowledge of programming languages such as Python, Ruby, or Golang, with an understanding of software craftsmanship. Experience with Linux and Kubernetes: Experience in setting up and managing fleets of Linux servers and Kubernetes-based architectures. Knowledge of Distributed Systems: Exposure to operating distributed systems and understanding their challenges at a basic level. Public Cloud Experience: Familiarity with public cloud providers such as Microsoft Azure, AWS, or GCP. Problem-Solving Skills: Ability to independently identify and solve problems, demonstrating initiative and minimal reliance on senior team members. Communication and Organization Skills: Strong communication and organizational skills to effectively collaborate with team members and stakeholders. 3 years or more of related work experience. We’re looking for someone who can live our values: GRIT - Problem-solving and perseverance capability in an ever-changing and growing environment. TRUST - Willingness to trust our co-workers and to take ownership. CANDOR - Ability to receive and give constructive feedback. CARE - Genuine care about other team members, our clients, and the decisions we make in the company. HUMILITY - Aptitude for learning from others, putting ego aside. REMOTE STRATEGY: Algolia’s flexible workplace model is designed to empower all Algolians to fulfill our mission to power search and discovery with ease. We place an emphasis on an individual’s impact, contribution, and output, over their physical location. Algolia is a high-trust environment and our team members have the autonomy to choose where they want to work and when. We know community comes in many forms and strive to create opportunities for intentional in-person connection in our offices and virtually for our remote colleagues around the world. We have a global presence with physical offices in San Francisco, NYC, Paris, London, Sydney, and Bucharest. #J-18808-Ljbffr
Apply Now
Share this job
Algolia
  • Similar Jobs

  • Site Reliability Engineer

    Paris
    View Job
  • Site Reliability Engineer

    Paris
    View Job
  • Site Reliability Engineer

    Paris
    View Job
  • Site Reliability Engineer

    Paris
    View Job
  • Site Reliability Engineer

    Paris
    View Job
An unhandled exception has occurred. See browser dev tools for details. Reload 🗙