Site Reliability Engineering Manager

7 days ago


Kuala Lumpur, Kuala Lumpur, Malaysia Avensys Consulting Full time

If you are passionate about playing a key role in the success of a Top-Notch company, we want to hear from you

Our client's project is a well-established brand in Information Technology Industry who is now looking for a passionate and driven Site Reliability Engineering Lead. This is an exciting opportunity to expand your skill set, achieve job satisfaction and work-life balance.

ROLES & RESPONSIBILITIES:

  • Contribute to system design and deployment phases with a focus on scalability, reliability, and operability. Ensure that production readiness is considered at every stage of the software lifecycle.
  • Develop automation scripts, infrastructure as code, and tooling using industry best practices to improve system reliability, reduce manual effort, and enable self-service.
  • Review system architectures, deployment strategies, observability setups, and operational documentation to ensure reliability and operational excellence.
  • Analyze production issues, identify root causes, and implement long-term reliability improvements through automation, monitoring, and architectural enhancements.
  • Work collaboratively with other team members and provide guidance to more junior team members.
  • Organize an efficient handover through high quality documentation and training.
  • Automate the deployment and operation of multi-tenant infrastructure, handling tasks that ensure system resilience and availability.
  • Develop and maintain monitoring tools, dashboards, and self-healing mechanisms.
  • Participate in on-call rotations, conduct blameless postmortems, and drive continuous learning.
  • Work closely with developers, product teams, and engineering stakeholders to troubleshoot issues, improve systems, and integrate reliability improvements

REQUIREMENTS

  • Bachelor's degree in Computer Science, Engineering, or related field
  • Minimum 6 years of experience in Site Reliability Engineering or software development within an international company.
  • Must have hands-on ELK, Linux , Kafka , build automation solution
  • Hands-on experience with CI/CD and deployment tools such as Ansible, Jenkins, Maven, Nexus, Git, and Docker.
  • Proficiency in Linux OS
  • Proficiency in scripting and automation (e.g. Python, PowerShell, YAML) with the ability to develop tools and infrastructure as code.
  • Familiarity with Java-based systems with the ability to understand code for root cause analysis.
  • Understanding of distributed systems and microservices architectures, including REST and SOAP APIs.
  • Experience with databases, including NoSQL platforms.
  • Familiarity with performance and reliability testing tools such as JMeter or Postman.
  • Exposure to observability and analytics technologies; experience with Elasticsearch or reporting tools like Power BI is a plus.
  • Practical experience working in Agile-driven teams.
  • Strong interpersonal and communication skills, with a customer-centric mindset and the ability to work effectively across cultures.
  • Demonstrated ability to collaborate with distributed teams across multiple time zones.

WHAT'S ON OFFER

You will be remunerated with an excellent base salary and entitled to attractive company benefits. Additionally, you will get the opportunity to enjoy a fun and collaborative work environment, alongside a strong career progression

To submit your application, please apply online or email your UPDATED CV in Microsoft Word format to - . Your interest will be treated with strict confidentiality.

CONSULTANT DETAILS

Consultant Name : Emimal Joshua

Reg No :

Avensys Consulting Sdn Bhd

Privacy Statement
: Data collected will be used for recruitment purposes only. Personal data provided will be used strictly in accordance with the relevant data protection law and Avensys' privacy policy.



  • Kuala Lumpur, Kuala Lumpur, Malaysia PERSOL APAC Full time

    About the company:We have partnered with a renowned global leader in information and communications technology (ICT) infrastructure and smart devices. They are providing full-stack, all-scenario solution for products and services carriers, enterprises, governments, and individual consumers worldwide.Our client is looking for enthusiasticSite Reliability...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Kneat Full time

    Site Reliability Engineer – Kuala Lumpur, MalaysiaKneat enables regulated organizations to move from paper-based validation to intelligent, digitized, paperless solutions. And we do it through the ongoing development of a powerful, purpose-built software platform. In 2014, after eight years of intensive software development, we launched Kneat Gx—the...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Kneat Full time

    Site Reliability Engineer – Kuala Lumpur, MalaysiaKneat enables regulated organizations to move from paper-based validation to intelligent, digitized, paperless solutions. And we do it through the ongoing development of a powerful, purpose-built software platform. In 2014, after eight years of intensive software development, we launched Kneat Gx—the...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Encora Full time

    We are looking for a Site Reliability Engineer (SRE) to support and manage our Infrastructure-as-a-Service (IaaS) platform built on the VMware Cloud Foundation (VCF) stack. The role involves maintaining, automating, and optimizing the VCF environment, ensuring high availability, scalability, and operational efficiencyDay-to-Day ResponsibilitiesManage,...


  • Kuala Lumpur, Kuala Lumpur, Malaysia FPT Software Malaysia Sdn. Bhd. Full time

    Key Responsibilities:Disaster Recovery Planning (DRP):Design and maintain scalable failover systems, backup strategies, and redundancy mechanisms across cloud and on-prem environments.Develop and update DR documentation, runbooks, and recovery playbooks for infrastructure and application layers.Business Continuity Testing:Plan, coordinate, and execute...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Encora Full time

    About the RoleWe are seeking a Site Reliability Engineer (SRE) to support and manage our Infrastructure-as-a-Service (IaaS) platform built on the VMware Cloud Foundation (VCF) stack. You will be responsible for maintaining, automating, and optimizing the VCF environment to ensure high availability, scalability, and operational efficiency. This role is ideal...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Unison Consulting Full time 120,000 - 240,000 per year

    As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and operations, ensuring robust, scalable, and responsive infrastructure. This role emphasizes strong system architecture and design principles, focusing on key SRE...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Razer Inc. Full time 60,000 - 120,000 per year

    Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Guidewire Software Full time $100,000 - $200,000 per year

    SummaryWe are searching for a Senior Site Reliability Engineer hungry for a rare chance to transform insurance with the industry's leading cloud platformJob Description*Senior Site Reliability Engineer - Cloud ApplicationThe Opportunity*At Guidewire, growth isn't just financial, it's technical. With Q3 FY2025 revenue up 22% year-over-year and a rapidly...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Guidewire Software Full time

    SummaryWe are searching for a Senior Site Reliability Engineer hungry for a rare chance to transform insurance with the industry's leading cloud platformJob DescriptionSenior Site Reliability Engineer - Cloud ApplicationThe OpportunityAt Guidewire, growth isn't just financial, it's technical. With Q3 FY2025 revenue up 22% year-over-year and a rapidly growing...