Senior Site Reliability Engineer

4 days ago


Mid Valley City Federal Territory of Kuala Lumpur, Malaysia Horizontal Talent Full time

About Horizontal:
Established since 2003 in the US, Horizontal solves complex challenges across two distinct businesses: Horizontal Digital and Horizontal Talent. We are consistently recognized for being a top workplace and one of the fastest growing private companies. Horizontal Talent specializes in staffing for IT, Digital & Creative and Business & Strategy markets. We have global offices in US, UAE, India, Malaysia and Australia.

About The Role
As a Senior SRE, you'll drive the development and execution of strategies for DevSecOps practices and platform. Your work will ensure seamless collaboration between technology teams, enabling fast and reliable high-quality software delivery.

You'll work with a team responsible for implementing and managing Infrastructure as Code (IaC), CI/CD pipelines, cloud native & micro-services, automation frameworks, and release management processes, ensuring they align with organizational objectives.

What You'll Do

  • Lead the design and implementation of highly available, secure, and scalable banking infrastructure using infrastructure as code (IaC) principles
  • Establish and maintain SLOs/SLIs that define our reliability standards and drive accountability across engineering teams
  • Serve as an incident commander during critical service disruptions, leading cross-functional response teams with calm expertise
  • Build and enhance our observability platform, enabling real-time monitoring of our golden signals (uptime, latency, saturation, error rate)
  • Develop automation solutions for incident response, disaster recovery, and business continuity
  • Drive our DevSecOps platform to enable safe, rapid deployments through CI/CD, GitOps, and self-service capabilities
  • Lead FinOps initiatives to bring visibility and drive ownership amongst tech teams to optimize infrastructure utilization while maintaining performance and reliability
  • Mentor junior engineers and contribute to a culture of operational excellence

What We're Seeking

  • Demonstrated experience of at least 5 years in Site Reliability Engineering, DevOps, or equivalent roles.
  • Strong understanding of cloud technologies (AWS, Azure, GCP, Alibaba Cloud)
  • Experience implementing CI/CD pipelines and GitOps workflows
  • Deep expertise with infrastructure as code tools (Hashicorp Terraform, OpenTofu, CloudFormation, or similar)
  • Proven ability to design and implement observability solutions using modern monitoring stacks
  • Experience leading incident response and building post-mortem processes
  • Strong understanding of Java or any other object-oriented programming language (OOP).
  • Strong understanding of containerization & orchestration.
  • Experience with messaging systems such as Kafka is an added advantage.
  • Familiarity with relational and non-relational databases is a plus.
  • Ability to balance hands-on technical expertise with strategic decision-making.
  • Strong problem-solving skills and the ability to make sound decisions under pressure.
  • A passion for continuous learning, innovation, and professional development.
  • High ownership of responsibilities, with a focus on delivering results and meeting deadlines.
  • Financial services experience is a plus but not required


  • Bangsar South, Federal Territory of Kuala Lumpur, Malaysia Razer Inc. Full time 80,000 - 120,000 per year

    Joining Razer will place you on a global mission to revolutionize the way the world games. Razer isa place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is alsoa great place to work,providing you the unique, gamer-centric experience that will put you in an...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Guidewire Software Full time $100,000 - $200,000 per year

    SummaryWe are searching for a Senior Site Reliability Engineer hungry for a rare chance to transform insurance with the industry's leading cloud platformJob Description*Senior Site Reliability Engineer - Cloud ApplicationThe Opportunity*At Guidewire, growth isn't just financial, it's technical. With Q3 FY2025 revenue up 22% year-over-year and a rapidly...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Guidewire Software Full time

    SummaryWe are searching for a Senior Site Reliability Engineer hungry for a rare chance to transform insurance with the industry's leading cloud platformJob DescriptionSenior Site Reliability Engineer - Cloud ApplicationThe OpportunityAt Guidewire, growth isn't just financial, it's technical. With Q3 FY2025 revenue up 22% year-over-year and a rapidly growing...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Kneat Full time

    Site Reliability Engineer – Kuala Lumpur, MalaysiaKneat enables regulated organizations to move from paper-based validation to intelligent, digitized, paperless solutions. And we do it through the ongoing development of a powerful, purpose-built software platform. In 2014, after eight years of intensive software development, we launched Kneat Gx—the...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Kneat Full time

    Site Reliability Engineer – Kuala Lumpur, MalaysiaKneat enables regulated organizations to move from paper-based validation to intelligent, digitized, paperless solutions. And we do it through the ongoing development of a powerful, purpose-built software platform. In 2014, after eight years of intensive software development, we launched Kneat Gx—the...


  • Kuala Lumpur, Kuala Lumpur, Malaysia PERSOL APAC Full time

    About the company:We have partnered with a renowned global leader in information and communications technology (ICT) infrastructure and smart devices. They are providing full-stack, all-scenario solution for products and services carriers, enterprises, governments, and individual consumers worldwide.Our client is looking for enthusiasticSite Reliability...

  • Senior Engineer, I&C

    2 weeks ago


    KL Eco City, Federal Territory of Kuala Lumpur, Malaysia MODEC Offshore Production Systems (Singapore) Offshore Frontier Solutions Full time

    OFS Malaysia is a subsidiary of Offshore Frontier Solutions Pte. Ltd. (OFS), a MODEC Group company in Malaysia. Being part of Modec means being the protagonist of a challenging career and being in touch with the latest deep-water production systems, knowing that your career begins in Malaysia, but your talent can take you anywhere in the world.If you want to...

  • Engineer, Piping

    1 week ago


    KL Eco City, Federal Territory of Kuala Lumpur, Malaysia MODEC Offshore Production Systems (Singapore) Offshore Frontier Solutions Full time

    OFS Malaysia is a subsidiary of Offshore Frontier Solutions Pte. Ltd. (OFS), a MODEC Group company in Malaysia. Being part of Modec means being the protagonist of a challenging career and being in touch with the latest deep-water production systems, knowing that your career begins in Malaysia, but your talent can take you anywhere in the world.If you want to...


  • Kuala Lumpur, Kuala Lumpur, Malaysia FPT Software Malaysia Sdn. Bhd. Full time

    Key Responsibilities:Disaster Recovery Planning (DRP):Design and maintain scalable failover systems, backup strategies, and redundancy mechanisms across cloud and on-prem environments.Develop and update DR documentation, runbooks, and recovery playbooks for infrastructure and application layers.Business Continuity Testing:Plan, coordinate, and execute...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Encora Full time

    About the RoleWe are seeking a Site Reliability Engineer (SRE) to support and manage our Infrastructure-as-a-Service (IaaS) platform built on the VMware Cloud Foundation (VCF) stack. You will be responsible for maintaining, automating, and optimizing the VCF environment to ensure high availability, scalability, and operational efficiency. This role is ideal...