Site Reliability Engineer, Principal

3 days ago


Kuala Lumpur MYAIA Malaysia AIA Group Full time $90,000 - $120,000 per year

At AIA we've started an exciting movement to create a healthier, more sustainable future for everyone.

As pioneering innovators for over 100 years, we're now transforming our organisation to be faster, simpler and more connected. Because we want to be even better equipped to develop digital solutions and experiences that help more people live Healthier, Longer, Better Lives.

To get there, we need people with tech/digital/analytics expertise and passion to help develop positive, sustainable change through digitally enhanced experiences that will impact the lives of millions of people and create a healthier future for everyone.

If you believe in developing a better tomorrow, read on. 

About the Role

System Reliability Engineer (SRE) is responsible to ensure our cloud application systems are reliable and available to users. The SRE will supervise application systems and establish automated detections, root cause analysis, and formulate preventive actions. They will gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding. They will partner with development teams to improve services.

Functional Duties:

  • Set up and maintain monitoring of infrastructure and application

  • Build alerts and auto recovery for various operational issues

  • Capture and analyze metrics from operating systems as well as applications

  • Advise in performance tuning and fault finding

  • Partner with development teams to improve services

  • Assist in formulating preventive actions where possible, lead potential failure scenarios studies and formulate automated recovery methods

  • Comfortable with working on new tools e.g., Azure DevOps, Grafana, ELK, Dynatrace

People Management Duties:

  • Train and mentor other consultants or teammates on your specialties

  • Be the advisor toward applications and assist application team establish recovery processes

Requirements:

  • Tertiary qualification in Computer Science or any other relevant education

  • Programming Languages: Java 8 or above (must have)

  • Experience in developing and optimizing stored procedures for MySQL and MSSQL databases

  • OS: Linux(RHEL or SUSE) or Windows Server

  • Scripting (must have any one of them) : Shell, Bash, Powershell

  • Knowledge in open-source distributed version control system, git

  • Sound knowledge of how REST API works

  • Experience in Atlassian tools (e.g., Jira, Bitbucket, Confluence)

  • Familiarity with Azure Cloud services

  • Working experience with ITIL in Agile environment

Good to have:

  • Experience with Python programming language

  • Experience with containerization (Docker, AKS, ACR, EKS, ECS)

  • Experience in CICD with Azure DevOps

  • Experience in Dashboard development with Grafana, Azure Monitor, or Dynatrace

  • Experience in infrastructure management with Terraform or Ansible

  • Experience with Azure or AWS cloud certification would be an added advantage

Build a career with us as we help our customers and the community live Healthier, Longer, Better Lives.

You must provide all requested information, including Personal Data, to be considered for this career opportunity. Failure to provide such information may influence the processing and outcome of your application. You are responsible for ensuring that the information you submit is accurate and up-to-date.



  • Kuala Lumpur Office, Malaysia Guidewire Full time 100,000 - 120,000 per year

    SummaryWe are searching for a Senior Site Reliability Engineer hungry for a rare chance to transform insurance with the industry's leading cloud platformJob DescriptionSenior Site Reliability Engineer - Cloud ApplicationThe OpportunityAt Guidewire, growth isn't just financial, it's technical. With Q3 FY2025 revenue up 22% year-over-year and a rapidly growing...


  • Kuala Lumpur, Kuala Lumpur, Malaysia VCB Malaysia Berhad Full time 144,000 - 156,000 per year

    Overview:As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and operations, ensuring robust, scalable, and responsive infrastructure. This role emphasizes strong system architecture and design principles, focusing on...


  • Kuala Lumpur, Kuala Lumpur, Malaysia PeopleScope Full time 60,000 - 120,000 per year

    Site Reliability EngineerJob Description:Ability to debug scripts and automate routine tasks in OS, network, database or application servers. Coding experience beyond simple scripts; Experience in Devops process, programming knowledge in at least one of the following languages: Java, Python, or Go; Scripting skills in at least of the following:...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Abhidi Solution Private Limited Full time 120,000 - 180,000 per year

    Job Title: Site Reliability Engineer (SRE)Job Type: Permanent positionWork Location Kuala LumpurResponsibilities:Strong hands-on experience with VMware solutionsStrong experience with patch management for OS & middlewareExperience in VMware server templating/blueprints (RedHat & Windows)Experience with Infrastructure-as-Code, orchestration, configuration...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Hunters International Full time 19,000 per year

    Overview:As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and operations, ensuring robust, scalable, and responsive infrastructure. This role emphasizes strong system architecture and design principles, focusing on...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Unison Consulting Full time 120,000 - 240,000 per year

    As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and operations, ensuring robust, scalable, and responsive infrastructure. This role emphasizes strong system architecture and design principles, focusing on key SRE...


  • Malaysia HCLTech Full time 90,000 - 120,000 per year

    Job PurposePlatform Reliability Engineer (PRE) is responsible for engineering, operating, and maintaining GELs internal container platform and its supporting infrastructure, with a strong focus on reliability, resiliency, and security. As a Senior PRE within GELs Infrastructure team, you will play a pivotal role in designing, building, and operating...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Unison Group Full time 120,000 - 240,000 per year

    As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and operations, ensuring robust, scalable, and responsive infrastructure. This role emphasizes strong system architecture and design principles, focusing on key SRE...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Career Wise Full time 120,000 - 240,000 per year

    As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and operations, ensuring robust, scalable, and responsive infrastructure. This role emphasizes strong system architecture and design principles, focusing on key SRE...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Aisling Group Full time 90,000 - 120,000 per year

    COMPANY PROFILE: Our client is a Tech Ecommerce Scale-Up that provides a single platform for customers to shop for the best price online. Not only that, they also provide data and insights to customers on latest trends and e-commerce sector.   They are looking for a Site Reliability Engineers (SREs) who are responsible for keeping all services and...