Current jobs related to Platform Reliability Engineer - Greater Kuala Lumpur - Abhidi Solution


  • Kuala Lumpur, Kuala Lumpur, Malaysia POWER IT SERVICES Full time 120,000 - 240,000 per year

    Platform Reliability EngineerExperience: 5–7 YearsRole Overview:Responsible for engineering, operating, and maintaining internal container platforms (Broadcom Tanzu, Kubernetes) with a focus on reliability, resiliency, security, and automation.Key Responsibilities:Manage and optimize Tanzu Application Service and Kubernetes clusters.Ensure platform...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Tata Consultancy Services (TCS) Full time 90,000 - 120,000 per year

    Roles & Responsibilities:Job Purpose:Platform Reliability Engineer (PRE) is responsible for engineering, operating, and maintaining GEL's internal container platform and its supporting infrastructure, with a strong focus on reliability, resiliency, and security. As a Senior PRE within GEL's Infrastructure team, you will play a pivotal role in designing,...


  • Greater Kuala Lumpur, Malaysia Chemcastle Sdn Bhd Full time 800,000 - 1,000,000 per year

    Description:POSITION OVERVIEW : Software Development AnalystPOSITION GENERAL DUTIES AND TASKS :A Site Reliability Engineer (SRE) for VMware Cloud Foundation (VCF) focuses on ensuring the reliability, availability, and performance of the VCF platform through automation, monitoring, and proactive problem-solving. This role involves developing and implementing...


  • Kuala Lumpur, Kuala Lumpur, Malaysia HCLTech Full time 120,000 - 240,000 per year

    About the role:Platform Reliability Engineer (PRE) is responsible for engineering, operating, and maintaining GEL's internal container platform and its supporting infrastructure, with a strong focus on reliability, resiliency, and security. As a Senior PRE within GEL's Infrastructure team, you will play a pivotal role in designing, building, and operating...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Abhidi Solution Sdn Bhd Full time 72,000 - 120,000 per year

    Key Responsibilities:Maintain and optimize internal container platform (Tanzu + Kubernetes), including provisioning, monitoring, outage response, and capacity planning.Improve platform reliability, security, and scalability by identifying issues, applying Tanzu updates, and automating manual workflows.Participate in 24/7 on-call rotation to resolve...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Abhidi Solution Full time 80,000 - 120,000 per year

    Please find the Skill set requirement for Tanzu (TAS & TKGi)TASTanzu Application Service (TAS), BOSH, Ops Manager, CF Cli.Different types of Tiles using Tanzu platform (Bosh director, antivirus, harbor, compliance scanner, healthwatch)TKGiTanzu Kubernetes Grid integration (basic Architecture).Kubernetes - Lifecycle activities include upgrading the cluster...

  • Platform Engineer

    2 days ago


    Kuala Lumpur, Kuala Lumpur, Malaysia Hyred APAC Full time 100,000 - 150,000 per year

    About the clientOur client specialises in building Agentic AI systems.Role DescriptionOur client is looking to onboard a Platform Engineer to serve as one of the principal architects of SupplyOS, their agentic AI platform that powers procurement, inventory, manufacturing, and logistics operations across Asia.You will help define the long term technical...

  • Reliability Engineer

    5 hours ago


    Kuala Lumpur Centre, Kuala Lumpur, Malaysia ExxonMobil Malaysia Full time 120,000 - 240,000 per year

    As an experienced Reliability Engineer, you will provide strategic, cross-functional, and technical leadership in the areas of reliability engineering, maintenance optimization, and asset management through all stages of project development and asset operation. You will support technology development and implementation to enhance reliability and...

  • Reliability Engineer

    5 hours ago


    Kuala Lumpur, Kuala Lumpur, Malaysia ExxonMobil Full time 150,000 - 250,000 per year

    Location:Kuala Lumpur, 14, MYCompany: ExxonMobilAbout usAt ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world's largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do and what we stand for.The success...


  • Greater Kuala Lumpur, Malaysia Cognizant Full time 100,000 - 120,000 per year

    Job Title : Integration Platform Engineer ExpertLocation : Puchong, MalaysiaThe Data Integration product provides ETL (Extract Transform & Load) platforms with Informatica PowerCenter, PCCE, CDI and Talend Cloud for several entities around the worldResponsibilities:Management and creation / update of jobs TWS (Tivoli IBM);Installation and upgrade of...

Platform Reliability Engineer

2 weeks ago


Greater Kuala Lumpur, Malaysia Abhidi Solution Full time 120,000 - 240,000 per year

Job Purpose

Platform Reliability Engineer (PRE) is responsible for engineering, operating, and maintaining GEL's internal container platform and its supporting infrastructure, with a strong focus on reliability, resiliency, and security. As a Senior PRE within GEL's Infrastructure team, you will play a pivotal role in designing, building, and operating distributed container hosting solutions using Broadcom's Tanzu product.

Roles & Responsibilities

  • As a Senior Platform Reliability Engineer, you will play a key role in maintaining the stability, reliability, and efficiency of GEL's internal container platform and its supporting infrastructure. Your responsibilities will include core operational tasks such as resource provisioning and management, responding to platform and application outages, capacity planning, monitoring, and driving reliability enhancements.
  • You will continuously evaluate platform's technical architecture to ensure it scales effectively with evolving application demands.
  • This includes proactively identifying and resolving reliability issues, analyzing product dependencies, pinpointing performance bottlenecks, and implementing optimization strategies to enhance platform availability and cost efficiency.
  • In this role, you will participate in a 24/7 on-call rotation, promptly addressing alerts from the global monitoring team and resolving production incidents to maintain platform and application uptime. Additionally, you will regularly review team workflows to identify manual processes and implement automation solutions that reduce effort and minimize human error.
  • Regularly review the security advisory issued by Broadcom related to Tanzu suite of products and deploy product updates as required to keep platform vulnerable free.
  • Work with open-source technologies, CI/CD, SCM tools as necessary, and source control such as Bitbucket, implement organization containers (eg, Docker and Kubernetes). Stay current with industry trends and propose new ways for our business to improve
  • Takes accountability in considering business and regulatory compliance risks and takes appropriate steps to mitigate the risks.
  • Maintains awareness of industry trends on regulatory compliance, emerging threats and technologies in order to understand the risk and better safeguard the company.
  • Highlights any potential concerns /risks and proactively shares best risk management practices.

Our Requirements

  • Working experience as a Platform Reliability Engineer or strong working experience as a Site Reliability Engineer in a cloud operating environment. Candidates with excellent DevOps experience will be considered.
  • Strong experience in managing Tanzu Application Service and Kubernetes clusters.
  • Good working knowledge of DevOps pipeline and automation tools (E.g. Selenium, SOAPUI, Bamboo, Jenkins, Ansible, Maven, Github, Bitbucket, Nexus, Jira, Confluence etc).
  • Strong technical and business acumen with the ability to lead a small technical team.
  • Experience with infrastructure-as-code, server templating, orchestration, configuration management and provisioning tools is advantageous e.g. Terraform, Chef, Docker, Packer, Kubernetes.
  • Must code, debug and optimize code and automate repetitive tasks.
  • Systematic problem-solving approach, coupled with effective communication skills and a sense of ownership and drive.
  • Experienced in one or more of the following: C, C++, Java, Python, Go, Perl or Ruby.
  • Strong experience in a Continuous Integration/Continuous Delivery (CI/CD) environment with strong appreciation of change/version control process and methodologies
  • Strong experience in dealing with platform upgrades, patching and buildpack management
  • Strong experience in troubleshooting network related issues
  • Good working knowledge of NSX-T solution and its integration with various Tanzu suite of products
  • Candidate should be open to take up on call support on rotation basis
  • Candidate should be willing to work in shifts