Senior Site Reliability Engineer

4 days ago


Kuala Lumpur, Kuala Lumpur, Malaysia MindPec Solutions Full time

Role: Senior Linux Site Reliability Engineer

Client: IT Services

Working Mode: On Site (Monday to Friday 9AM to 6 PM)

Job Type: Permanent

Experience: More than 3 years experience in Site Reliability Engineer or DevOps Engineer.

Applicants: Open to Local candidates who can speak English & Chinese.

JOB DESCRIPTION
  1. Develop, deploy, and maintain reliable, high-performance site infrastructure.
  2. Implement and manage comprehensive monitoring solutions (Metrics, Logs, Traces), utilizing tools such as Grafana, Prometheus, or other relevant platforms.
  3. Automate processes for deployment, system health monitoring, and incident response.
  4. Troubleshoot and resolve complex technical issues across development, testing, and production environments to minimize downtime.
  5. Collaborate with development teams to enhance system performance and reliability.
  6. Manage containerized applications using Kubernetes.
  7. Participate in on-call rotations, responding to emergencies to maintain system uptime.
  8. Optimize infrastructure for scalability and fault tolerance.
JOB REQUIREMENTS
  1. Kubernetes & Docker: Experience managing containerized applications and clusters.
  2. Nginx: Knowledge of web server and reverse proxy configuration.
  3. Ansible: Automation of system configuration and deployment.
  4. Zabbix & Grafana: Experience with monitoring and visualizing system metrics.
  5. AWS: Hands-on experience with cloud infrastructure management (EC2, S3, etc.).
Seniority level

Mid-Senior level

Employment type

Full-time

Job function

Engineering and Information Technology

#J-18808-Ljbffr

  • Kuala Lumpur, Kuala Lumpur, Malaysia Chinasoft International (CSI) Full time

    Company DescriptionWe suggest you enter details here.Role DescriptionThis is a full-time on-site role for a Site Reliability Engineer based in WP. Kuala Lumpur. The Site Reliability Engineer will be responsible for maintaining system reliability and availability. Daily tasks will include troubleshooting issues, ensuring proper infrastructure setup, and...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Swift Software Full time

    At Swift Software, we are dedicated to creating a culture that values diversity, intellectual curiosity, problem-solving, and openness. We encourage collaboration, big thinking, and risk-taking in a supportive, blame-free environment.About the RoleSet annual objectives for team members.Conduct performance appraisals and provide constructive feedback to...


  • Kuala Lumpur, Kuala Lumpur, Malaysia SWIFT Full time

    About the PositionWe are seeking a highly skilled Senior Site Reliability Engineer to lead our team responsible for ensuring the reliability, uptime, and performance of our mission-critical systems. As a Senior SRE Manager, you will be responsible for providing technical leadership, mentorship, and guidance to your team members, as well as collaborating with...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Glints Full time

    Glints Federal Territory of Kuala Lumpur, MalaysiaSite Reliability EngineerReady to elevate your career with a globally recognized professional services firm? We are seeking a skilled DevOps / SRE Specialist to join our team. You'll be at the forefront of transforming business challenges into cutting-edge technology solutions, working alongside diverse...


  • Kuala Lumpur, Kuala Lumpur, Malaysia MindPec Solutions Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at MindPec Solutions. As a key member of our infrastructure team, you will be responsible for developing, deploying, and maintaining reliable, high-performance site infrastructure.Your primary focus will be on implementing and managing comprehensive monitoring solutions...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Swift Software Full time

    As a Senior Site Reliability Engineering Manager at Swift Software, you will lead a team responsible for providing the platform for mission-critical systems to maintain constant uptime, scale seamlessly, and enable new applications and services to flourish.About the RoleRecruit and retain engineers with diverse perspectives.Provide coaching, mentorship, and...


  • Kuala Lumpur, Kuala Lumpur, Malaysia PERSOLKELLY Malaysia Full time

    This job is for a Site Reliability Engineer focusing on network systems. You might like this job because you'll automate tasks, ensure services run smoothly with minimal downtime, and work with coding languages like Java or Python in a dynamic tech environmentJob Description:Ability to debug scripts and automate routine tasks in OS, network, database or...


  • Kuala Lumpur, Kuala Lumpur, Malaysia PERSOLKELLY Malaysia Full time

    Achieve service excellence as a Site Reliability Engineer specializing in network systems! You will join a team that aims to deliver high-quality services through automation, reliability and efficiency.Key Job Duties:Debug scripts and automate routine tasks in OS, network, database or application serversCoding experience beyond simple scripts is requiredOur...


  • Kuala Lumpur, Kuala Lumpur, Malaysia DUG Technology Full time

    We are a technology company at the forefront of high-performance computing (HPC) with a strong foundation in applied physics. Our innovative hardware and software solutions for the global technology and resource sectors enable our clients to leverage large and complex data sets.We operate three world-class green supercomputer clusters, running a large suite...


  • Kuala Lumpur, Kuala Lumpur, Malaysia WCC Full time

    Job DescriptionSysadmin, IT Operations, and DevOps ExpertiseDistributed Production Load ManagementContinuous Integration, Continuous Deployment, and Continuous ImprovementWe are seeking a skilled Site Reliability Engineer to support our product owners and DevOps team in determining which new features can be launched and when, using service-level agreements...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Glints Full time

    We are seeking a skilled DevOps/SRE specialist to join our team at Glints. This role will be at the forefront of transforming business challenges into cutting-edge technology solutions.Key ResponsibilitiesConduct research and analysis to develop innovative, technology-enabled business solutions.SUPPORT THE DESIGN AND DELIVERY OF DIGITAL SOLUTION...


  • Kuala Lumpur, Kuala Lumpur, Malaysia InsiderSecurity Full time

    InsiderSecurity: A Cybersecurity LeaderWe are a leading provider of cybersecurity solutions, dedicated to helping organizations protect themselves against ever-evolving threats. As a DevOps/Site Reliability Engineer, you will play a critical role in ensuring the reliability, quality, and time-to-market of our products.About the PositionDesign, develop, and...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Businesslist Full time

    We're looking for a seasoned Site Reliability Engineer to join our team at Businesslist. The ideal candidate will be skilled in managing and optimizing cloud infrastructure using Azure.About the RoleYou'll be responsible for ensuring the smooth operation of our systems, with a focus on scalability and reliability.Proficiency in Infrastructure as Code (e.g.,...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Exxon Mobil Full time

    At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world's largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do and what we stand for.The success of our Upstream, Product Solutions and Low Carbon...


  • Kuala Lumpur, Kuala Lumpur, Malaysia SWIFT Full time

    Responsibilities and RequirementsWe are seeking a Senior Site Reliability Engineer to lead our team responsible for ensuring the reliability, uptime, and performance of our mission-critical systems. As a Senior SRE Manager, you will be responsible for providing technical leadership, mentorship, and guidance to your team members, as well as collaborating with...

  • Site Reliability Engineer

    59 minutes ago


    Kuala Lumpur, Kuala Lumpur, Malaysia Businesslist Full time

    ResponsibilitiesEnsure the reliability and robustness of the company's technology systems.Identify and resolve system issues, ensuring minimal disruption to operations.Collaborate with the team to develop strategies for system improvements.Maintain an up-to-date understanding of industry trends and emerging technologies.Participate in system planning and...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Chinasoft International (CSI) Full time

    We're looking for a skilled Software Reliability Specialist to join our team at Chinasoft International (CSI). This is a full-time on-site role based in Kuala Lumpur, where you'll play a crucial role in maintaining system reliability and availability.Your daily tasks will involve troubleshooting issues, setting up infrastructure, and performing system...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Net2Source Inc. Full time

    About the PositionWe are seeking a System Reliability Engineer to join our IT team in Kuala Lumpur, Malaysia. In this role, you will be responsible for ensuring the reliability and performance of our systems and applications.Main ResponsibilitiesDesigning and implementing system reliability solutions.Monitoring and analyzing system performance, identifying...


  • Kuala Lumpur, Kuala Lumpur, Malaysia SPECIFIC DIMENSION SDN BHD Full time

    Job OverviewWe are seeking a highly skilled Site Engineer to join our team at SPECIFIC DIMENSION SDN BHD. As a Site Engineer, you will be responsible for overseeing the execution of construction projects, ensuring timely completion and quality delivery.ResponsibilitiesProject Planning and Execution: Develop and implement project plans, schedules, and budgets...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Aarorn Technologies Sdn Bhd Full time

    Aarorn Technologies Sdn Bhd Overview:As a technology-driven company, we recognize the significance of Site Reliability Engineers in maintaining the reliability and performance of critical services. We are seeking a skilled expert to bridge the gap between development and operations.Key Responsibilities:We are looking for an individual with expertise in...