Site Reliability Engineer
7 days ago
As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and operations, ensuring robust, scalable, and responsive infrastructure. This role emphasizes strong system architecture and design principles, focusing on key SRE practices such as Service Level Objectives (SLOs), Service Level Indicators (SLIs), and the reduction of operational toil. You will collaborate closely with diverse teams to drive reliability improvements and foster a culture of continuous learning and accountability.
Key Responsibilities:
- Design and implement resilient system architectures that support high availability and scalability.
- Develop automation tools and scripts to enhance operational efficiency and reduce manual effort.
- Define, track, and analyze SLOs and SLIs to ensure reliability and performance meet business needs.
- Conduct thorough post-mortem analyses following incidents, driving continuous improvement through root cause identification and solution implementation.
- Collaborate with development and operations teams to establish best practices in system reliability and incident management.
- Troubleshoot and resolve issues related to database performance, network connectivity, and deployment failures, including diagnosing problems at the underlying platform level (e.g., Kubernetes, virtual machines).
- Ensure that issues are resolved within the stipulated Service Level Agreements (SLAs), maintaining high standards of service delivery.
- Identify and troubleshoot performance bottlenecks across systems, providing actionable recommendations for enhancements.
- Maintain detailed documentation of processes and incident responses to support knowledge sharing and compliance.
Qualifications:
- Proficiency in programming languages such as Python, Golang, Java, or similar, focusing on operational efficiency.
- Demonstrated experience in system architecture and design, prioritizing reliability, and scalability.
- Strong understanding of SRE principles, including SLOs, SLIs, toil reduction, and incident post-mortems.
- Experience with cloud environments (e.g., AWS, Azure, Google Cloud) and their operational management.
- Strong expertise in Linux system administration.
- Proven experience in troubleshooting application support issues with a focus on performance and connectivity.
- Familiarity with networking concepts and effective troubleshooting techniques.
- Excellent problem-solving abilities and a proactive approach to operational challenges.
- Ability to work independently while effectively collaborating within a team environment.
Preferred Skills:
- Familiarity with monitoring tools and performance optimization techniques.
- Experience in scripting or automation for system administration tasks.
- Knowledge of networking concepts and troubleshooting methodologies.
- Hands-on knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and their services.
- Familiarity with DevOps practices and frameworks, including CI/CD, infrastructure as code, and containerization.
Mid-Senior level
Employment typeFull-time
Job functionEngineering and Information Technology
IndustriesHuman Resources Services
#J-18808-Ljbffr-
Site Reliability Engineer
2 weeks ago
Kuala Lumpur, Kuala Lumpur, Malaysia Chinasoft International (CSI) Full timeCompany DescriptionWe suggest you enter details here.Role DescriptionThis is a full-time on-site role for a Site Reliability Engineer based in WP. Kuala Lumpur. The Site Reliability Engineer will be responsible for maintaining system reliability and availability. Daily tasks will include troubleshooting issues, ensuring proper infrastructure setup, and...
-
Site Reliability Engineer
1 week ago
Kuala Lumpur, Kuala Lumpur, Malaysia Teknowiz Full timeSite Reliability Engineer (DevOps Consultant)We are urgently hiring for one of our Big4 clients in Malaysia.Job Title: Site Reliability Engineer (DevOps)Location: KL/Johor/Penang (Onsite)Job Overview: As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help...
-
Senior Site Reliability Engineer
2 weeks ago
Kuala Lumpur, Kuala Lumpur, Malaysia MindPec Solutions Full timeRole: Senior Linux Site Reliability EngineerClient: IT ServicesWorking Mode: On Site (Monday to Friday 9AM to 6 PM)Job Type: PermanentExperience: More than 3 years experience in Site Reliability Engineer or DevOps Engineer.Applicants: Open to Local candidates who can speak English & Chinese.JOB DESCRIPTIONDevelop, deploy, and maintain reliable,...
-
Site Reliability Engineer
2 weeks ago
Kuala Lumpur, Kuala Lumpur, Malaysia PERSOLKELLY Malaysia Full timeThis job is for a Site Reliability Engineer focusing on network systems. You might like this job because you'll automate tasks, ensure services run smoothly with minimal downtime, and work with coding languages like Java or Python in a dynamic tech environmentJob Description:Ability to debug scripts and automate routine tasks in OS, network, database or...
-
Senior Site Reliability Engineer
3 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Ant International Full timeSenior Site Reliability Engineer (DevOps)Ant International powers the future of global commerce with digital innovation for everyone and every business to thrive. In close collaboration with partners, we support merchants of all sizes worldwide to realize their growth aspirations through a comprehensive range of tech-driven digital payment and financial...
-
Site Reliability Engineer
2 weeks ago
Kuala Lumpur, Kuala Lumpur, Malaysia Glints Full timeGlints Federal Territory of Kuala Lumpur, MalaysiaSite Reliability EngineerReady to elevate your career with a globally recognized professional services firm? We are seeking a skilled DevOps / SRE Specialist to join our team. You'll be at the forefront of transforming business challenges into cutting-edge technology solutions, working alongside diverse...
-
Site Reliability Engineer
4 hours ago
Kuala Lumpur, Kuala Lumpur, Malaysia iSoftStone Full timeiSoftStone WP. Kuala Lumpur, Federal Territory of Kuala Lumpur, MalaysiaSite Reliability EngineeriSoftStone WP. Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia2 weeks ago Be among the first 25 applicantsEnsure the stability of Alibaba apsara stack and cloud services running on it. Carry health check, operation & maintenance, troubleshooting tasks....
-
Site Reliability Engineer, Principal
3 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia AIA Hong Kong and Macau Full timeSite Reliability Engineer, Principal page is loadedSite Reliability Engineer, PrincipalApply locations Kuala Lumpur, MY-AIA Malaysia time type Full time posted on Posted 30+ Days Ago job requisition id JR-45612At AIA we've started an exciting movement to create a healthier, more sustainable future for everyone.As pioneering innovators for over 100 years,...
-
Site Reliability Engineer
4 hours ago
Kuala Lumpur, Kuala Lumpur, Malaysia WCC Full timeJoin us in providing work that mattersWCC has changed lives since 1996. We are a group of highly ambitious professionals who believe in the greater story. WCC is more than just a software organization; we are a community that strives for improving human life. We provide software that matters.Our product is an advanced Search and Match engine used in...
-
Senior DevOps Engineer
7 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Razer Inc. Full timeAbout Razer Inc.Razer is a leading gaming company that empowers gamers and tech enthusiasts around the world. We strive to create innovative products and experiences that inspire creativity and self-expression.As a Senior Site Reliability Engineer, you will be part of our team responsible for ensuring the reliability and scalability of our infrastructure.You...
-
Senior Site Reliability Engineer
7 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Razer Inc. Full timeSenior Site Reliability EngineerApply locations: Bangsar SouthTime type: Full timePosted on: Posted 30+ Days AgoJob requisition id: JR2024004634Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team...
-
Site Reliability Specialist
7 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Unison Consulting Full timeWe are seeking a skilled Site Reliability Engineer (SRE) to join our team at Unison Consulting. As an SRE, you will be responsible for ensuring the reliability and performance of our services.ResponsibilitiesDesign and Implement Automated Operational ProcessesManage and Monitor Linux-based Systems for Performance, Availability, and SecurityCollaborate with...
-
Reliability Engineering
6 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Jones Lang LaSalle Incorporated Full timeReliability Engineering & Operations DirectorJob OverviewSolicitar remote type On-site locations Kuala Lumpur, Malaysia time type Full time posted on Publicado ayer job requisition id REQ420122JLL empowers you to shape a brighter way.Our people at JLL and JLL Technologies are shaping the future of real estate for a better world by combining world class...
-
Site Reliability Engineer
2 weeks ago
Kuala Lumpur, Kuala Lumpur, Malaysia DUG Technology Full timeWe are a technology company at the forefront of high-performance computing (HPC) with a strong foundation in applied physics. Our innovative hardware and software solutions for the global technology and resource sectors enable our clients to leverage large and complex data sets.We operate three world-class green supercomputer clusters, running a large suite...
-
Reliability Engineering Manager
6 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Jones Lang LaSalle Incorporated Full timeJob OverviewSolicit applications for a Reliability Engineering & Operations Director position at Jones Lang LaSalle Incorporated.This role is responsible for developing and implementing reliability-centered maintenance strategies across the region, overseeing preventive and predictive maintenance programs for all facilities and assets, and supporting the...
-
Reliability Engineering Specialist
21 hours ago
Kuala Lumpur, Kuala Lumpur, Malaysia PEOPLE PROFILERS Full timeJob Description:We are seeking an experienced Reliability Engineering Specialist to join our team. In this role, you will be responsible for the management and delivery of a system(s) within a platform leveraging agile practices.The ideal candidate will have at least 3 - 6 years of relevant experience in DevOps, SRE, and a full understanding of Site...
-
Reliability Engineering
3 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia JLL Full timeJLL empowers you to shape a brighter way.Our people at JLL and JLL Technologies are shaping the future of real estate for a better world by combining world class services, advisory and technology for our clients. We are committed to hiring the best, most talented people and empowering them to thrive, grow meaningful careers and to find a place where they...
-
Reliability Engineer
7 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Teknowiz Full timeAbout TeknowizWe are a forward-thinking company that delivers innovative IT solutions to our clients. Our mission is to provide top-notch services that meet the evolving needs of modern businesses.As a Site Reliability Engineer at Teknowiz, your primary focus will be on maintaining the uptime and performance of critical systems. This involves designing and...
-
Reliability Engineer
2 weeks ago
Kuala Lumpur, Kuala Lumpur, Malaysia Exxon Mobil Full timeAt ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world's largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do and what we stand for.The success of our Upstream, Product Solutions and Low Carbon...
-
Reliability Engineer
7 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Exxon Mobil Full timeAt ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world's largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do and what we stand for.The success of our Upstream, Product Solutions and Low Carbon...