Site Reliability Engineer

2 weeks ago


Kuala Lumpur, Kuala Lumpur, Malaysia WCC Full time

Join us in providing work that matters

WCC has changed lives since 1996. We are a group of highly ambitious professionals who believe in the greater story. WCC is more than just a software organization, we are a community that strives for improving human life. We provide software that matters.

Our product is an advanced Search and Match engine used in solutions for the private and public sector.

We specialize in:

  • ID & Security Solutions - WCC enables governments to manage large volumes of Identity and Security data. Protecting borders and citizens while providing legal identity for all
  • Employment Solutions - WCC enables Public and Private Employment Services to match people quickly and expertly with suitable and sustainable jobs

What's in it For You?

Our team - Our people believe unity is one of our strengths. So, if teamwork is important for you, we trust you will enjoy working in a team where people feel welcome, valued, and respected.

Work environment - We focus on talent and possibilities, not limitations. We love challenges and exploring new creative horizons. WCC has a diverse environment that gives every person the freedom to express their ideas.

We want to give you the conditions to do your best work, so here are the Perks and Benefits we provide:

  • competitive salary
  • Indefinite contract
  • Health insurance
  • Travel allowance
  • 21 vacation days
  • 13th salary
  • personal development opportunities
  • hybrid working from home / working from the office policy
  • Home office budget
  • An opportunity to create an international and diverse network.

Role

As a Site Reliability Engineer you have a unique role in our organization. You play an important role in the dynamics of software development, additional operations experience, sysadmin and IT operations. As site Reliability engineer you support our product owners and DevOps team to determine which new features can be launched and when by using service-level agreements (SLAs) to define the required reliability of the system through service-level indicators (SLI) and service-level objectives (SLO).

Responsibilities

  • Ensure the availability and efficient working of the services in compliance with the non-functional expectations
  • Plan and implement continuous improvements and changes in the ecosystem through automation
  • Handle service interruptions towards resolution within the defined SLAs with a mindset of continuous improvement
  • React to events (monitor alerts, support escalation issues, internal incidents), i.e. incidents that hit the application or the underlying infrastructure. Troubleshoot and resolve the service interruption (either hands-on or by guiding 3rd party for incident resolution actions with clear instructions)
  • Provide information for root cause analysis and/or conduct postmortem and provide reports
  • Provide recommendations/workarounds for identified problems
  • Liaise and act with others (Vendors, internal teams) for incident and problem management.
  • Provide and implement improvements in proactive actions: extend monitoring, tune alerting and alert thresholds, increase observability of the services and log management
  • Documentation: Create documentation tuned for the intended audience, including runbooks, Knowledge Base articles, how-to articles
  • Communication: Communicate with different stakeholders and vendors on technical level. Able to translate the impact of technical issues and concept to non-technical users for impact assessment.
  • Increase observability and manageability by:
  • Building and configuring logging, monitoring, and alerting
  • Providing information about what needs to be monitored, how, and the recommended thresholds
  • Participate in tuning and extending the monitoring implementation
  • Provide the mechanisms and preparation for possible system failures and outages and increase the robustness of the system
  • Participate in performance and capacity planning
  • Standby/on-call roster participation

In order to succeed in this role you must have have:

  • Strong AWS knowledge and experience
  • At least 5 years of experience in running distributed production loads on a variety of technical stack; with an ability to deep dive into complex problems
  • Proven experience in using software tools to automate IT operation tasks, including production system management, change management, application monitoring etc. For example,
    • Knowledge in maintaining continuous integration (CI) and continuous deployment/delivery (CD) systems for complex, distributed applications, using tools like GitLab, Jenkins etc.

    • Automate all aspects of deployment with CI/CD pipelines and Infrastructure as a Code (IaC)
  • Proven ability to triage problems quickly, assess the problem's impact and severity, and provide appropriate response. Ability to provide workarounds for the system to work while not ignoring the need for root cause troubleshooting
  • Good working knowledge of ITIL processes and procedures (e.g. incident, problem, emergency change)
  • Broad technology experience with Automation Software
  • Fluent in English (written and spoken)

Bonus point for:

  • Experience in software development
  • Containers and container orchestration experience
  • Terraform, Ansible
  • Have experience with CouchBase, Keycloak, Nginx, MySQL/MariaDB, PostgreSQL, ActiveMQ, ELK Stack )
  • Good sense of humor

Sounds good?

Upload your motivation and CV in English via the "Apply" button. You will hear back from us within the blink of an eye. Click here for the application process.


#J-18808-Ljbffr

  • Kuala Lumpur, Kuala Lumpur, Malaysia Agensi Pekerjaan Eps Consultants Sdn Bhd Full time

    Client background: A software development company, aim to be a powerhouse of social & mobile gamesIndustry:IT Location: Bangsar, KL Headcount: 1 Tenure: Permanent Remuneration:Basic salary EPF & SOCSO Company benefits Our client is seeking a Site Reliability Engineer to join their dynamic IT team. In this role, you'll be instrumental in ensuring the...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Agensi Pekerjaan Eps Consultants Sdn Bhd Full time

    Client background: A software development company, aim to be a powerhouse of social & mobile gamesIndustry:IT Location: Bangsar, KL Headcount: 1 Tenure: Permanent Remuneration:Basic salary EPF & SOCSO Company benefits Our client is seeking a Site Reliability Engineer to join their dynamic IT team. In this role, you'll be instrumental in ensuring the...


  • Kuala Lumpur, Kuala Lumpur, Malaysia TIME's group Full time

    Engineering - Software (Information & Communication Technology) At SNSoft, we're at the forefront of technological advancement, integrating cutting-edge solutions into everyday life. Our culture thrives on innovation, collaboration, and a commitment to excellence. We foster an inclusive environment where every team member's contribution is valued and where...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Guidewire Software Full time

    Site Reliability Engineer (SRE) - Guidewire Cloud Platform (Application)Department Product Development and Operations Location Type Remote The Opportunity We are searching for a Site Reliability Engineer eager for a rare chance to transform insurance with the industry's leading cloud platform. As a member of the SRE-Application team, you'll be responsible...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Link Compliance Full time

    Our customer is a top worldwide supplier of information and technology infrastructure and smart gadgets with a workforce of more than 200,000 employees across 170 countries and regions, catering to over three billion individuals globally.Key Responsibilities:Proficiency in debugging scripts and automating regular tasks in various systems including OS,...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Embedded LLM Full time

    Our mission is to provide developers with a suite of intuitive tools and platforms that simplify the process of integrating LLMs into their software projects. We are building an open-source toolkit that empowers developers to effortlessly build cutting-edge, AI-powered applications. We're at the forefront of generative AI innovation, creating tools that...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Malaysia Full time

    Engineering - Software (Information & Communication Technology) Full time Competitive Remuneration & Benefits (AUD) Due to an exciting phase of growth, TechnologyOne is looking for a Site Reliability Engineer to join our Kuala Lumpur office. Day to day you'll work across a range of core Cloud Services from Load Balancing, Performance Optimisation,...


  • Kuala Lumpur, Kuala Lumpur, Malaysia Razer Full time

    Job Responsibilities:Design, implement, and maintain Infrastructure as Code (IaC)Collaborate with development and operations teams to ensure IaC best practices are followed.Participate in architecture reviews to provide insights into system reliability, platform management, capacity planning, and performance.Implement monitoring solutions to proactively...


  • Kuala Lumpur, Kuala Lumpur, Malaysia AirAsia Full time

    Job DescriptionAirAsia Software Engineering Team (AASET) is a technology centre that designs and creates custom-built solutions for the group's airline and digital businesses. It is a global initiative to drive its digital transformation. The technology centre comprises of a team of software engineering and technology experts based in RedQ office and...

  • Reliability Engineer

    1 month ago


    Kuala Lumpur, Kuala Lumpur, Malaysia ExxonMobil Full time

    About our CompanyExxonMobil is a global leader in energy innovations striving for a sustainable future. With a diverse and dedicated team, we work on solutions to enhance energy, chemicals, and technologies while focusing on lower-emissions.ExxonMobil in MalaysiaOperating as a major energy and chemical provider, ExxonMobil in Malaysia contributes to oil and...

  • Reliability Engineer

    2 weeks ago


    Kuala Lumpur, Kuala Lumpur, Malaysia ExxonMobil Full time

    About our CompanyExxonMobil is a global leader in energy innovations striving for a sustainable future. With a diverse and dedicated team, we work on solutions to enhance energy, chemicals, and technologies while focusing on lower-emissions.ExxonMobil in MalaysiaOperating as a major energy and chemical provider, ExxonMobil in Malaysia contributes to oil and...

  • Reliability Engineer

    4 weeks ago


    Kuala Lumpur, Kuala Lumpur, Malaysia ExxonMobil Full time

    About our CompanyExxonMobil is a global leader in energy innovations striving for a sustainable future. With a diverse and dedicated team, we work on solutions to enhance energy, chemicals, and technologies while focusing on lower-emissions.ExxonMobil in MalaysiaOperating as a major energy and chemical provider, ExxonMobil in Malaysia contributes to oil and...

  • Reliability Engineer

    2 weeks ago


    Kuala Lumpur, Kuala Lumpur, Malaysia Exxon Mobil Full time

    Press Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world's largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do...


  • Kuala Lumpur, Kuala Lumpur, Malaysia AIA Hong Kong and Macau Full time

    P Site Reliability Engineer, Principal-1 page is loaded P Site Reliability Engineer, Principal-1 Apply locations Kuala Lumpur, MY-AIA Malaysia time type Full time posted on Posted Today job requisition id JR-45795 At AIA we've started an exciting movement to create a healthier, more sustainable future for everyone.As pioneering innovators for over 100...


  • Kuala Lumpur, Kuala Lumpur, Malaysia InsiderSecurity Full time

    Responsibilities:Build automation for DevOps and be its advocate in the product teamsBuild automation for high availability and robustness of our infrastructureMonitor our infrastructure health to ensure high availabilityMeasure our infrastructure performance and optimize resource usageRespond to infrastructure issues and come up with good (preferably...

  • Site Engineer

    2 months ago


    Kuala Lumpur, Kuala Lumpur, Malaysia Randstad Malaysia Full time

    about the companyThe client is a well-regarded local firm specializing in the design, construction, and project management of bridge and waterworks construction projects since 1995.about the jobAssist in the preparation of project plans, schedules, and budgets.Coordinate with architects, contractors, and stakeholders to ensure project requirements are...

  • Site Engineer

    4 weeks ago


    Kuala Lumpur, Kuala Lumpur, Malaysia Randstad Malaysia Full time

    about the companyThe client is a well-regarded local firm specializing in the design, construction, and project management of bridge and waterworks construction projects since 1995.about the jobAssist in the preparation of project plans, schedules, and budgets.Coordinate with architects, contractors, and stakeholders to ensure project requirements are...

  • Reliability Engineer

    2 weeks ago


    Kuala Lumpur, Kuala Lumpur, Malaysia The Boeing Company Full time

    Job DescriptionAt Boeing, we innovate and collaborate to make the world a better place. From the seabed to outer space, you can contribute to work that matters with a company where diversity, equity and inclusion are shared values. We're committed to fostering an environment for every teammate that's welcoming, respectful and inclusive, with great...

  • Site Engineer

    2 weeks ago


    Kuala Lumpur, Kuala Lumpur, Malaysia Randstad Malaysia Full time

    about the companyOur client is a global leader in providing essential infrastructure and services catering to data centers and associated facilities.about the jobCoordinating, supporting, and executing all operational and maintenance activities and responsible for ensuring the timely and accurate updating of specified data/information, as well as any...

  • Site Engineer

    1 month ago


    Kuala Lumpur, Kuala Lumpur, Malaysia Randstad Malaysia Full time

    about the companyOur client is a global leader in providing essential infrastructure and services catering to data centers and associated facilities.about the jobCoordinating, supporting, and executing all operational and maintenance activities and responsible for ensuring the timely and accurate updating of specified data/information, as well as any...