Xoc Analyst

7 months ago


Johor Bahru, Malaysia BYTEBRIDGE TECHNOLOGY SDN. BHD Full time

**Scope of Work (SOW)**

**Incident & Problem Management**
- Investigate and respond to alerts, incident response (war room, remote bridges) and report, and on-going maintenance, tuning, and improvements of the detection signals
- Respond to incidents and critical situations in a calm, problem-solving manner, and conduct in-depth investigation of alerts
- Be the first layer of defense responsible for quick detection and incident response using various monitoring and automation tools, conduct thorough investigation of alerts, classification and triage.
- Provide deep understanding and intelligence of the criticality and impact of the incidents to the resolver groups.
- Ensure detailed records of alarm handling activities, including actions taken, resolutions in ticketing tools and file incident reports.
- Be available to coordinate as an incident commander in event of an issue.
- Support program managers and facilitate project deliverables, improve overall operational and engineering initiatives.
- Conduct root cause analysis (RCA) to determine recurring problems to their source.
- Employ in-depth questioning and analysis techniques such as five whys to determine the underlying cause of the incident or problem.
- Perform duties in compliance with SOP.

**Server, DCIM, Network and Traffic Alarms Operations**
- Continuously monitor alarm dashboards and systems.
- Investigate and respond to alarms such as but not limited to Network, DC Environment, Server Health, Facility Security and Safety.
- Identify and acknowledge incidents associated with alarms.
- Assess incidents to determine their criticality and impact on operations.
- Engage the resolver group who will be resolving the incident and escalate to higher tiers or management when necessary, following established escalation paths.
- Documented procedures to resolve incidents promptly and effectively.
- Ensure detailed records of alarm handling activities, including actions taken and resolutions in ticketing tools.
- Perform duties in compliance with SOP.

**Threat Intelligence & Critical Event Management**
- Monitor directed tools or queries for specific requests from stakeholders.
- Notifications about violence, inclement weather, threats to life, property and assets etc.
- Coordinate emergency response efforts, including liaising with law enforcement if needed.
- Conduct research to verify the accuracy and relevance of the information through additional sources.
- Create heatmap of the affected area to highlight areas impacted by a specific event or series of events.
- Collaborate with other security and operational teams for a coordinated response.
- Implement incident containment and mitigation strategies.
- Document incident details, response actions, and lessons learned.
- Perform duties in compliance with SOP.

**Physical Security and Safety**
- Basic monitoring of Closed-Circuit Television (CCTV) systems and Access Control Systems(ACS).
- Monitor safety alarms and communication channels for events such as but not limited to electrical incidents, fire & environmental hazards, equipment failure, chemical exposure, water leaks, that pose a risk to the safety of personnel or the data center infrastructure.
- Conduct audits of camera footage to ensure proper functioning, video quality, and coverage of critical areas.
- Respond to access control incidents and anomalies.
- Report findings to the security and safety engineers, and relevant stakeholders promptly.
- Perform duties in compliance with SOP.

Badge Management
- Perform badge enrolment and ensure that all requests go through proper approval process and to assess accuracy and completeness of request in compliance with SOP.
- Generating access logs reports.
- Conduct access log audit.
- Continuous Service Improvement
- Identify areas of improvement within current service delivery processes.
- Implement changes that lead to measurable enhancements in service quality, efficiency, and customer satisfaction.
- Establish a culture of continuous improvement within the organization.
- Establish mechanisms for ongoing feedback collection from customers and employees.
- Integrate feedback into future continuous improvement efforts.

**Basic Qualifications**
- 2 years+ experience in command center, service center, or similar 24x7 operations center environment
- Ability to quickly triage multiple incidents and assign the right priority based on risk and confidence levels
- Knowledge of technical elements associated with systems such as IP Networks, DC
- Environment and Server Health.
- Outstanding verbal and written communication skills required, work with mínimal direction, meeting goals, attention to details and an eye for continuous improvements
- Ability to successfully interact at all levels of the organization, including with clients, while functioning as a team player required.
- Basic working knowledge of data protection policies such as GDPR and the need to keep sensitive information secure.
- XOC A