Lead Data Engineer
2 days ago
About Us
At YTL AI Labs, we build sovereign AI models that perform on par with the world's best, while staying grounded in local needs, values, and context. Our flagship model,
Ilmu
, is designed to be culturally aware, contextually intelligent, and fluent in Bahasa Melayu, delivering cutting-edge solutions that empower Malaysian businesses with intelligence that truly understands the market and the people they serve.
As pioneers of sovereign AI, we believe every nation should have the power to shape its own intelligence, guided by its people, priorities, and principles.
About the Role
As the
Lead Data Engineer
on the
Data Team
, you will lead the design and implementation of scalable, high-fidelity data acquisition systems that power cutting-edge AI research and deployment. You will architect end-to-end data pipelines that span diverse modalities, such as text, code, and multimedia, ensuring efficiency, reliability, and quality at web-scale. In this role, you will mentor engineers, shape technical direction, and collaborate across research, infrastructure, and product teams to ensure our data ecosystem meets the evolving needs of large-scale AI systems.
You'll be responsible for
1. Strategic Web-Scale Data Acquisition
- Architect and oversee the development of robust, distributed pipelines for crawling and ingesting structured and unstructured data from various sources (e.g., forums).
- Lead the design of scraping systems using Scrapy, Playwright, Selenium, etc., optimized for dynamic content.
- Define and enforce standards for data normalization and harmonization to ensure semantic consistency across datasets.
- Drive initiatives to expand multilingual and multimodal data coverage.
2. Metadata Strategy and Dataset Lineage
- Define metadata schemas and enrichment strategies (e.g., licensing, timestamps, source reliability).
- Build metadata validation and tracing systems to ensure transparency, reproducibility, and ethical dataset use.
3. Scalable, Secure, Resilient Web Interaction
- Develop web interaction strategies
- Deploy and monitor scraping infrastructure for high-throughput, fault-tolerant operations.
4. Data Infrastructure and Governance
- Design modular data lake architectures and metadata-aware repositories (e.g., Elasticsearch, HBase, Snowflake, MongoDB).
- Standardize dataset versioning and lineage using tools like DVC, Dolt, or HuggingFace Datasets.
- Establish IaC-based deployment and monitoring using Terraform, Helm, Kubernetes, Airflow, etc.
5. Team Leadership and Cross-Functional Collaboration
- Lead and mentor data engineers.
- Partner with AI research teams to align data priorities with model development goals.
- Work with infrastructure teams to optimize scalability, performance, observability, and cost.
- Ensure compliance with legal and responsible data sourcing standards.
What We're Looking For
- Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or related field.
- 3+ years of experience building and scaling complex data pipelines or backend systems, with leadership responsibility.
- Deep expertise in web crawling systems, distributed systems, and data normalization workflows.
- Strong understanding of metadata governance and dataset traceability in AI/ML workflows.
- Proficient in Python, SQL, and distributed compute frameworks (e.g., Spark, Dask); experienced with orchestrators like Airflow or Prefect.
- Experience with dataset versioning tools (DVC, Dolt, HuggingFace Datasets) and data stores (Elasticsearch, MongoDB, Snowflake, object storage).
- Strong communication and mentoring capability.
- Familiarity with NLP, LLMs, data ethics, or large-scale dataset development is preferred.
If you're looking to do meaningful work with people who care about how we get there, we'd love to meet you. Apply now
-
Lead Data Engineer
2 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Ironbook AI Full time $80,000 - $150,000 per yearWe are seeking an experienced and driven Lead Data Engineer to spearhead thedesign and development of a modern, cloud-native data warehouse on AWS. This roleis critical to building a scalable, secure, and efficient data platform that supportsanalytics, reporting, and AI use cases across the organization. The ideal candidate isboth technically hands-on and...
-
Lead Data Engineer
2 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia TECHTIERA SDN. BHD. Full time 150,000 - 250,000 per yearResponsibilitiesTake end-to-end ownership of our cloud data architecture—designing, developing, and implementing a robust data warehouse using AWS services such as S3, Glue, Redshift, Lambda, Step Functions etc.Lead the evolution of our data infrastructure with a long-term vision, ensuring scalability, reliability, and performance.Define and enforce high...
-
Lead Data Engineer
2 days ago
Kuala Lumpur, Kuala Lumpur, Malaysia Collabera Full time 192,000 - 384,000 per yearCollabera is expanding the team and welcomes a driven Lead Data Engineer to shape our cloud and data strategy, based in KL.Key Responsibilities:Provide ongoing support to the IT or AMS team during and after cloud migrations.Collaborate with the customer team to troubleshoot and resolve cloud-related issues.Proactively identify potential problems and...
-
Data Engineer
2 days ago
Greater Kuala Lumpur, Malaysia Ironbook AI Full time 60,000 - 120,000 per yearData Engineer (Microsoft Fabric & Azure)Position: Data EngineerDepartment: Data & AIRole OverviewThe Data Engineer is responsible for designing, building, and maintaining scalable data pipelines and modern data lakehouse architectures using Microsoft Fabric. This role focuses on enabling data ingestion, transformation, and governance across OneLake, Synapse,...
-
Engineering Lead
2 days ago
Greater Kuala Lumpur, Malaysia Boost Full time 80,000 - 120,000 per yearAs the engineering lead for Transversal, you will be responsible for leading a team of skilled engineers to develop, enhance, and maintain our fraud and anti-money laundering, customer authentication system. You will play crucial role in ensuring the scalability, reliability, and security of our fraud and customer login infrastructure.Key...
-
Data Engineer
2 weeks ago
Greater Kuala Lumpur, Malaysia Cradle Fund Full time 80,000 - 120,000 per yearThe Data Engineer is responsible for assisting in the design and development of data pipelines that enable reliable, timely, and scalable data flow across the organization. This role ensures that business intelligence and analytics teams have access to clean, consistent, and well-structured data to support data-driven decision-making.Key...
-
Data Engineer
2 weeks ago
Greater Kuala Lumpur, Malaysia Kamlax Global Technologies Full time 60,000 - 120,000 per yearHiring: Data Engineer (AWS | Python | PySpark | Databricks)Location:Kuala Lumpur, MalaysiaExperience:7+ yearsWe're seeking aData Engineerto design and maintain scalabledata pipelines, ETL processes, anddata warehousing solutions.Must have strong hands-on experience withPython, PySpark, Databricks, and at least3+ years on AWS.AWS certificationwill be an added...
-
Product Engineer – Data
2 days ago
Greater Kuala Lumpur, Malaysia SRKK Group Full time 90,000 - 120,000 per yearRole PurposeTheProduct Engineer – Data & AIplays a pioneering role in transforming our data and analytics projects into scalable, productized solutions built on Microsoft Fabric. This role involves full-stack development using the Fabric Extensibility Kit, integrating data, AI, and app components into reusable accelerators and SaaS-ready offerings. As one...
-
Databricks Data Engineer
2 days ago
Greater Kuala Lumpur, Malaysia Abhidi Solution Full time 120,000 - 240,000 per yearKey Responsibilities:Design and implement scalable data pipelines using Apache Spark, Delta Lake, and Databricks Workflows.Build and maintain ETL/ELT processes on Databricks Lakehouse Platform to support BI and advanced analytics using Delta Live Tables.Optimize performance of data pipelines and queries using Spark tuning and caching strategies.Implement and...
-
IT & Data Storage Lead
2 days ago
Greater Kuala Lumpur, Malaysia Beijing Foreign Enterprise Management Consultants Co.,Ltd. Full time 900,000 - 1,200,000 per yearJob Descriptions:• Responsible for IT Delivery / infrastructure projects such as delivery of Cloud computing, Storage, Server, Virtualization, Databases, and Data Center Network & Security.• Responsible for engineering projects and deal with technical problems with occurrences underway.• Responsible for the delivery of IT Professional Services,...