Data Engineer
6 days ago
About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next generation of AI builders, and drive transformative contributions to a knowledge-driven economy. As part of our team, you'll have the opportunity to work on the core of cutting-edge foundation model training, alongside world-class researchers, data scientists, and engineers, tackling the most fundamental and impactful challenges in AI development. You will participate in the development of groundbreaking AI solutions that have the potential to reshape entire industries. Strategic and innovative problem‐solving skills will be instrumental in establishing MBZUAI as a global hub for high‐performance computing in deep learning, driving impactful discoveries that inspire the next generation of AI pioneers. The Role As a Data Engineer specializing in Natural Language Processing (NLP) and large‐scale data processing, you will quickly and effectively gather, curate, and prepare high‐quality datasets to support cutting‐edge NLP research. Your role will be instrumental in enabling researchers by delivering essential data through efficient and scalable engineering practices, including web crawling, LLM‐generated content refinement, and robust data pipelines, primarily leveraging Python and related technologies. Key Responsibilities Rapidly collect, curate, and preprocess datasets based on detailed specifications provided by NLP researchers, delivering data within tight timelines (typically within 1-2 days). Develop and maintain efficient web crawling solutions, APIs, and automated workflows to continuously improve data collection processes. Refine and evaluate outputs from Large Language Models (LLMs) to generate structured datasets suitable for model training and benchmarking. Implement scalable data pipelines, ensuring efficient data processing, storage, retrieval, and distribution to research teams. Collaborate closely with researchers and engineers to ensure collected data meets specified quality and relevance criteria. Document data collection methodologies, dataset characteristics, and pipeline architecture clearly and effectively. Engage with peer teams and participate in technical reviews to uphold best practices and data quality standards. Represent MBZUAI at industry and research forums, showcasing technical capabilities in large‐scale data processing and AI data infrastructure. Perform all other duties as reasonably directed by the line manager commensurate with these functional objectives. Academic Qualifications Bachelor's degree in Computer Science, Data Science, Engineering, or a related technical field required. Master's degree or equivalent experience in Computer Science, Data Engineering, or related technical fields preferred. Professional Experience - Required Extensive experience in data engineering, data processing, and automation using Python. Demonstrated proficiency in designing and deploying web crawling solutions, automated data extraction, and processing pipelines. Strong understanding of data structures, algorithms, databases, SQL, and performance optimization. Experience working with cloud infrastructure and distributed data processing frameworks (e.g., AWS, Spark, Kafka, Kubernetes). Excellent problem‐solving abilities, attention to detail, and the capability to rapidly address technical challenges. Strong communication and collaboration skills with cross‐functional teams. Professional Experience - Preferred Proven track record of supporting NLP or AI research teams with rapid and reliable data delivery. Experience with refining outputs from large‐scale AI models, such as LLM‐generated data. Contributions to open‐source projects, coding competitions, or high visibility in coding communities (e.g., GitHub, Stack Overflow). Familiarity with the latest advancements in NLP data processing and large language model technologies. #J-18808-Ljbffr
-
Senior Network Engineer for Juniper
2 weeks ago
Abu Dhabi, United Arab Emirates Alpha Data Full timeSenior Network Engineer for Juniper - May 9, 2013 Full‑time Alpha Data is one of the largest multi‑disciplined systems integrators in the United Arab Emirates. Founded in 1981, Alpha Data has grown from two employees to a 700‑strong workforce building ICT infrastructure solutions for thousands of organizations. Alpha Data works with its clients through...
-
Senior Network Engineer for Juniper
2 weeks ago
Abu Dhabi, United Arab Emirates Alpha Data Full timeSenior Network Engineer for Juniper - May 9, 2013 Full‑time Alpha Data is one of the largest multi‑disciplined systems integrators in the United Arab Emirates. Founded in 1981, Alpha Data has grown from two employees to a 700‑strong workforce building ICT infrastructure solutions for thousands of organizations. Alpha Data works with its clients through...
-
Data Engineer
1 week ago
Abu Dhabi, United Arab Emirates Contango Full timeAbout the RoleWe are seeking a motivated and technically versatileData Engineer to join our team. You will play a key role in delivering data platforms, pipelines, and ML enablement within a Databricks on Azure environment.As part of a stream-aligned delivery team, you'll work closely with Data Scientists, Architects, and Product Managers to build scalable,...
-
Abu Dhabi, United Arab Emirates Amazon Data Services Emirates LLC Full timeExperience with computer hardware troubleshooting and repair - Experience in networking - Bachelor's degree in information technology or computer science AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data...
-
Data Engineer
16 hours ago
Abu Dhabi, United Arab Emirates Durlston Partners Full timeA leading global investment organization is seeking a Data Engineer to build and maintain robust data platforms supporting trading, analytics, and investment decision-making. The role involves working closely with trading and technology teams on complex financial datasets and time‑critical systems. Key Responsibilities - Develop and maintain referential...
-
Data Engineer
2 weeks ago
Abu Dhabi, United Arab Emirates SmartChoice International GCC Full timeData Pipeline Development for AI – Design, implement, and maintain pipelines that feed AI models with clinical and non-clinical data from Epic and other hospital systems.Real-Time Integration – Configure Rhapsody and BizTalk to process high-volume HL7/FHIR events for triggering AI model scoring and monitoring.Feature Engineering for AI – Collaborate...
-
Data Engineer
2 weeks ago
Abu Dhabi, United Arab Emirates Le Chene Full timeAbout the Opportunity This role supports the delivery of enterprise data platforms and analytics pipelines within a cloud-based environment for a government enterprise. Key Responsibilities - Design and build scalable data pipelines - Work across SQL and NoSQL data platforms - Support batch and streaming data processing - Collaborate with analytics and data...
-
Data Engineer
6 days ago
Abu Dhabi, United Arab Emirates Boskalis Full timeImagine automating the real-time sensor data from our fleet and supporting on NOx reduction for sustainability this role you will have the chance to contribute to our operations by utilizing your skills in data movement and creating data insights. You will have the opportunity to work with cloud technology platforms like Azure Databricks and Azure Data...
-
Senior Data Engineer
1 week ago
Abu Dhabi, United Arab Emirates NorthBay Solutions Full timeJob Title: Senior Data EngineerLocation: Abu Dhabi UAE (Onsite)Employment Type: Full-Time Permanent About the Role We are seeking a highly skilled Senior Data Engineer with 710 years of experience to join our team in Abu Dhabi. The ideal candidate will have deep expertise in building and optimizing scalable data pipelines working with large datasets and...
-
Azure Data Engineer
7 days ago
Abu Dhabi, Abu Dhabi, United Arab Emirates Mind Stream Full timeWe are seeking a highly skilled Azure Data Engineer with extensive experience in building, optimizing, and managing scalable data solutions on Microsoft Azure. The ideal candidate will have hands-on expertise in Azure Databricks, Data Factory, Data Lake Storage, and Event Hub, along with strong programming and data engineering skills.Required Skills and...