We are seeking a skilled AI Engineer to join our dynamic data team, supporting cutting-edge AI initiatives. You will design and maintain scalable data pipelines to fuel machine learning models and analytics, using tools like Databricks, Azure, Fivetran, and SQL. As part of a collaborative team you will ensure high-quality, accessible data to drive impactful AI solutions.
What you will do- Build and optimize ETL/ELT pipelines using Databricks and Fivetran to ingest, transform, and prepare data for AI model training and inference.
- Design and manage data storage and processing solutions on Azure (e.g., Azure Data Lake, Azure Data Factory, Synapse Analytics) to support large-scale AI workloads.
- Ensure data integrity, consistency, and reliability through robust validation and monitoring processes, leveraging SQL for querying and transformations.
- Collaborate with the Data Scientist to preprocess and structure datasets for machine learning, including feature engineering and real-time data feeds.
- Automate data workflows to streamline ingestion, transformation, and delivery processes, ensuring efficiency and scalability.
- Monitor and optimize pipeline performance on Databricks and Azure to handle large datasets and meet AI system requirements.
- Work closely with Data Analysts and business stakeholders to align data solutions with analytical and AI-driven business needs.
- Stay updated on AI and data engineering trends, recommending new tools or approaches to enhance pipeline efficiency
Candidate Profile- Skills & Qualifications
- Bachelor’s degree in Computer Science, Data Science, Business Analytics, or a related field. Master’s degree is a plus
- 3+ years of experience in data engineering, with a focus on AI or machine learning projects.
- Expertise in building and optimizing ETL/ELT pipelines for AI applications
- Hands-on expertise with Databricks for data processing and pipeline development.
- Experience with Fivetran for data integration and ETL processes
- Strong SQL skills for querying, data modeling, and transformations.
- Knowledge of Python or Scala for scripting and automation within Databricks
- Experience with data warehousing and lakehouse architectures
- Familiarity with distributed systems and big data tools (e.g., Apache Spark, Hadoop)
- Familiarity with version control (e.g., Git) and CI/CD pipelines
- Databricks Certified Data Scientist or Data Engineer certification is a plus
- Strong problem-solving skills and attention to detail.
- Effective communication to collaborate with technical and non-technical stakeholders
- Ability to work in a fast-paced, team-oriented environment
- Work Environment
- Occasional travel necessary to accommodate for global and regional meetings, workshops or training
- Ability to occasionally join calls before / after traditional office hours