GenAI Data Engineer
1 Days Old
Location: London or Edinburgh (Hybrid-2 days a week from office)
6 months contract position
Your responsibilities:
* Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing.
* Architect and optimize AWS-based data and AI infrastructure, ensuring secure, performant, and cost‑efficient ingestion, transformation, and storage.
* Develop, finetune, benchmark, and evaluate GenAI/LLM models, including custom training and inference optimization.
* Implement and maintain RAG pipelines, vector databases, and document-processing workflows for enterprise GenAI applications.
* Build reusable frameworks for prompt management, evaluation, and GenAI operations.
* Collaborate with cross-functional teams to integrate GenAI capabilities into production systems and ensure high-quality data, governance, and operational reliability
Your Profile
Essential skills/knowledge/experience:
* Strong experience with PySpark, distributed data processing, and largescale ETL/ELT pipelines.
* Strong SQL expertise including star/snowflake schema design, indexing strategies, writing optimized queries, and implementing CDC, SCD Type 1/2/3 patterns for reliable data warehousing.
* Advanced proficiency in Python for data engineering, automation, and ML/GenAI integration.
* Hands on expertise with AWS services (S3, Glue, Lambda, EMR, Bedrock / custom model hosting).
* Practical experience with Gen AI/LLM model creation, finetuning, benchmarking, and evaluation.
* Solid understanding of RAG architectures, embeddings, vector stores, and LLM evaluation methods.
* Experience working with structured and unstructured datasets (documents, logs, text, images).
* Familiarity with scalable data storage solutions (Delta Lake, Parquet, Redshift, DynamoDB).
* Understanding model optimization techniques (quantization, distillation, inference optimization).
* Strong capability to debug, tune, and optimize distributed systems and AI pipelines.
* Desirable skills/knowledge/experience: (As applicable)
* Pyspark, Python, SQL, AWS, Gen AI
- Location:
- London
- Job Type:
- FullTime
- Category:
- Accounting/Financial/Insurance
We found some similar jobs based on your search
-
New Today
GenAI Data Engineer
-
London
GenAI Data Engineer Location: London / Edinburgh - (Hybrid - 2 days per week in the office)Day Rate: Market rate (Inside IR35)Duration: 6 monthsThe RoleWe are seeking a highly skilled GenAI Data Engineer to join a forward-thinking team delivering adv...
More Details -
-
New Today
GenAI Data Engineer
-
City Of London
- Technology
Job Description Your responsibilities: Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing. Architect and optimize AWS-based data and AI infrastructure, en...
More Details -
-
New Today
GenAI Data Engineer
-
London
- Technology
Job Description Your responsibilities: Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing. Architect and optimize AWS-based data and AI infrastructure, en...
More Details -
-
New Yesterday
GenAI Data Engineer
-
London
GenAI Data Engineer Location: London / Edinburgh - (Hybrid - 2 days per week in the office)Day Rate: Market rate (Inside IR35)Duration: 6 monthsThe RoleWe are seeking a highly skilled GenAI Data Engineer to join a forward-thinking team delivering adv...
More Details -
-
1 Days Old
GenAI Data Engineer
-
London
-
£500 - £525 /day
- IT & Computers
The Role: GenAI Data Engineer Location: London (or) Edinburgh, UK Position Type: Contract Inside IR35 Remote work option Available: Hybrid – 2 Days Onsite Job Description: Essential skills/knowledge/experience: * Strong experience with PySpark,...
More Details -
-
1 Days Old
GenAI Data Engineer
-
London
- Accounting/Financial/Insurance
Position: GenAI Data Engineer Location: London or Edinburgh (Hybrid-2 days a week from office) 6 months contract position Your responsibilities: * Design and maintain scalable data pipelines using PySpark, Python, and distributed computing framewor...
More Details -