Overview

Data Engineer – Collibra | Insurance Domain – London 

Reference Code: 335291-en_GB 

Contract Type: Permanent 

Professional Communities: Data & AI 

Get The Future You Want!
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the
way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around
the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading
organizations unlock the value of technology and build a more sustainable, more inclusive world.

Your Role:

  • Design and implement scalable data pipelines using Azure Data Factory, Databricks, Synapse Analytics, and Azure Data Lake.
  • Develop and optimize data transformation workflows using Python, R, or Scala on Azure Databricks or Apache Spark.
  • Integrate and manage metadata using Collibra for effective data governance and cataloging.
  • Handle structured, semi-structured, and unstructured data to extract insights and identify linkages across datasets.
  • Lead technical delivery and mentor junior engineers on data engineering best practices.
  • Optimize Spark jobs and debug performance issues using tools like Ganglia UI.
  • Design efficient data structures for storage and querying, including formats like Parquet and Delta Lake.
  • Work across multiple database technologies: RDBMS (MS SQL Server, Oracle), MPP (Teradata, Netezza), and NoSQL (MongoDB, Cassandra, Neo4J, CosmosDB, Gremlin).
  • Ensure secure and compliant data handling aligned with Information Security principles.
  • Collaborate in Agile teams and use Git-based workflows for version control and code management.

Your Profile:

  • Minimum 7 years of hands-on experience in Azure Data Engineering.
  • Strong working knowledge of Collibra for data governance and metadata management.
  • Proven experience in the insurance domain is highly desirable.
  • Proficient in Python, R, or Scala for data transformation and analysis.
  • Deep understanding of NoSQL databases and distributed data processing.
  • Experience with traditional ETL tools such as Informatica, IBM Datastage, or Microsoft SSIS.
  • Skilled in working with large and complex codebases using GitHub and Gitflow.
  • Effective communicator with strong stakeholder management capabilities.
  • Familiarity with Agile methodologies including SCRUM, XP, and Kanban.
  • Preferred certifications: Microsoft Certified Azure Data Engineer Associate and Collibra Certified Ranger (or equivalent).

About Capgemini
Capgemini is a global business and technology transformation partner, helping organizations to
accelerate their dual transition to a digital and sustainable world while creating tangible impact for
enterprises and society. It is a responsible and diverse group of 350,000 team members in more
than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock
the value of technology to address the entire breadth of their business needs. It delivers end-to-end
services and solutions leveraging strengths from strategy and design to engineering, all fueled by
its market-leading capabilities in AI, cloud, and data, combined with its deep industry expertise and
partner ecosystem. The Group reported 2023 global revenues of €22.5 billion.
Get The Future You Want | www.capgemini.co

 

Before you apply -
Register now and turn on alerts for jobs like this!

By registering you agree to our terms and conditions.

No thanks, continue to apply