Since Booking.com started facilitating bookings in 1996, the amount of data produced and consumed has increased in unimaginable proportions ( 20TB / day), certainly from the perspective of our founders.
The last decade, open source data tools (Hive, Spark, Cassandra and Kafka) running on large internal server parks enabled hundreds of colleagues working closely with data to produce various data products, e.
g. in Machine Learning and Analytics. As the community has grown, so have the number of challenges around working with data.
Providing flexible compute resources introduced the onset of clouds in parallel to a heavily utilized on-premise environment.
Governments introduce standards for personal data protection. A growing, physically disconnected employee-base is less able to share tribal knowledge regarding data finesses.
Hence, the establishment of 'Enterprise Data Management', a group governing the production and consumption of data, for it to be trusted and understood.
The Lead Data Engineer is a technical leader who drives broad data engineering strategies and delivery across a business area.
You will lead solution envisaging, technical designs, hands-on implementation as well as provide operational support across multiple data domains.
You need to influence, differentiate, and guide the business and technology strategies in your area, as they relate to data, through constant interaction with various teams.
You ask the right questions to the right people in order to align data strategy with commercial strategy, demonstrating deep technical expertise and broad business knowledge.
You play an active role in identifying data engineering skill gaps within your area and support development of tools, materials, and training to bridge these gaps.
Support the data requirements of new and existing solutions by developing scalable and extensible physical data models that can be operationalised within the company’s workflows and infrastructure
Drive efficiency by mapping data flows between systems / workflows across the company
Ensure standardisation by following patterns in line with data governance requirements
Manage and automate the entire life-cycle of data processing systems and platforms
Support varied business requirements by building extensible data pipelines spanning different data encodings, protocols and unstructured data across different systems
Support real-time internal and customer-facing actions by developing real-time event-based streaming data pipelines
Enable rapid data-driven decisions by developing efficient and scalable data ingestion
Drive high-value data by connecting different disparate datasets from different systems into a well-managed unified solution
Own end-to-end data applications by defining, monitoring and adjusting SLIs and SLOs
Handle, mitigate and learn from incidents in a manner that improves the overall system health
Ensure accuracy by developing criteria, automation, and processes for data production, transport, transform, and storage
Drive data validity by defining criteria and ongoing validation strategies
Ensure ongoing reliability of data pipelines by developing and implementing standards for end-to-end testing
Improve failure detection by evolving the maturity of monitoring systems and processes
Ensure compliance with data-related requirements by building solutions in line with all applicable standards and regulations
Ensure ongoing resilience of data processes by monitoring system performance and proactively identifying bottlenecks, potential risks, and failure points that degrade quality
Build software applications by using the relevant development languages and applying in-depth knowledge of the systems, services and tools used by the specific business area.
Write readable and reusable code by applying standard patterns and standard libraries.
Choose the right technology by researching and understanding requirements
Continuously evolve your craft by keeping up to date with the latest developments in data engineering and related technologies, introducing them to the community and promoting their application in areas where they can generate impact
Actively contribute to Data Engineering at Booking.com through training, exploration of new technologies, interviewing, onboarding and mentoring colleagues
Push for improvements, scaling and extending data engineering tooling and infrastructure, collaborating with central teams
Level of Education
Bachelors degree in Computer Science or related field
Masters degree in Computer Science or related field
Relevant Job Knowledge
7+ years of experience in a data engineering or related field using a server side programming languages, preferably Scala, Java, Python, or Perl
5+ years of experience building data pipelines and transformations at scale, with technologies such as Hadoop, Cassandra, Kafka, Spark, HBase, MySQL
3+ years of experience in data modelling,
3+ years of experience handling data streaming
Requirements of special knowledge / skills
Strong knowledge of data modelling methods, e.g. Relational, Data Vault, Dimensional
Strong understanding of data architecture best practices, DAMA, TOGAF
Strong knowledge of data warehouses
Intermediate knowledge of data governance requirements based on best practices, e.g. DAMA, and tooling for continuous automated data governance activities
Strong knowledge of data quality requirements and implementation methodologies
Excellent English communication skills, both written and verbal.