Principal Data Engineer
Job Description:
As a Principal Data Engineer at Everstream Analytics, you will play a critical role in building and maintaining our data infrastructure. You will work with a team of talented engineers to design, develop, and optimize data pipelines and data products that support our multi-tenant cloud-native data platform, leveraging various AWS services such as Lambda, EMR, S3, Glue and Redshift as well as helping drive our future toolset. Your expertise in distributed system design, data warehousing, and stream processing is essential in ensuring the scalability, reliability, and efficiency of our data infrastructure.
Key Responsibilities:
Architect, Develop and “Own” Data and Data Pipelines: Design, implement, and maintain data pipelines that handle large volumes of data from various sources, ensuring data quality, integrity, and availability.
Manage team planning, priorities, and deliverables using Agile methodologies.
AWS Expertise: Utilize AWS services like Lambda, EMR, S3, Glue, and others to create scalable and cost-effective data solutions.
Relational Database Experience: Utilize PostgreSQL on RDS or similar database technologies, where applicable.
Graph Database Experience: Utilize Neo4j or other graph databases for specialized data processing and analysis, where applicable.
Stream Processing: Experience with Apache Kafka, Apache Spark or similar for real-time data processing and stream analytics.
Python Development: Primarily use Python for data engineering tasks, data transformation, and ETL processes.
Data Warehousing: Implement and manage data warehousing and/or data lake solutions for efficient data storage and retrieval to support engineering, data science, applications, and groups across our organization.
Collaboration: Work closely with Product Management, Data Science, and the leadership team to understand data requirements and deliver data solutions that meet business needs.
Monitoring and Optimization: Continuously monitor the performance of data pipelines and optimize for scalability and efficiency.
Documentation: Maintain comprehensive documentation for data engineering processes, ensuring knowledge transfer within the team.
Leadership: Lead by example within the data engineering team, taking pride in your team’s deliverables, and performing as technical lead for a scrum team or on various projects, where applicable.
Qualifications:
7+ years experience leading large scale Data Engineering projects.
Proven experience in designing and building multi-tenant cloud-native data platforms in a SaaS or PaaS environment.
Experience with both relational and graph database technologies in a production environment, Specifically PosgreSQL and Neo4j
Strong expertise in AWS services, including Lambda, EMR, S3, Glue, Apache Airflow or similar tools and services.
Proficiency in distributed system design, data warehousing, data lakes, and stream processing using Spark or similar.
Strong programming skills in Python.
Excellent problem-solving and troubleshooting skills.
Ability to work collaboratively with cross-functional teams and convey complex technical concepts to non-technical stakeholders.
Bachelor's or Master's degree in Computer Science, Data Engineering, related field, or equivalent experience.
100% Remote Position to be based in a European country (UK/Germany preferred)
#LI-TC1
About the job
Apply for this position
Principal Data Engineer
Job Description:
As a Principal Data Engineer at Everstream Analytics, you will play a critical role in building and maintaining our data infrastructure. You will work with a team of talented engineers to design, develop, and optimize data pipelines and data products that support our multi-tenant cloud-native data platform, leveraging various AWS services such as Lambda, EMR, S3, Glue and Redshift as well as helping drive our future toolset. Your expertise in distributed system design, data warehousing, and stream processing is essential in ensuring the scalability, reliability, and efficiency of our data infrastructure.
Key Responsibilities:
Architect, Develop and “Own” Data and Data Pipelines: Design, implement, and maintain data pipelines that handle large volumes of data from various sources, ensuring data quality, integrity, and availability.
Manage team planning, priorities, and deliverables using Agile methodologies.
AWS Expertise: Utilize AWS services like Lambda, EMR, S3, Glue, and others to create scalable and cost-effective data solutions.
Relational Database Experience: Utilize PostgreSQL on RDS or similar database technologies, where applicable.
Graph Database Experience: Utilize Neo4j or other graph databases for specialized data processing and analysis, where applicable.
Stream Processing: Experience with Apache Kafka, Apache Spark or similar for real-time data processing and stream analytics.
Python Development: Primarily use Python for data engineering tasks, data transformation, and ETL processes.
Data Warehousing: Implement and manage data warehousing and/or data lake solutions for efficient data storage and retrieval to support engineering, data science, applications, and groups across our organization.
Collaboration: Work closely with Product Management, Data Science, and the leadership team to understand data requirements and deliver data solutions that meet business needs.
Monitoring and Optimization: Continuously monitor the performance of data pipelines and optimize for scalability and efficiency.
Documentation: Maintain comprehensive documentation for data engineering processes, ensuring knowledge transfer within the team.
Leadership: Lead by example within the data engineering team, taking pride in your team’s deliverables, and performing as technical lead for a scrum team or on various projects, where applicable.
Qualifications:
7+ years experience leading large scale Data Engineering projects.
Proven experience in designing and building multi-tenant cloud-native data platforms in a SaaS or PaaS environment.
Experience with both relational and graph database technologies in a production environment, Specifically PosgreSQL and Neo4j
Strong expertise in AWS services, including Lambda, EMR, S3, Glue, Apache Airflow or similar tools and services.
Proficiency in distributed system design, data warehousing, data lakes, and stream processing using Spark or similar.
Strong programming skills in Python.
Excellent problem-solving and troubleshooting skills.
Ability to work collaboratively with cross-functional teams and convey complex technical concepts to non-technical stakeholders.
Bachelor's or Master's degree in Computer Science, Data Engineering, related field, or equivalent experience.
100% Remote Position to be based in a European country (UK/Germany preferred)
#LI-TC1