Data Engineer (Boston)
QuantumBlack helps companies use data to drive decisions. We combine business experience, expertise in large-scale data analysis and visualization, and advanced software engineering know-how to deliver results. From aerospace to finance to Formula One, we help companies prototype, develop, and deploy bespoke data science and data visualisation solutions to make better decisions.
Who You'll Work With
Our Consultant Data Engineers work closely with our clients and our Data Scientists in order to curate, transform and construct features which feed directly into our modelling approach.
This would be a hybrid client-facing/technical role using cutting edge technologies, whilst also being able to communicate complex intractable ideas to non-technical audiences. Gathering clear requirements is a key part of this role and will define the technical strategy the team employs on the study.
Our projects cover a wide range of industries and may expose you to problem areas such as: Disease epidemiology, athlete injury prediction or salesforce effectiveness optimisation and many more. In order to gain insight from previously ignored and unconnected data you will need to extract information from vast array of different data sources such as: Data Warehouses, SQL databases, legacy applications, unstructured data, documents, emails, APIs, Kafka endpoints and graph databases.
What You'll Do
- Work with our clients to model their data landscape, obtain data extracts and define secure data exchange approaches
- Acquire, ingest, and process data from multiple sources and systems into Big Data platforms
- Understanding, assessing and mapping the data landscape.
- Maintaining our Information Security standards on the engagement.
- Collaborate with our data scientists to map data fields to hypotheses and curate, wrangle, and prepare data for use in their advanced analytical models
- Defining the technology stack to be provisioned by our infrastructure team.
- Building modular pipeline to construct features and modelling tables.
- Use new and innovative techniques to deliver impact for our clients as well as internal R&D projects
- Mentoring and developing junior Data Engineers on engagements
- Strong experience with at least two of the following technologies: Python, Scala, SQL, Java
- Commercial client-facing project experience is beneficial, including working in close-knit teams
- The ability to work across structured, semi-structured, and unstructured data, extracting information and identifying linkages across disparate data sets
- Good experience in multiple database technologies such as:
- Distributed Processing (Spark, Hadoop, EMR)
- Traditional RDBMS (MS SQL Server, Oracle, MySQL, PostgreSQL)
- MPP (AWS Redshift, Teradata)
- NoSQL (MongoDB, DynamoDB, Cassandra, Neo4J, Titan)
- A proven ability in clearly communicating complex solutions
- Have a strong understanding of Information Security principles to ensure compliant handling and management of client data
- Experience and interest in Cloud platforms such as: AWS, Azure, Goole Platform or Databricks
- Strong experience in traditional data warehousing / ETL tools (Informatica, Talend, Pentaho, DataStage)
- Exceptional attention to detail