The senior data engineer will make data accessible and usable to data scientists, business analytics teams and decision makers for the purposes of down-stream scientific analysis. They will work on building, operating, and scaling data solutions and software tools to meet business needs. The data engineer must also fully understand and have experience in generating and handling contextual data about data (so called metadata). This aspect is key to developing a sustainable life science data strategy. Ideally the data engineer would have experience of both relational (SQL) and non-relational database technologies.
This is a hands-on role that blends software development, data engineering and bioinformatics. He/she will work closely with scientific and IM experts, business analytics teams, and decision makers to enable integrated data reuse and vastly improve time-to-solution for data and analytics initiatives. Scope includes:
- Understanding of data, metadata and analytic requirements in a scientific environment; selecting the right software tools, ways of working and overall solutions for rapidly delivering business needs at an optimum price/performance level.
- Develop a scientific data and metadata framework, using appropriate metadata standards in life sciences.
- Data query and visualization applications for large-scale data processing from data source ingestion to end user consumption
- Apply data governance and data security requirements to solutions
The IM senior data engineer will support fast paced ad-hoc data analysis, individual projects and longer-term enterprise-wide solutions.
- Understanding of life science data and metadata requirements, with the ability to select and develop solutions which maximise business benefit, yielding good price/performance ratio and optimum route to implementation, deployment and support.
- Hands-on development of IM data and metadata solutions.
- Data source ingestion to end user visualization
- Assemble large, complex data sets to support business needs.
- Understand and quantify performance of potential database technologies.
3. Work with a diverse range of scientists and IM experts to gather business needs and use to evaluate
4. Other duties as assigned by management in support of rapidly growing company
Qualifications & Experience
- BSc or MSc degree in a relevant field, such as computer science, statistics, applied mathematics, computational biology, data management, data science, information systems, bioinformatics etc.
- Combination of experience in IT software engineering, data management and integration, and data visualization skills with data science or big data background.
- Experience of partnering with business users and speaking the language of data with the business
- Open source and commercial scientific software package experience.
- Extensive background of using Linux operating system tools and practices.
- Experience of installing and deploying data and metadata technologies on a Debian related Linux platform.
- Experience of developing a corporate metadata capture, storage and query framework.
- Knowledge of relational (SQL) and non-relational database technologies.
- High performance computing and data storage using cloud technologies.
- Prior experience in complex biotechnology and / or pharmaceuticals industry
- GxP experience
- Relevant experience in the biotechnology field
- Expertise in Next Generation Sequencing-RNA sequencing data and other bioinformatics tools
- Understanding of computational methods, scripting and programming languages, and relevant concepts in cancer biology, biotechnology, immunology and/or genetics.
- Prior experience as bioinformatician, biotech software programmer or data architect a plus
- Scientific software engineering experience using computational programming languages (such as R, Python, Java, C++) and data pipeline tools.
Skills & Compentencies
- Hands on skills with pipeline, data visualization and analysis tools.
- An understanding of the principles of oncology / immuno-oncology
- Outstanding communication, collaboration and partnering skills
- Good knowledge of in-process manufacture, research and clinical trial data
- Demonstrated ability to work across multiple deployment environments including cloud, on-premises and hybrid, multiple operating systems and through tools such as Docker, AWS, etc.
- Demonstrable experience of pipeline workflow tools as used in a life science environment.
- Understanding and experience of common workflow language (CWL)
- Knowledge of semantic web (RDF triple) database technologies and query methods.
- Demonstrable Knowledge of 1 or more other non-relational (NoSQL) database solutions.
Travel to Adaptimmune sites and vendor locations, as needed