Data Engineer

Immunai

Immunai

Software Engineering, Data Science
Prague, Czechia
Posted on Nov 16, 2024

Description

About Immunai:

Immunai is an engineering-first platform company aiming to improve therapeutic decision-making throughout the drug discovery and development process. We are mapping the immune system at unprecedented scale and granularity and applying machine learning to this massive clinico-immune database, in order to generate novel insights into disease pathology for our partners - pharma companies and research institutes. We provide a comprehensive, end-to-end solution - from data generation and curation to therapeutics development, that continuously supports and validates the capabilities of our platform.

As drug development is becoming increasingly inefficient, our ultimate goal is to help bring breakthrough medicines to patients as quickly and successfully as possible.

Immunai is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

About the role:

The Metadata Developers Team, part of the Immunai Software group, focuses on advanced tools and solutions to retrieve, store, handle, and analyze complex descriptive biological data, such as laboratory and clinical metadata, with extensive use of domain ontologies. In close collaboration with Immunai’s biocurators and biology experts, we are continuously advancing our technology and data models to improve the consistency and descriptive power of our vast clinical database, as well as the usability and accessibility of data that feed the company’s most advanced algorithms for therapeutic decision-making and discovery.

As a Data Engineer at Immunai, you will specialize in designing, building, and maintaining top-notch, scalable data pipelines and software solutions for biological data and clinical metadata, ensuring robust and resilient data flows and processes, as well as a smooth integration of different data sources.

You will act as a partner for our Prague-based biocuration team to maximize the exchange of knowledge that will drive advances and breakthroughs in our metadata curation and delivery infrastructure. You will understand their operational and data governance needs and dive deep into the characteristics of our highly-specialized data models. You will design and create lean and modular solutions that will leverage a larger ecosystem of integrated data tools and will empower our internal analysts and most valuable customers in their research.

Location: Prague

What will you do?

  • Collaborate with other Metadata Dev team members in an agile setting to build and maintain new metadata pipelines and infrastructure, or to enhance existing ones
  • Interact with biologists and bioinformaticians to understand their data needs and tooling requirements, propose and discuss solutions, bring new ideas
  • Strengthen the liaison between the Prague site and the other Metadata Dev team members located in Zurich by collaborative work and continuous knowledge sharing
  • Provide support, training, and guidance to internal users and other stakeholders
  • Collaborate with the Metadata Operations team to ensure smooth data processing, ingestion, and delivery, striving for the resolution of any time-critical issues
  • Develop and maintain documentation for our tools and products

Requirements

Required qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Software Engineering, Bioinformatics, or a related field
  • 3+ years experience as a Data Engineer, ideally with a strong track-record in handling complex datasets and mastering sophisticated data processes
  • Good programming skills (5+ years experience) with Python, building modular and reusable code by leveraging standard data libraries (e.g. Pandas)
  • Proficiency in SQL, experience with relational databases (PostgreSQL preferred)
  • Familiarity with other data storage technologies (data warehousing technologies, ideally BigQuery, and NoSQL databases, especially MongoDB)
  • Some knowledge of data orchestration tools (e.g. Apache Airflow, Dagster)
  • An analytical mindset with attention to detail
  • Very good English communication skills (Czech/Slovak is an advantage)

Preferred qualifications:

  • Experience with cloud environments (Google preferred) is highly desirable
  • Coding experience in other programming languages (Java / R) is a big plus
  • Experience with ETL tools (e.g. Apache Beam, Apache Spark) is also a big plus
  • Familiarity with biotech or healthcare data is a plus

Desired personal traits:

  • You want to make an impact on humankind
  • You prioritize “We” over “I”
  • You enjoy getting things done and striving for excellence
  • You collaborate effectively with people of diverse backgrounds and cultures
  • You have a growth mindset
  • You are candid, authentic, and transparent

*Please note that when you apply for a position at Immunai, your application will be processed via our recruitment platform Comeet. You can read more about how we process personal data here: https://www.immunai.com/privacy-policy/