Modern biomedical research is driven not only by scientific discoveries in the laboratory, but also by secure digital infrastructure, intelligent data environments, and innovative computational tools. These systems enable researchers and clinicians to work more efficiently, collaboratively, and effectively. At CRC1709, one of the people helping to build this foundation is Marcel Kleinmann, Clinical AI & Data Systems Architect within the INF project.

Marcel holds a Master of Science in Data Science and a Bachelor of Science in Business Informatics from the University of Mannheim. Before joining CRC1709, he gained industry experience as a Working Student in Software Engineering at VECTOR Informatik in Karlsruhe.
Within CRC1709, Marcel works closely with Prof. Martin Dugas and Sarah Richter, where he plays a central role in providing digital infrastructure and coordinates data management across the consortium. He also collaborates with Prof. Jakob Kather’s lab and clinical partners, particularly in the development of innovative AI-based applications.
Importantly, Marcel’s contribution goes far beyond programming alone. In addition to developing technical solutions, he advises Data Management Officers across the consortium and supports project teams in improving the organization, quality, and handling of research data. By working closely with multiple project teams, Marcel also helps strengthen communication and interaction between projects, creating a more connected and collaborative data environment across CRC1709.
One crucial step towards the collaboration is a monthly online meeting as well as regular in-person workshops, where Data Management Officers from all projects are invited to participate. These sessions provide a platform to exchange experiences, discuss challenges, present new tools and workflows, and identify shared infrastructure needs. As of today, Marcel has already successfully co-coordinated one in-person workshop and numerous online meetings.
From early on, Marcel had a strong interest in medical informatics. While computer science is often imagined as a field focused on isolated programming behind a screen, he sees medical informatics differently. For him, it creates a meaningful bridge between technology and people. Through his work, Marcel can apply his technical expertise to support physicians, researchers, and ultimately patients. Even without direct patient contact, he values the sense of connection that comes from developing systems that improve patient care and research outcomes.
For me, medical informatics uses technology in a way that directly supports clinical operations and treatments. Even if I do not meet patients myself, I know that my work can still make a difference.”
Marcel is currently working on three distinct projects:
1) Medical Extraction Tool for Unstructured Data,
2) Clinical Chatbot,
and 3) Pseudonymization of Clinical Documents.
The first of these projects focuses on extracting valuable information from unstructured clinical letters. These documents often contain valuable information such as treatment history, complex diagnoses, and genetic mutations. However, much of this information is written as a free text and is therefore not directly usable within existing clinical data systems or the Data Warehouse.
The aim is therefore to automatically identify and extract relevant information from these letters and convert it into standardized data formats such as FHIR, ODM, and JSON. To achieve this, Marcel and his colleagues are developing a custom-built medical extraction pipeline based on self-hosted Large Language Models (LLMs).

Marcel mentioned that the chatbot for physicians is another highlight of his work, which is already in the testing phase. The system is designed to help doctors safely retrieve patient information, structure clinical records, prepare medical letters, and support evidence-based treatment searches.
The chatbot leverages an LLM to understand user input and generate contextually relevant, natural-sounding responses based on our proprietary clinical data and CRC1709 general information and publications.

At the same time, the young computer scientist emphasized that this work is exciting—but far from simple.
Working with clinical data requires extensive knowledge of data protection and privacy regulations, as patient information is particularly sensitive. This places especially high standards on technical design and implementation. Moreover, rapid technological progress also means that infrastructures that were sufficient only a short time ago may already need further development today. Hardware limitations such as CPU and GPU capacity can create additional challenges. Within a large collaborative research center, enabling efficient and secure data sharing between multiple projects is another important task and challenge.
Looking ahead, the INF team member is excited to continue developing these initiatives over the coming years.
When asked whether he could imagine pursuing a PhD here one day, Marcel smiled and said: “Definitely!”
He added that it may be something he seriously considers in the future—especially in an environment like CRC1709, where innovative digital projects can create impact far beyond a single consortium.
For CRC1709, talents like Marcel represent the next generation of interdisciplinary innovation—where data science, medicine, and research excellence meet.
As part of our CRC1709 News series, we will continue to feature interviews with members from across the consortium on a regular basis. Through their personal perspectives, we aim to share insights into ongoing project developments, interdisciplinary collaboration, and their visions for the future of CRC1709.











