Dr. Daniel Lehmann

Greifswald, Germany

Contact

GitHub

Kaggle

Google Scholar

Welcome!

My name is Daniel Lehmann. I am a software engineer and researcher in the field of machine learning. Information about my work experience and my skillset are listed below. If you want to get in contact with me, feel free to send me an email.

Education: I obtained my PhD from the University of Greifswald (Germany), which was supervised by Prof. Dr. Marc Ebner . Prior to my doctoral studies, I received a Diplom (German degree, comparable to a Master's degree) in Business Computer Science from the University of Rostock (Germany). You can find more details in my CV.

Research: Machine learning-based models face various challenges in real-world applications. For instance, the training data of such a model could be imbalanced. Furthermore, after deploying a trained model to a production server, the model could be faced with out-of-distribution data. These challenges negatively impact a machine learning-based model. I am interested in techniques that address these challenges. During my doctoral studies, I focused on improving the problem of classifying images using a classification model, based on a Convolutional Neural Network (CNN), with respect to various challenges of real-world applications. My main work resulted in developing a method to detect out-of-distribution data. In addition, I also proposed a method to balance imbalanced training data, and a method to initialize the weights of a CNN-based model. For more information, please see my publications and my blog.

Software Engineering: Prior to my doctoral studies, I worked at the IT company Gini GmbH in Munich (Germany) as a machine learning engineer in the field of natural language processing, and at IBM in San José (California, USA) as a software engineering intern. During my time in industry, I was able to gain initial experience in the areas of software engineering, web development and databases. You can find more details about my professional activities in my CV. In addition to my professional work, I also aim to further deepen my knowledge of software development in my spare time. You will find more information about my learning progress on my blog.

Further Interests: In addition to my professional activities, I am also interested in IT security and low-level programming. However, I have only gained limited experience in these areas so far, but I hope to learn more soon. I will publish my learning progress in these areas on my blog as well. In my free time, I enjoy bouldering, running, traveling, and language learning.

Curriculum Vitae

Education

Doctoral Studies - Computer Science

2019 - 2024

University of Greifswald (Germany)

Thesis: Improving Convolutional Neural Network-based Image Classification by Exploiting Network Layer Information.

I examined whether the layer activations of a Convolutional Neural Network-based model can be used to improve image classification. Based on my research, I developed a method to detect out-of-distribution data, a method to balance imbalanced training data, and a method to initialize the model weights. To implement these methods in Python, I mainly used PyTorch, fastai, scikit-learn, numpy, and OpenCV.

[More Details]

Diplom Studies - Business Computer Science

2006 - 2012

University of Rostock (Germany)

Thesis: Comparison of Learning and Classification Approaches for Medical 3D Image Data.

My studies included classes of computer science, operations research, and business administration. For my thesis, I compared different classification approaches using Python to detect the Alzheimer's disease using three-dimensional MRI images of the human brain.

(comparable to a Master's degree)

Languages

German

native

English

fluent

Spanish

advanced

Chinese

beginner

Professional Experience

Research Assistant

2017 - 2023

University of Greifswald (Germany)

My work included research and teaching. The research-related tasks were focused on machine learning for object recognition on images using Python and C++. For the implementations, I mainly used PyTorch, fastai, scikit-learn, numpy, and OpenCV. The teaching-related tasks were focused on giving tutorials and supporting students of the following classes: Introduction to Computer Science (Linux, C++, HTML, SQL), Computer Graphics (C++, OpenSceneGraph), and Evolutionary Algorithms (Java, ECJ). In addition, I also supported students with a software engineering project (C++, OpenCV).

Machine Learning Engineer

2013 - 2017

Gini GmbH (Munich, Germany)

The company offered an API for extracting information from business documents (e.g., invoices). As part of the information extraction team, I worked on the core functionality of the offered API. We organized our software development process using the Kanban methology. My responsibilities in the team included researching and implementing methods in Python, Scala and Java for extracting information from text. For the implementations, I used regular expressions, SVMs, and sequence classification techniques.

Software Engineering Intern

2010 - 2011

IBM (San José, USA)

During my internship, I worked in the DB2 database department as part of the DB2-TextSearch and the DB2-Infrastructure team. My responsibilites included supporting the senior software developers by fixing minor bugs, writing tests, evaluating features, and managing the DB2 software builds for Unix and Linux.

Publications

Below you can find my publications, which can also be found on Google Scholar. When you use one of my publications as a reference for your own work, please cite the publication according to its corresponding BibTeX.

Improving Convolutional Neural Network-based Image Classification by Exploiting Network Layer Information

Daniel Lehmann (2024) - Doctoral Thesis (University of Greifswald)

[Abstract] [Thesis] [BibTeX] [GitHub]

Convolutional Neural Network-based image classification models are the current state-of-the-art for solving image classification problems. However, obtaining and using such a model to solve a specific image classification problem presents several challenges in practice. To train the model, we need to find good hyperparameter values for training, such as initial model weights or learning rate. However, finding these values is usually a non-trivial process. Another problem is that the training data used for model training is often class-imbalanced in practice. This usually has a negative impact on model training. However, not only is it challenging to obtain a Convolutional Neural Network-based model, but also to use the model after model training. After training, the model might be applied to images that were drawn from a data distribution that is different from the data distribution the training data was drawn from. These images are typically referred to as out-of-distribution samples. Unfortunately, Convolutional Neural Network-based image classification models typically fail to predict the correct class for out-of-distribution samples without warning, which is problematic when such a model is used for safety-critical applications. In my work, I examined whether information from the layers of a Convolutional Neural Network-based image classification model (pixels and activations) can be used to address all of these issues. As a result, I suggest a method for initializing the model weights based on image patches, a method for balancing a class-imbalanced dataset based on layer activations, and a method for detecting out-of-distribution samples, which is also based on layer activations. To test the proposed methods, I conducted extensive experiments using different datasets. My experiments showed that layer information (pixels and activations) can indeed be used to address all of the aforementioned challenges when training and using Convolutional Neural Network-based image classification models.