We integrate public datasets, often sequencing data, into our research projects and utilize them in innovative ways (DataReuse/DataUpcycling). This allows us to address questions not just for a single plant, but directly across all species sequenced to date. However, a common problem is the lack of metadata, meaning information about the origin of samples and the process of data generation. Therefore, we are developing methods to infer missing metadata from the data itself.
A significant challenge in the medium term will be elucidating gene functions. While technological advances in sequencing have made genome decoding easier, functional annotation of individual genes remains a labor-intensive process. Because it is not always possible to elucidate gene functions by knocking out the gene in question, we are developing methods for automatic transfer of information between different plant species. The long-term goal is to integrate all available knowledge on gene functions in plants. Our particular focus is on genes that are active in specialized metabolism.