Data Management and central database for CRC TRR 341 projects
The CRC TRR 341 "Plant Ecological Genetics" is generating and/or relying on both genome and sequencing data, combined with detailed ecological data such as leaf traits or complex competition traits. The information management (INF) Z3 project develops the platform for data management in the TRR 341.
Datasets produced within the CRC are stored and made accessible in a FAIR way, relying on structures defined by the plant NFDI, DataPLANT. In particular, data management relies on the so-called annotated research context ARC, which funnels multiple standardized components (such as ISA-XLS) into one container providing the adequate environment not only for rich data storage, but also for reproducible workflows.
The project Z3 INF builds on hands-on experience gathered by the data management team of the CEPLAS II cluster (UoC and HHU), and their adaptation of the ARC concept particularly in the omics domain. In addition, the TRR 341 will profit from the research data management program of UoC (C3RDM). The TRR 341 works closely together with the CEPLAS II cluster and the research data management program at the University of Cologne (C3RDM) to develop a specialized data management platform, stewardship support, and training.
Thus, this INF project enables FAIR data management and sharing, and the development of novel webbased tools specifically tailored to answer the CRC’s questions.
This central INF project, as a close cooperation between the Institute for Biological Data Science of HHU and the Regional Computing Centre of the UoC (RRZK), has two main purposes:
- Providing complete research and data-management support to the entire CRC TRR 341, from the infrastructure and tools to the data-handling policies and individual assistance with documentation, preservation, and publication (of data and process) in collaboration with existing RDM networks such as DataPLANT
- Designing and building a curated data platform to integratively combine, visualize, and analyze the molecular, genomic, and ecological information produced within the CRC TRR 341 and collected from other sources; it aims to serve as a tool for discovering hidden or unseen relations between genes and ecologies across plant species.