Navigated to

Making sense of complex data

Most plant and animal features arise from complex interactions of genes, proteins and metabolites. The identification and analysis of these complex genetic traits is very challenging, especially when the sequenced genomes are fragmented. When he was a doctoral student in Nathaniel Street’s group, Bastian Schiffthaler improved the genome information from European aspen and developed bioinformatic tools that help to analyse complex genetic traits in plants.

Published: 2025-11-25 Text: Anne Honsel

Learn more about the programme

Master's Programme in Bioinformatics

Photo of Bastian Schiffthaler standing outdoors, in the background lake, forest and snowcovered mountain tops

Bastian Schiffthaler

Image: Alena Aliashkevich

For sequencing a genome, the DNA is normally cut into small pieces, the sequence is read and then bioinformatic software assembles the whole sequence information using overlapping regions of these small pieces in an iterative process that ideally yields full length chromosomes. For trees, which often have very complex genomes and most available genome assemblies are therefore not very contiguous. Bastian Schiffthaler worked on improving the contiguity of such genomes focussing on European aspen.

When Bastian Schiffthaler started, the genome sequence of European aspen was already quite good compared to for example Norway spruce. However, it was still fragmented which made it difficult to carry out analyses that depend on a highly continuous assembly. Examples of this are the detection of DNA signatures that relate to traits via genome wide association, or studying evolutionary history by looking at large scale genomic rearrangements. 

Overwhelming mass of information

“Our strategy included modern long read sequencing, polished with highly accurate short-read data and combined with an optical and a genetic map to further link the initially assembled scaffolds into fully assembled chromosomes. At close to 20,000 genetic markers, the genetic map is one of the most comprehensive ones created for any organism to date. This was an overwhelming mass of information that most of the commonly used free software programmes were not able to handle.”

Ordering markers on a genetic map is a classic application of the traveling salesman problem, which aims to find the shortest between a set of points or locations. To derive the perfect order for only sixty markers would take more calculations than are atoms in the universe, hence all software relies on approximations, but even those were too slow for a dataset of this size. To overcome this problem, Bastian Schiffthaler developed “BatchMap”, a software package that speeds up the computations required to find the order of genetic markers with the highest likelihood given their inheritance patterns.

Parallel computations

“BatchMap” divides calculations into small batches, which are easy to compute and can run in parallel. This drastically decreased the calculation time and Bastian Schiffthaler could produce a dense map of genetic signatures on the European aspen chromosomes. Since the creation of BatchMap, it has now been adopted by other genome projects such as those assembling the Norway spruce or a strawberry genome, which comprises eight chromosome sets.

“We wanted to evaluate our improved assembly in the context of genome wide association studies to look for genes that are involved in the salicinoid metabolism. These metabolites are only available in Populus and Salix species and help to protect the plant against herbivores” explains Bastian Schiffthaler. “When compared to previous attempts using the more fragmented assembly, we could see that our new genome version improved the analysis of this complex trait a lot and we were able to gain new insights into the evolution of the different Populus species.”

Created new tool

To identify genes that are controlling complex traits is very challenging. Bastian Schiffthaler and his colleagues studied leaf shape variation in European aspen, a complex trait that is inherited from the parents but still highly diverse between individuals. Their results show that leaf shape is controlled by a complex network of many different genes, but the individual gene often exerted only a minor influence on the final leaf shape.

Bastian Schiffthaler believes that it in order to better understand the workings of traits like leaf shape, an integrative approach, where traits are analysed at all stages that contribute to their emergence. He therefore developed “Seidr”, a toolkit to study the interactions of genes that are actively being made into protein within an organism. He hopes that integrating “Seidr” with other layers of data will enable scientists to better predict complex traits in the future.

Bastian Schiffthaler successfully defended his thesis at Umeå University in June 2025.