Ancestral sequences of a large promiscuous enzyme family correspond to bridges in sequence space in a network representation

Patrick C. F. Buchholz, Bert van Loo, Bernard D.G. Eenink, Erich Bornberg-Bauer, Jürgen Pleiss*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Evolutionary relationships of protein families can be characterized either by networks or by trees. Whereas trees allow for hierarchical grouping and reconstruction of the most likely ancestral sequences, networks lack a time axis but allow for thresholds of pairwise sequence identity to be chosen and, therefore, the clustering of family members with presumably more similar functions. Here, we use the large family of arylsulfatases and phosphonate monoester hydrolases to investigate similarities, strengths and weaknesses in tree and network representations. For varying thresholds of pairwise sequence identity, values of betweenness centrality and clustering coefficients were derived for nodes of the reconstructed ancestors to measure the propensity to act as a bridge in a network. Based on these properties, ancestral protein sequences emerge as bridges in protein sequence networks. Interestingly, many ancestral protein sequences appear close to extant sequences. Therefore, reconstructed ancestor sequences might also be interpreted as yet-to-be-identified homologues. The concept of ancestor reconstruction is compared to consensus sequences, too. It was found that hub sequences in a network, e.g. reconstructed ancestral sequences that are connected to many neighbouring sequences, share closer similarity with derived consensus sequences. Therefore, some reconstructed ancestor sequences can also be interpreted as consensus sequences.
Original languageEnglish
Article number20210389
Number of pages9
JournalJournal of the Royal Society Interface
Issue number184
Early online date3 Nov 2021
Publication statusPublished - 3 Nov 2021


Dive into the research topics of 'Ancestral sequences of a large promiscuous enzyme family correspond to bridges in sequence space in a network representation'. Together they form a unique fingerprint.

Cite this