OverviewSignaling pathways are cornerstones of molecular and cellular biology. Hypergraphs and hyperpaths provide a precise model for signaling pathways but have been underutilized, due to their theoretical intractability and the lack of practical algorithms. The best previously-available method to find a shortest hyperpath, which is NP-complete, can only find acyclic hyperpaths, even though cycles are abundant in real signaling pathways.
MethodsWe give here source code for a new heuristic of finding shortest hyperpaths which for the first time may contain cycles. We also provide source code for our hyperpath enumeration algorithm, which exhaustively explores all possible s,t-hyperpaths.
Our fast heuristic often finds a provably optimal hyperpath on real signaling pathways. Based on Dijkstra's algorithm for shortest paths in ordinary graphs, our hyperpath heuristic finds a possibly-cyclic hyperpath in under a minute, even on large cell-signaling hypergraphs with nearly 10,000 hyperedges.
Pathway enumeration algorithm
Our hyperpath enumeration algorithm maintains a queue of hyperpath problem instances consisting of insets and outsets. An outset for a problem instance is a set of hyperedges that are not allowed to be in the solution. An inset is a set of hyperedges that cannot be placed in the outset of child problem instances spawned from the current instance. Together, these sets ensure that the solution to a problem instance is ever computed twice.
On datasets from two common pathway databases, Reactome and NCI-PID, our hyperpath heuristic (in less than a minute of computation) finds a demonstrably optimal solution on 99.5% of all possible source, target instances. Furthermore, these experiments reveal the best prior methods, which were limited to acyclic hyperpaths, failed to find any solution for many instances where all source-target hyperpaths are in fact cyclic. For all such cyclic biological instances, our heuristic hyperpath is optimal.
Source codeThe source code is available on GitHub.
The hyperpaths source code is free for noncommercial use, and comes with neither warranty nor guarantee. The hyperpaths source code cannot be redistributed in any form without consent of the authors. If you wish to use the hyperpaths source code for commercial purpose, you must first obtain the permission from all authors. All noteworthy uses of the hyperpaths source code should cite the related paper.
Noteworthy uses of the hyperpaths source code should cite the following publication: Spencer Krieger, and John Kececioglu, “Fast approximate shortest hyperpaths for inferring cell signaling pathways”, Under review for the Proceedings of the Workshop on Algorithms in Bioinformatics (WABI 2021).
Research supported by the US National Science Foundation through grants IIS-2041613 and CCF-1617192.