MOTIVATION: Network-based gene function inference methods have proliferated in recent years, but measurable progress remains elusive. We wished to better explore performance trends by controlling data and algorithm implementation, with a particular focus on the performance of aggregate predictions. RESULTS: Hypothesizing that popular methods would perform well without hand-tuning, we used well-characterized algorithms to produce verifiably ‘untweaked’ results. We find that most state-of-the-art machine learning methods obtain ‘gold standard’ performance as measured in critical assessments in defined tasks. Across a broad range of tests, we see close alignment in algorithm performances after controlling for the underlying data being used. We find that algorithm aggregation provides only modest benefits, with a 17% increase in area under the ROC (AUROC) above the mean AUROC. In contrast, data aggregation gains are enormous with an 88% improvement in mean AUROC. Altogether, we find substantial evidence to support the view that additional algorithm development has little to offer for gene function prediction. AVAILABILITY AND IMPLEMENTATION: The supplementary information contains a description of the algorithms, the network data parsed from different biological data resources and a guide to the source code (available at: http://gillislab.cshl.edu/supplements/).