Skip to content

Commit a2fa62f

Browse files
authored
Merge pull request #180 from graphcore-research/october2025-papers
October2025 papers
2 parents 6d35f70 + 11bac62 commit a2fa62f

File tree

1 file changed

+17
-9
lines changed

1 file changed

+17
-9
lines changed

_data/publications.yaml

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ areas:
1616
low-precision: "Low Precision"
1717
sparsity: "Sparsity"
1818
efficient-ml: "Efficient ML"
19-
gnns: "Graph Neural Networks"
19+
gnns: "Graph Learning"
2020
physics: "Physics"
2121
graphics: "Graphics"
2222

@@ -34,6 +34,14 @@ papers:
3434
2025:
3535
conference:
3636

37+
- title: "The Role of Graph Topology in the Performance of Biomedical Knowledge Graph Completion Models"
38+
url: https://www.arxiv.org/abs/2409.04103
39+
date: 2025-10-07
40+
area: [gnns]
41+
authors: "Alberto Cattaneo, Stephen Bonner, Thomas Martynec, Edward Morrissey, Carlo Luschi, Ian P Barrett, Daniel Justus"
42+
abstract: "Knowledge Graph Completion has been increasingly adopted as a useful method for several tasks in biomedical research, like drug repurposing or drug-target identification. To that end, a variety of datasets and Knowledge Graph Embedding models has been proposed over the years. However, little is known about the properties that render a dataset useful for a given task and, even though theoretical properties of Knowledge Graph Embedding models are well understood, their practical utility in this field remains controversial. We conduct a comprehensive investigation into the topological properties of publicly available biomedical Knowledge Graphs and establish links to the accuracy observed in real-world applications. By releasing all model predictions and a new suite of analysis tools we invite the community to build upon our work and continue improving the understanding of these crucial applications."
43+
published: "Bioinformatics, Volume 41, Issue 10, October 2025"
44+
3745
- title: "On Stochastic Rounding with Few Random Bits"
3846
url: https://arxiv.org/abs/2504.20634
3947
date: 2025-05-07
@@ -52,6 +60,14 @@ papers:
5260

5361
workshop:
5462

63+
- title: "Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs"
64+
url: https://arxiv.org/abs/2511.04473
65+
date: 2025-11-06
66+
area: [gnns]
67+
authors: "Alberto Cattaneo, Carlo Luschi, Daniel Justus"
68+
abstract: "Retrieval of information from graph-structured knowledge bases represents a promising direction for improving the factuality of LLMs. While various solutions have been proposed, a comparison of methods is difficult due to the lack of challenging QA datasets with ground-truth targets for graph retrieval. We present SynthKGQA, a framework for generating high-quality synthetic Knowledge Graph Question Answering datasets from any Knowledge Graph, providing the full set of ground-truth facts in the KG to reason over each question. We show how, in addition to enabling more informative benchmarking of KG retrievers, the data produced with SynthKGQA also allows us to train better models. We apply SynthKGQA to Wikidata to generate GTSQA, a new dataset designed to test zero-shot generalization abilities of KG retrievers with respect to unseen graph structures and relation types, and benchmark popular solutions for KG-augmented LLMs on it."
69+
published: "arXiv Preprint"
70+
5571
- title: "Elucidating the Design Space of FP4 training"
5672
url: https://arxiv.org/abs/2509.17791
5773
date: 2025-09-22
@@ -105,14 +121,6 @@ papers:
105121
abstract: "The nearest neighbour search problem underlies many important machine learning applications, including efficient long-context generation, retrieval-augmented generation, and knowledge graph completion. However, computing top-k exactly suffers from limited parallelism, making it inefficient for highly parallel machine learning accelerators. By relaxing the requirement that the top-k is exact, bucketed algorithms can dramatically increase parallelism by independently computing many smaller top-k operations. We explore the design choices for this class of algorithms using both theoretical analysis and empirical evaluation on downstream tasks. Our motivating examples are sparsity algorithms for language models, which often use top-k to select the most important parameters or activations. We also release a fast bucketed top-k implementation for PyTorch."
106122
published: "NeurIPS'24 Workshop on Adaptive Foundation Models"
107123

108-
- title: "The Role of Graph Topology in the Performance of Biomedical Knowledge Graph Completion Models"
109-
url: https://www.arxiv.org/abs/2409.04103
110-
date: 2024-09-06
111-
area: [gnns]
112-
authors: "Alberto Cattaneo, Stephen Bonner, Thomas Martynec, Carlo Luschi, Ian P Barrett, Daniel Justus"
113-
abstract: "Knowledge Graph Completion has been increasingly adopted as a useful method for several tasks in biomedical research, like drug repurposing or drug-target identification. To that end, a variety of datasets and Knowledge Graph Embedding models has been proposed over the years. However, little is known about the properties that render a dataset useful for a given task and, even though theoretical properties of Knowledge Graph Embedding models are well understood, their practical utility in this field remains controversial. We conduct a comprehensive investigation into the topological properties of publicly available biomedical Knowledge Graphs and establish links to the accuracy observed in real-world applications. By releasing all model predictions and a new suite of analysis tools we invite the community to build upon our work and continue improving the understanding of these crucial applications."
114-
published: "ICML'24 Workshop on Machine Learning for Life and Material Science: From Theory to Industry applications"
115-
116124
- title: "Scalify: scale propagation for efficient low-precision LLM training"
117125
url: https://arxiv.org/abs/2407.17353
118126
date: 2024-07-24

0 commit comments

Comments
 (0)