Using a Bayesian approach to reconstruct graph statistics after edge sampling
View/ Open
Volume
8
DOI
10.1007/s41109-023-00574-3
Journal
Applied Network Science
Issue
Metadata
Show full item recordAbstract
Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the counts associated with each edge). We use a Bayesian approach and show a range of methods for constructing a prior which does not require assumptions about the original network. Our approach is tested on two synthetic and three real datasets with diverse sizes, degree distributions, degree-degree correlations and triangle count distributions.
Authors
Arnold, NA; Mondragón, RJ; Clegg, RGLicence information
The following license files are associated with this item: