dc.contributor.author | Chen, Z | |
dc.contributor.author | Yi, W | |
dc.contributor.author | Nallanathan, A | |
dc.date.accessioned | 2024-07-12T07:36:48Z | |
dc.date.available | 2024-07-12T07:36:48Z | |
dc.date.issued | 2023-06-06 | |
dc.identifier.citation | Z. Chen, W. Yi and A. Nallanathan, "Exploring Representativity in Device Scheduling for Wireless Federated Learning," in IEEE Transactions on Wireless Communications, vol. 23, no. 1, pp. 720-735, Jan. 2024, doi: 10.1109/TWC.2023.3281765. keywords: {Wireless communication;Training;Performance evaluation;Convergence;Servers;Scheduling algorithms;Job shop scheduling;Device scheduling;wireless federated Learning;resource allocation;submodular optimization}, | en_US |
dc.identifier.issn | 1536-1276 | |
dc.identifier.uri | https://qmro.qmul.ac.uk/xmlui/handle/123456789/98043 | |
dc.description.abstract | Existing device scheduling works in wireless federated learning (FL) mainly focused on selecting the devices with maximum gradient norm or loss function and require all devices to perform local training in each round. This may produce extra training costs and schedule devices with similar data statistics, thus degrading learning performance. To mitigate these problems, we first theoretically characterize the convergence behaviour of the considered FL system, finding that the learning performance is degraded by the difference between the aggregated gradient of scheduled devices and the full participation gradient. Inspired by this, we propose to find a subset of representative devices and the corresponding pre-device stepsizes to approximate the full participation aggregated gradient. Considering the limited wireless bandwidth, we formulate a problem to capture the trade-off between representativity and latency by optimizing device scheduling and bandwidth allocation policies. Our analysis reveals optimal bandwidth allocation is achieved when all scheduled devices have the same latency. Then, by proving the non-monotone submodularity of the problem, we develop a double greedy algorithm to solve the device scheduling policy. To avoid the local training of unscheduled devices, we utilize the historical gradient information of devices to estimate the current gradient for device scheduling design. Compared to existing scheduling algorithms, the proposed representativity-aware device scheduling algorithm improves 6.7% and 4.02% accuracies on two typical datasets under heterogeneous local data distributions, i.e., MNIST and CIFAR-10, respectively. In addition, the proposed latency- and representativity-aware scheduling algorithm saves over 16% and 12% training time for MNIST and CIFAR-10 datasets than the scheduling algorithms based on either latency and representativity individually. | en_US |
dc.format.extent | 720 - 735 | |
dc.publisher | IEEE | en_US |
dc.relation.ispartof | IEEE Transactions on Wireless Communications | |
dc.rights | © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | |
dc.title | Exploring Representativity in Device Scheduling for Wireless Federated Learning | en_US |
dc.type | Article | en_US |
dc.identifier.doi | 10.1109/TWC.2023.3281765 | |
pubs.issue | 1 | en_US |
pubs.notes | Not known | en_US |
pubs.publication-status | Published | en_US |
pubs.volume | 23 | en_US |
rioxxterms.funder | Default funder | en_US |
rioxxterms.identifier.project | Default project | en_US |
rioxxterms.funder.project | b215eee3-195d-4c4f-a85d-169a4331c138 | en_US |