Show simple item record

dc.contributor.authorChen, Z
dc.contributor.authorYi, W
dc.contributor.authorNallanathan, A
dc.date.accessioned2024-07-12T07:36:48Z
dc.date.available2024-07-12T07:36:48Z
dc.date.issued2023-06-06
dc.identifier.citationZ. Chen, W. Yi and A. Nallanathan, "Exploring Representativity in Device Scheduling for Wireless Federated Learning," in IEEE Transactions on Wireless Communications, vol. 23, no. 1, pp. 720-735, Jan. 2024, doi: 10.1109/TWC.2023.3281765. keywords: {Wireless communication;Training;Performance evaluation;Convergence;Servers;Scheduling algorithms;Job shop scheduling;Device scheduling;wireless federated Learning;resource allocation;submodular optimization},en_US
dc.identifier.issn1536-1276
dc.identifier.urihttps://qmro.qmul.ac.uk/xmlui/handle/123456789/98043
dc.description.abstractExisting device scheduling works in wireless federated learning (FL) mainly focused on selecting the devices with maximum gradient norm or loss function and require all devices to perform local training in each round. This may produce extra training costs and schedule devices with similar data statistics, thus degrading learning performance. To mitigate these problems, we first theoretically characterize the convergence behaviour of the considered FL system, finding that the learning performance is degraded by the difference between the aggregated gradient of scheduled devices and the full participation gradient. Inspired by this, we propose to find a subset of representative devices and the corresponding pre-device stepsizes to approximate the full participation aggregated gradient. Considering the limited wireless bandwidth, we formulate a problem to capture the trade-off between representativity and latency by optimizing device scheduling and bandwidth allocation policies. Our analysis reveals optimal bandwidth allocation is achieved when all scheduled devices have the same latency. Then, by proving the non-monotone submodularity of the problem, we develop a double greedy algorithm to solve the device scheduling policy. To avoid the local training of unscheduled devices, we utilize the historical gradient information of devices to estimate the current gradient for device scheduling design. Compared to existing scheduling algorithms, the proposed representativity-aware device scheduling algorithm improves 6.7% and 4.02% accuracies on two typical datasets under heterogeneous local data distributions, i.e., MNIST and CIFAR-10, respectively. In addition, the proposed latency- and representativity-aware scheduling algorithm saves over 16% and 12% training time for MNIST and CIFAR-10 datasets than the scheduling algorithms based on either latency and representativity individually.en_US
dc.format.extent720 - 735
dc.publisherIEEEen_US
dc.relation.ispartofIEEE Transactions on Wireless Communications
dc.rights© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.titleExploring Representativity in Device Scheduling for Wireless Federated Learningen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/TWC.2023.3281765
pubs.issue1en_US
pubs.notesNot knownen_US
pubs.publication-statusPublisheden_US
pubs.volume23en_US
rioxxterms.funderDefault funderen_US
rioxxterms.identifier.projectDefault projecten_US
rioxxterms.funder.projectb215eee3-195d-4c4f-a85d-169a4331c138en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record