Exploring Representativity in Device Scheduling for Wireless Federated Learning

Chen, Z; Yi, W; Nallanathan, A

dc.contributor.author	Chen, Z
dc.contributor.author	Yi, W
dc.contributor.author	Nallanathan, A
dc.date.accessioned	2024-07-12T07:36:48Z
dc.date.available	2024-07-12T07:36:48Z
dc.date.issued	2023-06-06
dc.identifier.citation	Z. Chen, W. Yi and A. Nallanathan, "Exploring Representativity in Device Scheduling for Wireless Federated Learning," in IEEE Transactions on Wireless Communications, vol. 23, no. 1, pp. 720-735, Jan. 2024, doi: 10.1109/TWC.2023.3281765. keywords: {Wireless communication;Training;Performance evaluation;Convergence;Servers;Scheduling algorithms;Job shop scheduling;Device scheduling;wireless federated Learning;resource allocation;submodular optimization},	en_US
dc.identifier.issn	1536-1276
dc.identifier.uri	https://qmro.qmul.ac.uk/xmlui/handle/123456789/98043
dc.description.abstract	Existing device scheduling works in wireless federated learning (FL) mainly focused on selecting the devices with maximum gradient norm or loss function and require all devices to perform local training in each round. This may produce extra training costs and schedule devices with similar data statistics, thus degrading learning performance. To mitigate these problems, we first theoretically characterize the convergence behaviour of the considered FL system, finding that the learning performance is degraded by the difference between the aggregated gradient of scheduled devices and the full participation gradient. Inspired by this, we propose to find a subset of representative devices and the corresponding pre-device stepsizes to approximate the full participation aggregated gradient. Considering the limited wireless bandwidth, we formulate a problem to capture the trade-off between representativity and latency by optimizing device scheduling and bandwidth allocation policies. Our analysis reveals optimal bandwidth allocation is achieved when all scheduled devices have the same latency. Then, by proving the non-monotone submodularity of the problem, we develop a double greedy algorithm to solve the device scheduling policy. To avoid the local training of unscheduled devices, we utilize the historical gradient information of devices to estimate the current gradient for device scheduling design. Compared to existing scheduling algorithms, the proposed representativity-aware device scheduling algorithm improves 6.7% and 4.02% accuracies on two typical datasets under heterogeneous local data distributions, i.e., MNIST and CIFAR-10, respectively. In addition, the proposed latency- and representativity-aware scheduling algorithm saves over 16% and 12% training time for MNIST and CIFAR-10 datasets than the scheduling algorithms based on either latency and representativity individually.	en_US
dc.format.extent	720 - 735
dc.publisher	IEEE	en_US
dc.relation.ispartof	IEEE Transactions on Wireless Communications
dc.rights	© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.title	Exploring Representativity in Device Scheduling for Wireless Federated Learning	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1109/TWC.2023.3281765
pubs.issue	1	en_US
pubs.notes	Not known	en_US
pubs.publication-status	Published	en_US
pubs.volume	23	en_US
rioxxterms.funder	Default funder	en_US
rioxxterms.identifier.project	Default project	en_US
rioxxterms.funder.project	b215eee3-195d-4c4f-a85d-169a4331c138	en_US

Files in this item

Name:: Zhixiong_TWC_23_1.pdf
Size:: 876.9Kb
Format:: application/
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

Electronic Engineering and Computer Science [3424]

Show simple item record