Show simple item record

dc.contributor.authorCui, Y
dc.contributor.authorMao, Y
dc.contributor.authorLiu, Z
dc.contributor.authorLi, Q
dc.contributor.authorChan, AB
dc.contributor.authorLiu, X
dc.contributor.authorKuo, T-W
dc.contributor.authorXue, CJ
dc.date.accessioned2024-07-22T09:33:22Z
dc.date.available2024-07-22T09:33:22Z
dc.date.issued2023-02-20
dc.identifier.citationY. Cui et al., "Variational Nested Dropout," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 8, pp. 10519-10534, Aug. 2023, doi: 10.1109/TPAMI.2023.3241945. keywords: {Training;Bayes methods;Uncertainty;Indexes;Costs;Computational modeling;Representation learning;Bayesian neural netowrk;dropout;model compression;slimmable neural network;uncertainty estimation;variational autoencoder},en_US
dc.identifier.urihttps://qmro.qmul.ac.uk/xmlui/handle/123456789/98284
dc.description.abstractNested dropout is a variant of dropout operation that is able to order network parameters or features based on the pre-defined importance during training. It has been explored for: I. Constructing nested nets Cui et al. 2020, Cui et al. 2021: the nested nets are neural networks whose architectures can be adjusted instantly during testing time, e.g., based on computational constraints. The nested dropout implicitly ranks the network parameters, generating a set of sub-networks such that any smaller sub-network forms the basis of a larger one. II. Learning ordered representation Rippel et al. 2014: the nested dropout applied to the latent representation of a generative model (e.g., auto-encoder) ranks the features, enforcing explicit order of the dense representation over dimensions. However, the dropout rate is fixed as a hyper-parameter during the whole training process. For nested nets, when network parameters are removed, the performance decays in a human-specified trajectory rather than in a trajectory learned from data. For generative models, the importance of features is specified as a constant vector, restraining the flexibility of representation learning. To address the problem, we focus on the probabilistic counterpart of the nested dropout. We propose a variational nested dropout (VND) operation that draws samples of multi-dimensional ordered masks at a low cost, providing useful gradients to the parameters of nested dropout. Based on this approach, we design a Bayesian nested neural network that learns the order knowledge of the parameter distributions. We further exploit the VND under different generative models for learning ordered latent distributions. In experiments, we show that the proposed approach outperforms the nested network in terms of accuracy, calibration, and out-of-domain detection in classification tasks. It also outperforms the related generative models on data generation tasks.en_US
dc.format.extent10519 - 10534
dc.languageeng
dc.publisherIEEEen_US
dc.relation.ispartofIEEE Trans Pattern Anal Mach Intell
dc.rights© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.titleVariational Nested Dropout.en_US
dc.typeArticleen_US
dc.identifier.doi10.1109/TPAMI.2023.3241945
pubs.author-urlhttps://www.ncbi.nlm.nih.gov/pubmed/37027650en_US
pubs.issue8en_US
pubs.notesNot knownen_US
pubs.publication-statusPublisheden_US
pubs.volume45en_US
rioxxterms.funderDefault funderen_US
rioxxterms.identifier.projectDefault projecten_US
rioxxterms.funder.projectb215eee3-195d-4c4f-a85d-169a4331c138en_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record