Pakistan Science Abstracts
Article details & metrics
No Detail Found!!
CS-1792: Data-Efficient Transformers for Vision: Enhancing Accuracy under Resource Limitations
Author(s):
1. Aiman Sumaya: Dept. of CS, COMSATS University Islamabad, Wah Campus,,Pakistan
Abstract:
In recent years, transformer-based architectures have revolutionized computer vision, achieving state-of-the-art results across diverse tasks. Nevertheless, conventional Vision Transformers (ViTs) demand massive datasets and substantial computational resources, which restricts their applicability in resource-constrained or data-scarce scenarios. To overcome this limitation, DataEfficient Image Transformers (DEiTs) have been introduced, leveraging three key strategies: (i) knowledge distillation, where a compact student transformer benefits from the guidance of a powerful CNN teacher; (ii) optimized training protocols, including regularization and augmentation tailored for small data regimes; and (iii) efficient architectural modifications that reduce redundancy while preserving representational power. By integrating these mechanisms, DEiTs not only narrow the gap between transformers and CNNs under limited data conditions but also establish a scalable framework for future vision applications requiring both efficiency and accuracy.
Page(s): 102-102
DOI: DOI not available
Published: Journal: 4th International Conference of Sciences “Revamped Scientific Outlook of 21st Century, 2025” , November 12,2025, Volume: 1, Issue: 1, Year: 2025
Keywords:
Data Augmentation , Computer vision , Vit , knowledge distillation
References:
References are not available for this document.
Citations
Citations are not available for this document.
0

Citations

0

Downloads

13

Views