Abstract
The loss function plays a crucial role in the construction of machine learning algorithms. Employing a teacher model to set loss functions dynamically for student models has attracted attention. In existing works, (1) the characterization of the dynamic loss suffers from some inherent limitations, ie, the computational cost of loss networks and the restricted similarity measurement handcrafted loss functions; and (2) the states of the student model are provided to the teacher model directly without integration, causing the teacher model to underperform when trained on insufficient amounts of data. To alleviate the above-mentioned issues, in this paper, we select and weigh a set of similarity metrics by a confidence-based selection algorithm and a temporal teacher model to enhance the dynamic loss functions. Subsequently, to integrate the states of the student model, we employ statistics to quantify the information loss of the student model. Extensive experiments demonstrate that our approach can enhance student learning and improve the performance of various deep models on real-world tasks, including classification, object detection, and semantic segmentation scenarios.
Original language | English |
---|---|
Article number | 111124 |
Journal | Pattern Recognition |
Volume | 159 |
DOIs | |
Publication status | Published - Mar 2025 |
Keywords
- Dynamic loss function
- Learning to teach
- Optimization