Exploring The Principles and Prospects for Efficient Fine-Tuning of Transformer-Based Pre-Trained Large Language Models

Open Access

Issue		ITM Web Conf. Volume 78, 2025 International Conference on Computer Science and Electronic Information Technology (CSEIT 2025)


Article Number		04033
Number of page(s)		13
Section		Foundations and Frontiers in Multimodal AI, Large Models, and Generative Technologies
DOI		https://doi.org/10.1051/itmconf/20257804033
Published online		08 September 2025

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al. “Language models are few-shot learners”, Advances in Neural Information Processing Systems, vol. 33, Vancouver, Canada, 2020, pp. 1877–1901. [Google Scholar]
Y. Zhuang, Y. Yu, K. Wang, H. Sun, and C. Zhang “ToolQA: A dataset for LLM question answering with external tools”, arXiv preprint, arXiv:2306.13304, 2023. [Google Scholar]
W. Zhu, H. Liu, Q. Dong, J. Xu, L. Kong, J. Chen, L. Li, and S. Huang “Multilingual machine translation with large language models: Empirical results and analysis”, arXiv preprint, arXiv:2304.04675, 2023. [Google Scholar]
G. Li, H. A. A. K. Hammoud, H. Itani, D. Khizbullin, and B. Ghanem “Camel: Communicative agents for mind exploration of large language model society”, Thirty-seventh Conference on Neural Information Processing Systems, New Orleans, USA, 2023. [Google Scholar]
Z. Han, et al. “Parameter-efficient fine-tuning for large models: A comprehensive survey”, arXiv preprint, arXiv:2403.14608, 2024. [Google Scholar]
Y. Li, et al. “Tokens-to-token ViT: Training vision transformers from scratch on ImageNet”, IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 2021. [Google Scholar]
K. Zhou, et al. “Learning to prompt for vision-language models”, International Journal of Computer Vision, vol. 130, no. 9, 2022, pp. 2337–2348. [Google Scholar]
N. Ding, et al. “Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models”, arXiv preprint, arXiv:2203.06904, 2022. [Google Scholar]
A. Vaswani, et al. “Attention is all you need”, Advances in Neural Information Processing Systems, vol. 30, Long Beach, USA, 2017. [Google Scholar]
K. He, et al. “Deep residual learning for image recognition”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016. [Google Scholar]
L. Xu, H. Xie, Z. J. Qin, et al. “Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment”, arXiv preprint, arXiv:2312.12148, 2023. [Google Scholar]
N. Houlsby, et al. “Parameter-efficient transfer learning for NLP”, International Conference on Machine Learning (ICML), Long Beach, USA, 2019. [Google Scholar]
Y.-L. Sung, J. Cho, and M. Bansal “Lst: Ladder side-tuning for parameter and memory efficient transfer learning”, Advances in Neural Information Processing Systems, vol. 35, New Orleans, USA, 2022, pp. 12991–13005. [Google Scholar]
X. L. Li and P. Liang “Prefix-tuning: Optimizing continuous prompts for generation”, arXiv preprint, arXiv:2101.00190, 2021. [Google Scholar]
B. Lester, R. Al-Rfou, and N. Constant “The power of scale for parameter-efficient prompt tuning”, arXiv preprint, arXiv:2104.08691, 2021. [Google Scholar]
J. Lee, R. Tang, and J. Lin “What would Elsa do? Freezing layers during transformer fine-tuning”, arXiv preprint, arXiv:1911.03090, 2019. [Google Scholar]
E. Zaken, S. Ravfogel, and Y. Goldberg “BitFit: Simple parameter-efficient fine-tuning for transformer-based masked language models”, arXiv preprint, arXiv:2106.10199, 2021. [Google Scholar]
M. Raghu, et al. “Transfusion: Understanding transfer learning for medical imaging”, Advances in Neural Information Processing Systems, vol. 32, Vancouver, Canada, 2019. [Google Scholar]
A. Aghajanyan, L. Zettlemoyer, and S. Gupta ‘Intrinsic dimensionality explains the effectiveness of language model fine-tuning’, arXiv preprint, arXiv:2012.13255, 2020. [Google Scholar]
E. J. Hu, et al. “LoRA: Low-rank adaptation of large language models”, International Conference on Learning Representations (ICLR), Virtual Conference, 2022, pp. 1–3. [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.