Natural Language Processing Techniques for Information Retrieval Enhancing Search Engines with Semantic Understanding

Subi S; Shanthini B; SilpaRaj M; Shekar K; Keerthana G; Anitha R

doi:10.1051/itmconf/20257605013

Open Access

Issue		ITM Web Conf. Volume 76, 2025 Harnessing Innovation for Sustainability in Computing and Engineering Solutions (ICSICE-2025)


Article Number		05013
Number of page(s)		11
Section		Emerging Technologies & Computing
DOI		https://doi.org/10.1051/itmconf/20257605013
Published online		25 March 2025

ITM Web of Conferences 76, 05013 (2025)

Natural Language Processing Techniques for Information Retrieval Enhancing Search Engines with Semantic Understanding

Subi S¹, Shanthini B², SilpaRaj M³, Shekar K⁴, Keerthana G⁵ and Anitha R⁶

¹ Assistant Professor, Department of Artificial Intelligence and Data Scienct, R.M.K. College of Engineering and Technology (RMKCET), Thiruvallur, Tamil Nadu, India
² Professor & Head, Dept of Computer Science and Engineering, St. Peter's Institute of Higher Education and Research, Chennai, Tamil Nadu, India
³ Assistant Professor, Department of Computer Science and Engineering (Cyber Security), CVR College of Engineering, Hyderabad, Telangana, India
⁴ Department of Computer Science and Engineering, MLR Institute of Technology, Hyderabad, Telangana, India
⁵ Assistant Professor, Department of Computer Science and Engineering, J.J. College of Engineering and Technology, Tiruchirappalli, Tamil Nadu, India
⁶ Assistant Professor, Department of IT, New Prince Shri Bhavani College of Engineering and Technology Chennai, Tamil Nadu, India

This email address is being protected from spambots. You need JavaScript enabled to view it.
This email address is being protected from spambots. You need JavaScript enabled to view it.
This email address is being protected from spambots. You need JavaScript enabled to view it.
This email address is being protected from spambots. You need JavaScript enabled to view it.
This email address is being protected from spambots. You need JavaScript enabled to view it.
This email address is being protected from spambots. You need JavaScript enabled to view it.

Abstract

This paper investigates new Natural Language Processing (NLP) methods which seek to improve information retrieval systems via semantic knowledge and focuses on enhancing search engines. The proposed ideas focus on reducing the size of the model (one of the biggest problems with large models), training it on domain-specific knowledge (the right knowledge is important for the real application) and ways to efficiently deal with unstructured data (this is also a key issue against NLP frameworks). The study highlights the need for hybrid models that combine generalization and specificity, fast algorithms for big data sets, and automated knowledge extraction. They include cross-lingual approaches, rapid learning in out-of-distribution domains, and human-centered design of AI systems. The end objective of this work is to create a semantic search engine which is adaptive, scalable and flexible; intent aware, and query ambiguity tolerant; improving semantic richness in results tailored to datasets of varying size; hence promising complementary applications of Natural Language Processing to information retrieval.

Key words: Natural Language Processing / Information Retrieval / Semantic Understanding / Search Engines / Large Models / Domain-Specific Knowledge / Hybrid Models

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.