Evaluation of Filtering Methods Applied to the Unstructured Datasets in the Predictive Learning Services
1 Moscow Technological Institute, 119334, Moscow, Russia
2 Moscow Technological University MIREA, 107996, Moscow, Russia
3 Moscow Institute of Physics and Technology, 141700, Dolgoprudny, Moscow Region, Russia
* Corresponding author: firstname.lastname@example.org
Predictive learning services perform aggregation and homogenization of open data from public sources, in particular from the online recruitment agencies. However, the sample of vacancies may contain various percentage of noise due to the frequent occurrence of homonyms. This article will consider two approaches of noise reduction: the first one is based on the cosine similarity and the second one is based on the contextual words.
© The Authors, published by EDP Sciences, 2017
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.