Institutional Repository of School of Information Engineering and Artificial Intelligence
Fast semi-supervised self-training algorithm based on data editing | |
Li, Bing1,2; Wang, Jikui2; Yang, Zhengguo2; Yi, Jihai2; Nie, Feiping3,4 | |
2023-05 | |
发表期刊 | INFORMATION SCIENCES |
卷号 | 626页码:293-314 |
摘要 | Self-training is a commonly semi-supervised learning Algorithm framework. How to select the high-confidence samples is a crucial step for algorithms based on self-training framework. To alleviate the impact of noise data, researchers have proposed many data editing methods to improve the selection quality of high-confidence samples. However, the state-of-the-art data editing methods have high time complexity, which is not less than O(n(2)), where n denotes the number of samples. To improve the training speed while ensuring the quality of the selected high-confidence samples, inspired by Ball-k-means algorithm, we propose a fast semi-supervised self-training Algorithm based on data editing (EBSA), which defines ball-cluster partition and editing to improve the quality of high-confidence samples. The time complexity of the proposed EBSA is O(t(2kn + n log n + n + k(2))) , where k denotes the number of centers, t denotes the number of iterates. k is far less than n, EBSA has linear time complexity with respect to n. A large number of experiments on 20 benchmark data sets have been carried out and the experimental results show that the proposed Algorithm not only ran faster, but also obtained better classification performance compared with the comparison algorithms. (c) 2023 Elsevier Inc. All rights reserved. |
关键词 | Semi-supervised learning Self-training classification Ball-k-means Data editing |
DOI | 10.1016/j.ins.2023.01.029 |
收录类别 | SCIE ; EI |
ISSN | 0020-0255 |
语种 | 英语 |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Information Systems |
WOS记录号 | WOS:000925337600001 |
出版者 | ELSEVIER SCIENCE INC |
EI入藏号 | 20230413427395 |
EI主题词 | Classification (of information) |
EI分类号 | 716.1 Information Theory and Signal Processing ; 723.4.2 Machine Learning ; 903.1 Information Sources and Analysis |
原始文献类型 | Article |
EISSN | 1872-6291 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.lzufe.edu.cn/handle/39EH0E1M/33449 |
专题 | 信息工程与人工智能学院 |
作者单位 | 1.Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Guizhou, Peoples R China; 2.Lanzhou Univ Finance & Econ, Coll Informat Engn, Lanzhou 730020, Gansu, Peoples R China; 3.Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shanxi, Peoples R China; 4.Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning OPTIMAL, Xian 710072, Shanxi, Peoples R China |
第一作者单位 | 兰州财经大学 |
推荐引用方式 GB/T 7714 | Li, Bing,Wang, Jikui,Yang, Zhengguo,et al. Fast semi-supervised self-training algorithm based on data editing[J]. INFORMATION SCIENCES,2023,626:293-314. |
APA | Li, Bing,Wang, Jikui,Yang, Zhengguo,Yi, Jihai,&Nie, Feiping.(2023).Fast semi-supervised self-training algorithm based on data editing.INFORMATION SCIENCES,626,293-314. |
MLA | Li, Bing,et al."Fast semi-supervised self-training algorithm based on data editing".INFORMATION SCIENCES 626(2023):293-314. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论