[1]周澳回,翁知远,周思源,等.一种基于主题过滤和语义匹配的服务发现方法[J].郑州大学学报(工学版),2022,43(06):36-41.[doi:10.13705/j.issn.1671-6833.2022.06.003]
 ZHOU Aohui,WENG Zhiyuan,ZHOU Siyuan,et al.A Service Discovery Method Based on Topic Filtering and Semantic Matching[J].Journal of Zhengzhou University (Engineering Science),2022,43(06):36-41.[doi:10.13705/j.issn.1671-6833.2022.06.003]
点击复制

一种基于主题过滤和语义匹配的服务发现方法()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
43卷
期数:
2022年06期
页码:
36-41
栏目:
出版日期:
2022-09-02

文章信息/Info

Title:
A Service Discovery Method Based on Topic Filtering and Semantic Matching
作者:
周澳回1 翁知远 2 周思源 1 黄 乔1 汪 烨1 张 华1
1.浙江工商大学计算机与信息工程学院;2.内布拉斯加大学林肯分校计算机科学与工程系;

Author(s):
ZHOU Aohui1 WENG Zhiyuan2 ZHOU Siyuan1 HUANG Qiao1 WANG Ye1 ZHANG Hua1
1.School of Computer and Information Engineering, Zhejiang Gongshang University, Hangzhou 310018, China; 、
2.Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln 68508, U.S.
Keywords:
service computing business targets service matching recurrent neural networks natural language processing
分类号:
TP311;O244
DOI:
10.13705/j.issn.1671-6833.2022.06.003
文献标志码:
A
摘要:
在互联网现有的大量可用的服务中,如何高效的为特定的业务目标匹配合适的服务是目前研究的一大难题。 针对这一问题,提出一种基于主题过滤和语义匹配的可用于海量服务发现的方法。 首先,使用 Word2Vec 对主题描述文本和业务目标描述文本进行相似度比较,获取业务目标主题。 其次,使用 TextRank 对服务描述文本提取服务关键句,通过提取到的业务目标主题对服务关键句进行过滤,缩小比较范围。 再次,对相应的业务目标与服务描述文本进行词向量提取,使用带注意力机制 BiLSTM 模型计算两者相似度并返回与业务目标描述文本最相似的前 N 个服务列表给业务开发人员进行选择,并对从Programmable Web上爬取的数据进行标注,以此建立实验所需的业务目标-服务句子数据集,评估本文方法的有效性。 最后,与 TextCNN 等模型进行对比,结果表明:本文方法的 MAP 比不带注意力机制的 BiLSTM 模型、TextCNN 模型、Word2VecSD 模型分别提高了 1. 41 百分点、4. 61 百分点和 4. 95 百分点,并且在今后的工作中有进一步改进的潜力。
Abstract:
Among the large number of available services on the internet, how to efficiently match the right service for a specific business target is a major challenge in current research. To address this problem, a method based on topic fitering and semantic matching was proposed that could be used for massive service discovery. The method first used Word2Vec to compare the similarity between the topic description text and the business target description text to obtain the business target topic, and then used TextRank to extract the service key sentences from the service description text. The service key sentences were filtered by the extracted business target topics to narrow the comparison range. Then, the word vector was extracted from the corresponding business goal and service description text, and the BiLSTM model with attention mechanism was used to calculate the similarity between them and return the list of the TOP-N services that were most similar to the business target description text for business developers for selection. And the data crawled from Programmable Web was annotated to build the business target-service sentence dataset required for the experiments, and evaluate the effectiveness of the methods in this study. Finally, the comparison results with models such as TextCNN , BiLSTM, and Word2VecSD showed that MAP of this method could be increase by 1.41 percentage points, 4.61 percentage points, and 4.95 percentage points. The finding of this study lay solid ground for further improvement in future work.

参考文献/References:

[1] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[ J] . Statistics , 2013, 26:1-9.

 [2] YANG D Z, ZHANG A N. Performing literature review using text mining, part III: summarizing articles using TextRank[C]∥2018 IEEE International Conference on Big Data. Piscataway: IEEE, 2018: 3186-3190.
 [3] HABI H V, JENNINGS R H, NETZER A. HMQ: hardware friendly mixed precision quantization block for CNNs[M] . Cham: Springer International Publishing, 2020: 448-463. 
[4] BRANCO B, ABREU P, GOMES A S, et al. Interleaved sequence RNNs for fraud detection [ C] ∥Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York: ACM, 2020: 3101-3109. 
[5] LI P H, FU T J, MA W Y. Why attention? analyze BiLSTM deficiency and its remedies in the case of NER[ J] . Proceedings of the AAAI conference on artificial intelligence, 2020, 34(5) : 8236-8244.
 [6] 李勇, 金庆雨, 张青川. 融合位置注意力机制和改 进 BLSTM 的食品评论情感分析 [ J] . 郑州大学学 报(工学版) , 2020, 41(1) : 58-62. 
LI Y, JIN Q Y, ZHANG Q C. Improved BLSTM food review sentiment analysis with positional attention mechanisms[ J] . Journal of Zhengzhou university ( engineering science) , 2020, 41(1) : 58-62. 
[7] 魏强, 金芝, 许焱. 基于概率主题模型的物联网服 务发现[ J] . 软件学报, 2014, 25(8) : 1640-1658. 
WEI Q, JIN Z, XU Y. Service discovery for Internet of Things based on probabilistic topic model[ J] . Journal of software, 2014, 25(8) : 1640-1658. 
[8] ZHANG N, WANG J, MA Y T, et al. Web service discovery based on goal-oriented query expansion[ J] . Journal of systems and software, 2018, 142: 73-91. 
[9] 郑垛萍, 姜 波, 汪 烨. 服 务 失 效 情 境 下 的 高 质 量 Web 服务推荐 [ J] . 小 型 微 型 计 算 机 系 统, 2015, 36(12) : 2675-2679. 
ZHENG D P, JIANG B, WANG Y. Recommendation of high-quality web service in the failure context[ J] . Journal of Chinese computer systems, 2015, 36(12) : 2675-2679. 
[10] CHEN Y Z, LU H J, SHAPIRO L, et al. An approach to semantic query expansion system based on Hepatitis ontology[ J] . Journal of biological researchThessaloniki, 2016, 23( S1) : 11.
 [11] WEI D P, WANG T, WANG J, et al. SAWSDL  iMatcher: a customizable and effective semantic web service matchmaker [ J ] . Journal of web semantics, 2011, 9(4) : 402-417.
 [12] SATO I, NAKAGAWA H. Stochastic divergence minimization for online collapsed variational Bayes zero inference of latent dirichlet allocation[ C]∥Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2015: 1035-1044. 
[13] CHEN F Z, LU C H, WU H, et al. A semantic similarity measure integrating multiple conceptual relationships for web service discovery [ J ] . Expert systems with applications, 2017, 67: 19-31.
 [14] YIN Y Y, CHEN L, XU Y S, et al. QoS prediction for service recommendation with deep feature learning in edge computing environment [ J] . Mobile networks and applications, 2020, 25(2) : 391-401.
 [15] SEVERYN A, MOSCHITTI A. Learning to rank short text pairs with convolutional deep neural networks[C]∥ Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2015: 373-382.
 [16] LIZARRALDE I, RODRIGUEZ J M, MATEOS C, et al. Word embeddings for improving REST services discoverability[C]∥2017 XLIII Latin American Computer Conference (CLEI). Piscataway: IEEE, 2017: 1-8. 
[17] HENDERSON P, FERRARI V. End-to-end training of object class detectors for mean average precision[C]∥ Asian Conference on Computer Vision. Cham: Springer, 2016: 198-213. 
[18] DEVLIN J, CHANG M W, LEE K, et al. Bert: pretraining of deep bidirectional transformers for language understanding[EB / OL] . (2018-10- 11) [ 2021- 11- 10] . https:∥doi. org / 10. 48550 / arXiv. 1810. 04805. 
[19] GAO T, YAO X, CHEN D. Simcse: simple contrastive learning of sentence embeddings[C]∥Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2021: 6894-6910. 
[20] GROVER A, LESKOVEC J. Node2vec: scalable feature learning for networks [ C ] ∥Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 855-864. 
[21] ZHU Y, LIU M, TU Z, et al. SRaSLR: a novel social relation aware service label recommendation model[C] ∥2021 IEEE International Conference on Web Services ( ICWS) . Piscataway: IEEE, 2021: 87-96.

更新日期/Last Update: 2022-10-03