This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Ajees, A. P., Abrar, K. J., Sumam, M. I. and Sreenathan, M. 2021. A deep level tagger for malayalam, a morphologically rich language. Journal of Intelligent Systems 30(1): 115–129.AjeesA. P.AbrarK. J.SumamM. I.SreenathanM.2021A deep level tagger for malayalam, a morphologically rich languageJournal of Intelligent Systems30(1):11512910.1515/jisys-2019-0070Search in Google Scholar
Albalawi, R., Yeap, T. H. and Benyoucef, M. 2020. Using topic modeling methods for short-text data: a comparative analysis. Frontiers in Artificial Intelligence 3. Available at: https://doi.org/10.3389/frai.2020.00042.AlbalawiR.YeapT. H.BenyoucefM.2020 Using topic modeling methods for short-text data: a comparative analysisFrontiers in Artificial Intelligence3Available at:https://doi.org/10.3389/frai.2020.0004210.3389/frai.2020.00042786129833733159Search in Google Scholar
Alqahtani, A., Alhakami, H., Alsubait, T. and Baz, A. 2021. A survey of text matching techniques. Engineering, Technology & Applied Science Research 11(1): 6656–6661. doi: 10.48084/etasr.3968.[1].AlqahtaniA.AlhakamiH.AlsubaitT.BazA.2021A survey of text matching techniquesEngineering, Technology & Applied Science Research11(1):66566661doi:10.48084/etasr.3968.[1]Open DOISearch in Google Scholar
Alqrainy, S. and Alawairdhi, M. 2021. Towards developing a comprehensive tag set for the arabic language. Journal of Intelligent Systems 30(1): 287–296.AlqrainyS.AlawairdhiM.2021Towards developing a comprehensive tag set for the arabic languageJournal of Intelligent Systems30(1):28729610.1515/jisys-2019-0256Search in Google Scholar
Al-Subaihin, A., Sarro, F. and Black, S. 2019. Empirical comparison of text-based mobile apps similarity measurement techniques. Empirical Software Engineering 24: 3290–3315.Al-SubaihinA.SarroF.BlackS.2019Empirical comparison of text-based mobile apps similarity measurement techniquesEmpirical Software Engineering243290331510.1007/s10664-019-09726-5Search in Google Scholar
Arun, P. R. and Sumesh, M. S. 2015. Near-duplicate web page detection by enhanced TDW and simHash technique. 2015 International Conference on Computing and Network Communications (CoCoNet'15), December 16–19, Trivandrum.ArunP. R.SumeshM. S.2015Near-duplicate web page detection by enhanced TDW and simHash technique. 2015 International Conference on Computing and Network Communications (CoCoNet'15)December1619Trivandrum10.1109/CoCoNet.2015.7411276Search in Google Scholar
Broder, A. 2000. Identifying and Filtering Near-Duplicate Documents. In Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching, Montreal, Canada, pp. 1–10.BroderA.2000Identifying and Filtering Near-Duplicate Documents. In Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching, Montreal, Canadapp.110Search in Google Scholar
Chandrasekaran, D. and Mago, V. 2021. Evolution of semantic similarity—a survey. ACM Computing Surveys 54(2): 1–37, doi: 10.1145/3440755.[2].ChandrasekaranD.MagoV.2021Evolution of semantic similarity—a surveyACM Computing Surveys54(2):137doi:10.1145/3440755.[2]Open DOISearch in Google Scholar
Do, N. and LongVan, H. 2015. Domain-specific keyphrase extraction and near-duplicate article detection based on ontology. The 2015 IEEE RIVF International Conference on Computing & Communication Technologies—Research, Innovation, and Vision for Future (RIVF), pp. 123–126, doi: 10.1109/RIVF.2015.7049886.DoN.LongVanH.2015Domain-specific keyphrase extraction and near-duplicate article detection based on ontology. The 2015 IEEE RIVF International Conference on Computing & Communication Technologies—Research, Innovation, and Vision for Future (RIVF)pp.123126doi:10.1109/RIVF.2015.7049886Open DOISearch in Google Scholar
El-Kassas, W. S., Salama, C. R., Rafea, A. A. and Mohamed, H. K. 2021. Automatic text summarization: a comprehensive survey. Expert Systems with Applications 165: 113679.El-KassasW. S.SalamaC. R.RafeaA. A.MohamedH. K.2021Automatic text summarization: a comprehensive surveyExpert Systems with Applications16511367910.1016/j.eswa.2020.113679Search in Google Scholar
Elrefaiy, A., Abas, A. R. and Elhenawy, I. 2018. Review of recent techniques for extractive text summarization. Journal of Theoretical and Applied Information Technology 96(23): 7739–7759.ElrefaiyA.AbasA. R.ElhenawyI.2018Review of recent techniques for extractive text summarizationJournal of Theoretical and Applied Information Technology96(23):77397759Search in Google Scholar
Feng, J. and Wu, S.2015. “Detecting near-duplicate documents using sentence level features”, Chen, Q.DEXA 2015, Part II, LNCS 9262 Switzerland: Springer International Publishing; pp. 195–204, doi: 10.1007/978-3-319-22852-5_17.FengJ.WuS.2015“Detecting near-duplicate documents using sentence level features”,ChenQ.DEXA 2015, Part II, LNCS 9262SwitzerlandSpringer International Publishingpp.195204doi:10.1007/978-3-319-22852-5_17Open DOISearch in Google Scholar
Gali, N., Mariescu-Istodor, R. and Fränti, P. 2016. Similarity measures for title matching. 2016 23rd International Conference on Pattern Recognition (ICPR) Cancún Centre, Cancún, December 4–8.GaliN.Mariescu-IstodorR.FräntiP.2016Similarity measures for title matching. 2016 23rd International Conference on Pattern Recognition (ICPR) Cancún Centre, CancúnDecember48Search in Google Scholar
Han, M., Zhang, X., Yuan, X., Jiang, J., Yun, W. and Gao, C. 2021. A survey on the techniques, applications, and performance of short text semantic similarity. Concurrency and Computation: Practice and Experience 33(5), doi: 10.1002/cpe.5971.HanM.ZhangX.YuanX.JiangJ.YunW.GaoC.2021A survey on the techniques, applications, and performance of short text semantic similarityConcurrency and Computation: Practice and Experience33(5)doi:10.1002/cpe.5971Open DOISearch in Google Scholar
Hajishirzi, H., Yih, W. and Kołcz, A. 2010. Adaptive near-duplicate detection via similarity learning. SIGIR’10, Geneva, July 19–23.HajishirziH.YihW.KołczA.2010Adaptive near-duplicate detection via similarity learning. SIGIR’10, GenevaJuly1923Search in Google Scholar
Hassanian-esfahania, R. and Kargar, M. -J. 2018. Sectional MinHash for near-duplicate detection. Expert Systems with Applications 99: 203–212.Hassanian-esfahaniaR.KargarM. -J.2018Sectional MinHash for near-duplicate detectionExpert Systems with Applications9920321210.1016/j.eswa.2018.01.014Search in Google Scholar
Hendre, M., Mukherjee, P., Godse, M. 2021. Utility of neural embeddings in semantic similarity of text data. Bhateja, V., Peng, S. L., Satapathy, S. C. and Zhang, Y. D.Evolution in Computational Intelligence. Advances in Intelligent Systems and Computing 1176. Springer, Singapore, Available at: https://doi.org/10.1007/978-981-15-5788-0_21.HendreM.MukherjeeP.GodseM.2021Utility of neural embeddings in semantic similarity of text dataBhatejaV.PengS. L.SatapathyS. C.ZhangY. D.Evolution in Computational Intelligence. Advances in Intelligent Systems and Computing1176SpringerSingaporeAvailable at:https://doi.org/10.1007/978-981-15-5788-0_2110.1007/978-981-15-5788-0_21Search in Google Scholar
Jain, A., Bhatia, D. and Thakur, M. K. 2017. Extractive text summarization using word vector embedding. 2017 International Conference on Machine Learning and Data Science (MLDS), pp. 51–55, doi: 10.1109/MLDS.2017.12.JainA.BhatiaD.ThakurM. K.2017Extractive text summarization using word vector embedding. 2017 International Conference on Machine Learning and Data Science (MLDS)pp.5155doi:10.1109/MLDS.2017.12Open DOISearch in Google Scholar
Khattak, F. K., Jeblee, S., Pou-Prom, C., Abdalla, M., Meaney, C. and Rudzicz, F. 2019. A survey of word embeddings for clinical text. Journal of Biomedical Informatics X 4:100057.KhattakF. K.JebleeS.Pou-PromC.AbdallaM.MeaneyC.RudziczF.2019A survey of word embeddings for clinical textJournal of Biomedical Informatics X410005710.1016/j.yjbinx.2019.10005734384583Search in Google Scholar
Li, S. and Gong, B. 2021. Word embedding and text classification based on deep learning methods. MATEC Web of Conferences 336(3): 06022, doi: 10.1051/matecconf/202133606022.LiS.GongB.2021Word embedding and text classification based on deep learning methodsMATEC Web of Conferences336(3):06022doi:10.1051/matecconf/202133606022Open DOISearch in Google Scholar
Mansoor, M., Ur Rehman, Z., Shaheen, M., Khan, M. A. and Habib, M. 2020. Deep learning based semantic similarity detection using text data. Information Technology and Control 49(4): 495–510, doi: 10.5755/j01.itc.49.4.27118.MansoorM.Ur RehmanZ.ShaheenM.KhanM. A.HabibM.2020Deep learning based semantic similarity detection using text dataInformation Technology and Control49(4):495510doi:10.5755/j01.itc.49.4.27118Open DOISearch in Google Scholar
Mishra, A. R. 2019. Impact of feature representation on supervised classifiers—A comparative analysis. Global Sci-Tech 11(2): 69–74.MishraA. R.2019Impact of feature representation on supervised classifiers—A comparative analysisGlobal Sci-Tech11(2):697410.5958/2455-7110.2019.00010.7Search in Google Scholar
Mishra, A. R., Panchal, V. K. and Kumar, P.2019. Extractive text summarization—an effective approach to extract information from Text. 2019 International Conference on contemporary Computing and Informatics (IC3I), Singapore, pp. 252–255, doi: 10.1109/IC3I46837.2019.9055636.MishraA. R.PanchalV. K.KumarP.2019Extractive text summarization—an effective approach to extract information from Text. 2019 International Conference on contemporary Computing and Informatics (IC3I), Singaporepp.252255doi:10.1109/IC3I46837.2019.9055636Open DOISearch in Google Scholar
Mishra, A. R., Panchal, V. K. and Kumar, P. 2020. “Similarity Search based on Text Embedding Model for detection of Near Duplicates”. International Journal of Grid and Distributed Computing 13(2): 1871–1881.MishraA. R.PanchalV. K.KumarP.2020“Similarity Search based on Text Embedding Model for detection of Near Duplicates”International Journal of Grid and Distributed Computing13(2):18711881Search in Google Scholar
Mohammadi, H. and Khasteh, S. H. 2020. A fast text similarity measure for large document collections using multireference cosine and genetic algorithm. Turkish Journal of Electrical Engineering Computer Sciences 28(2): 999–1013.MohammadiH.KhastehS. H.2020A fast text similarity measure for large document collections using multireference cosine and genetic algorithmTurkish Journal of Electrical Engineering Computer Sciences28(2):999101310.3906/elk-1906-30Search in Google Scholar
Nazar, R., Balvet, A., Ferraro, G., Marín, R. and Renau, I. 2021. Pruning and repopulating a lexical taxonomy: experiments in Spanish, English and French. Journal of Intelligent Systems 30(1): 376–394.NazarR.BalvetA.FerraroG.MarínR.RenauI.2021Pruning and repopulating a lexical taxonomy: experiments in Spanish, English and FrenchJournal of Intelligent Systems30(1):37639410.1515/jisys-2020-0044Search in Google Scholar
Pamulaparty, L., Rao, C. V. G. and Rao, M. S.2014. A near duplicate detection algorithm to facilitate document clustering. International Journal of Data Mining & Knowledge Management Process (IJDKP) 4(6): 39–49, doi: 10.5121/ijdkp.2014.4604 39.PamulapartyL.RaoC. V. G.RaoM. S.2014. A near duplicate detection algorithm to facilitate document clustering.International Journal of Data Mining & Knowledge Management Process (IJDKP)4(6):3949doi: 10.5121/ijdkp.2014.4604 3910.5121/ijdkp.2014.4604Search in Google Scholar
Pamulapartya, L., Rao, C. V. G. and Rao, M. S. 2015. XNDDF: towards a framework for flexible near-duplicate document detection using supervised and unsupervised learning. International Conference on Intelligent Computing, Communication & Convergence (ICCC-2014), Procedia Computer Science 48: 228–235.PamulapartyaL.RaoC. V. G.RaoM. S.2015. XNDDF: towards a framework for flexible near-duplicate document detection using supervised and unsupervised learning. International Conference on Intelligent Computing, Communication & Convergence (ICCC-2014)Procedia Computer Science4822823510.1016/j.procs.2015.04.175Search in Google Scholar
Pamulaparty, L., Rao, C. V. G. and Rao, M. S. 2017. Critical review of various near-duplicate detection methods in web crawl and their prospective application in drug discovery. International Journal of Biomedical Engineering and Technology 25( 2/3/4): 212–226.PamulapartyL.RaoC. V. G.RaoM. S.2017 Critical review of various near-duplicate detection methods in web crawl and their prospective application in drug discoveryInternational Journal of Biomedical Engineering and Technology25( 2/3/4):21222610.1504/IJBET.2017.087723Search in Google Scholar
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L. 2018. Deep contextualized word representations. arXiv:1802.05365.PetersM. E.NeumannM.IyyerM.GardnerM.ClarkC.LeeK.ZettlemoyerL.2018Deep contextualized word representations. arXiv:1802.0536510.18653/v1/N18-1202Search in Google Scholar
Rodier, S. and Carter, D. 2020. Online near-duplicate detection of news article. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, 11–16 c European Language Resources Association (ELRA), Marseille, May 11–16, pp. 1242–1249, licensed under CC-BY-NC.RodierS.CarterD.2020Online near-duplicate detection of news article. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, 11–16 c European Language Resources Association (ELRA), MarseilleMay1116pp.12421249licensed under CC-BY-NCSearch in Google Scholar
Roul, R. K. and Sahoo, J. K. 2020. Near-duplicate document detection using semantic-based similarity measure: a novel approach. Advances in Intelligent Systems and Computing 990: 543–558.RoulR. K.SahooJ. K.2020Near-duplicate document detection using semantic-based similarity measure: a novel approachAdvances in Intelligent Systems and Computing99054355810.1007/978-981-13-8676-3_46Search in Google Scholar
Shashavali, D., Vishwjeet, V., Kumar, R., Mathur, G., Nihal, N., Mukherjee, S. and Patil, S. V. 2019. Sentence similarity techniques for short vs variable length text using word embeddings. Computación y Sistemas 23(3): 999–1004.ShashavaliD.VishwjeetV.KumarR.MathurG.NihalN.MukherjeeS.PatilS. V.2019Sentence similarity techniques for short vs variable length text using word embeddingsComputación y Sistemas23(3):999100410.13053/cys-23-3-3273Search in Google Scholar
Stefanovič, P., Kurasova, O. and Štrimaitis, R. 2019. The N-grams based text similarity detection approach using self-organizing maps and similarity measures. Applied Sciences (Switzerland) 9(9): 1870, doi: 10.3390/app9091870.StefanovičP.KurasovaO.ŠtrimaitisR.2019The N-grams based text similarity detection approach using self-organizing maps and similarity measuresApplied Sciences (Switzerland)9(9):1870doi:10.3390/app9091870Open DOISearch in Google Scholar
Tan, T. and Phienthrakul, T. 2019. Sentiment classification using document embeddings trained with cosine similarity. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pp. 407–414.TanT.PhienthrakulT.2019Sentiment classification using document embeddings trained with cosine similarity. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshoppp.407414Search in Google Scholar
Wang, J. H. and Chang, H. C. 2009. Exploiting Sentence-level Features for Near-duplicate Document Detection. In Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology (AIRS09), Sapporo, Japan, Springer: Berlin/Heidelberg, Germany, pp. 205–217.WangJ. H.ChangH. C.2009Exploiting Sentence-level Features for Near-duplicate Document Detection. In Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology (AIRS09), Sapporo, Japan, Springer: Berlin/Heidelberg, Germanypp.205217Search in Google Scholar
Wang, J. and Dong, Y. 2020. Measurement of text similarity: a survey. Information 11(9): 421.WangJ.DongY.2020Measurement of text similarity: a surveyInformation11(9):42110.3390/info11090421Search in Google Scholar
Wang, Q., Liu, P., Zhu, Z., Yin, H., Zhang, Q. and Zhang, L. 2019. A text abstraction summary model based on BERT word embedding and reinforcement learning. Applied Sciences (Switzerland) 9(21): 4701, doi: 10.3390/app9214701.WangQ.LiuP.ZhuZ.YinH.ZhangQ.ZhangL.2019A text abstraction summary model based on BERT word embedding and reinforcement learningApplied Sciences (Switzerland)9(21):4701doi:10.3390/app9214701Open DOISearch in Google Scholar
Xiao, C., Wang, W., Lin, X. and Yu, J. X.2008. Efficient Similarity Joins for Near DuplicateDetection” WWW2008, April 21–25, Beijing, ACM 78-1-60558-085-2/08.XiaoC.WangW.LinX.YuJ. X.2008Efficient Similarity Joins for Near DuplicateDetection” WWW2008, April 21–25, BeijingACM 78-1-60558-085-2/08Search in Google Scholar
Yandrapally, R. K., Stocco, A. and Mesbah, A.2020. Near-duplicate detection in web app model inference. ICSE ’20, May 23–29, Seoul, Republic of Korea, ACM, New York, NY, May 23–29, 12pp. Available at: https://doi.org/10.1145/3377811.3380416.YandrapallyR. K.StoccoA.MesbahA.2020Near-duplicate detection in web app model inference. ICSE ’20, May 23–29, Seoul, Republic of Korea, ACM, New York, NYMay232912pp. Available at:https://doi.org/10.1145/3377811.338041610.1145/3377811.3380416Search in Google Scholar
Yung-Shen, L., Ting-Yi, L. and Shie-Jue, L. 2013. Detecting near-duplicate documents using sentence-level features and supervised learning. Expert Systems with Applications 40(5): 1467–1476.Yung-ShenL.Ting-YiL.Shie-JueL.2013Detecting near-duplicate documents using sentence-level features and supervised learningExpert Systems with Applications40(5):1467147610.1016/j.eswa.2012.08.045Search in Google Scholar