This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Arora, S., Liang, Y.Y., & Ma, T.Y. (2017). A simple but tough-to-beat baseline for sentence embeddings. In proceedings of International Conference on Learning Representations, Toulon, France, April 24–26, 2017.AroraS.LiangY.Y.MaT.Y.2017Inproceedings of International Conference on Learning RepresentationsToulon, FranceApril 24–26, 2017Search in Google Scholar
Astrakhantsev, N. (2015). Methods and software for terminology extraction from domain-specific text collection (Unpublished doctoral dissertation). Ph. D. thesis, Institute for System Programming of Russian Academy of Sciences.AstrakhantsevN.2015Ph. D. thesis,Institute for System Programming of Russian Academy of SciencesSearch in Google Scholar
Awan, M.N., & Beg, M.O. (2020). Top-rank: A topicalpostionrank for extraction and classification of keyphrases in text. Computer Speech & Language, 65, 101116.AwanM.N.BegM.O.2020Top-rank: A topicalpostionrank for extraction and classification of keyphrases in text6510111610.1016/j.csl.2020.101116Search in Google Scholar
Beltagy, I., Lo, K., & Cohan, A. (2019). Scibert: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676.BeltagyI.LoK.CohanA.2019arXiv preprint arXiv:1903.10676.Search in Google Scholar
Blei, D.M., Ng, A.Y., & Jordan, M.I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993–1022.BleiD.M.NgA.Y.JordanM.I.2003Latent dirichlet allocation3Jan9931022Search in Google Scholar
Cagliero, L., & La Quatra, M. (2020). Extracting highlights of scientific articles: A supervised summarization approach. Expert Systems with Applications, 160, 113659.CaglieroL.La QuatraM.2020Extracting highlights of scientific articles: A supervised summarization approach16011365910.1016/j.eswa.2020.113659Search in Google Scholar
Curiskis, S.A., Drake, B., Osborn, T.R., & Kennedy, P.J. (2020). An evaluation of document clustering and topic modelling in two online social networks: Twitter and reddit. Information Processing & Management, 57(2), 102034.CuriskisS.A.DrakeB.OsbornT.R.KennedyP.J.2020An evaluation of document clustering and topic modelling in two online social networks: Twitter and reddit57210203410.1016/j.ipm.2019.04.002Search in Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391–407.DeerwesterS.DumaisS.T.FurnasG.W.LandauerT.K.HarshmanR.1990Indexing by latent semantic analysis41639140710.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9Search in Google Scholar
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.DevlinJ.ChangM.-W.LeeK.ToutanovaK.2018arXiv preprint arXiv:1810.04805.Search in Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd (Vol. 96, pp. 226–231).EsterM.KriegelH.-P.SanderJ.XuX.1996A density-based algorithm for discovering clusters in large spatial databases with noise96226231Search in Google Scholar
Harris, Z.S. (1954). Distributional structure. Word, 10(2–3), 146–162.HarrisZ.S.1954Distributional structure102–314616210.1080/00437956.1954.11659520Search in Google Scholar
Hou, J.H., Yang, X.C., & Chen, C.M. (2018). Emerging trends and new developments in information science: A document co-citation analysis (2009–2016). Scientometrics, 115(2), 869–892.HouJ.H.YangX.C.ChenC.M.2018Emerging trends and new developments in information science: A document co-citation analysis (2009–2016)115286989210.1007/s11192-018-2695-9Search in Google Scholar
Jelodar, H., Wang, Y.L., Yuan, C., Feng, X., Jiang, X.H., Li, Y.C., & Zhao, L. (2019). Latent dirichlet allocation (lda) and topic modeling: Models, applications, a survey. Multimedia Tools and Applications, 78(11), 15169–15211.JelodarH.WangY.L.YuanC.FengX.JiangX.H.LiY.C.ZhaoL.2019Latent dirichlet allocation (lda) and topic modeling: Models, applications, a survey7811151691521110.1007/s11042-018-6894-4Search in Google Scholar
Jones, K.S. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11–21.JonesK.S.1972A statistical interpretation of term specificity and its application in retrieval28112110.1108/eb026526Search in Google Scholar
Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.JoulinA.GraveE.BojanowskiP.MikolovT.2016arXiv preprint arXiv:1607.01759.Search in Google Scholar
Kenter, T., Borisov, A., & De Rijke, M. (2016). Siamese cbow: Optimizing word embeddings for sentence representations. arXiv preprint arXiv:1606.04640.KenterT.BorisovA.De RijkeM.2016arXiv preprint arXiv:1606.04640.Search in Google Scholar
Kim, J., Yoon, J., Park, E., & Choi, S. (2020). Patent document clustering with deep embeddings. Scientometrics, 1–15.KimJ.YoonJ.ParkE.ChoiS.2020Patent document clustering with deep embeddings11510.1007/s11192-020-03396-7Search in Google Scholar
Krenn, M., & Zeilinger, A. (2020). Predicting research trends with semantic and neural networks with an application in quantum physics. Proceedings of the National Academy of Sciences, 117(4), 1910–1916.KrennM.ZeilingerA.2020Predicting research trends with semantic and neural networks with an application in quantum physics11741910191610.1073/pnas.1914370116Search in Google Scholar
Kuhn, T., Perc, M., & Helbing, D. (2014). Inheritance patterns in citation networks reveal scientific memes. Physical Review X, 4(4), 041036.KuhnT.PercM.HelbingD.2014Inheritance patterns in citation networks reveal scientific memes4404103610.1103/PhysRevX.4.041036Search in Google Scholar
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In International conference on machine learning (pp. 1188–1196).LeQ.MikolovT.2014InInternational conference on machine learning11881196Search in Google Scholar
Li, J.Z., Fan, Q.N., & Zhang, K., et al. (2007). Keyword extraction based on tf/idf for chinese news document. Wuhan University Journal of Natural Sciences, 12(5), 917–921.LiJ.Z.FanQ.N.ZhangK.2007Keyword extraction based on tf/idf for chinese news document12591792110.1007/s11859-007-0038-4Search in Google Scholar
Liu, H.W., Kou, H.Z., Yan, C., & Qi, L.Y. (2019). Link prediction in paper citation network to construct paper correlation graph. EURASIP Journal on Wireless Communications and Networking, 2019(1), 1–12.LiuH.W.KouH.Z.YanC.QiL.Y.2019Link prediction in paper citation network to construct paper correlation graph2019111210.1186/s13638-019-1561-7Search in Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119).MikolovT.SutskeverI.ChenK.CorradoG.S.DeanJ.2013Distributed representations of words and phrases and their compositionalityIn31113119Search in Google Scholar
Miller, G.A. (1995). Wordnet: A lexical database for english. Communications of the ACM, 38(11), 39–41.MillerG.A.1995Wordnet: A lexical database for english3811394110.1145/219717.219748Search in Google Scholar
Peganova, I., Rebrova, A., & Nedumov, Y. (2019). Labelling hierarchical clusters of scientific articles. In 2019 ivannikov memorial workshop (ivmem) (pp. 26–32).PeganovaI.RebrovaA.NedumovY.2019Labelling hierarchical clusters of scientific articlesIn263210.1109/IVMEM.2019.00010Search in Google Scholar
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365.PetersM.E.NeumannM.IyyerM.GardnerM.ClarkC.LeeK.ZettlemoyerL.2018arXiv preprint arXiv:1802.05365.Search in Google Scholar
Radu, R.-G., Rădulescu, I.-M., Truică, C.-O., Apostol, E.-S., & Mocanu, M. (2020). Clustering documents using the document to vector model for dimensionality reduction. In 2020 ieee international conference on automation, quality and testing, robotics (aqtr) (pp. 1–6).RaduR.-G.RădulescuI.-M.TruicăC.-O.ApostolE.-S.MocanuM.2020In2020 ieee international conference on automation, quality and testing, robotics (aqtr)1610.1109/AQTR49680.2020.9129967Search in Google Scholar
Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic keyword extraction from individual documents. Text mining: Applications and theory, 1, 1–20.RoseS.EngelD.CramerN.CowleyW.2010Automatic keyword extraction from individual documents112010.1002/9780470689646.ch1Search in Google Scholar
Rousseeuw, P.J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20, 53–65.RousseeuwP.J.1987Silhouettes: A graphical aid to the interpretation and validation of cluster analysis20536510.1016/0377-0427(87)90125-7Search in Google Scholar
Steinley, D. (2004). Properties of the hubert-arable adjusted rand index. Psychological methods, 9(3), 386.SteinleyD.2004Properties of the hubert-arable adjusted rand index9338610.1037/1082-989X.9.3.38615355155Search in Google Scholar
Vahidnia, S., Abbasi, A., & Abbass, H.A. (2020). Document clustering and labeling for research trend extraction and evolution mapping. In C. Zhang, P. Mayr, W. Lu, & Y. Zhang (Eds.), Proceedings of the 1st workshop on extraction and evaluation of knowledge entities from scientific documents co-located with the ACM/IEEE joint conference on digital libraries in 2020, eeke@jcdl 2020, virtual event, china, august 1st, 2020 (Vol. 2658, pp. 54–62). Retrieved from http://ceur-ws.org/Vol-2658/paper7.pdfVahidniaS.AbbasiA.AbbassH.A.2020InZhangC.MayrP.LuW.ZhangY.(Eds.),Proceedings of the 1st workshop on extraction and evaluation of knowledge entities from scientific documents co-located with the ACM/IEEE joint conference on digital libraries in 2020, eeke@jcdl 2020, virtual eventchinaaugust 1st, 202026585462Retrieved from http://ceur-ws.org/Vol-2658/paper7.pdfSearch in Google Scholar
Ward Jr, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American statistical association, 58(301), 236–244.WardJ.H.Jr1963Hierarchical grouping to optimize an objective function5830123624410.1080/01621459.1963.10500845Search in Google Scholar
Weber, T., Kranzlmüller, D., Fromm, M., & Tavares de Sousa, N. (2020). Using supervised learning to classify metadata of research data by field of study. Quantitative Science Studies, 1–26.WeberT.KranzlmüllerD.FrommM.Tavares de SousaN.2020Using supervised learning to classify metadata of research data by field of study12610.1162/qss_a_00049Search in Google Scholar
Xie, J., Girshick, R., & Farhadi, A. (2016). Unsupervised deep embedding for clustering analysis. In International conference on machine learning (pp. 478–487).XieJ.GirshickR.FarhadiA.2016InInternational conference on machine learning478487Search in Google Scholar
Xu, H.Y., Winnink, J., Yue, Z.H., Liu, Z.Q., & Yuan, G.T. (2020). Topic-linked innovation paths in science and technology. Journal of Informetrics, 14(2), 101014.XuH.Y.WinninkJ.YueZ.H.LiuZ.Q.YuanG.T.2020Topic-linked innovation paths in science and technology14210101410.1016/j.joi.2020.101014Search in Google Scholar
Xu, S., Hao, L.Y., An, X., Yang, G.C., & Wang, F.F. (2019). Emerging research topics detection with multiple machine learning models. Journal of Informetrics, 13(4), 100983.XuS.HaoL.Y.AnX.YangG.C.WangF.F.2019Emerging research topics detection with multiple machine learning models13410098310.1016/j.joi.2019.100983Search in Google Scholar
Xu, S., Zhai, D.S., Wang, F.F., An, X., Pang, H.S., & Sun, Y.R. (2019). A novel method for topic linkages between scientific publications and patents. Journal of the Association for Information Science and Technology, 70(9), 1026–1042.XuS.ZhaiD.S.WangF.F.AnX.PangH.S.SunY.R.2019A novel method for topic linkages between scientific publications and patents7091026104210.1002/asi.24175Search in Google Scholar
Zeng, A., Shen, Z.S., Zhou, J.L., Wu, J.S., Fan, Y., Wang, Y.G., & Stanley, H.E. (2017). The science of science: From the perspective of complex systems. Physics Reports, 714–715, 1–73. Retrieved from https://doi.org/10.1016/j.physrep.2017.10.001 doi: 10.1016/j.physrep.2017.10.001ZengA.ShenZ.S.ZhouJ.L.WuJ.S.FanY.WangY.G.StanleyH.E.2017The science of science: From the perspective of complex systems714–715173Retrieved from https://doi.org/10.1016/j.physrep.2017.10.00110.1016/j.physrep.2017.10.001Open DOISearch in Google Scholar
Zhang, Q.R., Li, Y., Liu, J.S., Chen, Y.D., & Chai, L.H. (2017). A dynamic co-word network-related approach on the evolution of China's urbanization research. Scientometrics, 111(3), 1623–1642. doi: 10.1007/s11192-017-2314-1ZhangQ.R.LiY.LiuJ.S.ChenY.D.ChaiL.H.2017A dynamic co-word network-related approach on the evolution of China's urbanization research11131623164210.1007/s11192-017-2314-1Open DOISearch in Google Scholar
Zhang, Y., Chen, H.S., Lu, J., & Zhang, G.Q. (2017). Detecting and predicting the topic change of knowledge-based systems: A topic-based bibliometric analysis from 1991 to 2016. Knowledge-Based Systems, 133, 255–268. Retrieved from http://dx.doi.org/10.1016/j.knosys.2017.07.011 doi: 10.1016/j.knosys.2017.07.011ZhangY.ChenH.S.LuJ.ZhangG.Q.2017Detecting and predicting the topic change of knowledge-based systems: A topic-based bibliometric analysis from 1991 to 2016133255268Retrieved from http://dx.doi.org/10.1016/j.knosys.2017.07.01110.1016/j.knosys.2017.07.011Open DOISearch in Google Scholar
Zhang, Y., Lu, J., Liu, F., Liu, Q., Porter, A., Chen, H.S., & Zhang, G.Q. (2018). Does deep learning help topic extraction? A kernel k-means clustering method with word embedding. Journal of Informetrics, 12(4), 1099–1117.ZhangY.LuJ.LiuF.LiuQ.PorterA.ChenH.S.ZhangG.Q.2018Does deep learning help topic extraction? A kernel k-means clustering method with word embedding1241099111710.1016/j.joi.2018.09.004Search in Google Scholar
Zhang, Y., Zhang, G.Q., Zhu, D.H., & Lu, J. (2017). Scientific evolutionary pathways: Identifying and visualizing relationships for scientific topics. Journal of the Association for Information Science and Technology, 68(8), 1925–1939. Retrieved from http://doi.wiley.com/10.1002/asi.23814 doi: 10.1002/asi.23814ZhangY.ZhangG.Q.ZhuD.H.LuJ.2017Scientific evolutionary pathways: Identifying and visualizing relationships for scientific topics68819251939Retrieved from http://doi.wiley.com/10.1002/asi.2381410.1002/asi.23814Open DOISearch in Google Scholar
Zhou, Y., Lin, H., Liu, Y.F., & Ding, W. (2019). A novel method to identify emerging technologies using a semi-supervised topic clustering model: A case of 3d printing industry. Scientometrics, 120(1), 167–185.ZhouY.LinH.LiuY.F.DingW.2019A novel method to identify emerging technologies using a semi-supervised topic clustering model: A case of 3d printing industry120116718510.1007/s11192-019-03126-8Search in Google Scholar