1. bookVolume 51 (2021): Issue 1 (March 2021)
Journal Details
License
Format
Journal
eISSN
2083-4608
First Published
26 Feb 2008
Publication timeframe
4 times per year
Languages
English
access type Open Access

K-Graph: Knowledgeable Graph for Text Documents

Published Online: 06 Apr 2021
Volume & Issue: Volume 51 (2021) - Issue 1 (March 2021)
Page range: 73 - 89
Journal Details
License
Format
Journal
eISSN
2083-4608
First Published
26 Feb 2008
Publication timeframe
4 times per year
Languages
English
Abstract

Graph databases are applied in many applications, including science and business, due to their low-complexity, low-overheads, and lower time-complexity. The graph-based storage offers the advantage of capturing the semantic and structural information rather than simply using the Bag-of-Words technique. An approach called Knowledgeable graphs (K-Graph) is proposed to capture semantic knowledge. Documents are stored using graph nodes. Thanks to weighted subgraphs, the frequent subgraphs are extracted and stored in the Fast Embedding Referral Table (FERT). The table is maintained at different levels according to the headings and subheadings of the documents. It reduces the memory overhead, retrieval, and access time of the subgraph needed. The authors propose an approach that will reduce the data redundancy to a larger extent. With real-world datasets, K-graph’s performance and power usage are threefold greater than the current methods. Ninety-nine per cent accuracy demonstrates the robustness of the proposed algorithm.

Keywords

1. Atastina I., Sitohang B., Saptawati G., Moertini V.S.: A Review of Big Graph Mining Research. IOP Conf. Ser. Mater. Sci. Eng., 180, 12-16, 2017.10.1088/1757-899X/180/1/012065Search in Google Scholar

2. Abdelhamid E., Canim M., Sadoghi M., Bhattacharjee B., Chang Y., Kalnis P.: Incremental Frequent Subgraph Mining for Large Evolving Graphs. IEEE Transactions on Knowledge and Data Engineering, 29, 12, 2017.Search in Google Scholar

3. Dhiman A., Jain S.K..: Frequent subgraph mining algorithms for single large graphs — A brief survey. International Conference on Advances in Computing, Communication, Automation (ICACCA) (Spring), Apr. 2016.10.1109/ICACCA.2016.7578886Search in Google Scholar

4. Gee K.R., Cook D.J.: Text Classification Using Graph-Encoded Linguistic Elements. In FLAIRS Conference, 487-492, 2005.Search in Google Scholar

5. Geibel, Krumnack U., Pustylnikow O., Mehler A.: Structure-Sensitive Learning of Text Types. Advances in Artificial Intelligence, 4830, 642-646, 2007.10.1007/978-3-540-76928-6_68Search in Google Scholar

6. Giarelis N., Kanakaris N., Karacapilidis N.: On a Novel Representation of Multiple Textual Documents in a single Graph. Proceedings of International Conference on Intelligent Decision Technologies IDT 2020, Split, Croatia, 105-115, 2020.10.1007/978-981-15-5925-9_9Search in Google Scholar

7. https://shodhganga.inibnet.ac.in.Search in Google Scholar

8. https://library.stanford.edu/spc/universityarchives/dissertations-and-theses.Search in Google Scholar

9. https://indiankanoon.org/browse/supremecourt/Search in Google Scholar

10. http://read.gov/books/Search in Google Scholar

11. Huan J., Wang J., Prins J.: Efficient mining of frequent subgraphs in the presence of isomorphism. Third IEEE International Conference on Data Mining, 549–552, 2003.Search in Google Scholar

12. Inokuchi A., Washio T., Motoda H.: An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data. Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, London, UK, UK,13–23, 2003.10.1007/3-540-45372-5_2Search in Google Scholar

13. Kang U., Tsourakakis C.E., Faloutsos C.: PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations. Ninth IEEE International Conference on Data Mining, Miami Beach, FL, USA, Dec. 2009.10.1109/ICDM.2009.14Search in Google Scholar

14. Kuramochi M., Karypis G.: Frequent Subgraph Discovery. Proceedings - IEEE International Conference on Data Mining, ICDM, 313–320, 2010.Search in Google Scholar

15. Kuramochi M., Karypis G.: GREW - a scalable frequent subgraph discovery algorithm. IEEE International Conference on Data Mining (ICDM’04), 439–442, 2004.10.21236/ADA439436Search in Google Scholar

16. Markov A.: Efficient Graph-based Representation of web Documents. Proceedings of the Third International Workshop on Mining Graphs, Trees and Sequences, Potro Portugal 52-62, 2005.Search in Google Scholar

17. Markov A., Last M., Kandel A.: A Fast Categorization of Web Documents represented by Graphs. Advances in Web Mining and Web Usage Analysis, 4811, 56-71, 2007.10.1007/978-3-540-77485-3_4Search in Google Scholar

18. Mukund D., Kuramochi M., Karypis G.: Frequent Sub-structur based Approaches for Classifying Chemical Compounds, In Proceedings of the Third IEEE International Conference on Data Mining, 2003.Search in Google Scholar

19. Nijssen S., Kok J.N.: A Quickstart in Frequent Structure Mining Can Make a Difference. Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 2004.10.1145/1014052.1014134Search in Google Scholar

20. Paulheim H.: Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web, vol. 8, no.3, 489–508, 2016.10.3233/SW-160218Search in Google Scholar

21. Pokorny J.: Integration of Relational and Graph Database Functionally. Foundation of Computing and Decision Sciences, 44, 4, 427-441, 2019.10.2478/fcds-2019-0021Search in Google Scholar

22. Schenker A.: Graph Theoretic Techniques for Web Content Mining, Phd Thesis, University of South Florida, 2003.Search in Google Scholar

23. Ramraj T., Prabhakar R.: Frequent Subgraph Mining Algorithms – A Survey. Procedia Comput. Sci.,47, 197–204, 2015.10.1016/j.procs.2015.03.198Search in Google Scholar

24. Rehman S.U., Khan A.U and Fong S.: Graph mining: A survey of graph mining techniques. Seventh International Conference on Digital Information Management (ICDIM 2012), 88–92, 2012.10.1109/ICDIM.2012.6360146Search in Google Scholar

25. Rehman S.U., Asghar S., Fong S.: An Efficient Ranking Scheme for Frequent Subgraph Patterns. Proceedings of the 2018 10th International Conference on Machine Learning and Computing, New York, NY, USA, 257–262, 2018.10.1145/3195106.3195166Search in Google Scholar

26. Tao F., Murtagh F., Farid M.: Weighted Association Rule Mining Using Weighted Support and Significant Framework. Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, USA, 2003.10.1145/956750.956836Search in Google Scholar

27. Yan X., Han J.: gSpan: graph-based substructure pattern mining. IEEE International Conference on Data Mining Proceedings, pp. 721–724, 2002.Search in Google Scholar

28. Yan X., Han J.: CloseGraph: Mining Closed Frequent Graph Patterns. Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 286–295, 2003.10.1145/956750.956784Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo