Research Article

Cube Index: A Text Index Model for Retrieval and Mining

by  B. Janet, A. V. Reddy
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 1 - Issue 9
Published: February 2010
Authors: B. Janet, A. V. Reddy
10.5120/192-330
PDF

B. Janet, A. V. Reddy . Cube Index: A Text Index Model for Retrieval and Mining. International Journal of Computer Applications. 1, 9 (February 2010), 88-92. DOI=10.5120/192-330

                        @article{ 10.5120/192-330,
                        author  = { B. Janet,A. V. Reddy },
                        title   = { Cube Index: A Text Index Model for Retrieval and Mining },
                        journal = { International Journal of Computer Applications },
                        year    = { 2010 },
                        volume  = { 1 },
                        number  = { 9 },
                        pages   = { 88-92 },
                        doi     = { 10.5120/192-330 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2010
                        %A B. Janet
                        %A A. V. Reddy
                        %T Cube Index: A Text Index Model for Retrieval and Mining%T 
                        %J International Journal of Computer Applications
                        %V 1
                        %N 9
                        %P 88-92
                        %R 10.5120/192-330
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Text retrieval, Analysis, Mining and Knowledge management have gained a lot of importance in a time when we drown in information but are starved for knowledge. In this paper, we propose a novel Index that uses a Text Cube model to store the text information similar to a data cube in Data Mining. This model creates a direct index, next word index and inverted index in a single Cube Index which is three dimensional in nature. The Dimensions considered are first word, next word and document. The measure of the cube is the frequency of occurrence of the word next-word pair. The cube index has been tested by modifying the open source of terrier 2.1.

References
  • Frakes W. B. and Yates, R. B. 2008, Information Retrieval Data Structures and Algorithms, Pearson.
  • Han Jaiwei, Kamber M.,2006 Data Mining Concepts and Techniques, Elsvier, Morgan Kaufmann Publishers.
  • http://ir.dcs.gla.ac.uk/terrier/
  • Lin C. X., Ding B., Han J., Zhu F., Zhao B., Text Cube: Computing IR Measures for multidimensional Text Database Analysis, Proceedings of the 8th IEEE International Conference on Data Mining, 2008, http://doi.acm.org/10.1109/ ICDM.2008.135
  • Salton G. and McGill, M. J., 1983, Introduction to Modern Information Retrieval, McGraw Hill Company.
  • Williams, H. E., Zobel, J., and Anderson, P. What's next? Index structures for efficient phrase querying. In Proceedings Australasian Database Conference, M. Orlowska, Ed. Springer-Verlag, Auckland, New Zealand, 1999, p141-152.
  • Bahle D., Williams H.E., and Zobel J., Efficient Phrase Querying with an Auxiliary Index, In Proc. ACM-SIGIR Conf. on Research and Development in Information Retrieval, Tampere, Finland, August 2002, p215-221.
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Cube Index Information Retrieval Inverted index Next-word index Associative index Direct Index Text Data cube

Powered by PhDFocusTM