Cube Index: A Text Index Model for Retrieval and Mining

B. Janet; A. V. Reddy

Research Article

Cube Index: A Text Index Model for Retrieval and Mining

by B. Janet, A. V. Reddy

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 1 - Issue 9

Published: February 2010

Authors: B. Janet, A. V. Reddy

10.5120/192-330

PDF

B. Janet, A. V. Reddy . Cube Index: A Text Index Model for Retrieval and Mining. International Journal of Computer Applications. 1, 9 (February 2010), 88-92. DOI=10.5120/192-330

                        @article{ 10.5120/192-330,
                        author  = { B. Janet,A. V. Reddy },
                        title   = { Cube Index: A Text Index Model for Retrieval and Mining },
                        journal = { International Journal of Computer Applications },
                        year    = { 2010 },
                        volume  = { 1 },
                        number  = { 9 },
                        pages   = { 88-92 },
                        doi     = { 10.5120/192-330 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2010
                        %A B. Janet
                        %A A. V. Reddy
                        %T Cube Index: A Text Index Model for Retrieval and Mining%T 
                        %J International Journal of Computer Applications
                        %V 1
                        %N 9
                        %P 88-92
                        %R 10.5120/192-330
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

Text retrieval, Analysis, Mining and Knowledge management have gained a lot of importance in a time when we drown in information but are starved for knowledge. In this paper, we propose a novel Index that uses a Text Cube model to store the text information similar to a data cube in Data Mining. This model creates a direct index, next word index and inverted index in a single Cube Index which is three dimensional in nature. The Dimensions considered are first word, next word and document. The measure of the cube is the frequency of occurrence of the word next-word pair. The cube index has been tested by modifying the open source of terrier 2.1.

References

Frakes W. B. and Yates, R. B. 2008, Information Retrieval Data Structures and Algorithms, Pearson.
Han Jaiwei, Kamber M.,2006 Data Mining Concepts and Techniques, Elsvier, Morgan Kaufmann Publishers.
http://ir.dcs.gla.ac.uk/terrier/
Lin C. X., Ding B., Han J., Zhu F., Zhao B., Text Cube: Computing IR Measures for multidimensional Text Database Analysis, Proceedings of the 8th IEEE International Conference on Data Mining, 2008, http://doi.acm.org/10.1109/ ICDM.2008.135
Salton G. and McGill, M. J., 1983, Introduction to Modern Information Retrieval, McGraw Hill Company.
Williams, H. E., Zobel, J., and Anderson, P. What's next? Index structures for efficient phrase querying. In Proceedings Australasian Database Conference, M. Orlowska, Ed. Springer-Verlag, Auckland, New Zealand, 1999, p141-152.
Bahle D., Williams H.E., and Zobel J., Efficient Phrase Querying with an Auxiliary Index, In Proc. ACM-SIGIR Conf. on Research and Development in Information Retrieval, Tampere, Finland, August 2002, p215-221.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Cube Index Information Retrieval Inverted index Next-word index Associative index Direct Index Text Data cube