N. Maheswaria* and M. Revathi b

aSchool of Computing Science and Engineering VIT University, Chennai, India
bDepartment of Computer Science and Engineering,Hindusthan College of Engineering and Technology, Coimbatore, India


Download Citation: |
Download PDF


ABSTRACT


Protection of privacy from unauthorized access is one of the primary concerns in data use, from national security to business transactions. It brings out a new branch of data mining, known as Privacy Preserving Data Mining (PPDM). Privacy-Preserving is a major concern in the application of data mining techniques to datasets containing personal, sensitive, or confidential information. Data distortion is a critical component to preserve privacy in security-related data mining applications; we propose a QR Decomposition method for data distortion. We focus primarily on privacy preserving data clustering. As the distorted data occupies small amount of storage space, the memory requirement becomes low. Finally, we evaluate the effectiveness of the method in terms of misclassification error rate. Our experiments on several data sets reveal that the classification error rate varies as a result of security. However, the method has much less computational cost, especially when new data items are inserted dynamically.


Keywords: Privacy preserving; QR Decomposition; clustering; data distortion; data mining.


Share this article with your colleagues

 


REFERENCES


[1] Achlioptas, D. 2004. Random matrices in data analysis. Proceedings of the 15th European Conference on Machine Learning, pp. 1-8, Pisa, Italy.

[2] Agrawal, R. and Srikant, R. 2000. Privacy-preserving data mining. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 439-450, Dallas, TX.

[3] Berry, M. W., Drmac, Z., and Jessup, E. R. 1999. Matrix, vector space, and information retrieval. SIAM Review, 41, 335-362.

[4] Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., and Zhu, M. 2003. Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations, 4: 2: 1-7.

[5] Evfimievski, A., Srikant, R., Agarwal, R., and Gehrke, J. 2002. Privacy Preserving Mining of Association Rules. Proc. Eighth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, 217-228.

[6] Frankes, W. and Baeza-Yates , R. 1992. “Information Retrieval: Data Structures and Algorithms”. Prentice-Hall.Englewood Cliffs. NJ.

[7] Gao, J. and Zhang, J. 2003. Sparsification strategies in latent semantic indexing. Proceedings of the 2003 Text Mining Workshop, pp. 93-103. San Francisco, CA.

[8] Han, J. and Kamber, M. 2001. “Data Mining: Concepts and Techniques”. Morgan Kaufmann Publishers. San Francisco.CA.

[9] Hubert, L., Meulman, J., and Heiser, W. 2000 Two purposes for matrix factorization: a historical appraisal. SIAM Review, 42, 4: 68-82.

[10] Ye, J., Li, Q., Xiong, H., Park, H., Janardan, R., and Kumar, V. 2005. IDR/QR: An Incremental Dimension Reduction Algorithm via QR Decomposition. IEEE Transactions on Knowledge and Data Engineering, 17, 9: 1208-1222.

[11] Kargupta, H., Sivakumar. K., and Ghosh,S. 2002. Dependency detection in mobimine and random matrices. Proceedings of the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, 250-262, Helsinki. Finland.

[12] Liu, K. and Kargupta, H. 2006. Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining. IEEE Transactions on Knowledge and Data Engineering, 18, 1: 92-106.

[13] Lee, D. D. and Seung, H. S. 1999. Learning in parts of objects by non-negative matrix factorization. Nature, 401, 788- 791.

[14] Mahta, M. L. 1991.“Random Matrices”. 2nd edition. Academic. London.

[15] Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., and Pascual-Marqui, P. D. 2006. “Nonsmooth nonnegative matrix factorization (nsNMF)”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 403-415,.

[16] Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y., and Theodoridis,Y. 2004. State-of-the-art in privacy preserving data mining. ACM SIGMOD Record, 3, 1: 50-57.

[17] Wang, J., Zhong, W. J., and Zhang, J. 2006. NNMF-based factorization techniques for high-accuracy privacy protection on non-negative-valued datasets. Proceedings of the IEEE Conference on Data Mining, PADM2006, 513-517.

[18] Li, X. B. and Sarkar, S. 2006. A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining. IEEE Transactions on Knowledge and Data Engineering, 18, 9: 1278-1283.

[19] Xu, S., Zhang, J., Han, D., and Wang, J. 2006. Singular value decomposition based data distortion strategy for privacy protection. Knowledge and Information Systems, 10, 3: 383-397.

[20] Xu, S., Zhang, J., Han, D., and Wang, J. 2005. Data distortion for privacy protection in a terrorist analysis system. Proceedings of the 2005 IEEE International Conference on Intelligence and Security Informatics, 459-464, Atlanta. GA.

[21] Xu, Y., Wang, K. A., Fu,W. C., and Yu, P.S. 2008. Anonymizing Transaction Databases for Publication. Proc SIGKDD, 767- 775.

[22] Kim,Y. and Shaneck, M. 2010. Efficient Cryptographic Primitives for Private Data Mining. The forty third Hawaii international Conference on System Sciences, HICSS, 1-9.

[23] Tao, Y., Ghinita, G., and Kalnis, P. 2011. Anonymous Publication of Sensitive Transactional Data. IEEE Transactions on Knowledge and Data Engineering, 23, 2: 161-174.


ARTICLE INFORMATION


Received: 2013-07-28
Revised: 2014-07-05
Accepted: 2014-08-18
Available Online: 2014-12-01


Cite this article:

Maheswari, N., Revathi, M. 2014. Data security using decomposition. International Journal of Applied Science and Engineering, 12, 303–312. https://doi.org/10.6703/IJASE.2014.12(4).303