Data Security Using Decomposition

N. Maheswari; M. Revathi

doi:10.6703/IJASE.2014.12(4).303

Data Security Using Decomposition

N. Maheswari^a* and M. Revathi ^b

^aSchool of Computing Science and Engineering VIT University, Chennai, India
^bDepartment of Computer Science and Engineering,Hindusthan College of Engineering and Technology, Coimbatore, India

Download Citation: |
Download PDF

ABSTRACT

Protection of privacy from unauthorized access is one of the primary concerns in data use, from national security to business transactions. It brings out a new branch of data mining, known as Privacy Preserving Data Mining (PPDM). Privacy-Preserving is a major concern in the application of data mining techniques to datasets containing personal, sensitive, or confidential information. Data distortion is a critical component to preserve privacy in security-related data mining applications; we propose a QR Decomposition method for data distortion. We focus primarily on privacy preserving data clustering. As the distorted data occupies small amount of storage space, the memory requirement becomes low. Finally, we evaluate the effectiveness of the method in terms of misclassification error rate. Our experiments on several data sets reveal that the classification error rate varies as a result of security. However, the method has much less computational cost, especially when new data items are inserted dynamically.

Keywords: Privacy preserving; QR Decomposition; clustering; data distortion; data mining.

Share this article with your colleagues

REFERENCES

[1] Achlioptas, D. 2004. Random matrices in data analysis. Proceedings of the 15th European Conference on Machine Learning, pp. 1-8, Pisa, Italy.

[2] Agrawal, R. and Srikant, R. 2000. Privacy-preserving data mining. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 439-450, Dallas, TX.

[3] Berry, M. W., Drmac, Z., and Jessup, E. R. 1999. Matrix, vector space, and information retrieval. SIAM Review, 41, 335-362.

[4] Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., and Zhu, M. 2003. Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations, 4: 2: 1-7.

[5] Evfimievski, A., Srikant, R., Agarwal, R., and Gehrke, J. 2002. Privacy Preserving Mining of Association Rules. Proc. Eighth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, 217-228.

[6] Frankes, W. and Baeza-Yates , R. 1992. “Information Retrieval: Data Structures and Algorithms”. Prentice-Hall.Englewood Cliffs. NJ.

[7] Gao, J. and Zhang, J. 2003. Sparsification strategies in latent semantic indexing. Proceedings of the 2003 Text Mining Workshop, pp. 93-103. San Francisco, CA.

[8] Han, J. and Kamber, M. 2001. “Data Mining: Concepts and Techniques”. Morgan Kaufmann Publishers. San Francisco.CA.

[9] Hubert, L., Meulman, J., and Heiser, W. 2000 Two purposes for matrix factorization: a historical appraisal. SIAM Review, 42, 4: 68-82.

[10] Ye, J., Li, Q., Xiong, H., Park, H., Janardan, R., and Kumar, V. 2005. IDR/QR: An Incremental Dimension Reduction Algorithm via QR Decomposition. IEEE Transactions on Knowledge and Data Engineering, 17, 9: 1208-1222.

[11] Kargupta, H., Sivakumar. K., and Ghosh,S. 2002. Dependency detection in mobimine and random matrices. Proceedings of the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, 250-262, Helsinki. Finland.

[12] Liu, K. and Kargupta, H. 2006. Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining. IEEE Transactions on Knowledge and Data Engineering, 18, 1: 92-106.

[13] Lee, D. D. and Seung, H. S. 1999. Learning in parts of objects by non-negative matrix factorization. Nature, 401, 788- 791.

[14] Mahta, M. L. 1991.“Random Matrices”. 2nd edition. Academic. London.

[15] Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., and Pascual-Marqui, P. D. 2006. “Nonsmooth nonnegative matrix factorization (nsNMF)”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 403-415,.

[16] Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y., and Theodoridis,Y. 2004. State-of-the-art in privacy preserving data mining. ACM SIGMOD Record, 3, 1: 50-57.

[17] Wang, J., Zhong, W. J., and Zhang, J. 2006. NNMF-based factorization techniques for high-accuracy privacy protection on non-negative-valued datasets. Proceedings of the IEEE Conference on Data Mining, PADM2006, 513-517.

[18] Li, X. B. and Sarkar, S. 2006. A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining. IEEE Transactions on Knowledge and Data Engineering, 18, 9: 1278-1283.

[19] Xu, S., Zhang, J., Han, D., and Wang, J. 2006. Singular value decomposition based data distortion strategy for privacy protection. Knowledge and Information Systems, 10, 3: 383-397.

[20] Xu, S., Zhang, J., Han, D., and Wang, J. 2005. Data distortion for privacy protection in a terrorist analysis system. Proceedings of the 2005 IEEE International Conference on Intelligence and Security Informatics, 459-464, Atlanta. GA.

[21] Xu, Y., Wang, K. A., Fu,W. C., and Yu, P.S. 2008. Anonymizing Transaction Databases for Publication. Proc SIGKDD, 767- 775.

[22] Kim,Y. and Shaneck, M. 2010. Efficient Cryptographic Primitives for Private Data Mining. The forty third Hawaii international Conference on System Sciences, HICSS, 1-9.

[23] Tao, Y., Ghinita, G., and Kalnis, P. 2011. Anonymous Publication of Sensitive Transactional Data. IEEE Transactions on Knowledge and Data Engineering, 23, 2: 161-174.

ARTICLE INFORMATION

Received: 2013-07-28
Revised: 2014-07-05
Accepted: 2014-08-18
Available Online: 2014-12-01

Cite this article:

Maheswari, N., Revathi, M. 2014. Data security using decomposition. International Journal of Applied Science and Engineering, 12, 303–312. https://doi.org/10.6703/IJASE.2014.12(4).303

Data Security Using Decomposition

ABSTRACT

REFERENCES

ARTICLE INFORMATION

Other people also read ...

Monitoring soil resilience via the dynamic changes of selected physicochemical properties of soil in a tropical rehabilitated forest

Efficacy of real-time audio biofeedback on physiological strains for simulated tasks with medium and heavy loads

An alternative framework for implementing generator coherency prediction and islanding detection scheme considering critical contingency in an interconnected power grid

Usability evaluation for driving simulation with the mechanical and joystick manual controllers

Formulation, characterization, and optimization of aripiprazole-loaded lyotropic liquid crystalline nanoparticle for sustained release and better encapsulation efficiency against psychosis disorder

Influence of palm oil mills effluent (POME) sludge vermicomposting on soil physicochemical properties and Zea mays growth performances