EVALUASI PERFORMA ALGORITMA KLASIFIKASI DECISION TREE ID3, C4.5, DAN CART PADA DATASET READMISI PASIEN DIABETES

Mochammad Yusa, Ema Utami, Emha Taufiq Luthfi

Sari


Klasifikasi merupakan salah satu teknik yang terdapat pada data mining. Tujuan atau objectif dari teknik klasifikasi data mining adalah untuk memprediksi kelas target secara akurat dengan menggunakan variabel-variabel terkait. Terdapat banyak model algoritma dalam teknik klasifikasi data mining. Model algoritma klasfikasi memiliki nilai yang berbeda-beda dan sangat bergantung pada jumlah atribut dan records dari dataset.Dataset yang digunakan adalah dataset terkait proses readmisi pasien diabetes. Dataset yang digunakan masih mengandung missing values sehingga dalam penelitian ini tahap prepocessing data dilakukan. Setelah tahap prepocessing data dilakukan didapat dataset yang terdiri dari 47 atribut dan 49.735 records.  Di dalam penelitian ini juga, teknik klasifikasi menggunakan berbagai macam algoritma Decision Tree akan dieksplorasi performanya pada dataset. Algoritma-algoritma klasifikasi yang akan dievaluasi adalah ID3, C4.5, dan CART. Teknik perhitungan atau validasi yang digunakan adalah 10-fold Cross Validation.Hasil dari penelitian ini menunjukkan bahwa model klasifikasi C4.5 memiliki nilai performa yang paling baik. Nilai performa yang dihasilkan adalah 54,13% performa akurasi dan 6 detik Execution time.

Kata Kunci


Data Mining; Klasifikasi; ID3; C4.5; CART; Machine Learning; Performal ; Akurasi; Execution Time; Decision Tree

Teks Lengkap:

PDF

Referensi


Son, H., Kim, C., Hwang, N., Kim, C., & Kang, Y. 2014. Classification of major construction materials in construction environments using ensemble classifiers. Elsevier - Advanced Engineering Informatics Vol. 28, pp. 1-10.

Taghizadeh-Mehrjardi, R., Nabiollahi, K., Minasny, B., & Triantafilis, J. 2015. Comparing data mining classifiers to predict spatial distribution of. Geoderma 253-254, Elsavier, pp. 67-77.

Koutanaei, F. N., Sajedi, H., & Khanbabaei, M. 2015. A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring. Journal of Retailing and Consumer Services 27 , pp. 11-23.

Ragab, A. H., Noaman, A. Y., Al-Ghamdi, A. S., & Madbouly, A. I. 2014. A comparative analysis of classification algorithms for students college enrollment approval using data mining. In Proceedings of the 2014 Workshop on Interaction Design in Educational Environments. ACM., pp. 106-112.

Upadhyaya, S., Baker-Demaray, K., & Farahmand, T. 2013. Comparison of NN and LR classifiers in the context of screening native American elders with diabetes. Elsevier Expert Systems with Applications Vol 40, pp. 5830-5838.

Harper, P. R. 2005. A review and comparison of classification algorithms. Health Policy Volume. 71, pp. 315–331.

Rahman, R. M., & Afroz, F. 2013. Comparison of Various Classification Techniques Using Different Data Mining Tools for Diabetes Diagnosis. Journal of Software Engineering and Applications Vol.6, pp. 85-97..

Temurtas, H., Yumusak, N., & Temurtas, F. 2009. A comparative study on diabetes disease diagnosis using neural networks. Expert Systems with Applications Volume 36 Elsevier,. pp. 8610–8615.

Sheu, Jyh-Jian, May 2008, An Efficient Two-phase Spam Filtering Methode Based on E-mails categorization. International Journal of Network Security, Vol. 8, No. 3, pp.334-343.

Sharma, A.K. dan Sahni, Suruchi. 2011. A Comparative Study of Classification Algorithms for Spam Email Data Analysis. International Journal on Computer Science and Engineering (IJCSE) Vol. 3 No. 5. pp. 1890-1895.

Yadav, S. K., & Pal, S. 2012. Data Mining: A Prediction for Performance Improvement of Engineering Students using Classification. World of Computer Science and Information Technology Journal (WCSIT) Vol. 2, No. 2, pp. 51-56.

Lavanya, D. & Rani, K.Usha . 2011. Performance Evaluation of Decision Tree Classifiers on Medical Datasets. International Journal of Computer Applications. Volume.26 No.4. pp. 1-4.

Kamber, M., & Han, J. 2006. Data Mining; Concepts and Techniques Second Edition. San Francisco: Morgan Kaufmann Publishers.

Mitchell, T. 1997. Machine Learning. McGraw Hill.

Berry, M. J., & Linoff, G. S. 2004. Data Mining Techniques For Marketing, Sales, Customer Relationship Management Second Editon. New York: Wiley Publishing, Inc.

Kusrini, & Luthfi, E. 2009. Algoritma Data Mining. Yogyakarta: Andi Publisher.

Gorunescu, F. 2011. Data Mining Concept Model and Techniques. Berlin: Springer.

Strack, B., DeShazo, J. P., Gennings, C., Olmo, J. L., Ventura, S., Cios, K. J., & Clore, J. N. 2014. Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed research international Hindawi, pp. 1-11.

Ashari, A., Paryudi, I., & Tjoa, A. M. 2013. Performance Comparison between Naïve Bayes, Decision Tree and k-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool. (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 4, No. 11, pp. 33-29.

Mittal, P., & Gill, N. S. 2014. A Comparative Analysis Of Classification Techniques On Medical Data Sets. IJRET: International Journal of Research in Engineering and Technology, Volume: 03 Issue: 06, pp. 454-460.




DOI: http://dx.doi.org/10.22303/infosys.4.1.2016.23-34

Refbacks

  • Saat ini tidak ada refbacks.


##submission.copyrightStatement##

##submission.license.cc.by4.footer##

Kantor Redaksi InfoSys Journal. Gedung LPPM Lt2, Kampus Universitas Potensi Utama. Jl. K.L. Yos Sudarso Km 6,5 No.3-A Telp. (061) 6640525 Ext. 214 Tanjung Mulia Medan 20241

 

Qries Qries Qries Qries Qries Qries

 

Creative Commons License
This work is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.