Betul Districts Primary School Performance Prediction Model Using Data Mining

Manmohan Singh, Anjali Sant



As this academic performance is influenced by many factors, it is essential to develop predictive data mining model for students’ performance so as to identify the slow learners and study the influence of the dominant factors on their academic performance. In the present investigation, a survey cum experimental methodology was adopted to generate a database and it was constructed from a primary. While the primary data was collected from the regular students and irregular student the secondary data was gathered from the school in class 3, 4 and 5 a total of 1000 datasets of the 2014 year from five different schools in three different districts of BETUL state Madhya Pradesh were collected. The raw data was preprocessed in terms of filling up missing values, transforming values in one form into another and relevant attribute/ variable selection. As a result, we had 700 student records, which were used for primary school prediction model construction. A set of prediction rules were extracted from primary school prediction model and the efficiency of the generated student prediction model was found. The accuracy of the present model was compared with other model and it has been found to be satisfactory.


student performance, Decision Tree, Data Mining, WEKA

Full Text:



. Walters YB, and Soyibo K. “An Analysis of High School Students' Performance on Five Integrated Science Process Skills”, Research in Science & Technical Education, 2001;19(2):133 – 145.

. Khan ZN. “Scholastic Achievement of Higher Secondary Students in Science Stream”, Journal of Social Sciences,2005; 1 (2):84-87.

. Hijazi ST, and Naqvi RSMM. “Factors Affecting Student’s Performance: A Case of Private Colleges”, Bangladesh e-Journal of Sociology, 2006;3(1).

. Ma Y, Liu B, Wong CK, Yu PS, and Lee SM. “Targeting the Right Students Using Data Mining”, Proceedings of KDD, International Conference on Knowledge discovery and Data Mining, Boston, USA, 2000, 457-464.

. Kotsiantis S, Pierrakeas C, and Pintelas P. “Prediction of Student’s Performance in Distance Learning Using Machine Learning Techniques”, Applied Artificial Intelligence, 2004;18, (5):411-426. IJCSI International.

. Cortez P, and Silva A. “Using Data Mining To Predict Secondary School Student Performance”, In EUROSIS, A. Brito and J. Teixeira (Eds.), 2008, 5-12.

. Kristjansson A L, Sigfusdottir I G, and Allegrante JP. “Health Behavior and Academic Achievement Among Adolescents: The Relative Contribution of Dietary Habits, Physical Activity, Body Mass Index, and Self-Esteem”, Health Education & Behavior, (In Press).

. Moriana JA, Alos F, Alcala R, Pino M J J. Academic Performance in Secondary Students”, Electronic Journal of Research in Educational Psychology,2006;4(1):35-46.

. Bray M. The Shadow Education System: Private Tutoring And Its Implications For Planners, (2nd ed.), UNESCO, PARIS, France, 2007.

. AI-Radaideh Q A, AI-Shawakfa E M, and AI-Najjar MI. “Mining Student Data using Decision Trees”, International Arab Conference on Information Technology(ACIT'2006), Yarmouk University, Jordan, 2006.

. Camdeviren HA, Yazici AC, Akkus Z, Bugdayci R, and Sungur MA. “Comparison of Logistic Regression Model and Classification Tree: An Application to Postpartum Depression Data”, Expert Systems with Applications, 2007; 32(4): 987-994.

. Kass GV. “An Exploratory Technique for Investigating Large Quantities of Categorical Data”, Applied Statistic, 1980; 29:119-127.

. Witten IH, and Frank E. Data Mining – Practical Machine Learning Tools and Techniques (2nd ed.), San Francisco, CA: Morgan Kaufmann Publisher, 2005.

. Liu H, and Setiono R. “Chi-square: Feature Selection and Discretization of Numeric Attributes”, Proceedings of IEEE 7th International Conference on Tools with Artificial Intelligence, 1995. 338, (391)

. Ganesh S. “Data Mining: Should it be Included in the Statistics Curriculum?” The 6th international conference on teaching statistics (ICOTS-6), Cape Town, South Africa, 2002.

. Ramasamy K. “Mother Tongue and Medium of Instruction – A Continuing Battle”, Language in India, Vol. 1, No. 6,

. Orlando S, Palmerini P, and Perego R. Enhancing the Apriori Algorithm for Frequent Set Counting. Proceedings of 3rd International Conference on Data Warehousing and Knowledge Discovery. 2001.

. Grahne G, Lakshmanan L, and Wang X. 2000. Efficient mining of constrained correlated sets. In Proc. 2000. Int. Conf. Data Engineering (ICDE’00), San Diego, CA, pp. 512–521.

. Dong G, and Li J. 1999. Efficient mining of emerging patterns: Discovering trends and differences. In Proc. 1999 Int. Conf. Knowledge Discovery and Data Mining (KDD’99), San Diego, CA, pp. 43–52.

. Bayardo RJ. 1998. Efficiently mining long patterns from databases. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’98), Seattle, WA, pp. 85–93.

. Cheung D W L, Lee S D, and Kao B. "A general incremental technique for maintaining discovered association rules," in Proceedings of the 15th International Conference on Database Systems for Advanced Applications, pp. 185-194, 1997.


  • There are currently no refbacks.

Advanced Research Journals

18K, Street 1st, Gaytri Vihar, Pinto Park, Gwalior, M.P. India (Design) 2009-2016


Follow @arjournals on Twitter