SlideShare une entreprise Scribd logo
1  sur  13
Télécharger pour lire hors ligne
Knowledge Discovery from Academic Data
using Association Rule Mining
SUBMITTED BY
Rajshakhar Paul
Student ID: 0805020
Shibbir Ahmed
Student ID: 0805097
Summary of the Thesis
Department of Computer Science and Engineering
BANGLADESH UNIVERSITY OF ENGINEERING AND TECHNOLOGY
Page | 1
i. Introduction
Students are one of the fundamental elements of any academic institution. Indeed, the prime concern for
an educational institution is to ensure qualified technical foundation, scholarly guidance and high standard
education to all of its students. For a large educational institute like public university which generates
large volumes of data, it requires an efficient way to apply data mining techniques for obtaining
knowledge on the development and performance improvement of academic activities. The knowledge
acquired from the institutional database will be sufficient to look for answers to such questions as: Which
factors determine better or worse academic performance of students? What are the causes behind the
students' retention i.e., the extended continuation of the studies in the university? Why do students drop
out before graduation i.e., students‟ abandonment from an educational institute. Concepts and techniques
of data mining are essential to discover the hidden knowledge from large datasets.
Bangladesh University of Engineering and Technology (BUET) is the topmost technological university of
Bangladesh and it enrolls the top most brilliant 1000 students selected by a competitive examination
among one million students competing higher secondary education. Among these 1000 students, top
ranked students can get admission into the different departments under different faculties. Although, this
university possesses most of the brightest students of Bangladesh, statistics demonstrates that
performance of some students degrades noticeably. On the other hand, some students perform
outstandingly at the initial stage of the undergraduate studies but they can not demonstrate the same level
of excellence till the completion of their graduation. Some students can not perform well initially but at
the end of their graduation they possess pretty good academic career. Again, there are some students in
this university who have to continue their studies year after year and take a very long time for the
completion of their graduation. Unfortunately there are also some meritorious students who drop out
before the graduation. Only statistical analysis is not sufficient for finding the reasons of all the above
problems in any academic institution. The hidden knowledge inside the institutional academic and
personal data of students is necessary to find out the possible causes of all these problems and take
suitable precaution for them. That is why knowledge discovery and data mining form academic data is
essential for educational institution like BUET to improve academic performance of students as well as
refine the standard of teaching methodologies and reshape the decision makings for the betterment of the
institution.
Discovering the hidden knowledge from educational data and applying it properly for decision making is
essential for ensuring high quality education in any academic institution. For this, data mining techniques
are very effective. But all the data mining techniques can not be applied directly on academic data
because of complex structure. This requires rigorous preprocessing. The choice of support and
confidence, selection of important association rules from huge number of generated rules are other
significant problems of knowledge discovery from academic data.
ii. Motivation
In a developing country like Bangladesh, too many students from rural area come to city for higher
education. They usually come to city leaving their family and have to accommodate with a completely
new environment. They start their new educational life at institution‟s hall. New living place, new types
of foods, new companions, new atmosphere. It is seen that they usually need some time to cope up both
physically and mentally with all of these new things which may hamper their educational activities at the
very beginning. And the scenario is bit more difficult for girls than boys. So sometimes they lag behind at
the beginning of the race of their higher studies which may create an adverse effect in the long run for
them. On the other hand, the city students are more likely familiar with the environment, living with their
family and provided with more opportunities of educational, technological and psychological aspects
Page | 2
which may give them some advantages in the track of higher education. Though the scenario can be
different, the more opportunities may drive them away from the track and demoralize them in studies.
In higher education system like BUET, the performance of one course depends on different aspects such
as class attendance, class test, quiz, assignments, term final examinations, etc. some of which start from
very beginning of the class. So if any student gets poor marks in any of these, it may affect the final
result. And the later courses are sometimes dependent of previous courses. So if any student gets poor
result in any course it may affect the performance of other related courses too.
So it is very obvious to discover all possible knowledge from academic data to know all the relevant rules
behind students‟ performances whether they are doing well or bad. And if they cannot perform well then
the reason behind it can also be discovered.
iii. Goal and Objectives
The department of Computer Science and Engineering (CSE) is one of the prestigious departments of
BUET. Although, this department possesses most of the brightest students of Bangladesh, statistical data
demonstrates that performance of some students degrades noticeably. Moreover the problem of retention
as well as abandonment is also prevalent among the students. The main objective of this research study is-
To discover knowledge of students‟ academic progress from academic performance with personal
statistics through the impact of different assessment of courses e.g., class test, attendance, term final
examination etc.
To find out reasons behind the degradation of student‟s merit i.e., decay in their potentiality
To discover causes behind extended continuation for graduation i.e., retention of students
To find out why some meritorious students drop out before graduation i.e., abandonment of students
iv. Key Techniques used to achieve the Goal
A. Data Analysis
1) Personal and Academic Data
In this research, we have considered academic data structure of BUET. The student data of the BIIS
(BUET Institutional Information System) contains several personal and academic information of a
particular student. We have collected them anonymously for the data preprocessing and data analysis. We
have considered these personal and academic data stated in the Table 1 for knowledge discovery
regarding academic performance, abandonment and retention of students illustrated in Figure 4.1.
Table 4.1: Selected Data from BIIS database
Academic Information
Department
Admission Year / Batch
Overall CGPA
Marks of Class test, Attendance, Two Answer
Scripts, Total Marks and Grades of all Theory
Courses
Total Marks and Grades of all Sessional Courses
Total Completed Credit Hour
Personal Information
Gender
Hall Resident/Non-resident
Page | 3
Figure 4.1: Factors related to Academic Performance, Abandonment and Retention of students
2) Course and Curriculum
As we have experimented with the students‟ data of the department of Computer Science and Engineering
(CSE) in BUET, we have analyzed all the courses in the curriculum which has to be taken to complete the
BSc degree. A student has to take total 68 departmental and non-departmental courses in total. All the
courses along with their credit hour are shown in Table 4.2.
Table 4.2: All Undergraduate Courses for department of CSE
Among them there are 40 theory courses (25 departmental and 15 non-departmental) and 28 sessional
courses (20 departmental and 7 non-departmental) including thesis. We determine academic performance
and impact of other factors on basis of these courses‟ final grade and marks of attendance, class tests,
term final answer scripts, total marks etc.
Course Type Credit Hour Course Number
Departmental
Theory Courses
4.0 CSE307, CSE321
3.0
CSE103, CSE105, CSE201, CSE203, CSE205, CSE207,
CSE209, CSE303, CSE305, CSE309, CSE311, CSE301,
CSE313, CSE315, CSE317, CSE401, CSE403, CSE423,
CSE409, CSE461, CSE463
2.0 CSE100, CSE 211
Departmental
Sessional Courses
1.5
CSE106, CSE202, CSE206, CSE210, CSE214, CSE304,
CSE308, CSE314, CSE316, CSE404
0.75
CSE204, CSE208, CSE300, CSE310, CSE322, CSE324,
CSE402, CSE410, CSE462, CSE464
Non-Departmental
Theory Courses
4.0 PHY109, MATH143, EEE263, MATH 243,
3.0
EEE163, MATH141, ME165, CHEM101, HUM175, MATH241,
EEE269, IPE493
2.0 HUM211, HUM275, HUM371
Non-Departmental
Sessional Courses 1.5
PHY102, EEE164, ME160, HUM272, CHEM114, EEE264,
EEE270
Thesis 6.0 CSE400
Academic Performance
Student Retention
Student Abandonment
ResidenceGender
Records of all Continuous
Assessments
Records of
Departmental Courses
Records of Non
Departmental Courses
Page | 4
B. Preprocessing for Mining Academic Database
1) Relational Database
Students take courses through BIIS account via registration. In the relational database illustrated in Figure
4.2, all the personal information as well as the results of taken courses of a student are stored. Through
which we can obtain the relational table containing a student‟s gender, hall status, performance of all
courses, CGPA etc.
Figure 4.2: Relational database
2) Universal Database
A universal database is created for the purpose in which records of all taken courses along with personal
information like gender, hall status of corresponding student id are stored in a single row of the table. For
a specific course, the grade, attendance, marks of class tests, marks of each section (section A and section
B) of term final answer scripts and total marks. Like this the similar records of all other taken courses are
stored in the database with the corresponding student id. And by this process the records of other students
are stored in the database one after another after the corresponding Gender and Hall Status of a particular
student. Another attribute is stored as Student Type by which we have determined the student type-
regular, retentive or abandoned. As, for applying Apriori algorithm of Association Rule Mining, we have
to set the value of attribute in discrete form. So, record such as student id has been omitted in the
universal table.
Table 4.3: Partial portion of universal database
3) Data Transformation
The universal database of Table 4.3 has been transformed into an equivalent transformation table by
transforming the continuous valued attribute as discrete valued attribute representing some knowledge for
the suitability of implementing Apriori algorithm of Association Rule Mining. As for example, CGPA is a
continuous attribute and it has been transformed into five classifications as excellent, very good, good,
average and poor. We have used one algorithm for transforming all continuous numbers for attendance,
class tests, and both sections of answer scripts of term final and total marks of a course. We have used
another algorithm for transforming all grade or grade points of courses or overall CGPA into those five
classifications.
Gender
Hall_
Status
Student_
Type
CSE103_
Grade
CSE103
_Attend
ance
CSE103
_CT
CSE103_
Section A
CSE 103_
SectionB
CSE103
_Total
…
Male Resident Regular A+ 30 55 90 75 250
Female
Non-
Resident Regular
A
25 45 85 70 225
… … … … … … … … …
Student
Grade
Sheet
Course
achieves represents
Page | 5
For transforming the numbers of universal table i.e., attendance, class tests, section A, section B, total
marks of each course, Algorithm1 has been developed to populate the transformed table in such a way
that there is no continuous value in an entry.
Similarly the grades of universal table are also transformed by an algorithm named as Algorithm2. As the
real data set contains CGPA in grade points we similarly consider another variable grade point and
transformed the continuous value of CGPA to these five classified definitions.
As there are theory courses of credit 4.0, 3.0 and 2.0 and sessional with credit hour 1.5 and 0.75, we need
different transformation rule tables for all these different courses. Below, Transformation rules for 3.0
credit hour (in Table 4), for 4.0 credit hour (in Table 5), for 2.0 credit hour (in Table 6) theory courses
and for all sessional courses (in Table 7) are illustrated.
Algoithm1: Marks_Transformation ( )
Input: marks of Attendance, CT, Section A, Section B, Total Marks of each course from Universal
Table of Studentlist
Output: discrete level of marks for the Transformation Table
for i=1 to | Studentlist |
if (marks>=80%)
level = “Excellent”
else if (marks<80% && marks>=75%)
level = “Very Good”
else if (marks<75% && marks>=60%)
level = “Good”
else if(marks<60% && marks>=50%)
level = “Average”
else if(marks<50%)
level = “Poor”
end for
Algoithm2: Grade_Transformation ( )
Input: all acquired Grade of each courses in the Courselist of the universal table
Output: transformed_ grade for the Transformation Table
for i=1 to | Courselist |
if grade = A+
transformed_grade = „Excellent‟
else if grade = A
transformed_grade = „Very Good‟
else if grade = A- or B+
transformed_grade = „Good‟
else if grade = B
transformed_grade = „Average‟
else if grade = B- or C+ or C or D
transformed_grade = „Poor‟
end for
Page | 6
Table 4.4: Transformation rule table for 3.0 credit theory course
Table 4.5: Transformation rule table for 4.0 credit theory course
Table 4.6: Transformation rule table for 2.0 credit theory course
Table 4.7: Transformation rule table for all sessional courses
To construct the entire transformed table as given in Table 4.8, we have used the universal table and
above transformation rules.
Table 4.8: Transformed table from universal table
Classified
Name
Range of Marks (M)
Attendance Class Test SecA/SecB Total
Excellent 27≤ M ≤30 48≤M≤60 84≤M≤105 240≤M≤300
Very Good 24≤ M ≤26 45≤M≤47 78≤M≤83 225≤M≤239
Good 21≤ M ≤23 36≤M≤44 63≤M≤77 180≤M≤224
Average 18≤ M ≤20 30≤M≤35 52≤M≤62 150≤M≤179
Poor 0≤ M ≤17 0≤M≤29 0≤M≤51 0≤M≤149
Classified
Name
Range of Marks (M)
Attendance Class Test SecA/SecB Total
Excellent 36≤ M ≤40 64≤M≤80 112≤M≤140 320≤M≤400
VeryGood 32≤ M ≤35 60≤M≤63 105≤M≤111 300≤M≤319
Good 28≤ M ≤31 48≤M≤49 84≤M≤104 240≤M≤299
Average 24≤ M ≤27 40≤M≤47 70≤M≤83 200≤M≤239
Poor 0≤ M ≤23 0≤M≤39 0≤M≤69 0≤M≤199
Classified
Name
Range of Marks (M)
Attendance Class Test SecA/SecB Total
Excellent 18≤ M ≤20 32≤M≤40 56≤M≤70 160≤M≤200
Very Good 16≤ M ≤17 30≤M≤31 52≤M≤55 150≤M≤159
Good 14≤ M ≤15 24≤M≤29 42≤M≤51 120≤M≤149
Average 12≤ M ≤13 20≤M≤23 35≤M≤41 100≤M≤119
Poor 0≤ M ≤11 0≤M≤19 0≤M≤34 0≤M≤99
Classified
Name
Range of Marks (M)
Sessional Credit Hour=1.5 Sessional Credit Hour=0.75
Excellent 120≤ M ≤150 60≤ M ≤75
Very Good 112≤ M ≤119 56≤ M ≤59
Good 90≤ M ≤111 45≤ M ≤55
Average 75≤ M ≤89 37≤ M ≤44
Poor 0≤ M ≤74 0≤ M ≤36
Gender Hall_Statu
s
Student_Type CSE103_
Grade
CSE103_
Attendance
CSE103_CT CSE103_
SectionA
CSE103_
SectionB
CSE103_
Total
……Male Resident Regular Excellent Excellent Excellent Excellent Good Excellent
Female Non-
resident
Regular Very
Good
Very Good Very Good Excellent Good Very
Good
…. …. …. …. …. …. …. …. ….
Page | 7
4) Dataset and Application Environment
In this experiment, we have considered the data up to the last five graduated batch in the department of
CSE, BUET. The institutional dataset of BUET consist academic and personal data of 9210 students in
last 10 years. We have categorized relevant academic and personal information of those students which
are gender, hall status, admission year, completed credit hour, all records of theory and sessional courses,
overall CGPA etc. from the relational BIIS database and transformed into universal table structure.
Finally we transformed it into a transformed table structure for applying association rule mining. The
entire experimental setup is illustrated in Figure 5.1.
Figure 4.1: Experimental Setup for applying Apriori Algorithm using Weka Explorer to generate
Association Rules
After preprocessing step, we have obtained a transformed table of 582 students of department of CSE
who have already graduated. Universal table also contain one additional attribute which is student type –
retentive, regular or abandoned. Student type is obtained by analyzing completed credit hour and
admission year. We have manipulated the transformation table containing all continuous data transformed
into five discrete value- Excellent, Very Good, Good, Average and Poor. Finally we have used Weka
Explorer to the transformation table (in .csv file format) to generate interesting Association Rules.
BUET Institutional Dataset of 9210 Students
of All Departments in Last 10 years
Gender Hall Status Admission Year Completed CreditHour
All Records of Theory & Sessional Courses Overall CGPA
Universal Table Structure
Regular 552
Student Type
Retentive 26
Abandoned 4
Male 473
Gender
Female 109
Resident 348
Hall Status
Non Resident 234
Theory Course 40
Attendance Classtest Section A Section B Total Grade
Sessional
Course 28
Total Marks Grade
Transformation Table Structure
Regular 552
Student Type
Retentive 26
Abandoned 4
Male 473
Gender
Female 109
Resident 348
Hall Status
Non Resident 234
PoorAverageGoodVery GoodExcellent
All Marks & Grade of 68 Theory & Sessional Courses
Including Overall CGPA of 582 Students
Page | 8
v. Main Results and Discussions
1) Impact of Gender
We have found the impact of gender in the overall academic performance. This indication is very
important in terms of socio economic condition of the country. In BUET majority of the students are male
and lives in the university dormitories. There are multiple factors that affect the academic environment
and students‟ academic performance. The result of Table 5.1 points out that the male students have a very
high confidence level with the poor CGPA. The reason is that male students are generally affected by
various societal problems of a third world country like Bangladesh. All other rules support that the
academic performance of female students is better than the male students.
Table 5.1: Impact of Gender
No. Generated Interesting Rules Minimum Support Confidence
01 CGPA=Poor ⇒ Gender=male 10% 87%
02 CGPA=Average ⇒ Gender=male 10% 79%
03 CGPA=Very Good ⇒ Gender=male 10% 83%
04 Gender=male ⇒CGPA=Good 10% 26%
05 Gender=male ⇒ CGPA=Average 10% 21%
06 CGPA=Good ⇒ Gender=female 5% 22%
07 CGPA=Average ⇒ Gender=female 5% 21%
08 CGPA=Excellent ⇒ Gender=female 5% 20%
2) Impact of Residence
In BUET, most of the students live in institution hall. But the number of students live in home is also
significant fact. Analyzing the rules we have found that both the students of hall and the students residing
at home get good CGPA with a descent minimum support and confidence (in table 5.2). So if any student
wants to do well in academic prospect he can do from anywhere.
Table 5.2: Impact of Hall Status
No Generated Interesting Rules Minimum Support Confidence
01 CGPA=Average ⇒ Hall_Status=Resident 10% 65%
02 CGPA=Very Good ⇒
Hall_Status=Resident
10% 63%
03 CGPA=Good ⇒ Hall_Status=Non-
Resident
10% 43%
04 CGPA=Good Hall_Status=Resident ⇒
Gender=male
10% 82%
But it is found that the percentage of getting poor CGPA is high in hall. Because in hall, there is very little
restriction and sometimes there is no one to take care of a student as family members do. So a student can
be demoralize and get a very poor grade due to lack of studies. And as shown in rule number 1 in table
4.3, the percentage of male resident students is higher in this regard. In most of the cases, it is inevitable
that the poor CGPA holders are resident of hall (rule number 1 and 5 of table 5.3).
Page | 9
Table 5.3: Impact of Hall Status and Gender
No Generated Interesting Rules Minimum Support Confidence
01 CGPA=Poor Gender=male ⇒
Hall_Status=Resident
5% 51%
02 CGPA=Very Good Gender=male ⇒
Hall_Status=Non-Resident
5% 40%
03 Hall_Status=Non-Resident Gender= female ⇒
CGPA=Average
5% 24%
04 Hall_Status=Resident Gender=female ⇒
CGPA=Good
5% 21%
05 CGPA=Poor ⇒ Hall_Status=Resident 5% 52%
3) Correlation between Courses
The analyzed Association Rules show that the grade of one course may depend on prerequisite courses. In
rule number 1 we find that if anyone gets excellent grade in CSE105, he/she gets excellent grade in the
course CSE205 too with a confidence of 0.48 where CSE105 is Structured Programming Language
course and CSE201 is Object Oriented Programming Language course. We also discover that the
interrelation of course CSE311 (Data Communication-I) and CSE321 (Networking) in rule number 6, 7
and 8. We also find the impact of course CSE205 (Digital Logic Design) and CSE209 (Digital Electronics
and Pulse Technique) on course CSE403 (Digital System Design) in rule number 10 in Table 5.4.
Table 5.4: Correlation between Courses
No Generated Interesting Rules Minimum Support Confidence
01 CSE105_Grade=Excellent⇒
CSE201_Grade=Excellent
10% 48%
02 CSE201_Grade=Very Good ⇒
CSE105_Grade=Very Good
5% 30%
03 EEE163_Grade=Excellent ⇒
EEE263_Grade=Very Good
5% 27%
04 CSE205_Grade=Excellent ⇒
CSE403_Grade=Excellent
10% 50%
05 CSE403_Grade=Poor ⇒
CSE205_Grade=Average
5% 28%
06 CSE321_Grade=Average ⇒
CSE311_Grade=Average
5% 36%
07 CSE321_Grade=F ⇒ CSE311_Grade=Poor 3% 13%
08 CSE321_Grade=Poor ⇒ CSE311_Grade=Poor 3% 16%
09 CSE205_Grade=Very Good
CSE209_Grade=Excellent ⇒
CSE403_Grade=Excellent
5% 53%
Page | 10
4) Impact on Retention
If any student fails to pass any course then he becomes retentive because he needs to take that course
again later to complete his graduation. We find that retentive students usually struggle with the grades in
rule number 2, 3, 4, 5 and 6. If a student has not passed in CSE100 which is the first fundamental course
of CSE, he or she is retentive i.e., he or she has not passed in the later departmental courses also. This is
illustrated by the generated rule no. 1 in the Table 4.5. Moreover, we have discovered that maximum
retentive student are hall resident and male which are illustrated in rule number 7 and 8 respectively with
a high confidence in the Table 5.5.
Table 5.5: Impact on Retention
5) Impact on Abandonment
The students who have given up their academic studies without completing all the required courses are
typed as „abandoned‟. By analyzing the rules illustrated in Table 5.6, it is discovered that with a high
confidence, the abandoned students are male and resident of hall. But the minimum value of support is
very low. Thus it is found that the rate of abandonment is very low in the CSE department of this
university.
Table 5.6: Impact on Abandonment
No Generated Interesting Rules Minimum Support Confidence
01 Student Type=Abandoned ⇒ Gender=male 0.5% 100%
02 Student Type=Abandoned ⇒
Hall_Status=Resident
0.5% 75%
03 Student Type=Abandoned ⇒ Gender=male
Hall_Status=Resident
0.5% 75%
6) Impact of Continuous Assessment
The grading of a course depends on various aspects such as marks of attendance, class test, both sections
of term final examination. From rule number 7 which has a maximum confidence value 1.00, we have
discovered that the excellent grade of a course depends on the excellent performance of all other aspects
of continuous assessment. Again, the performance of class test depends on attendance which is illustrated
by rule number 5 in Table 5.7 with a confidence of 0.95 which is very high.
No Generated Interesting Rules Minimum Support Confidence
01 CSE100_Grade=F ⇒
Student Type=Retentive
5% 42%
02 Student Type=Retentive ⇒
MATH243_Grade=Poor
5% 35%
03 Student Type=Retentive ⇒
CSE205_Grade=Average
5% 35%
04 Student Type=Retentive ⇒
CSE311_Grade=Average
5% 27%
05 Student Type=Retentive ⇒
EEE263_Grade=Poor
5% 33%
06 Student Type=Retentive ⇒
CSE409_Grade=Average
5% 43%
07 Student Type=Retentive ⇒
Hall_Status=Resident
5% 65%
08 Student Type=Retentive ⇒ Gender=male 5% 81%
Page | 11
Table 5.7: Impact of Continuous Assessment
No Generated Interesting Rules Minimum Support Confidence
01 CSE103_Attendance=Excellent
CSE103_SectionB=Poor ⇒ CSE103_Grade=Average
10% 63%
02 CSE103_Grade=Very Good CSE103_CT=Good ⇒
CSE103_Attendance= Excellent
10% 97%
03 EEE163_Grade=Average ⇒ EEE163_SectionB=Poor 10% 57%
04 EEE163_Grade=Very Good ⇒ EEE163_Attendance=
Excellent EEE163_CT=Excellent
10% 67%
05 HUM275_CT=Excellent ⇒ HUM275_Attendance=
Excellent
10% 95%
06 HUM275_CT=Excellent HUM275_SectionA=Good⇒
HUM275_Grade=Very Good HUM275_Attendance=
Excellent
10% 75%
07 CSE401_Grade=Excellent CSE401_CT=Excellent
CSE401_SectionA= Excellent ⇒
CSE401_Attendance= Excellent
10% 100%
08 CSE401_SectionB=Excellent ⇒
CSE401_Grade=Good
10% 75%
7) Impact of Non Departmental Courses
After analyzing the generated Association Rules (in Table 5.8) we observed various impacts of non-
departmental courses on academic performances. According to curriculum we need to take some non-
departmental courses‟ performance which is added to the final result. So it may happen that some students
get poor grades in those non departmental courses. But according to generated rules though the good
performance of the non-departmental courses brings good grade but the impact of getting poor grade in
non-departmental courses causes less harm to the final CGPA because those courses are less in quantity
and maximum of those are studied at the beginning of undergraduate level. So students get enough
opportunities to improve their CGPA later.
Table 4.8: Impact of Non Departmental Courses
No Generated Interesting Rules Minimum Support Confidence
01 CGPA=Very Good ⇒ HUM272_Grade=Very Good 10% 73%
02 CGPA=Very Good ⇒ MATH143_Grade=Average 5% 37%
03 CGPA=Good ⇒EEE163_Grade=Average 5% 36%
04 CGPA=Very Good ⇒ CHEM101_Grade=Average 10% 52%
05 CGPA=Average ⇒ IPE493_Grade=Very Good 5% 29%
06 CGPA=Good ⇒ ME165_Grade=Average 10% 43%
07 CGPA=Average ⇒ MATH243_Grade=Poor 5% 27%
8) Impact of Departmental Courses
As there are too many departmental courses are studied and there some inter connection between some
courses because of prerequisite courses, the result of departmental courses affect the final CGPA very
much. From the analyzed rules, it is found that the good grade of departmental courses brings good
CGPA. On the other hand poor grade in departmental courses results in poor overall CGPA. This
significant knowledge is discovered from the rules illustrated by the impact of departmental courses in
Table 5.9.
Page | 12
Table 5.9: Impact of Departmental Courses
No Generated Interesting Rules Minimum Support Confidence
01 CGPA=Very Good ⇒ CSE100_Grade=Very Good 5% 42%
02 CGPA=Very Good ⇒ CSE105_Grade=Average 5% 31%
03 CGPA=Very Good⇒ CSE206_Grade=Very Good 10% 44%
04 CGPA=Good ⇒ CSE303_Grade=Average 5% 31%
05 CGPA=Poor ==> CSE321_Grade=Poor 5% 29%
06 CGPA=Excellent ⇒ CSE401_Grade=Excellent 5% 50%
07 CGPA=Average ⇒ CSE401_Grade=Average 5% 29%
08 CGPA=Average ⇒ CSE409_Grade=Average 5% 42%
v. Conclusions
Knowledge discovery from academic data is very important to improve the academic performance of any
higher educational institution. In this research, we study the academic system, the existing problems and
the performance data of the most renowned Engineering University of Bangladesh. We have found
problems like abandonment, retention and potentiality decay of the most brilliant students. We have
applied Association Rule Data Mining technique to explore the root of the cause of the above problems.
Before applying the data mining algorithm, the existing academic data has been preprocessed to make it
suitable for data mining. We have developed a data transformation technique that transforms the
relational database into an equivalent universal relational format. In this format, we have also transformed
the continuous data into discrete valued qualitative data. We have found interesting Association Rules
applying Apriori Association Rule generator on the transformed data using WEKA tool. From the large
number of association rules, we have extracted the interesting rules regarding the impacts of gender,
residence, continuous assessment on the academic performance. We have also found the association
among the courses, retention and abandonment. The obtained result is found to be very much significant
for the decision maker to improve the overall academic condition of the institution.
According to the results found, 10% of 582 students of CSE department who have already graduated are
male and have CGPA below 3.00 and the probability of being male students among poor CGPA holders
is 0.87. Again, we have discovered that, 5% of total students have poor CGPA and they are hall resident
and the probability of hall resident among poor CGPA holders is 0.52. We have also discovered the
significant correlation between courses. For example, more than 58 students have excellent grades in both
CSE105 (Structured Programming Language) and CSE201 (Object Oriented Programming Language).
The probability of having excellent grade in CSE201 among students having excellent grade in CSE105 is
0.48. We have found that there are about 30 students who has to retake MATH243 courses. We found that
5% of total male students are both retentive and hall resident and 65% of total retentive students are hall
resident. Abandonment rate is very low in CSE department of BUET as we found that only 3 male
students dropped out before completing graduation and 75% of abandoned students were hall resident.
We have also determined the impact of several Non-departmental courses. For example, more than 60
students possess very good grade in HUM272 as well as have CGPA over 3.50. We have also determined
the impact of several departmental courses. For example, 5% of 582 students have CGPA over 3.75 and
have got A+ in CSE 401. 50% of students having CGPA over 3.75 have obtained A+ in CSE 401.
We hope all these quantitative findings will be helpful to the decision maker for improving the quality of
education provided in this department. We have applied the technique to only the CSE department of
BUET but it is applicable to any department of any higher educational institute.

Contenu connexe

Tendances

Development of Instructional Model Based on Indonesian National Qualification...
Development of Instructional Model Based on Indonesian National Qualification...Development of Instructional Model Based on Indonesian National Qualification...
Development of Instructional Model Based on Indonesian National Qualification...IJAEMSJORNAL
 
The Performance of FEU Pre-Service SPED Teachers in IE-Formulated Comprehensi...
The Performance of FEU Pre-Service SPED Teachers in IE-Formulated Comprehensi...The Performance of FEU Pre-Service SPED Teachers in IE-Formulated Comprehensi...
The Performance of FEU Pre-Service SPED Teachers in IE-Formulated Comprehensi...Stephanie Gaña
 
A study of the effects of electronic
A study of the effects of electronicA study of the effects of electronic
A study of the effects of electronicijcsit
 
Students achievement differences
Students achievement differencesStudents achievement differences
Students achievement differencesNabin Bhattarai
 
The effectiveness of problem-based learning model to increase the students’ c...
The effectiveness of problem-based learning model to increase the students’ c...The effectiveness of problem-based learning model to increase the students’ c...
The effectiveness of problem-based learning model to increase the students’ c...Journal of Education and Learning (EduLearn)
 
changing face of education needs assessment for learning
changing face of education needs assessment for learningchanging face of education needs assessment for learning
changing face of education needs assessment for learningDirectorate of Education Delhi
 
Assignment article review dr johan
Assignment article review dr johanAssignment article review dr johan
Assignment article review dr johanazlinazlan
 
report writing
report writingreport writing
report writingIllia Sham
 
Article review yukon territory
Article review yukon territoryArticle review yukon territory
Article review yukon territoryAdibah H. Mutalib
 
Sri Lankan Journal of Educational Research
Sri Lankan Journal of Educational ResearchSri Lankan Journal of Educational Research
Sri Lankan Journal of Educational ResearchGodwin Kodituwakku
 
Assessment and Evaluation System in Engineering Education of UG Programmes at...
Assessment and Evaluation System in Engineering Education of UG Programmes at...Assessment and Evaluation System in Engineering Education of UG Programmes at...
Assessment and Evaluation System in Engineering Education of UG Programmes at...ijtsrd
 
Comparative analysis of students’ achievement in senior school certificate fu...
Comparative analysis of students’ achievement in senior school certificate fu...Comparative analysis of students’ achievement in senior school certificate fu...
Comparative analysis of students’ achievement in senior school certificate fu...Journal of Education and Learning (EduLearn)
 
Effects of e learning program in accounting on students’ achievement and moti...
Effects of e learning program in accounting on students’ achievement and moti...Effects of e learning program in accounting on students’ achievement and moti...
Effects of e learning program in accounting on students’ achievement and moti...Alexander Decker
 

Tendances (17)

Development of Instructional Model Based on Indonesian National Qualification...
Development of Instructional Model Based on Indonesian National Qualification...Development of Instructional Model Based on Indonesian National Qualification...
Development of Instructional Model Based on Indonesian National Qualification...
 
The Performance of FEU Pre-Service SPED Teachers in IE-Formulated Comprehensi...
The Performance of FEU Pre-Service SPED Teachers in IE-Formulated Comprehensi...The Performance of FEU Pre-Service SPED Teachers in IE-Formulated Comprehensi...
The Performance of FEU Pre-Service SPED Teachers in IE-Formulated Comprehensi...
 
A study of the effects of electronic
A study of the effects of electronicA study of the effects of electronic
A study of the effects of electronic
 
Quality of blended learning as perceived by Arab Open University students
Quality of blended learning as perceived by Arab Open University studentsQuality of blended learning as perceived by Arab Open University students
Quality of blended learning as perceived by Arab Open University students
 
Students achievement differences
Students achievement differencesStudents achievement differences
Students achievement differences
 
The effectiveness of problem-based learning model to increase the students’ c...
The effectiveness of problem-based learning model to increase the students’ c...The effectiveness of problem-based learning model to increase the students’ c...
The effectiveness of problem-based learning model to increase the students’ c...
 
changing face of education needs assessment for learning
changing face of education needs assessment for learningchanging face of education needs assessment for learning
changing face of education needs assessment for learning
 
Assignment article review dr johan
Assignment article review dr johanAssignment article review dr johan
Assignment article review dr johan
 
report writing
report writingreport writing
report writing
 
Article review yukon territory
Article review yukon territoryArticle review yukon territory
Article review yukon territory
 
The dimensions of accounting profession in the view of high school students a...
The dimensions of accounting profession in the view of high school students a...The dimensions of accounting profession in the view of high school students a...
The dimensions of accounting profession in the view of high school students a...
 
Sri Lankan Journal of Educational Research
Sri Lankan Journal of Educational ResearchSri Lankan Journal of Educational Research
Sri Lankan Journal of Educational Research
 
TRACER STUDY OF BSCS GRADUATES OF LYCEUM OF THE PHILIPPINES UNIVERSITY FROM ...
TRACER STUDY OF BSCS GRADUATES OF LYCEUM OF THE  PHILIPPINES UNIVERSITY FROM ...TRACER STUDY OF BSCS GRADUATES OF LYCEUM OF THE  PHILIPPINES UNIVERSITY FROM ...
TRACER STUDY OF BSCS GRADUATES OF LYCEUM OF THE PHILIPPINES UNIVERSITY FROM ...
 
Assessment and Evaluation System in Engineering Education of UG Programmes at...
Assessment and Evaluation System in Engineering Education of UG Programmes at...Assessment and Evaluation System in Engineering Education of UG Programmes at...
Assessment and Evaluation System in Engineering Education of UG Programmes at...
 
Uitm proposal
Uitm proposalUitm proposal
Uitm proposal
 
Comparative analysis of students’ achievement in senior school certificate fu...
Comparative analysis of students’ achievement in senior school certificate fu...Comparative analysis of students’ achievement in senior school certificate fu...
Comparative analysis of students’ achievement in senior school certificate fu...
 
Effects of e learning program in accounting on students’ achievement and moti...
Effects of e learning program in accounting on students’ achievement and moti...Effects of e learning program in accounting on students’ achievement and moti...
Effects of e learning program in accounting on students’ achievement and moti...
 

Similaire à Thesis summary knowledge discovery from academic data using association rule mining

A Review On Career Guidance And Counselling Needs For Students
A Review On Career Guidance And Counselling Needs For StudentsA Review On Career Guidance And Counselling Needs For Students
A Review On Career Guidance And Counselling Needs For StudentsKarla Adamson
 
Non-Cognitive Testing
Non-Cognitive TestingNon-Cognitive Testing
Non-Cognitive Testingjwilliams77
 
A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...
A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...
A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...IRJET Journal
 
Criteria-2-–-Teaching-Learning-and-Evaluation-NAAC-Perspectives-by-Prof.-Rajm...
Criteria-2-–-Teaching-Learning-and-Evaluation-NAAC-Perspectives-by-Prof.-Rajm...Criteria-2-–-Teaching-Learning-and-Evaluation-NAAC-Perspectives-by-Prof.-Rajm...
Criteria-2-–-Teaching-Learning-and-Evaluation-NAAC-Perspectives-by-Prof.-Rajm...taruian
 
EBTM 350 Business AnalyticsSemester TermCourse-SectionI.docx
EBTM 350 Business AnalyticsSemester  TermCourse-SectionI.docxEBTM 350 Business AnalyticsSemester  TermCourse-SectionI.docx
EBTM 350 Business AnalyticsSemester TermCourse-SectionI.docxbudabrooks46239
 
Sunshine Coast TAFE VET in Schools Forum Presentation by QUT Widening Partici...
Sunshine Coast TAFE VET in Schools Forum Presentation by QUT Widening Partici...Sunshine Coast TAFE VET in Schools Forum Presentation by QUT Widening Partici...
Sunshine Coast TAFE VET in Schools Forum Presentation by QUT Widening Partici...Your Future Careers Team QUT
 
IRJET- A Conceptual Framework to Predict Academic Performance of Students usi...
IRJET- A Conceptual Framework to Predict Academic Performance of Students usi...IRJET- A Conceptual Framework to Predict Academic Performance of Students usi...
IRJET- A Conceptual Framework to Predict Academic Performance of Students usi...IRJET Journal
 
Student Assessment in Senior High School Strands V2.pptx
Student Assessment in Senior High School Strands V2.pptxStudent Assessment in Senior High School Strands V2.pptx
Student Assessment in Senior High School Strands V2.pptxamazinglycooldude273
 
Analysis Of Students Ability In Solving Relation And Functions Problems Base...
Analysis Of Students  Ability In Solving Relation And Functions Problems Base...Analysis Of Students  Ability In Solving Relation And Functions Problems Base...
Analysis Of Students Ability In Solving Relation And Functions Problems Base...Vicki Cristol
 
Data mining in higher education university student dropout case study
Data mining in higher education  university student dropout case studyData mining in higher education  university student dropout case study
Data mining in higher education university student dropout case studyIJDKP
 
THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...
THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...
THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...IAEME Publication
 
Improving student learning through assessment for learning using social media...
Improving student learning through assessment for learning using social media...Improving student learning through assessment for learning using social media...
Improving student learning through assessment for learning using social media...Gihan Wikramanayake
 
Reseach
ReseachReseach
Reseachpoo999
 
Role of Analytics in Education
Role of Analytics in EducationRole of Analytics in Education
Role of Analytics in Educationdjkpandian
 
Distance education
Distance educationDistance education
Distance educationbrnygmrr
 
Talis Insight Asia-Pacific 2017: Simon Bedford, University of Wollongong
Talis Insight Asia-Pacific 2017: Simon Bedford, University of WollongongTalis Insight Asia-Pacific 2017: Simon Bedford, University of Wollongong
Talis Insight Asia-Pacific 2017: Simon Bedford, University of WollongongTalis
 
Stat220 syllabus pt1 (1)
Stat220 syllabus pt1 (1)Stat220 syllabus pt1 (1)
Stat220 syllabus pt1 (1)Shubhada Sagdeo
 
Appraisal of the Choice of College among Management and Business Technology F...
Appraisal of the Choice of College among Management and Business Technology F...Appraisal of the Choice of College among Management and Business Technology F...
Appraisal of the Choice of College among Management and Business Technology F...IJAEMSJORNAL
 
Applicability of Educational Data Mining in Afghanistan: Opportunities and Ch...
Applicability of Educational Data Mining in Afghanistan: Opportunities and Ch...Applicability of Educational Data Mining in Afghanistan: Opportunities and Ch...
Applicability of Educational Data Mining in Afghanistan: Opportunities and Ch...Abdul Rahman Sherzad
 

Similaire à Thesis summary knowledge discovery from academic data using association rule mining (20)

A Review On Career Guidance And Counselling Needs For Students
A Review On Career Guidance And Counselling Needs For StudentsA Review On Career Guidance And Counselling Needs For Students
A Review On Career Guidance And Counselling Needs For Students
 
Non-Cognitive Testing
Non-Cognitive TestingNon-Cognitive Testing
Non-Cognitive Testing
 
A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...
A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...
A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...
 
Criteria-2-–-Teaching-Learning-and-Evaluation-NAAC-Perspectives-by-Prof.-Rajm...
Criteria-2-–-Teaching-Learning-and-Evaluation-NAAC-Perspectives-by-Prof.-Rajm...Criteria-2-–-Teaching-Learning-and-Evaluation-NAAC-Perspectives-by-Prof.-Rajm...
Criteria-2-–-Teaching-Learning-and-Evaluation-NAAC-Perspectives-by-Prof.-Rajm...
 
EBTM 350 Business AnalyticsSemester TermCourse-SectionI.docx
EBTM 350 Business AnalyticsSemester  TermCourse-SectionI.docxEBTM 350 Business AnalyticsSemester  TermCourse-SectionI.docx
EBTM 350 Business AnalyticsSemester TermCourse-SectionI.docx
 
Sunshine Coast TAFE VET in Schools Forum Presentation by QUT Widening Partici...
Sunshine Coast TAFE VET in Schools Forum Presentation by QUT Widening Partici...Sunshine Coast TAFE VET in Schools Forum Presentation by QUT Widening Partici...
Sunshine Coast TAFE VET in Schools Forum Presentation by QUT Widening Partici...
 
IRJET- A Conceptual Framework to Predict Academic Performance of Students usi...
IRJET- A Conceptual Framework to Predict Academic Performance of Students usi...IRJET- A Conceptual Framework to Predict Academic Performance of Students usi...
IRJET- A Conceptual Framework to Predict Academic Performance of Students usi...
 
Student Assessment in Senior High School Strands V2.pptx
Student Assessment in Senior High School Strands V2.pptxStudent Assessment in Senior High School Strands V2.pptx
Student Assessment in Senior High School Strands V2.pptx
 
Analysis Of Students Ability In Solving Relation And Functions Problems Base...
Analysis Of Students  Ability In Solving Relation And Functions Problems Base...Analysis Of Students  Ability In Solving Relation And Functions Problems Base...
Analysis Of Students Ability In Solving Relation And Functions Problems Base...
 
Data mining in higher education university student dropout case study
Data mining in higher education  university student dropout case studyData mining in higher education  university student dropout case study
Data mining in higher education university student dropout case study
 
THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...
THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...
THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...
 
FTCC - Executive Leadership Track
FTCC - Executive Leadership TrackFTCC - Executive Leadership Track
FTCC - Executive Leadership Track
 
Improving student learning through assessment for learning using social media...
Improving student learning through assessment for learning using social media...Improving student learning through assessment for learning using social media...
Improving student learning through assessment for learning using social media...
 
Reseach
ReseachReseach
Reseach
 
Role of Analytics in Education
Role of Analytics in EducationRole of Analytics in Education
Role of Analytics in Education
 
Distance education
Distance educationDistance education
Distance education
 
Talis Insight Asia-Pacific 2017: Simon Bedford, University of Wollongong
Talis Insight Asia-Pacific 2017: Simon Bedford, University of WollongongTalis Insight Asia-Pacific 2017: Simon Bedford, University of Wollongong
Talis Insight Asia-Pacific 2017: Simon Bedford, University of Wollongong
 
Stat220 syllabus pt1 (1)
Stat220 syllabus pt1 (1)Stat220 syllabus pt1 (1)
Stat220 syllabus pt1 (1)
 
Appraisal of the Choice of College among Management and Business Technology F...
Appraisal of the Choice of College among Management and Business Technology F...Appraisal of the Choice of College among Management and Business Technology F...
Appraisal of the Choice of College among Management and Business Technology F...
 
Applicability of Educational Data Mining in Afghanistan: Opportunities and Ch...
Applicability of Educational Data Mining in Afghanistan: Opportunities and Ch...Applicability of Educational Data Mining in Afghanistan: Opportunities and Ch...
Applicability of Educational Data Mining in Afghanistan: Opportunities and Ch...
 

Plus de shibbirtanvin

4bit pc report[cse 08-section-b2_group-02]
4bit pc report[cse 08-section-b2_group-02]4bit pc report[cse 08-section-b2_group-02]
4bit pc report[cse 08-section-b2_group-02]shibbirtanvin
 
Determininstic rounding of linear programs
Determininstic rounding of linear programsDetermininstic rounding of linear programs
Determininstic rounding of linear programsshibbirtanvin
 
Final thesis_Knowledge Discovery from Academic Data using Association Rule Mi...
Final thesis_Knowledge Discovery from Academic Data using Association Rule Mi...Final thesis_Knowledge Discovery from Academic Data using Association Rule Mi...
Final thesis_Knowledge Discovery from Academic Data using Association Rule Mi...shibbirtanvin
 
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...shibbirtanvin
 
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...shibbirtanvin
 
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...shibbirtanvin
 
An Easier Approach to Visible Edge Determination from Moving Viewpoint, Paper...
An Easier Approach to Visible Edge Determination from Moving Viewpoint, Paper...An Easier Approach to Visible Edge Determination from Moving Viewpoint, Paper...
An Easier Approach to Visible Edge Determination from Moving Viewpoint, Paper...shibbirtanvin
 
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...shibbirtanvin
 

Plus de shibbirtanvin (8)

4bit pc report[cse 08-section-b2_group-02]
4bit pc report[cse 08-section-b2_group-02]4bit pc report[cse 08-section-b2_group-02]
4bit pc report[cse 08-section-b2_group-02]
 
Determininstic rounding of linear programs
Determininstic rounding of linear programsDetermininstic rounding of linear programs
Determininstic rounding of linear programs
 
Final thesis_Knowledge Discovery from Academic Data using Association Rule Mi...
Final thesis_Knowledge Discovery from Academic Data using Association Rule Mi...Final thesis_Knowledge Discovery from Academic Data using Association Rule Mi...
Final thesis_Knowledge Discovery from Academic Data using Association Rule Mi...
 
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
 
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
 
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
 
An Easier Approach to Visible Edge Determination from Moving Viewpoint, Paper...
An Easier Approach to Visible Edge Determination from Moving Viewpoint, Paper...An Easier Approach to Visible Edge Determination from Moving Viewpoint, Paper...
An Easier Approach to Visible Edge Determination from Moving Viewpoint, Paper...
 
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
 

Dernier

mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 

Dernier (20)

mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 

Thesis summary knowledge discovery from academic data using association rule mining

  • 1. Knowledge Discovery from Academic Data using Association Rule Mining SUBMITTED BY Rajshakhar Paul Student ID: 0805020 Shibbir Ahmed Student ID: 0805097 Summary of the Thesis Department of Computer Science and Engineering BANGLADESH UNIVERSITY OF ENGINEERING AND TECHNOLOGY
  • 2. Page | 1 i. Introduction Students are one of the fundamental elements of any academic institution. Indeed, the prime concern for an educational institution is to ensure qualified technical foundation, scholarly guidance and high standard education to all of its students. For a large educational institute like public university which generates large volumes of data, it requires an efficient way to apply data mining techniques for obtaining knowledge on the development and performance improvement of academic activities. The knowledge acquired from the institutional database will be sufficient to look for answers to such questions as: Which factors determine better or worse academic performance of students? What are the causes behind the students' retention i.e., the extended continuation of the studies in the university? Why do students drop out before graduation i.e., students‟ abandonment from an educational institute. Concepts and techniques of data mining are essential to discover the hidden knowledge from large datasets. Bangladesh University of Engineering and Technology (BUET) is the topmost technological university of Bangladesh and it enrolls the top most brilliant 1000 students selected by a competitive examination among one million students competing higher secondary education. Among these 1000 students, top ranked students can get admission into the different departments under different faculties. Although, this university possesses most of the brightest students of Bangladesh, statistics demonstrates that performance of some students degrades noticeably. On the other hand, some students perform outstandingly at the initial stage of the undergraduate studies but they can not demonstrate the same level of excellence till the completion of their graduation. Some students can not perform well initially but at the end of their graduation they possess pretty good academic career. Again, there are some students in this university who have to continue their studies year after year and take a very long time for the completion of their graduation. Unfortunately there are also some meritorious students who drop out before the graduation. Only statistical analysis is not sufficient for finding the reasons of all the above problems in any academic institution. The hidden knowledge inside the institutional academic and personal data of students is necessary to find out the possible causes of all these problems and take suitable precaution for them. That is why knowledge discovery and data mining form academic data is essential for educational institution like BUET to improve academic performance of students as well as refine the standard of teaching methodologies and reshape the decision makings for the betterment of the institution. Discovering the hidden knowledge from educational data and applying it properly for decision making is essential for ensuring high quality education in any academic institution. For this, data mining techniques are very effective. But all the data mining techniques can not be applied directly on academic data because of complex structure. This requires rigorous preprocessing. The choice of support and confidence, selection of important association rules from huge number of generated rules are other significant problems of knowledge discovery from academic data. ii. Motivation In a developing country like Bangladesh, too many students from rural area come to city for higher education. They usually come to city leaving their family and have to accommodate with a completely new environment. They start their new educational life at institution‟s hall. New living place, new types of foods, new companions, new atmosphere. It is seen that they usually need some time to cope up both physically and mentally with all of these new things which may hamper their educational activities at the very beginning. And the scenario is bit more difficult for girls than boys. So sometimes they lag behind at the beginning of the race of their higher studies which may create an adverse effect in the long run for them. On the other hand, the city students are more likely familiar with the environment, living with their family and provided with more opportunities of educational, technological and psychological aspects
  • 3. Page | 2 which may give them some advantages in the track of higher education. Though the scenario can be different, the more opportunities may drive them away from the track and demoralize them in studies. In higher education system like BUET, the performance of one course depends on different aspects such as class attendance, class test, quiz, assignments, term final examinations, etc. some of which start from very beginning of the class. So if any student gets poor marks in any of these, it may affect the final result. And the later courses are sometimes dependent of previous courses. So if any student gets poor result in any course it may affect the performance of other related courses too. So it is very obvious to discover all possible knowledge from academic data to know all the relevant rules behind students‟ performances whether they are doing well or bad. And if they cannot perform well then the reason behind it can also be discovered. iii. Goal and Objectives The department of Computer Science and Engineering (CSE) is one of the prestigious departments of BUET. Although, this department possesses most of the brightest students of Bangladesh, statistical data demonstrates that performance of some students degrades noticeably. Moreover the problem of retention as well as abandonment is also prevalent among the students. The main objective of this research study is- To discover knowledge of students‟ academic progress from academic performance with personal statistics through the impact of different assessment of courses e.g., class test, attendance, term final examination etc. To find out reasons behind the degradation of student‟s merit i.e., decay in their potentiality To discover causes behind extended continuation for graduation i.e., retention of students To find out why some meritorious students drop out before graduation i.e., abandonment of students iv. Key Techniques used to achieve the Goal A. Data Analysis 1) Personal and Academic Data In this research, we have considered academic data structure of BUET. The student data of the BIIS (BUET Institutional Information System) contains several personal and academic information of a particular student. We have collected them anonymously for the data preprocessing and data analysis. We have considered these personal and academic data stated in the Table 1 for knowledge discovery regarding academic performance, abandonment and retention of students illustrated in Figure 4.1. Table 4.1: Selected Data from BIIS database Academic Information Department Admission Year / Batch Overall CGPA Marks of Class test, Attendance, Two Answer Scripts, Total Marks and Grades of all Theory Courses Total Marks and Grades of all Sessional Courses Total Completed Credit Hour Personal Information Gender Hall Resident/Non-resident
  • 4. Page | 3 Figure 4.1: Factors related to Academic Performance, Abandonment and Retention of students 2) Course and Curriculum As we have experimented with the students‟ data of the department of Computer Science and Engineering (CSE) in BUET, we have analyzed all the courses in the curriculum which has to be taken to complete the BSc degree. A student has to take total 68 departmental and non-departmental courses in total. All the courses along with their credit hour are shown in Table 4.2. Table 4.2: All Undergraduate Courses for department of CSE Among them there are 40 theory courses (25 departmental and 15 non-departmental) and 28 sessional courses (20 departmental and 7 non-departmental) including thesis. We determine academic performance and impact of other factors on basis of these courses‟ final grade and marks of attendance, class tests, term final answer scripts, total marks etc. Course Type Credit Hour Course Number Departmental Theory Courses 4.0 CSE307, CSE321 3.0 CSE103, CSE105, CSE201, CSE203, CSE205, CSE207, CSE209, CSE303, CSE305, CSE309, CSE311, CSE301, CSE313, CSE315, CSE317, CSE401, CSE403, CSE423, CSE409, CSE461, CSE463 2.0 CSE100, CSE 211 Departmental Sessional Courses 1.5 CSE106, CSE202, CSE206, CSE210, CSE214, CSE304, CSE308, CSE314, CSE316, CSE404 0.75 CSE204, CSE208, CSE300, CSE310, CSE322, CSE324, CSE402, CSE410, CSE462, CSE464 Non-Departmental Theory Courses 4.0 PHY109, MATH143, EEE263, MATH 243, 3.0 EEE163, MATH141, ME165, CHEM101, HUM175, MATH241, EEE269, IPE493 2.0 HUM211, HUM275, HUM371 Non-Departmental Sessional Courses 1.5 PHY102, EEE164, ME160, HUM272, CHEM114, EEE264, EEE270 Thesis 6.0 CSE400 Academic Performance Student Retention Student Abandonment ResidenceGender Records of all Continuous Assessments Records of Departmental Courses Records of Non Departmental Courses
  • 5. Page | 4 B. Preprocessing for Mining Academic Database 1) Relational Database Students take courses through BIIS account via registration. In the relational database illustrated in Figure 4.2, all the personal information as well as the results of taken courses of a student are stored. Through which we can obtain the relational table containing a student‟s gender, hall status, performance of all courses, CGPA etc. Figure 4.2: Relational database 2) Universal Database A universal database is created for the purpose in which records of all taken courses along with personal information like gender, hall status of corresponding student id are stored in a single row of the table. For a specific course, the grade, attendance, marks of class tests, marks of each section (section A and section B) of term final answer scripts and total marks. Like this the similar records of all other taken courses are stored in the database with the corresponding student id. And by this process the records of other students are stored in the database one after another after the corresponding Gender and Hall Status of a particular student. Another attribute is stored as Student Type by which we have determined the student type- regular, retentive or abandoned. As, for applying Apriori algorithm of Association Rule Mining, we have to set the value of attribute in discrete form. So, record such as student id has been omitted in the universal table. Table 4.3: Partial portion of universal database 3) Data Transformation The universal database of Table 4.3 has been transformed into an equivalent transformation table by transforming the continuous valued attribute as discrete valued attribute representing some knowledge for the suitability of implementing Apriori algorithm of Association Rule Mining. As for example, CGPA is a continuous attribute and it has been transformed into five classifications as excellent, very good, good, average and poor. We have used one algorithm for transforming all continuous numbers for attendance, class tests, and both sections of answer scripts of term final and total marks of a course. We have used another algorithm for transforming all grade or grade points of courses or overall CGPA into those five classifications. Gender Hall_ Status Student_ Type CSE103_ Grade CSE103 _Attend ance CSE103 _CT CSE103_ Section A CSE 103_ SectionB CSE103 _Total … Male Resident Regular A+ 30 55 90 75 250 Female Non- Resident Regular A 25 45 85 70 225 … … … … … … … … … Student Grade Sheet Course achieves represents
  • 6. Page | 5 For transforming the numbers of universal table i.e., attendance, class tests, section A, section B, total marks of each course, Algorithm1 has been developed to populate the transformed table in such a way that there is no continuous value in an entry. Similarly the grades of universal table are also transformed by an algorithm named as Algorithm2. As the real data set contains CGPA in grade points we similarly consider another variable grade point and transformed the continuous value of CGPA to these five classified definitions. As there are theory courses of credit 4.0, 3.0 and 2.0 and sessional with credit hour 1.5 and 0.75, we need different transformation rule tables for all these different courses. Below, Transformation rules for 3.0 credit hour (in Table 4), for 4.0 credit hour (in Table 5), for 2.0 credit hour (in Table 6) theory courses and for all sessional courses (in Table 7) are illustrated. Algoithm1: Marks_Transformation ( ) Input: marks of Attendance, CT, Section A, Section B, Total Marks of each course from Universal Table of Studentlist Output: discrete level of marks for the Transformation Table for i=1 to | Studentlist | if (marks>=80%) level = “Excellent” else if (marks<80% && marks>=75%) level = “Very Good” else if (marks<75% && marks>=60%) level = “Good” else if(marks<60% && marks>=50%) level = “Average” else if(marks<50%) level = “Poor” end for Algoithm2: Grade_Transformation ( ) Input: all acquired Grade of each courses in the Courselist of the universal table Output: transformed_ grade for the Transformation Table for i=1 to | Courselist | if grade = A+ transformed_grade = „Excellent‟ else if grade = A transformed_grade = „Very Good‟ else if grade = A- or B+ transformed_grade = „Good‟ else if grade = B transformed_grade = „Average‟ else if grade = B- or C+ or C or D transformed_grade = „Poor‟ end for
  • 7. Page | 6 Table 4.4: Transformation rule table for 3.0 credit theory course Table 4.5: Transformation rule table for 4.0 credit theory course Table 4.6: Transformation rule table for 2.0 credit theory course Table 4.7: Transformation rule table for all sessional courses To construct the entire transformed table as given in Table 4.8, we have used the universal table and above transformation rules. Table 4.8: Transformed table from universal table Classified Name Range of Marks (M) Attendance Class Test SecA/SecB Total Excellent 27≤ M ≤30 48≤M≤60 84≤M≤105 240≤M≤300 Very Good 24≤ M ≤26 45≤M≤47 78≤M≤83 225≤M≤239 Good 21≤ M ≤23 36≤M≤44 63≤M≤77 180≤M≤224 Average 18≤ M ≤20 30≤M≤35 52≤M≤62 150≤M≤179 Poor 0≤ M ≤17 0≤M≤29 0≤M≤51 0≤M≤149 Classified Name Range of Marks (M) Attendance Class Test SecA/SecB Total Excellent 36≤ M ≤40 64≤M≤80 112≤M≤140 320≤M≤400 VeryGood 32≤ M ≤35 60≤M≤63 105≤M≤111 300≤M≤319 Good 28≤ M ≤31 48≤M≤49 84≤M≤104 240≤M≤299 Average 24≤ M ≤27 40≤M≤47 70≤M≤83 200≤M≤239 Poor 0≤ M ≤23 0≤M≤39 0≤M≤69 0≤M≤199 Classified Name Range of Marks (M) Attendance Class Test SecA/SecB Total Excellent 18≤ M ≤20 32≤M≤40 56≤M≤70 160≤M≤200 Very Good 16≤ M ≤17 30≤M≤31 52≤M≤55 150≤M≤159 Good 14≤ M ≤15 24≤M≤29 42≤M≤51 120≤M≤149 Average 12≤ M ≤13 20≤M≤23 35≤M≤41 100≤M≤119 Poor 0≤ M ≤11 0≤M≤19 0≤M≤34 0≤M≤99 Classified Name Range of Marks (M) Sessional Credit Hour=1.5 Sessional Credit Hour=0.75 Excellent 120≤ M ≤150 60≤ M ≤75 Very Good 112≤ M ≤119 56≤ M ≤59 Good 90≤ M ≤111 45≤ M ≤55 Average 75≤ M ≤89 37≤ M ≤44 Poor 0≤ M ≤74 0≤ M ≤36 Gender Hall_Statu s Student_Type CSE103_ Grade CSE103_ Attendance CSE103_CT CSE103_ SectionA CSE103_ SectionB CSE103_ Total ……Male Resident Regular Excellent Excellent Excellent Excellent Good Excellent Female Non- resident Regular Very Good Very Good Very Good Excellent Good Very Good …. …. …. …. …. …. …. …. ….
  • 8. Page | 7 4) Dataset and Application Environment In this experiment, we have considered the data up to the last five graduated batch in the department of CSE, BUET. The institutional dataset of BUET consist academic and personal data of 9210 students in last 10 years. We have categorized relevant academic and personal information of those students which are gender, hall status, admission year, completed credit hour, all records of theory and sessional courses, overall CGPA etc. from the relational BIIS database and transformed into universal table structure. Finally we transformed it into a transformed table structure for applying association rule mining. The entire experimental setup is illustrated in Figure 5.1. Figure 4.1: Experimental Setup for applying Apriori Algorithm using Weka Explorer to generate Association Rules After preprocessing step, we have obtained a transformed table of 582 students of department of CSE who have already graduated. Universal table also contain one additional attribute which is student type – retentive, regular or abandoned. Student type is obtained by analyzing completed credit hour and admission year. We have manipulated the transformation table containing all continuous data transformed into five discrete value- Excellent, Very Good, Good, Average and Poor. Finally we have used Weka Explorer to the transformation table (in .csv file format) to generate interesting Association Rules. BUET Institutional Dataset of 9210 Students of All Departments in Last 10 years Gender Hall Status Admission Year Completed CreditHour All Records of Theory & Sessional Courses Overall CGPA Universal Table Structure Regular 552 Student Type Retentive 26 Abandoned 4 Male 473 Gender Female 109 Resident 348 Hall Status Non Resident 234 Theory Course 40 Attendance Classtest Section A Section B Total Grade Sessional Course 28 Total Marks Grade Transformation Table Structure Regular 552 Student Type Retentive 26 Abandoned 4 Male 473 Gender Female 109 Resident 348 Hall Status Non Resident 234 PoorAverageGoodVery GoodExcellent All Marks & Grade of 68 Theory & Sessional Courses Including Overall CGPA of 582 Students
  • 9. Page | 8 v. Main Results and Discussions 1) Impact of Gender We have found the impact of gender in the overall academic performance. This indication is very important in terms of socio economic condition of the country. In BUET majority of the students are male and lives in the university dormitories. There are multiple factors that affect the academic environment and students‟ academic performance. The result of Table 5.1 points out that the male students have a very high confidence level with the poor CGPA. The reason is that male students are generally affected by various societal problems of a third world country like Bangladesh. All other rules support that the academic performance of female students is better than the male students. Table 5.1: Impact of Gender No. Generated Interesting Rules Minimum Support Confidence 01 CGPA=Poor ⇒ Gender=male 10% 87% 02 CGPA=Average ⇒ Gender=male 10% 79% 03 CGPA=Very Good ⇒ Gender=male 10% 83% 04 Gender=male ⇒CGPA=Good 10% 26% 05 Gender=male ⇒ CGPA=Average 10% 21% 06 CGPA=Good ⇒ Gender=female 5% 22% 07 CGPA=Average ⇒ Gender=female 5% 21% 08 CGPA=Excellent ⇒ Gender=female 5% 20% 2) Impact of Residence In BUET, most of the students live in institution hall. But the number of students live in home is also significant fact. Analyzing the rules we have found that both the students of hall and the students residing at home get good CGPA with a descent minimum support and confidence (in table 5.2). So if any student wants to do well in academic prospect he can do from anywhere. Table 5.2: Impact of Hall Status No Generated Interesting Rules Minimum Support Confidence 01 CGPA=Average ⇒ Hall_Status=Resident 10% 65% 02 CGPA=Very Good ⇒ Hall_Status=Resident 10% 63% 03 CGPA=Good ⇒ Hall_Status=Non- Resident 10% 43% 04 CGPA=Good Hall_Status=Resident ⇒ Gender=male 10% 82% But it is found that the percentage of getting poor CGPA is high in hall. Because in hall, there is very little restriction and sometimes there is no one to take care of a student as family members do. So a student can be demoralize and get a very poor grade due to lack of studies. And as shown in rule number 1 in table 4.3, the percentage of male resident students is higher in this regard. In most of the cases, it is inevitable that the poor CGPA holders are resident of hall (rule number 1 and 5 of table 5.3).
  • 10. Page | 9 Table 5.3: Impact of Hall Status and Gender No Generated Interesting Rules Minimum Support Confidence 01 CGPA=Poor Gender=male ⇒ Hall_Status=Resident 5% 51% 02 CGPA=Very Good Gender=male ⇒ Hall_Status=Non-Resident 5% 40% 03 Hall_Status=Non-Resident Gender= female ⇒ CGPA=Average 5% 24% 04 Hall_Status=Resident Gender=female ⇒ CGPA=Good 5% 21% 05 CGPA=Poor ⇒ Hall_Status=Resident 5% 52% 3) Correlation between Courses The analyzed Association Rules show that the grade of one course may depend on prerequisite courses. In rule number 1 we find that if anyone gets excellent grade in CSE105, he/she gets excellent grade in the course CSE205 too with a confidence of 0.48 where CSE105 is Structured Programming Language course and CSE201 is Object Oriented Programming Language course. We also discover that the interrelation of course CSE311 (Data Communication-I) and CSE321 (Networking) in rule number 6, 7 and 8. We also find the impact of course CSE205 (Digital Logic Design) and CSE209 (Digital Electronics and Pulse Technique) on course CSE403 (Digital System Design) in rule number 10 in Table 5.4. Table 5.4: Correlation between Courses No Generated Interesting Rules Minimum Support Confidence 01 CSE105_Grade=Excellent⇒ CSE201_Grade=Excellent 10% 48% 02 CSE201_Grade=Very Good ⇒ CSE105_Grade=Very Good 5% 30% 03 EEE163_Grade=Excellent ⇒ EEE263_Grade=Very Good 5% 27% 04 CSE205_Grade=Excellent ⇒ CSE403_Grade=Excellent 10% 50% 05 CSE403_Grade=Poor ⇒ CSE205_Grade=Average 5% 28% 06 CSE321_Grade=Average ⇒ CSE311_Grade=Average 5% 36% 07 CSE321_Grade=F ⇒ CSE311_Grade=Poor 3% 13% 08 CSE321_Grade=Poor ⇒ CSE311_Grade=Poor 3% 16% 09 CSE205_Grade=Very Good CSE209_Grade=Excellent ⇒ CSE403_Grade=Excellent 5% 53%
  • 11. Page | 10 4) Impact on Retention If any student fails to pass any course then he becomes retentive because he needs to take that course again later to complete his graduation. We find that retentive students usually struggle with the grades in rule number 2, 3, 4, 5 and 6. If a student has not passed in CSE100 which is the first fundamental course of CSE, he or she is retentive i.e., he or she has not passed in the later departmental courses also. This is illustrated by the generated rule no. 1 in the Table 4.5. Moreover, we have discovered that maximum retentive student are hall resident and male which are illustrated in rule number 7 and 8 respectively with a high confidence in the Table 5.5. Table 5.5: Impact on Retention 5) Impact on Abandonment The students who have given up their academic studies without completing all the required courses are typed as „abandoned‟. By analyzing the rules illustrated in Table 5.6, it is discovered that with a high confidence, the abandoned students are male and resident of hall. But the minimum value of support is very low. Thus it is found that the rate of abandonment is very low in the CSE department of this university. Table 5.6: Impact on Abandonment No Generated Interesting Rules Minimum Support Confidence 01 Student Type=Abandoned ⇒ Gender=male 0.5% 100% 02 Student Type=Abandoned ⇒ Hall_Status=Resident 0.5% 75% 03 Student Type=Abandoned ⇒ Gender=male Hall_Status=Resident 0.5% 75% 6) Impact of Continuous Assessment The grading of a course depends on various aspects such as marks of attendance, class test, both sections of term final examination. From rule number 7 which has a maximum confidence value 1.00, we have discovered that the excellent grade of a course depends on the excellent performance of all other aspects of continuous assessment. Again, the performance of class test depends on attendance which is illustrated by rule number 5 in Table 5.7 with a confidence of 0.95 which is very high. No Generated Interesting Rules Minimum Support Confidence 01 CSE100_Grade=F ⇒ Student Type=Retentive 5% 42% 02 Student Type=Retentive ⇒ MATH243_Grade=Poor 5% 35% 03 Student Type=Retentive ⇒ CSE205_Grade=Average 5% 35% 04 Student Type=Retentive ⇒ CSE311_Grade=Average 5% 27% 05 Student Type=Retentive ⇒ EEE263_Grade=Poor 5% 33% 06 Student Type=Retentive ⇒ CSE409_Grade=Average 5% 43% 07 Student Type=Retentive ⇒ Hall_Status=Resident 5% 65% 08 Student Type=Retentive ⇒ Gender=male 5% 81%
  • 12. Page | 11 Table 5.7: Impact of Continuous Assessment No Generated Interesting Rules Minimum Support Confidence 01 CSE103_Attendance=Excellent CSE103_SectionB=Poor ⇒ CSE103_Grade=Average 10% 63% 02 CSE103_Grade=Very Good CSE103_CT=Good ⇒ CSE103_Attendance= Excellent 10% 97% 03 EEE163_Grade=Average ⇒ EEE163_SectionB=Poor 10% 57% 04 EEE163_Grade=Very Good ⇒ EEE163_Attendance= Excellent EEE163_CT=Excellent 10% 67% 05 HUM275_CT=Excellent ⇒ HUM275_Attendance= Excellent 10% 95% 06 HUM275_CT=Excellent HUM275_SectionA=Good⇒ HUM275_Grade=Very Good HUM275_Attendance= Excellent 10% 75% 07 CSE401_Grade=Excellent CSE401_CT=Excellent CSE401_SectionA= Excellent ⇒ CSE401_Attendance= Excellent 10% 100% 08 CSE401_SectionB=Excellent ⇒ CSE401_Grade=Good 10% 75% 7) Impact of Non Departmental Courses After analyzing the generated Association Rules (in Table 5.8) we observed various impacts of non- departmental courses on academic performances. According to curriculum we need to take some non- departmental courses‟ performance which is added to the final result. So it may happen that some students get poor grades in those non departmental courses. But according to generated rules though the good performance of the non-departmental courses brings good grade but the impact of getting poor grade in non-departmental courses causes less harm to the final CGPA because those courses are less in quantity and maximum of those are studied at the beginning of undergraduate level. So students get enough opportunities to improve their CGPA later. Table 4.8: Impact of Non Departmental Courses No Generated Interesting Rules Minimum Support Confidence 01 CGPA=Very Good ⇒ HUM272_Grade=Very Good 10% 73% 02 CGPA=Very Good ⇒ MATH143_Grade=Average 5% 37% 03 CGPA=Good ⇒EEE163_Grade=Average 5% 36% 04 CGPA=Very Good ⇒ CHEM101_Grade=Average 10% 52% 05 CGPA=Average ⇒ IPE493_Grade=Very Good 5% 29% 06 CGPA=Good ⇒ ME165_Grade=Average 10% 43% 07 CGPA=Average ⇒ MATH243_Grade=Poor 5% 27% 8) Impact of Departmental Courses As there are too many departmental courses are studied and there some inter connection between some courses because of prerequisite courses, the result of departmental courses affect the final CGPA very much. From the analyzed rules, it is found that the good grade of departmental courses brings good CGPA. On the other hand poor grade in departmental courses results in poor overall CGPA. This significant knowledge is discovered from the rules illustrated by the impact of departmental courses in Table 5.9.
  • 13. Page | 12 Table 5.9: Impact of Departmental Courses No Generated Interesting Rules Minimum Support Confidence 01 CGPA=Very Good ⇒ CSE100_Grade=Very Good 5% 42% 02 CGPA=Very Good ⇒ CSE105_Grade=Average 5% 31% 03 CGPA=Very Good⇒ CSE206_Grade=Very Good 10% 44% 04 CGPA=Good ⇒ CSE303_Grade=Average 5% 31% 05 CGPA=Poor ==> CSE321_Grade=Poor 5% 29% 06 CGPA=Excellent ⇒ CSE401_Grade=Excellent 5% 50% 07 CGPA=Average ⇒ CSE401_Grade=Average 5% 29% 08 CGPA=Average ⇒ CSE409_Grade=Average 5% 42% v. Conclusions Knowledge discovery from academic data is very important to improve the academic performance of any higher educational institution. In this research, we study the academic system, the existing problems and the performance data of the most renowned Engineering University of Bangladesh. We have found problems like abandonment, retention and potentiality decay of the most brilliant students. We have applied Association Rule Data Mining technique to explore the root of the cause of the above problems. Before applying the data mining algorithm, the existing academic data has been preprocessed to make it suitable for data mining. We have developed a data transformation technique that transforms the relational database into an equivalent universal relational format. In this format, we have also transformed the continuous data into discrete valued qualitative data. We have found interesting Association Rules applying Apriori Association Rule generator on the transformed data using WEKA tool. From the large number of association rules, we have extracted the interesting rules regarding the impacts of gender, residence, continuous assessment on the academic performance. We have also found the association among the courses, retention and abandonment. The obtained result is found to be very much significant for the decision maker to improve the overall academic condition of the institution. According to the results found, 10% of 582 students of CSE department who have already graduated are male and have CGPA below 3.00 and the probability of being male students among poor CGPA holders is 0.87. Again, we have discovered that, 5% of total students have poor CGPA and they are hall resident and the probability of hall resident among poor CGPA holders is 0.52. We have also discovered the significant correlation between courses. For example, more than 58 students have excellent grades in both CSE105 (Structured Programming Language) and CSE201 (Object Oriented Programming Language). The probability of having excellent grade in CSE201 among students having excellent grade in CSE105 is 0.48. We have found that there are about 30 students who has to retake MATH243 courses. We found that 5% of total male students are both retentive and hall resident and 65% of total retentive students are hall resident. Abandonment rate is very low in CSE department of BUET as we found that only 3 male students dropped out before completing graduation and 75% of abandoned students were hall resident. We have also determined the impact of several Non-departmental courses. For example, more than 60 students possess very good grade in HUM272 as well as have CGPA over 3.50. We have also determined the impact of several departmental courses. For example, 5% of 582 students have CGPA over 3.75 and have got A+ in CSE 401. 50% of students having CGPA over 3.75 have obtained A+ in CSE 401. We hope all these quantitative findings will be helpful to the decision maker for improving the quality of education provided in this department. We have applied the technique to only the CSE department of BUET but it is applicable to any department of any higher educational institute.