SlideShare une entreprise Scribd logo
1  sur  10
Télécharger pour lire hors ligne
Tran 1
Karen Tran
Dr. Jacob R. Grohs
ENGE 4994
8 May 2016
The Effects of Statics Credit on Future Mechanics Courses
Objective
The goal for this course (ENGE 4994) during the Spring 2016 semester was to hard-code
raw data, provided by Virginia Polytechnic Institute and State University, containing a large
subset of students taking statics and mechanics courses for a select portion of time (a few years).
The data obtained was 165,543 rows of de-identified transcripts, with each student identified by
a unique number. The raw data was then converted and coded to a Statistical Package for the
Social Sciences (SPSS) file by Dr. Jacob R. Grohs. The true objective was to run different
statistical analysis tests on the data to observe and quantify how transferring statics credit may
affect the performance of a student in future mechanics courses. However, because the data was
quite large, SPSS crashed multiple times trying to run certain tests. This resulted into
restructuring the data into a much more powerful and stronger program that could handle these
statistical tests: R.
Restructuring the data in another programming language (R) became the initial and
prominent objective of the course. Of course, the end goal remained to investigate and
understand how transferring statics credit affected future mechanics courses such as deforms and
dynamics. Other factors that were to be considered were the amount of times a student took a
course, how many other credits were they taking during the same semester, their GPA, their
Tran 2
major, and much more. There was an endless amount of questions to be proposed and answered
using statistical analysis.
Overarching Challenge
The real overarching challenge was learning how to code in R. Throughout the
experience of a typical engineering student at Virginia Polytechnic Institute and State University,
the most programming and coding knowledge through degree courses is standard MATLAB
(Mathematica) and basic Java. Unless the student is a Computer Science or Computer
Engineering major, it poses a slight disadvantage for students of other majors. Restructuring the
data into R became task that constantly needed research, Google, manuals, online guides, and
YouTube tutorials. The goal was to format each row of data to a unique student number
identified as the following:
{student_id, student_admit_type, degree1, degree2, degree3, degree4, class1, class2, etc.1
}.
Ideally, the goal was to be able to simply call out a unique student or specific category
and retrieve all of the information necessary and needed for statistical analysis. The desired
statistical analysis tests such as comparisons, hypothesis testing (p-values), and t-tests were
going to be the stepping stones to understanding performance in future courses after taking
statics (at Virginia Tech or somewhere else).
																																																								
1
There were 26 different classes.
Tran 3
Key Progress
While there were many roadblocks and struggles, there were also many movements of
progression during the semester. I was able to sort and understand the raw data in a short amount
of time (a few weeks) as well as organize and manipulate the data to structure it aesthetically.
The original data frame (mydata), at 165,543 rows, was condensed to unique student identifiers
in a new data frame (myfinaldata) at 23,364 rows. The next few columns of myfinaldata were
built from left to right and contained headers labeled: student_admit_type, degree1, degree2,
degree3, and degree4, respectfully. With the header “student_admit_type,” it contained a list of
three (freshman or transfer, first term attended, last term attended). The degree headers contained
lists of three as well (major, GPA, graduating year). Creating lists within lists was necessary to
be able to call out a certain element from a specific header for a distinct student.
R: Programming Language
Although the raw data was not completely restructured enough to run statistical tests, I
learned a great deal about R coding. I was able to code and restructure a good amount of raw
data using built in functions and logic. I found it extremely similar to MATLAB, however the
syntax was very different and building a data frame was much more extensive than solving a
mathematics problem. By learning how to use functions such as naming and assigning variables,
“unique,” “str,” and “as.list,” I was able to make lists within each column and could call out
pieces (elements) of data for a certain student. This was helpful in a sense that it was possible to
Tran 4
isolate certain variables (students, degrees, freshman/transfer) and manipulate them for future
analyses.2
Remaining Challenges
Still, there are many hurdles to overcome. For students who only had one degree, the
placeholders for degree2, degree3, and degree4 were replaced with “NA” for each element (9
NA’s). This created an immense amount of unnecessary space in the data frame which could
have been easily fixed by creating a list that could have an unlimited amount of elements within
the same column (appending to a list).
For example, taken from myfinaldata, there are two students (Student 1 and Student 17)
who have a different number of degrees. The data currently looks like this:
• [Student1], [Freshman,199909,2015012], [MATH,2.18174603,199907,NA,NA,NA,NA,NA,NA,NA,NA,NA]
• [Student17], [Freshman,199809,201401], [CE,2.89947644,200301,ME,2.89947644,201201, NA,NA,NA,NA,NA,NA].
Ideally the rows should project and display like this:
• [Student1], [Freshman,199909,2015012], [MATH,2.18174603,199907]
• [Student17], [Freshman,199809,201401], [CE,2.89947644,200301,ME,2.89947644,201201].
The list should close and end if there is not anymore information applicable to the unique
student. If a student only had one degree, the third header should only contain a single list of
three. If a student only had two degrees, the third header should contain two lists of three (a total
																																																								
2
The code for myfinaldata and screenshot demonstrating how to call out a unique student is
provided at the end of this document.
Tran 5
of six elements in the degree category). And the pattern should continue for students who had
three and four degrees (nine and 12 elements, respectively).
Another major challenge that needs to be tackled is a way to enter the data into R so that
it can read it line by line. There needs to be a method (while or for loop) to only enter
information for each unique student if they have taken a certain class. Because there are 26
classes, it would be redundant to have NA for 25 classes if a student only took one class.
Tran 6
Code3
:
getwd()
mydata <- read.csv("UG Mechanics Raw Transcript Data from IR.csv", header = TRUE)
View(mydata)
str(mydata)
length(unique(mydata$new_id))
myfinaldata = NULL
myfinaldata <-
unique(mydata[,c("new_id","student_admit_type","first_term_attend","last_term_attend")])
View(myfinaldata)
column2_vector1 <- as.vector(myfinaldata$student_admit_type)
column2_vector2 <- as.vector(myfinaldata$first_term_attend)
column2_vector3 <- as.vector(myfinaldata$last_term_attend)
newvariable = NULL
newvariable$student_admit_type <- paste(column2_vector1,column2_vector2,column2_vector3,
sep = ",")
myfinaldata <- data.frame(myfinaldata,newvariable)
myfinaldata$student_admit_type <- NULL
myfinaldata$first_term_attend <- NULL
myfinaldata$last_term_attend <- NULL
names(myfinaldata)[1] = "student_id"
names(myfinaldata)[2] = "student_admit_type"
myfinaldata$student_admit_type <- as.character(myfinaldata$student_admit_type)
str(myfinaldata$student_admit_type)
as.list(strsplit(myfinaldata$student_admit_type, ",")[[1]])
x <- as.list(strsplit(myfinaldata$student_admit_type, ","))
#CAN CALL THE ELEMENTS OF COLUMN 2 (STUDENT ADMIT TYPE)
mydata[mydata==""] <- NA
View(mydata)
testnull = NULL
testnull = unique(mydata[,c("new_id","degree1_major","degree1_gpa","degree1_term")])
myfinaldata <- data.frame(myfinaldata,testnull)
myfinaldata$new_id <- NULL
column3_vector1 <- as.character(myfinaldata$degree1_major)
column3_vector2 <- as.character(myfinaldata$degree1_gpa)
column3_vector3 <- as.character(myfinaldata$degree1_term)
newvariable3 = NULL
newvariable3$degrees <- paste(column3_vector1,column3_vector2,column3_vector3,sep=",")
myfinaldata <- data.frame(myfinaldata,newvariable3)
myfinaldata$degree1_major <- NULL
myfinaldata$degree1_gpa <- NULL
myfinaldata$degree1_term <- NULL
																																																								
3
This code was run in RStudio.
Tran 7
names(myfinaldata)[3] = "degree1"
myfinaldata$degree1 <- as.character(myfinaldata$degree1)
str(myfinaldata$degree1)
as.list(strsplit(myfinaldata$degree1, ",")[[1]])
x2 <- as.list(strsplit(myfinaldata$degree1, ","))
#CAN CALL COLUMN 3(1ST DEGREE)
col4 = NULL
col4 = unique(mydata[,c("new_id","degree2_major","degree2_gpa","degree2_term")])
myfinaldata <- data.frame(myfinaldata,col4)
myfinaldata$new_id <- NULL
column4_vector1 <- as.character(myfinaldata$degree2_major)
column4_vector2 <- as.character(myfinaldata$degree2_gpa)
column4_vector3 <- as.character(myfinaldata$degree2_term)
newvariable4 = NULL
newvariable4$degree2 <- paste(column4_vector1,column4_vector2,column4_vector3,sep=", ")
myfinaldata <- data.frame(myfinaldata,newvariable4)
myfinaldata$degree2_major <- NULL
myfinaldata$degree2_gpa <- NULL
myfinaldata$degree2_term <- NULL
myfinaldata$degree2 <- as.character(myfinaldata$degree2)
str(myfinaldata$degree2)
as.list(strsplit(myfinaldata$degree2, ",")[[1]])
x3 <- as.list(strsplit(myfinaldata$degree2, ","))
#CAN CALL COLUMN 4 (2nd DEGREE)
col5 = NULL
col5 = unique(mydata[,c("new_id","degree3_major","degree3_gpa","degree3_term")])
myfinaldata <- data.frame(myfinaldata,col5)
myfinaldata$new_id <- NULL
column5_vector1 <- as.character(myfinaldata$degree3_major)
column5_vector2 <- as.character(myfinaldata$degree3_gpa)
column5_vector3 <- as.character(myfinaldata$degree3_term)
newvariable5 = NULL
newvariable5$degree3 <- paste(column5_vector1,column5_vector2,column5_vector3,sep=", ")
myfinaldata <- data.frame(myfinaldata,newvariable5)
myfinaldata$degree3_major <- NULL
myfinaldata$degree3_gpa <- NULL
myfinaldata$degree3_term <- NULL
myfinaldata$degree3 <- as.character(myfinaldata$degree3)
str(myfinaldata$degree3)
as.list(strsplit(myfinaldata$degree3, ",")[[1]])
x4 <- as.list(strsplit(myfinaldata$degree3, ","))
#CAN CALL COLUMN 4 (3rd DEGREE)
col6 = NULL
Tran 8
col6 = unique(mydata[,c("new_id","degree4_major","degree4_gpa","degree4_term")])
myfinaldata <- data.frame(myfinaldata,col6)
myfinaldata$new_id <- NULL
column6_vector1 <- as.character(myfinaldata$degree4_major)
column6_vector2 <- as.character(myfinaldata$degree4_gpa)
column6_vector3 <- as.character(myfinaldata$degree4_term)
newvariable6 = NULL
newvariable6$degree4 <- paste(column6_vector1,column6_vector2,column6_vector3,sep=", ")
myfinaldata <- data.frame(myfinaldata,newvariable6)
myfinaldata$degree4_major <- NULL
myfinaldata$degree4_gpa <- NULL
myfinaldata$degree4_term <- NULL
myfinaldata$degree4 <- as.character(myfinaldata$degree4)
str(myfinaldata$degree4)
as.list(strsplit(myfinaldata$degree4, ",")[[1]])
x5 <- as.list(strsplit(myfinaldata$degree4, ","))
#CAN CALL COLUMN 5 (4th DEGREE)
Screenshots:
	
Figure 1. mydata, myfinaldata - Running Code with Two Outputs
Tran 9
	
Figure 2. mydata - Raw Data Output
	
	
Figure 3. myfinaldata - Manipulated Data Output
Tran 10
	
Figure 4. myfinaldata - Calling Out Select Students and Categories
Analysis of Figure 4
• x = student_admit_type
• x2 = degree1
• x3 = degree2
• x4 = degree3
• x5 = degree4
Interpretation:
• x2[[2]] = retrieve all information of degree1 for student number two
• x2[[2]][[1]] = retrieve the first element of degree1 for student number two
• x2[[3]] = retrieve all information of degree1 for student number three
• x2[[3]][[2]] = retrieve the second element of degree1 for student number three
• x3[[3]] = retrieve all information of degree2 for student number three
• x[[5]] = retrieve all information of the student_admit_type for student number five
• x[[5]][[1]] = retrieve the first element of the student_admit_type for student number five

Contenu connexe

Tendances

Higher Order Learning
Higher Order LearningHigher Order Learning
Higher Order Learningbutest
 
The D-basis Algorithm for Association Rules of High Confidence
The D-basis Algorithm for Association Rules of High ConfidenceThe D-basis Algorithm for Association Rules of High Confidence
The D-basis Algorithm for Association Rules of High ConfidenceITIIIndustries
 
QUERY INVERSION TO FIND DATA PROVENANCE
QUERY INVERSION TO FIND DATA PROVENANCE QUERY INVERSION TO FIND DATA PROVENANCE
QUERY INVERSION TO FIND DATA PROVENANCE cscpconf
 
Audit report[rollno 49]
Audit report[rollno 49]Audit report[rollno 49]
Audit report[rollno 49]RAHULROHAM2
 
Effective Data Retrieval in XML using TreeMatch Algorithm
Effective Data Retrieval in XML using TreeMatch AlgorithmEffective Data Retrieval in XML using TreeMatch Algorithm
Effective Data Retrieval in XML using TreeMatch AlgorithmIRJET Journal
 

Tendances (8)

Higher Order Learning
Higher Order LearningHigher Order Learning
Higher Order Learning
 
Bo4301369372
Bo4301369372Bo4301369372
Bo4301369372
 
The D-basis Algorithm for Association Rules of High Confidence
The D-basis Algorithm for Association Rules of High ConfidenceThe D-basis Algorithm for Association Rules of High Confidence
The D-basis Algorithm for Association Rules of High Confidence
 
QUERY INVERSION TO FIND DATA PROVENANCE
QUERY INVERSION TO FIND DATA PROVENANCE QUERY INVERSION TO FIND DATA PROVENANCE
QUERY INVERSION TO FIND DATA PROVENANCE
 
Data structure
Data structureData structure
Data structure
 
Audit report[rollno 49]
Audit report[rollno 49]Audit report[rollno 49]
Audit report[rollno 49]
 
Effective Data Retrieval in XML using TreeMatch Algorithm
Effective Data Retrieval in XML using TreeMatch AlgorithmEffective Data Retrieval in XML using TreeMatch Algorithm
Effective Data Retrieval in XML using TreeMatch Algorithm
 
Infos2014
Infos2014Infos2014
Infos2014
 

Similaire à Karen Tran - ENGE 4994 Paper

Data Analysis and Result Computation (DARC) Algorithm for Tertiary Institutions
Data Analysis and Result Computation (DARC) Algorithm for Tertiary InstitutionsData Analysis and Result Computation (DARC) Algorithm for Tertiary Institutions
Data Analysis and Result Computation (DARC) Algorithm for Tertiary InstitutionsIOSR Journals
 
Rd1 r17a19 datawarehousing and mining_cap617t_cap617
Rd1 r17a19 datawarehousing and mining_cap617t_cap617Rd1 r17a19 datawarehousing and mining_cap617t_cap617
Rd1 r17a19 datawarehousing and mining_cap617t_cap617Ravi Kumar
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATESA SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATESijcseit
 
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...IRJET Journal
 
IRJET- Using Data Mining to Predict Students Performance
IRJET-  	  Using Data Mining to Predict Students PerformanceIRJET-  	  Using Data Mining to Predict Students Performance
IRJET- Using Data Mining to Predict Students PerformanceIRJET Journal
 
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELijcsit
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for StudentsIRJET Journal
 
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple ViewsA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Viewscollwe
 
Empirical Study on Classification Algorithm For Evaluation of Students Academ...
Empirical Study on Classification Algorithm For Evaluation of Students Academ...Empirical Study on Classification Algorithm For Evaluation of Students Academ...
Empirical Study on Classification Algorithm For Evaluation of Students Academ...iosrjce
 
Educational data mining using jmp
Educational data mining using jmpEducational data mining using jmp
Educational data mining using jmpijcsit
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES ijcseit
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES ijcseit
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES ijcseit
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES ijcseit
 
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...Editor IJCATR
 
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...Editor IJCATR
 
Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Alexander Decker
 

Similaire à Karen Tran - ENGE 4994 Paper (20)

Data Analysis and Result Computation (DARC) Algorithm for Tertiary Institutions
Data Analysis and Result Computation (DARC) Algorithm for Tertiary InstitutionsData Analysis and Result Computation (DARC) Algorithm for Tertiary Institutions
Data Analysis and Result Computation (DARC) Algorithm for Tertiary Institutions
 
A Mini Research
A Mini ResearchA Mini Research
A Mini Research
 
Rd1 r17a19 datawarehousing and mining_cap617t_cap617
Rd1 r17a19 datawarehousing and mining_cap617t_cap617Rd1 r17a19 datawarehousing and mining_cap617t_cap617
Rd1 r17a19 datawarehousing and mining_cap617t_cap617
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATESA SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
 
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
 
IRJET- Using Data Mining to Predict Students Performance
IRJET-  	  Using Data Mining to Predict Students PerformanceIRJET-  	  Using Data Mining to Predict Students Performance
IRJET- Using Data Mining to Predict Students Performance
 
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for Students
 
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple ViewsA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views
 
Empirical Study on Classification Algorithm For Evaluation of Students Academ...
Empirical Study on Classification Algorithm For Evaluation of Students Academ...Empirical Study on Classification Algorithm For Evaluation of Students Academ...
Empirical Study on Classification Algorithm For Evaluation of Students Academ...
 
K017626773
K017626773K017626773
K017626773
 
Educational data mining using jmp
Educational data mining using jmpEducational data mining using jmp
Educational data mining using jmp
 
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATIONRESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
 
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
A SURVEY OF EMPLOYERS’ NEEDS FOR TECHNICAL AND SOFT SKILLS AMONG NEW GRADUATES
 
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
 
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
 
Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...
 

Karen Tran - ENGE 4994 Paper

  • 1. Tran 1 Karen Tran Dr. Jacob R. Grohs ENGE 4994 8 May 2016 The Effects of Statics Credit on Future Mechanics Courses Objective The goal for this course (ENGE 4994) during the Spring 2016 semester was to hard-code raw data, provided by Virginia Polytechnic Institute and State University, containing a large subset of students taking statics and mechanics courses for a select portion of time (a few years). The data obtained was 165,543 rows of de-identified transcripts, with each student identified by a unique number. The raw data was then converted and coded to a Statistical Package for the Social Sciences (SPSS) file by Dr. Jacob R. Grohs. The true objective was to run different statistical analysis tests on the data to observe and quantify how transferring statics credit may affect the performance of a student in future mechanics courses. However, because the data was quite large, SPSS crashed multiple times trying to run certain tests. This resulted into restructuring the data into a much more powerful and stronger program that could handle these statistical tests: R. Restructuring the data in another programming language (R) became the initial and prominent objective of the course. Of course, the end goal remained to investigate and understand how transferring statics credit affected future mechanics courses such as deforms and dynamics. Other factors that were to be considered were the amount of times a student took a course, how many other credits were they taking during the same semester, their GPA, their
  • 2. Tran 2 major, and much more. There was an endless amount of questions to be proposed and answered using statistical analysis. Overarching Challenge The real overarching challenge was learning how to code in R. Throughout the experience of a typical engineering student at Virginia Polytechnic Institute and State University, the most programming and coding knowledge through degree courses is standard MATLAB (Mathematica) and basic Java. Unless the student is a Computer Science or Computer Engineering major, it poses a slight disadvantage for students of other majors. Restructuring the data into R became task that constantly needed research, Google, manuals, online guides, and YouTube tutorials. The goal was to format each row of data to a unique student number identified as the following: {student_id, student_admit_type, degree1, degree2, degree3, degree4, class1, class2, etc.1 }. Ideally, the goal was to be able to simply call out a unique student or specific category and retrieve all of the information necessary and needed for statistical analysis. The desired statistical analysis tests such as comparisons, hypothesis testing (p-values), and t-tests were going to be the stepping stones to understanding performance in future courses after taking statics (at Virginia Tech or somewhere else). 1 There were 26 different classes.
  • 3. Tran 3 Key Progress While there were many roadblocks and struggles, there were also many movements of progression during the semester. I was able to sort and understand the raw data in a short amount of time (a few weeks) as well as organize and manipulate the data to structure it aesthetically. The original data frame (mydata), at 165,543 rows, was condensed to unique student identifiers in a new data frame (myfinaldata) at 23,364 rows. The next few columns of myfinaldata were built from left to right and contained headers labeled: student_admit_type, degree1, degree2, degree3, and degree4, respectfully. With the header “student_admit_type,” it contained a list of three (freshman or transfer, first term attended, last term attended). The degree headers contained lists of three as well (major, GPA, graduating year). Creating lists within lists was necessary to be able to call out a certain element from a specific header for a distinct student. R: Programming Language Although the raw data was not completely restructured enough to run statistical tests, I learned a great deal about R coding. I was able to code and restructure a good amount of raw data using built in functions and logic. I found it extremely similar to MATLAB, however the syntax was very different and building a data frame was much more extensive than solving a mathematics problem. By learning how to use functions such as naming and assigning variables, “unique,” “str,” and “as.list,” I was able to make lists within each column and could call out pieces (elements) of data for a certain student. This was helpful in a sense that it was possible to
  • 4. Tran 4 isolate certain variables (students, degrees, freshman/transfer) and manipulate them for future analyses.2 Remaining Challenges Still, there are many hurdles to overcome. For students who only had one degree, the placeholders for degree2, degree3, and degree4 were replaced with “NA” for each element (9 NA’s). This created an immense amount of unnecessary space in the data frame which could have been easily fixed by creating a list that could have an unlimited amount of elements within the same column (appending to a list). For example, taken from myfinaldata, there are two students (Student 1 and Student 17) who have a different number of degrees. The data currently looks like this: • [Student1], [Freshman,199909,2015012], [MATH,2.18174603,199907,NA,NA,NA,NA,NA,NA,NA,NA,NA] • [Student17], [Freshman,199809,201401], [CE,2.89947644,200301,ME,2.89947644,201201, NA,NA,NA,NA,NA,NA]. Ideally the rows should project and display like this: • [Student1], [Freshman,199909,2015012], [MATH,2.18174603,199907] • [Student17], [Freshman,199809,201401], [CE,2.89947644,200301,ME,2.89947644,201201]. The list should close and end if there is not anymore information applicable to the unique student. If a student only had one degree, the third header should only contain a single list of three. If a student only had two degrees, the third header should contain two lists of three (a total 2 The code for myfinaldata and screenshot demonstrating how to call out a unique student is provided at the end of this document.
  • 5. Tran 5 of six elements in the degree category). And the pattern should continue for students who had three and four degrees (nine and 12 elements, respectively). Another major challenge that needs to be tackled is a way to enter the data into R so that it can read it line by line. There needs to be a method (while or for loop) to only enter information for each unique student if they have taken a certain class. Because there are 26 classes, it would be redundant to have NA for 25 classes if a student only took one class.
  • 6. Tran 6 Code3 : getwd() mydata <- read.csv("UG Mechanics Raw Transcript Data from IR.csv", header = TRUE) View(mydata) str(mydata) length(unique(mydata$new_id)) myfinaldata = NULL myfinaldata <- unique(mydata[,c("new_id","student_admit_type","first_term_attend","last_term_attend")]) View(myfinaldata) column2_vector1 <- as.vector(myfinaldata$student_admit_type) column2_vector2 <- as.vector(myfinaldata$first_term_attend) column2_vector3 <- as.vector(myfinaldata$last_term_attend) newvariable = NULL newvariable$student_admit_type <- paste(column2_vector1,column2_vector2,column2_vector3, sep = ",") myfinaldata <- data.frame(myfinaldata,newvariable) myfinaldata$student_admit_type <- NULL myfinaldata$first_term_attend <- NULL myfinaldata$last_term_attend <- NULL names(myfinaldata)[1] = "student_id" names(myfinaldata)[2] = "student_admit_type" myfinaldata$student_admit_type <- as.character(myfinaldata$student_admit_type) str(myfinaldata$student_admit_type) as.list(strsplit(myfinaldata$student_admit_type, ",")[[1]]) x <- as.list(strsplit(myfinaldata$student_admit_type, ",")) #CAN CALL THE ELEMENTS OF COLUMN 2 (STUDENT ADMIT TYPE) mydata[mydata==""] <- NA View(mydata) testnull = NULL testnull = unique(mydata[,c("new_id","degree1_major","degree1_gpa","degree1_term")]) myfinaldata <- data.frame(myfinaldata,testnull) myfinaldata$new_id <- NULL column3_vector1 <- as.character(myfinaldata$degree1_major) column3_vector2 <- as.character(myfinaldata$degree1_gpa) column3_vector3 <- as.character(myfinaldata$degree1_term) newvariable3 = NULL newvariable3$degrees <- paste(column3_vector1,column3_vector2,column3_vector3,sep=",") myfinaldata <- data.frame(myfinaldata,newvariable3) myfinaldata$degree1_major <- NULL myfinaldata$degree1_gpa <- NULL myfinaldata$degree1_term <- NULL 3 This code was run in RStudio.
  • 7. Tran 7 names(myfinaldata)[3] = "degree1" myfinaldata$degree1 <- as.character(myfinaldata$degree1) str(myfinaldata$degree1) as.list(strsplit(myfinaldata$degree1, ",")[[1]]) x2 <- as.list(strsplit(myfinaldata$degree1, ",")) #CAN CALL COLUMN 3(1ST DEGREE) col4 = NULL col4 = unique(mydata[,c("new_id","degree2_major","degree2_gpa","degree2_term")]) myfinaldata <- data.frame(myfinaldata,col4) myfinaldata$new_id <- NULL column4_vector1 <- as.character(myfinaldata$degree2_major) column4_vector2 <- as.character(myfinaldata$degree2_gpa) column4_vector3 <- as.character(myfinaldata$degree2_term) newvariable4 = NULL newvariable4$degree2 <- paste(column4_vector1,column4_vector2,column4_vector3,sep=", ") myfinaldata <- data.frame(myfinaldata,newvariable4) myfinaldata$degree2_major <- NULL myfinaldata$degree2_gpa <- NULL myfinaldata$degree2_term <- NULL myfinaldata$degree2 <- as.character(myfinaldata$degree2) str(myfinaldata$degree2) as.list(strsplit(myfinaldata$degree2, ",")[[1]]) x3 <- as.list(strsplit(myfinaldata$degree2, ",")) #CAN CALL COLUMN 4 (2nd DEGREE) col5 = NULL col5 = unique(mydata[,c("new_id","degree3_major","degree3_gpa","degree3_term")]) myfinaldata <- data.frame(myfinaldata,col5) myfinaldata$new_id <- NULL column5_vector1 <- as.character(myfinaldata$degree3_major) column5_vector2 <- as.character(myfinaldata$degree3_gpa) column5_vector3 <- as.character(myfinaldata$degree3_term) newvariable5 = NULL newvariable5$degree3 <- paste(column5_vector1,column5_vector2,column5_vector3,sep=", ") myfinaldata <- data.frame(myfinaldata,newvariable5) myfinaldata$degree3_major <- NULL myfinaldata$degree3_gpa <- NULL myfinaldata$degree3_term <- NULL myfinaldata$degree3 <- as.character(myfinaldata$degree3) str(myfinaldata$degree3) as.list(strsplit(myfinaldata$degree3, ",")[[1]]) x4 <- as.list(strsplit(myfinaldata$degree3, ",")) #CAN CALL COLUMN 4 (3rd DEGREE) col6 = NULL
  • 8. Tran 8 col6 = unique(mydata[,c("new_id","degree4_major","degree4_gpa","degree4_term")]) myfinaldata <- data.frame(myfinaldata,col6) myfinaldata$new_id <- NULL column6_vector1 <- as.character(myfinaldata$degree4_major) column6_vector2 <- as.character(myfinaldata$degree4_gpa) column6_vector3 <- as.character(myfinaldata$degree4_term) newvariable6 = NULL newvariable6$degree4 <- paste(column6_vector1,column6_vector2,column6_vector3,sep=", ") myfinaldata <- data.frame(myfinaldata,newvariable6) myfinaldata$degree4_major <- NULL myfinaldata$degree4_gpa <- NULL myfinaldata$degree4_term <- NULL myfinaldata$degree4 <- as.character(myfinaldata$degree4) str(myfinaldata$degree4) as.list(strsplit(myfinaldata$degree4, ",")[[1]]) x5 <- as.list(strsplit(myfinaldata$degree4, ",")) #CAN CALL COLUMN 5 (4th DEGREE) Screenshots: Figure 1. mydata, myfinaldata - Running Code with Two Outputs
  • 9. Tran 9 Figure 2. mydata - Raw Data Output Figure 3. myfinaldata - Manipulated Data Output
  • 10. Tran 10 Figure 4. myfinaldata - Calling Out Select Students and Categories Analysis of Figure 4 • x = student_admit_type • x2 = degree1 • x3 = degree2 • x4 = degree3 • x5 = degree4 Interpretation: • x2[[2]] = retrieve all information of degree1 for student number two • x2[[2]][[1]] = retrieve the first element of degree1 for student number two • x2[[3]] = retrieve all information of degree1 for student number three • x2[[3]][[2]] = retrieve the second element of degree1 for student number three • x3[[3]] = retrieve all information of degree2 for student number three • x[[5]] = retrieve all information of the student_admit_type for student number five • x[[5]][[1]] = retrieve the first element of the student_admit_type for student number five