SlideShare a Scribd company logo
1 of 4
Download to read offline
et al.

The Journal of Animal & Plant Sciences, 21(2): 2011, Page: J. Anim. Plant Sci. 21(2):2011
182-185
ISSN: 1018-7081

USING OF FACTOR ANALYSIS SCORES IN MULTIPLE LINEAR REGRESSION
MODEL FOR PREDICTION OF KERNEL WEIGHT IN ANKARA WALNUTS
E. Sakar, S. Keskin* and H. Unver**
Deptt. of Horticulture, Faculty of Agriculture Sanluirfa, Turkey, *Deptt. of Biostatistics,
Faculty of Medicine Yuzuncu Yil University, Van, Turkey, ** Kalecik Vocational School, Ankara, Turkey
Corresponding author e-mail: ebru.sakar09@gmail.com

ABSTRACT
Kernel weight is important for plant breeders to select high productive plants. The determination of relationships
between kernel weight and some fruit-kernel characteristics may provide necessary information for plant breeders in
selection programs. In the present study, the relationships between kernel weight (KW) and 7 fruit-kernel characteristics:
Fruit Length, (FL,), Fruit Width (FW) Fruit Height (FH) Fruit Weight (FWe) Shell Thickness (ST), Kernel Ratio (KR)
and Filled-firm Kernel Raito (FKR,), were examined by the combination of factor and multiple linear regression
analyses. Firstly, factor analysis was used to reduce large number of explanatory variables, to remove multicolinearty
problems and to simplify the complex relationships among fruit-kernel characteristics. Then, 3 factors having Eigen
values greater than 1 were selected as independent or explanatory variables and 3 factor scores coefficients were used for
multiple linear regression analysis. As a result, it was found that three factors formed by original variables had
significant effects on kernel weight and these factors together have accounted for 85.9 % of variation in kernel weight.
Key words: Walnut, communality, eigenvalues, varimax rotation, determination coefficient

INTRODUCTION

MATERIALS AND METHODS

The determination of the relationships between
kernel weight and fruit characteristics plays an important
role in plant breeding research. Multiple regression and
factor analyses have been used to interpret the
multivariate relationships between kernel weight and
fruit-kernel characteristics. This statistical tool is useful
for predicting assumed dependent variable. Factor
analysis is applied to a single set of variables to discover
which variables are relatively independent of one another.
It reduces many variables to a few factors. It also
produces several linear combinations of observed
variables which are called as factors. The factors
summarize the pattern of correlations in the observed
data. Because there are normally fewer factors than
observed variables and because factor scores are nearly
uncorrelated, use of factor scores in other analyses may
be very helpful (Tabachnick and Fidell 2001).
The main objective of the present study was;
using 3 a multivariate statistical approach, factor analysis,
to
classify predictor variables according to
interrelationships and to predict kernel weight in Ankara
walnuts. For this purpose, factor analysis scores of 7
fruit-kernel characteristics were used as independent
variables in multiple linear regression model for
prediction of kernel weight.

Data on characteristics: Fruit Length, (FL), Fruit
Width (FW) Fruit Height (FH) Fruit Weight (FWe) Shell
Thickness (ST), Kernel Weight (KW), Kernel Ratio (KR)
and Filled-firm Kernel Raito (FKR), were collected from
365 Ankara walnut samples.
Statistical analysis: Kolmogorov-Simirnov normality
test was applied for all variables. After normality test, it
was determined that all variables were normally
distributed. Factor analysis was performed on 7 fruitkernel characteristics to rank their relative significance
and to describe their interrelation patterns to kernel
weight.
Factor analysis: For the factor analysis, basic factor
analysis equation can be represented in matrix form as:
Zpx1 = pxmFmx1 + px1: where Z is a px1 vector of
variables,  is a pxm matrix of factor loadings F is a mx1
vector of factors and  is a px1 vector of error (unique or
specific) factors (Sharma, 1996). It will be assumed that
factors were not correlated with the error components.
Because of differences in were the units of each variables
used in factor analysis, variables were standardized and
correlation matrix of variables was used to obtain Eigen
values. Loadings were correlation coefficients between
variables and factors. Varimax rotation was used to
facilitate interpretation of factor loadings (Lik).
Coefficients (Cik), were used to obtain factor scores for
selected factors. Factors with Eigen values greater than 1

182
et al.

J. Anim. Plant Sci. 21(2):2011

out of 8 factors were employed in multiple regression
analysis (Sharma, 1996; Tabachnick and Fidell, 2001;
Johnson and Wichern, 2002).

variables and corresponding factors. The greater loading,
the more the variables is pure measure of factor. For
instance, FL, FW, FH and FWe which showed the highest
correlation with Factor 1 were considered as a group.
Similarly, factor 3 showed highest correlation with only
FKR.
The highest values of communalities indicated
that the variances of variables were efficiently reflected
in multiple regression analysis. All the 7 variables were
included in the three selected factors. But only some of
variables possessed high loads within each factor. FL,
FW, FH and FWe possessed the highest loads in Factor 1,
KR and ST in Factor 2, while FKR in Factor 3. Factors
were interpreted from the variables that werehighly
correlated with them. Thus, the first factor primarily has
Fruit measurements; the second factor has primarily
Kernel and Shell measurements while the third factor had
only FKR. For factor 1, samples which scored high fruit
measurements tended to assign high value to FL, FW, FH
and FWe.
Factor score coefficients in Table 3 were used to
obtain Factor score values. Factor score values for
selected three factors were used as independent variables
in multiple linear regression analysis to determine
significant factors for kernel weight. All of the selected
factors (Factor 1, 2 and 3) were found to have significant
linear relationships with kernel weight (p≤0.01). The 85.9
% of variance in kernel weight was explained by Factor
1, 2 and 3 (Table-4).
As seen from Table 4, all three factors had
positive and significant effect on kernel weight. Thus,
kernel weight would to increase when the values of
Factors scores increase. Increase in significant variables
in Factor 1, namely, FL, FW, FH and FWe increase in
kernel weight. Similarly, increasing in Factor results in 3
increases kernel weight. On the other hand, an increase in
significant variables in factor 2 scores, indirectly
increases of KR and decreases ST would bring an
increase in kernel weight.
There is no study included the same variable and
statistical model as in our study. Therefore the results
were discussed with indirectly related studies. Sen (1983)
pointed out that correlation coefficients between FW and
KR as well as between FW and KR were very high in
walnut. In addition, Sen (1985) stated that there is high
significant correlation between KW and FL. Akça and
Sen (1992) underlined that there were statistically
significant correlations between KW and FL, FW, KR.
Firouz and Bayazid (2003) indicated that the
correlation between average small diameter and average
seed weight, kernel weight and percentage, shell weight
was positive and significant. The correlation between
average seed weight, and average kernel weight and
percentage, shell weight and thickness was also positive
and significant. There were significant and positive
correlations between average kernel weight and average

Multiple linear regression analysis: Score values of
selected factors were considered as independent variables
for predicting kernel weight.
The regression equation is presented as;
KW = a + b1FS1 + b2FS2 + b3FS3 + e
where ‘a’ is regression constant (its value is zero), ‘b 1’,
‘b2’ and b3 are regression coefficients of Factor Scores
(FS). FS is factor scores and e is the error term of the
regression model. Regression coefficients were tested by
using t test. Determination coefficient (R2) was used as
predictive success criteria for regression model (Draper
and Smith 1998). All data were analyzed using
MINITAB (ver:14) statistical package (Anonymous,
2000).

RESULTS AND DISCUSSION
Descriptive statistics and Pearson correlation
coefficients for all characteristics are presented in Table 1
and Table 2, respectively. Since most of the correlation
coefficients among variables were significant (P≤0.01 or
P≤0.05). Correlation coefficients may be factorable.
Factor analysis revealed that 3 of the 7 factors
have Eigen values greater than one and were selected as
independent variables for multiple regression model
(Tabachnick and Fidell, 2001; Johnson and Wichern,
2002).
Three selected factors explained 84.84 %
(5.939/7) of the total variation of variables in factor
analysis. Furthermore, communality values for variables
were high. For example, communality for FWe was
94.0%, indicating that 94.0% of the variance in FWe is
accounted for by Factor 1, 2 and 3. The proportion of
variance in the set of variables accounted for by a factor
is the sum of square loading for the factor (SSL)
(variance of factor) divided by the number of variables (if
rotation is orthogonal).
The proportion of variance is (3.187/7) = 0.4552
i.e for the first factor.45.52 % of the variance in the
variables is accounted for by the first factor. The second
and third factor accounts for 23.63 and 15.69 % of total
variation, respectively. Because rotation is orthogonal,
the three factors together accounted for 85.84 % of the
variances in variables. The proportion of variance in the
solution accounted for by a factor, in other words the
proportion of covariance, is the Sum of Loadings for the
factor divided by the sum of communalities.
For the selected three Factors, factor loading and
factor score coefficients are presented in Table 3. After
orthogonal rotation, the values of loading are correlations
between variables and corresponding factors. The bold
marked loads indicate the highest correlations between

183
et al.

J. Anim. Plant Sci. 21(2):2011

shell weight and kernel percentage. The correlation
between average shell weight and average shell thickness
was significant and positive, whereas its correlation with
average kernel percentage was significant and negative.
The results of this study are largely in agreement
with those of previous studies. Yang et al. (2001)
determined nut width, average nut weight, nut shell
thickness, nut kernel percent age, per kernel weight, total
fat content of kernel, total protein content and kernel
yield per m2 tree-crown projection area by principal
component analysis. According to more than 86.29% of
the cumulative variance proportion, the results proposed
four principal components and its function equations
which reflected the main economic characters of walnut.
In the present study three factors together have accounted
for 84.84 % of variation in kernel weight.

Table

1. Descriptive
characteristics

Fruit Length ( FL. mm)
Fruit width (FW. mm)
Fruit height (FH. mm)
Fruit weight (FWe. g)
Kernel weight (KW. g)
Kernel Ratio (KR. %)
Shell Thickness (ST.
mm)
Filled-firm Kernel
Raito (FKR. %)

statistics

for

fruit-kernel

Mean
35.679
29.549
30.975
10.032
4.630
46.450

SE
0.226
0.151
0.173
0.136
0.063
0.341

Mini. Maxi.
26.52 49.85
22.39 39.29
23.17 44.40
4.3
20.2
1.9
8.6
2.86 70.16

1.407

0.013

0.69

2.17

89.950

0.916

0

100

Table 2. Pearson correlation coefficients among all characteristics
FL
FW
FH
FWe
KW
KR
ST
FKR

FL
1
0.661**
0.575**
0.705**
0.654**
-0.106*
0.146**
0.155**

*: p<0.05. **. P<0.01

FW

FH

FWe

KW

KR

ST

FKR

1
0.879**
0.805**
0.793**
-0.046
0.062
0.140**

1
0.823**
0.796**
-0.043
0.120*
0.169**

1
0.867**
-0.218**
0.420**
0.382**

1
0.246**
0.080
0.368**

1
-0.576**
0.030

1
0.132*

1

Table 3. Results of Factor analysis
Variables
FL
FW
FH
FWe
KR
ST
FKR
Variance

Factor Score Coefficients (cik)
Factor1
Factor2
Factor3
0.264
-0.007
0.074
0.331
-0.092
0.129
0.312
-0.072
0.079
0.237
0.099
-0.154
0.037
-0.561
-0.170
-0.057
0.538
-0.105
-0.102
-0.049
-0.944

complex relationships among variables related to kernel
weight. Multivariate models or combining multivariate
and univariate models may be capable to determine
relationships among large number variables. In the
present study, the relationships between kernel weight
and some fruit-kernel characteristics have been examined
using both multivariate and univariate approaches. Thus
it was found that kernel weight can be predicted 85.6 %
successfully by using of some fruit-kernel characteristics.
The success of this approach has not been compared with
other models due to lack of the modeling studies on
kernel weight by using similar variables and approaches.
Furthermore the present model may be useful to eliminate
multicolinearity problems among large number of

Table 4. Results of multiple regression analysis
Predictor
F1
F2
F3

Coef
SE
t
0.856
0.020
43.29
0.161
0.020
-8.17
0.318
0.020
-16.10
S = 0.377
R2-Sq = 85.9%
2
R -Sq(adj) = 85.8% (P<0.01)

Rotated Factor Loadings (lik) and Communalities
Factor1
Factor2
Factor3
Communality
0.091
-0.049
0.651
0.800
-0.028
-0.008
0.903
0.950
0.005
-0.057
0.856
0.923
0.291
-0.302
0.940
0.874
-0.040
-0.120
0.801
0.886
0.104
-0.169
0.814
-0.881
0.122
0.017
0.974
0.979
3.187
1.654
1.098
5.939

p
0.001
0.001
0.001

Conclusion: Various univariate models and approaches
can be used to determine relationships between kernel
weight and some fruit-kernel characteristics for selecting
high productive species in plant breeding programs. But
sometimes these models or approaches may fail to define

184
et al.

J. Anim. Plant Sci. 21(2):2011

variables. In addition, this approach gets easy doing
multiple regression models by reducing large number of
variables and interpretation of multiple regression model
results by removing indirect effect of related explanatory
variables. The present study is one of the pioneer and
studies and hopefully, will be useful for future studies of
similar nature.

Sen, S. M., (1983). Cevizlerde Önemli Meyve Kalite
Faktörleri Arasındaki _liskiler. II. Meyve
Agırlıgı ile Kabuk Kalınlıgı ve Kabuk Kırılması
Arasındaki _liskiler. Atatürk Üniv. Ziraat Fak.
Derg., 14(3): 17-28,
Sen, S. M., (1985). Cevizlerde Kabuk Kalınlıgı, Kabuk
Kırılma Direnci, Kabukta Yapısma ve Kabuk
Dikine Kırılma Direnci ile Diger Bazı Kalite
Faktörleri Arasındaki _liskiler. Doga Bilim
Derg., 9(1): 10-24.
Sharma, S. (1996). Applied Multivariate Techniques,
John Wiley and Sons, Inc., New York, (USA).
493 pp
Tabachnick, B. G. and L. S Fidell (2001). Using
Multivariate Statistics. Allyn and Bacon A
Pearson Education Company Boston, (USA).
966 pp.
Yang, J., B. W. Guo and G, Q. Zhang (2001). The studies
of principal component analysis on the main
economic character and superior variety
selection of walnut. J. Agri. University of Hebei.
DOI: cnki: ISSN:1000-1573.0.2001-04-012.

REFERENCES
Akça, Y. and S. M., Sen (1992). Cevizlerde (Juglans
regia L.) Onemli Seleksiyon Kriterleri arasındaki
Iliskiler. YY Ü. ZF Derg., 2(2): 67- 75.
Anonymous (2000). MINITAB Statistical software,
Minitab Inc. USA.
Draper, N. R. and H. Smith (1998). Applied Regression
Analysis, John Wiley and Sons, Inc., New
York, (USA). 706 pp
Firouz, M., and Y. Bayazid (2003) Evaluatıon of
Walnut's Seed Propertıes in Kurdestan Province.
Iranian J. Forest and Poplar Res. 11(4): 565-584.
Johnson, R. A. and D.W Wichern (2002). Applied
Multivariate Statistical Analysis. Prentice Hall,
upper Saddle River, New Jersey, (USA). 766 pp

185

More Related Content

What's hot

Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
Salford Systems
 
Predicting the age of abalone
Predicting the age of abalonePredicting the age of abalone
Predicting the age of abalone
hyperak
 
Odds ratio dystocia_holstein_cows_iraq
Odds ratio dystocia_holstein_cows_iraqOdds ratio dystocia_holstein_cows_iraq
Odds ratio dystocia_holstein_cows_iraq
Firas Abdulateef
 
Published Version MILHAM & MORGAN NEW EMF EXPOSURE METRIC 5-29-08
Published Version MILHAM & MORGAN NEW EMF EXPOSURE METRIC 5-29-08Published Version MILHAM & MORGAN NEW EMF EXPOSURE METRIC 5-29-08
Published Version MILHAM & MORGAN NEW EMF EXPOSURE METRIC 5-29-08
Lloyd Morgan
 

What's hot (19)

v115n06p523
v115n06p523v115n06p523
v115n06p523
 
Marginal Regression for a Bi-variate Response with Diabetes Mellitus Study
Marginal Regression for a Bi-variate Response with Diabetes Mellitus StudyMarginal Regression for a Bi-variate Response with Diabetes Mellitus Study
Marginal Regression for a Bi-variate Response with Diabetes Mellitus Study
 
Methylation Subtypes of EAC/BE
Methylation Subtypes of EAC/BE Methylation Subtypes of EAC/BE
Methylation Subtypes of EAC/BE
 
Orlu_R N Poster
Orlu_R N PosterOrlu_R N Poster
Orlu_R N Poster
 
Fahim Yasin
Fahim YasinFahim Yasin
Fahim Yasin
 
14; allometry in chelonians
14; allometry in chelonians14; allometry in chelonians
14; allometry in chelonians
 
Development and Validation of prediction for estimating resting energy expend...
Development and Validation of prediction for estimating resting energy expend...Development and Validation of prediction for estimating resting energy expend...
Development and Validation of prediction for estimating resting energy expend...
 
A meta analysis of the use of genetically modified cotton and its conventiona...
A meta analysis of the use of genetically modified cotton and its conventiona...A meta analysis of the use of genetically modified cotton and its conventiona...
A meta analysis of the use of genetically modified cotton and its conventiona...
 
Spatial_final
Spatial_finalSpatial_final
Spatial_final
 
Aijrfans14 285
Aijrfans14 285Aijrfans14 285
Aijrfans14 285
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
 
final paper
final paperfinal paper
final paper
 
09535
0953509535
09535
 
Predicting the age of abalone
Predicting the age of abalonePredicting the age of abalone
Predicting the age of abalone
 
Odds ratio dystocia_holstein_cows_iraq
Odds ratio dystocia_holstein_cows_iraqOdds ratio dystocia_holstein_cows_iraq
Odds ratio dystocia_holstein_cows_iraq
 
Theorist
TheoristTheorist
Theorist
 
Published Version MILHAM & MORGAN NEW EMF EXPOSURE METRIC 5-29-08
Published Version MILHAM & MORGAN NEW EMF EXPOSURE METRIC 5-29-08Published Version MILHAM & MORGAN NEW EMF EXPOSURE METRIC 5-29-08
Published Version MILHAM & MORGAN NEW EMF EXPOSURE METRIC 5-29-08
 
Comparing the CR-3 Injury Severity Categories to Injury Severity Metrics
Comparing the CR-3 Injury Severity Categories to Injury Severity MetricsComparing the CR-3 Injury Severity Categories to Injury Severity Metrics
Comparing the CR-3 Injury Severity Categories to Injury Severity Metrics
 
Markrecapture versus-lengthfrequency-based-methods-evaluation-using-a-marine-...
Markrecapture versus-lengthfrequency-based-methods-evaluation-using-a-marine-...Markrecapture versus-lengthfrequency-based-methods-evaluation-using-a-marine-...
Markrecapture versus-lengthfrequency-based-methods-evaluation-using-a-marine-...
 

Viewers also liked

CC Portfolio-small
CC Portfolio-smallCC Portfolio-small
CC Portfolio-small
Joey Lam
 
Sw homework #9
Sw homework #9Sw homework #9
Sw homework #9
s1200016
 
Responding to common patterns of student thinking in mathematics
Responding to common patterns of student thinking in mathematicsResponding to common patterns of student thinking in mathematics
Responding to common patterns of student thinking in mathematics
xsmseo
 
CC Portfolio-small
CC Portfolio-smallCC Portfolio-small
CC Portfolio-small
Joey Lam
 
Child friendlysummaries-090419003610-phpapp01
Child friendlysummaries-090419003610-phpapp01Child friendlysummaries-090419003610-phpapp01
Child friendlysummaries-090419003610-phpapp01
Hemant Raghuwanshi
 
Comparative Analysis Of Mutual Funds
Comparative Analysis Of Mutual Funds Comparative Analysis Of Mutual Funds
Comparative Analysis Of Mutual Funds
Nikesh Gupta
 

Viewers also liked (20)

Introduction to doe
Introduction to doeIntroduction to doe
Introduction to doe
 
Management Strategies to Facilitate Continual Quality Improvement
Management Strategies to Facilitate Continual Quality ImprovementManagement Strategies to Facilitate Continual Quality Improvement
Management Strategies to Facilitate Continual Quality Improvement
 
CC Portfolio-small
CC Portfolio-smallCC Portfolio-small
CC Portfolio-small
 
A christmas wish list
A christmas wish listA christmas wish list
A christmas wish list
 
Ssari
SsariSsari
Ssari
 
Affects of
Affects ofAffects of
Affects of
 
Competència lectora 1
Competència  lectora 1Competència  lectora 1
Competència lectora 1
 
Las reformas-estructurales-durante-el-gobierno-militar (1)
Las reformas-estructurales-durante-el-gobierno-militar (1)Las reformas-estructurales-durante-el-gobierno-militar (1)
Las reformas-estructurales-durante-el-gobierno-militar (1)
 
Yourprezi1
Yourprezi1Yourprezi1
Yourprezi1
 
Sw homework #9
Sw homework #9Sw homework #9
Sw homework #9
 
Responding to common patterns of student thinking in mathematics
Responding to common patterns of student thinking in mathematicsResponding to common patterns of student thinking in mathematics
Responding to common patterns of student thinking in mathematics
 
Process Validation Guidances: FDA and Global
Process Validation Guidances: FDA and GlobalProcess Validation Guidances: FDA and Global
Process Validation Guidances: FDA and Global
 
CC Portfolio-small
CC Portfolio-smallCC Portfolio-small
CC Portfolio-small
 
Change control and configuration
Change control and configurationChange control and configuration
Change control and configuration
 
Child friendlysummaries-090419003610-phpapp01
Child friendlysummaries-090419003610-phpapp01Child friendlysummaries-090419003610-phpapp01
Child friendlysummaries-090419003610-phpapp01
 
Instalasi Bluetooth Headseat
Instalasi Bluetooth HeadseatInstalasi Bluetooth Headseat
Instalasi Bluetooth Headseat
 
Fda validation inspections
Fda validation inspectionsFda validation inspections
Fda validation inspections
 
Notification Tactics for Improved Notification Tactics For Improved Field Act...
Notification Tactics for Improved Notification Tactics For Improved Field Act...Notification Tactics for Improved Notification Tactics For Improved Field Act...
Notification Tactics for Improved Notification Tactics For Improved Field Act...
 
Comparative Analysis Of Mutual Funds
Comparative Analysis Of Mutual Funds Comparative Analysis Of Mutual Funds
Comparative Analysis Of Mutual Funds
 
Apply Risk Management to Computerized and Automated Systems
Apply Risk Management to Computerized and Automated SystemsApply Risk Management to Computerized and Automated Systems
Apply Risk Management to Computerized and Automated Systems
 

Similar to factore score in reg

applied multivariate statistical techniques in agriculture and plant science 2
applied multivariate statistical techniques in agriculture and plant science 2applied multivariate statistical techniques in agriculture and plant science 2
applied multivariate statistical techniques in agriculture and plant science 2
amir rahmani
 
Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...
Angelina Lessa
 
DIA2016_Comfort_Power Law Scaling AEPs-T43
DIA2016_Comfort_Power Law Scaling AEPs-T43DIA2016_Comfort_Power Law Scaling AEPs-T43
DIA2016_Comfort_Power Law Scaling AEPs-T43
Shaun Comfort
 
Exercises for Chapter 21RESEARCH ARTICLESource Rakel, B., &.docx
Exercises for Chapter 21RESEARCH ARTICLESource Rakel, B., &.docxExercises for Chapter 21RESEARCH ARTICLESource Rakel, B., &.docx
Exercises for Chapter 21RESEARCH ARTICLESource Rakel, B., &.docx
gitagrimston
 
Relationship between energy consumption and gdp in iran
Relationship between energy consumption and gdp in iranRelationship between energy consumption and gdp in iran
Relationship between energy consumption and gdp in iran
Alexander Decker
 

Similar to factore score in reg (20)

Study of relationship between oil quality traits with agromorphological trait...
Study of relationship between oil quality traits with agromorphological trait...Study of relationship between oil quality traits with agromorphological trait...
Study of relationship between oil quality traits with agromorphological trait...
 
Application of multivariate principal component analysis on dimensional reduc...
Application of multivariate principal component analysis on dimensional reduc...Application of multivariate principal component analysis on dimensional reduc...
Application of multivariate principal component analysis on dimensional reduc...
 
Assessment of Water Quality in Imo River Estuary Using Multivariate Statistic...
Assessment of Water Quality in Imo River Estuary Using Multivariate Statistic...Assessment of Water Quality in Imo River Estuary Using Multivariate Statistic...
Assessment of Water Quality in Imo River Estuary Using Multivariate Statistic...
 
applied multivariate statistical techniques in agriculture and plant science 2
applied multivariate statistical techniques in agriculture and plant science 2applied multivariate statistical techniques in agriculture and plant science 2
applied multivariate statistical techniques in agriculture and plant science 2
 
Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...
 
Advanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshayAdvanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshay
 
Overview of Multivariate Statistical Methods
Overview of Multivariate Statistical MethodsOverview of Multivariate Statistical Methods
Overview of Multivariate Statistical Methods
 
DIA2016_Comfort_Power Law Scaling AEPs-T43
DIA2016_Comfort_Power Law Scaling AEPs-T43DIA2016_Comfort_Power Law Scaling AEPs-T43
DIA2016_Comfort_Power Law Scaling AEPs-T43
 
Estimation of genetic parameter for reproduction traits of Dairy cattle in Et...
Estimation of genetic parameter for reproduction traits of Dairy cattle in Et...Estimation of genetic parameter for reproduction traits of Dairy cattle in Et...
Estimation of genetic parameter for reproduction traits of Dairy cattle in Et...
 
Stat214_Chapter 3.pdf
Stat214_Chapter 3.pdfStat214_Chapter 3.pdf
Stat214_Chapter 3.pdf
 
Jcb 2005-12-1103
Jcb 2005-12-1103Jcb 2005-12-1103
Jcb 2005-12-1103
 
quantitative structure activity relationship studies of anti proliferative ac...
quantitative structure activity relationship studies of anti proliferative ac...quantitative structure activity relationship studies of anti proliferative ac...
quantitative structure activity relationship studies of anti proliferative ac...
 
A Bifactor and Itam Response Theory Analysis of the Eating Disorder Inventory-3
A Bifactor and Itam Response Theory Analysis of the Eating Disorder Inventory-3A Bifactor and Itam Response Theory Analysis of the Eating Disorder Inventory-3
A Bifactor and Itam Response Theory Analysis of the Eating Disorder Inventory-3
 
MemoryPDFJNT
MemoryPDFJNTMemoryPDFJNT
MemoryPDFJNT
 
Exercises for Chapter 21RESEARCH ARTICLESource Rakel, B., &.docx
Exercises for Chapter 21RESEARCH ARTICLESource Rakel, B., &.docxExercises for Chapter 21RESEARCH ARTICLESource Rakel, B., &.docx
Exercises for Chapter 21RESEARCH ARTICLESource Rakel, B., &.docx
 
BIOMERTICAL TECHNIQUE FOR STABILITY ANALYSIS SHIV SHANKAR LONIYA 03.pptx
BIOMERTICAL TECHNIQUE FOR STABILITY ANALYSIS SHIV SHANKAR LONIYA 03.pptxBIOMERTICAL TECHNIQUE FOR STABILITY ANALYSIS SHIV SHANKAR LONIYA 03.pptx
BIOMERTICAL TECHNIQUE FOR STABILITY ANALYSIS SHIV SHANKAR LONIYA 03.pptx
 
Final analysis & Discussion_Volen
Final analysis & Discussion_VolenFinal analysis & Discussion_Volen
Final analysis & Discussion_Volen
 
EFA
EFAEFA
EFA
 
Assessment of sustainable food security
Assessment of sustainable food securityAssessment of sustainable food security
Assessment of sustainable food security
 
Relationship between energy consumption and gdp in iran
Relationship between energy consumption and gdp in iranRelationship between energy consumption and gdp in iran
Relationship between energy consumption and gdp in iran
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

factore score in reg

  • 1. et al. The Journal of Animal & Plant Sciences, 21(2): 2011, Page: J. Anim. Plant Sci. 21(2):2011 182-185 ISSN: 1018-7081 USING OF FACTOR ANALYSIS SCORES IN MULTIPLE LINEAR REGRESSION MODEL FOR PREDICTION OF KERNEL WEIGHT IN ANKARA WALNUTS E. Sakar, S. Keskin* and H. Unver** Deptt. of Horticulture, Faculty of Agriculture Sanluirfa, Turkey, *Deptt. of Biostatistics, Faculty of Medicine Yuzuncu Yil University, Van, Turkey, ** Kalecik Vocational School, Ankara, Turkey Corresponding author e-mail: ebru.sakar09@gmail.com ABSTRACT Kernel weight is important for plant breeders to select high productive plants. The determination of relationships between kernel weight and some fruit-kernel characteristics may provide necessary information for plant breeders in selection programs. In the present study, the relationships between kernel weight (KW) and 7 fruit-kernel characteristics: Fruit Length, (FL,), Fruit Width (FW) Fruit Height (FH) Fruit Weight (FWe) Shell Thickness (ST), Kernel Ratio (KR) and Filled-firm Kernel Raito (FKR,), were examined by the combination of factor and multiple linear regression analyses. Firstly, factor analysis was used to reduce large number of explanatory variables, to remove multicolinearty problems and to simplify the complex relationships among fruit-kernel characteristics. Then, 3 factors having Eigen values greater than 1 were selected as independent or explanatory variables and 3 factor scores coefficients were used for multiple linear regression analysis. As a result, it was found that three factors formed by original variables had significant effects on kernel weight and these factors together have accounted for 85.9 % of variation in kernel weight. Key words: Walnut, communality, eigenvalues, varimax rotation, determination coefficient INTRODUCTION MATERIALS AND METHODS The determination of the relationships between kernel weight and fruit characteristics plays an important role in plant breeding research. Multiple regression and factor analyses have been used to interpret the multivariate relationships between kernel weight and fruit-kernel characteristics. This statistical tool is useful for predicting assumed dependent variable. Factor analysis is applied to a single set of variables to discover which variables are relatively independent of one another. It reduces many variables to a few factors. It also produces several linear combinations of observed variables which are called as factors. The factors summarize the pattern of correlations in the observed data. Because there are normally fewer factors than observed variables and because factor scores are nearly uncorrelated, use of factor scores in other analyses may be very helpful (Tabachnick and Fidell 2001). The main objective of the present study was; using 3 a multivariate statistical approach, factor analysis, to classify predictor variables according to interrelationships and to predict kernel weight in Ankara walnuts. For this purpose, factor analysis scores of 7 fruit-kernel characteristics were used as independent variables in multiple linear regression model for prediction of kernel weight. Data on characteristics: Fruit Length, (FL), Fruit Width (FW) Fruit Height (FH) Fruit Weight (FWe) Shell Thickness (ST), Kernel Weight (KW), Kernel Ratio (KR) and Filled-firm Kernel Raito (FKR), were collected from 365 Ankara walnut samples. Statistical analysis: Kolmogorov-Simirnov normality test was applied for all variables. After normality test, it was determined that all variables were normally distributed. Factor analysis was performed on 7 fruitkernel characteristics to rank their relative significance and to describe their interrelation patterns to kernel weight. Factor analysis: For the factor analysis, basic factor analysis equation can be represented in matrix form as: Zpx1 = pxmFmx1 + px1: where Z is a px1 vector of variables,  is a pxm matrix of factor loadings F is a mx1 vector of factors and  is a px1 vector of error (unique or specific) factors (Sharma, 1996). It will be assumed that factors were not correlated with the error components. Because of differences in were the units of each variables used in factor analysis, variables were standardized and correlation matrix of variables was used to obtain Eigen values. Loadings were correlation coefficients between variables and factors. Varimax rotation was used to facilitate interpretation of factor loadings (Lik). Coefficients (Cik), were used to obtain factor scores for selected factors. Factors with Eigen values greater than 1 182
  • 2. et al. J. Anim. Plant Sci. 21(2):2011 out of 8 factors were employed in multiple regression analysis (Sharma, 1996; Tabachnick and Fidell, 2001; Johnson and Wichern, 2002). variables and corresponding factors. The greater loading, the more the variables is pure measure of factor. For instance, FL, FW, FH and FWe which showed the highest correlation with Factor 1 were considered as a group. Similarly, factor 3 showed highest correlation with only FKR. The highest values of communalities indicated that the variances of variables were efficiently reflected in multiple regression analysis. All the 7 variables were included in the three selected factors. But only some of variables possessed high loads within each factor. FL, FW, FH and FWe possessed the highest loads in Factor 1, KR and ST in Factor 2, while FKR in Factor 3. Factors were interpreted from the variables that werehighly correlated with them. Thus, the first factor primarily has Fruit measurements; the second factor has primarily Kernel and Shell measurements while the third factor had only FKR. For factor 1, samples which scored high fruit measurements tended to assign high value to FL, FW, FH and FWe. Factor score coefficients in Table 3 were used to obtain Factor score values. Factor score values for selected three factors were used as independent variables in multiple linear regression analysis to determine significant factors for kernel weight. All of the selected factors (Factor 1, 2 and 3) were found to have significant linear relationships with kernel weight (p≤0.01). The 85.9 % of variance in kernel weight was explained by Factor 1, 2 and 3 (Table-4). As seen from Table 4, all three factors had positive and significant effect on kernel weight. Thus, kernel weight would to increase when the values of Factors scores increase. Increase in significant variables in Factor 1, namely, FL, FW, FH and FWe increase in kernel weight. Similarly, increasing in Factor results in 3 increases kernel weight. On the other hand, an increase in significant variables in factor 2 scores, indirectly increases of KR and decreases ST would bring an increase in kernel weight. There is no study included the same variable and statistical model as in our study. Therefore the results were discussed with indirectly related studies. Sen (1983) pointed out that correlation coefficients between FW and KR as well as between FW and KR were very high in walnut. In addition, Sen (1985) stated that there is high significant correlation between KW and FL. Akça and Sen (1992) underlined that there were statistically significant correlations between KW and FL, FW, KR. Firouz and Bayazid (2003) indicated that the correlation between average small diameter and average seed weight, kernel weight and percentage, shell weight was positive and significant. The correlation between average seed weight, and average kernel weight and percentage, shell weight and thickness was also positive and significant. There were significant and positive correlations between average kernel weight and average Multiple linear regression analysis: Score values of selected factors were considered as independent variables for predicting kernel weight. The regression equation is presented as; KW = a + b1FS1 + b2FS2 + b3FS3 + e where ‘a’ is regression constant (its value is zero), ‘b 1’, ‘b2’ and b3 are regression coefficients of Factor Scores (FS). FS is factor scores and e is the error term of the regression model. Regression coefficients were tested by using t test. Determination coefficient (R2) was used as predictive success criteria for regression model (Draper and Smith 1998). All data were analyzed using MINITAB (ver:14) statistical package (Anonymous, 2000). RESULTS AND DISCUSSION Descriptive statistics and Pearson correlation coefficients for all characteristics are presented in Table 1 and Table 2, respectively. Since most of the correlation coefficients among variables were significant (P≤0.01 or P≤0.05). Correlation coefficients may be factorable. Factor analysis revealed that 3 of the 7 factors have Eigen values greater than one and were selected as independent variables for multiple regression model (Tabachnick and Fidell, 2001; Johnson and Wichern, 2002). Three selected factors explained 84.84 % (5.939/7) of the total variation of variables in factor analysis. Furthermore, communality values for variables were high. For example, communality for FWe was 94.0%, indicating that 94.0% of the variance in FWe is accounted for by Factor 1, 2 and 3. The proportion of variance in the set of variables accounted for by a factor is the sum of square loading for the factor (SSL) (variance of factor) divided by the number of variables (if rotation is orthogonal). The proportion of variance is (3.187/7) = 0.4552 i.e for the first factor.45.52 % of the variance in the variables is accounted for by the first factor. The second and third factor accounts for 23.63 and 15.69 % of total variation, respectively. Because rotation is orthogonal, the three factors together accounted for 85.84 % of the variances in variables. The proportion of variance in the solution accounted for by a factor, in other words the proportion of covariance, is the Sum of Loadings for the factor divided by the sum of communalities. For the selected three Factors, factor loading and factor score coefficients are presented in Table 3. After orthogonal rotation, the values of loading are correlations between variables and corresponding factors. The bold marked loads indicate the highest correlations between 183
  • 3. et al. J. Anim. Plant Sci. 21(2):2011 shell weight and kernel percentage. The correlation between average shell weight and average shell thickness was significant and positive, whereas its correlation with average kernel percentage was significant and negative. The results of this study are largely in agreement with those of previous studies. Yang et al. (2001) determined nut width, average nut weight, nut shell thickness, nut kernel percent age, per kernel weight, total fat content of kernel, total protein content and kernel yield per m2 tree-crown projection area by principal component analysis. According to more than 86.29% of the cumulative variance proportion, the results proposed four principal components and its function equations which reflected the main economic characters of walnut. In the present study three factors together have accounted for 84.84 % of variation in kernel weight. Table 1. Descriptive characteristics Fruit Length ( FL. mm) Fruit width (FW. mm) Fruit height (FH. mm) Fruit weight (FWe. g) Kernel weight (KW. g) Kernel Ratio (KR. %) Shell Thickness (ST. mm) Filled-firm Kernel Raito (FKR. %) statistics for fruit-kernel Mean 35.679 29.549 30.975 10.032 4.630 46.450 SE 0.226 0.151 0.173 0.136 0.063 0.341 Mini. Maxi. 26.52 49.85 22.39 39.29 23.17 44.40 4.3 20.2 1.9 8.6 2.86 70.16 1.407 0.013 0.69 2.17 89.950 0.916 0 100 Table 2. Pearson correlation coefficients among all characteristics FL FW FH FWe KW KR ST FKR FL 1 0.661** 0.575** 0.705** 0.654** -0.106* 0.146** 0.155** *: p<0.05. **. P<0.01 FW FH FWe KW KR ST FKR 1 0.879** 0.805** 0.793** -0.046 0.062 0.140** 1 0.823** 0.796** -0.043 0.120* 0.169** 1 0.867** -0.218** 0.420** 0.382** 1 0.246** 0.080 0.368** 1 -0.576** 0.030 1 0.132* 1 Table 3. Results of Factor analysis Variables FL FW FH FWe KR ST FKR Variance Factor Score Coefficients (cik) Factor1 Factor2 Factor3 0.264 -0.007 0.074 0.331 -0.092 0.129 0.312 -0.072 0.079 0.237 0.099 -0.154 0.037 -0.561 -0.170 -0.057 0.538 -0.105 -0.102 -0.049 -0.944 complex relationships among variables related to kernel weight. Multivariate models or combining multivariate and univariate models may be capable to determine relationships among large number variables. In the present study, the relationships between kernel weight and some fruit-kernel characteristics have been examined using both multivariate and univariate approaches. Thus it was found that kernel weight can be predicted 85.6 % successfully by using of some fruit-kernel characteristics. The success of this approach has not been compared with other models due to lack of the modeling studies on kernel weight by using similar variables and approaches. Furthermore the present model may be useful to eliminate multicolinearity problems among large number of Table 4. Results of multiple regression analysis Predictor F1 F2 F3 Coef SE t 0.856 0.020 43.29 0.161 0.020 -8.17 0.318 0.020 -16.10 S = 0.377 R2-Sq = 85.9% 2 R -Sq(adj) = 85.8% (P<0.01) Rotated Factor Loadings (lik) and Communalities Factor1 Factor2 Factor3 Communality 0.091 -0.049 0.651 0.800 -0.028 -0.008 0.903 0.950 0.005 -0.057 0.856 0.923 0.291 -0.302 0.940 0.874 -0.040 -0.120 0.801 0.886 0.104 -0.169 0.814 -0.881 0.122 0.017 0.974 0.979 3.187 1.654 1.098 5.939 p 0.001 0.001 0.001 Conclusion: Various univariate models and approaches can be used to determine relationships between kernel weight and some fruit-kernel characteristics for selecting high productive species in plant breeding programs. But sometimes these models or approaches may fail to define 184
  • 4. et al. J. Anim. Plant Sci. 21(2):2011 variables. In addition, this approach gets easy doing multiple regression models by reducing large number of variables and interpretation of multiple regression model results by removing indirect effect of related explanatory variables. The present study is one of the pioneer and studies and hopefully, will be useful for future studies of similar nature. Sen, S. M., (1983). Cevizlerde Önemli Meyve Kalite Faktörleri Arasındaki _liskiler. II. Meyve Agırlıgı ile Kabuk Kalınlıgı ve Kabuk Kırılması Arasındaki _liskiler. Atatürk Üniv. Ziraat Fak. Derg., 14(3): 17-28, Sen, S. M., (1985). Cevizlerde Kabuk Kalınlıgı, Kabuk Kırılma Direnci, Kabukta Yapısma ve Kabuk Dikine Kırılma Direnci ile Diger Bazı Kalite Faktörleri Arasındaki _liskiler. Doga Bilim Derg., 9(1): 10-24. Sharma, S. (1996). Applied Multivariate Techniques, John Wiley and Sons, Inc., New York, (USA). 493 pp Tabachnick, B. G. and L. S Fidell (2001). Using Multivariate Statistics. Allyn and Bacon A Pearson Education Company Boston, (USA). 966 pp. Yang, J., B. W. Guo and G, Q. Zhang (2001). The studies of principal component analysis on the main economic character and superior variety selection of walnut. J. Agri. University of Hebei. DOI: cnki: ISSN:1000-1573.0.2001-04-012. REFERENCES Akça, Y. and S. M., Sen (1992). Cevizlerde (Juglans regia L.) Onemli Seleksiyon Kriterleri arasındaki Iliskiler. YY Ü. ZF Derg., 2(2): 67- 75. Anonymous (2000). MINITAB Statistical software, Minitab Inc. USA. Draper, N. R. and H. Smith (1998). Applied Regression Analysis, John Wiley and Sons, Inc., New York, (USA). 706 pp Firouz, M., and Y. Bayazid (2003) Evaluatıon of Walnut's Seed Propertıes in Kurdestan Province. Iranian J. Forest and Poplar Res. 11(4): 565-584. Johnson, R. A. and D.W Wichern (2002). Applied Multivariate Statistical Analysis. Prentice Hall, upper Saddle River, New Jersey, (USA). 766 pp 185