SlideShare a Scribd company logo
1 of 3
Download to read offline
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 264
OPTIMIZATION OF WORKLOAD PREDICTION BASED ON MAP
REDUCE FRAME WORK IN A CLOUD SYSTEM
V.Sivaranjani1
, R.Jayamala2
1
Student, Pervasive Computing Technology, Bharathidasan Institute of Technology,Tamil Nadu, India
2
Assistant Professor, Computer Science and Engineering, Bharathidasan Institute of Technology, Tamil Nadu, India
Abstract
Nowadays cloud computing is emerging Technology. It is used to access anytime and anywhere through the internet. Hadoop is
an open-source Cloud computing environment that implements the Googletm MapReduce framework. Hadoop is a framework for
distributed processing of large datasets across large clusters of computers. This paper proposes the workload of jobs in clusters
mode using Hadoop. MapReduce is a programming model in hadoop used for maintaining the workload of the jobs. Depend on
the job analysis statistics the future workload of the cluster is predicted for potential performance optimization by using genetic
algorithm.
Key Words: Cloud computing, Hadoop Framework, MapReduce Analysis, Workload
--------------------------------------------------------------------***----------------------------------------------------------------------
1. INTRODUCTION
The large scale data processing is very important aspects of
the multimode cluster setup. It is very challenging problem.
The MapReduce framework [1] is proposed by Google
provides an efficient and scalable solution for working
large-scale data. The basic concept of MapReduce
framework is used to distribute the data among many nodes
and process them in parallel manner. Hadoop is a open-
source implementation of MapReduce framework. Hadoop
use the Yahoo, Facebook, Twitter etc.
The MapReduce consists of the two Phases. 1) Map and 2)
Reduce. The Map is used to split the job into several
independent chunks and each chunks assigned to different
computing data node. In the reduce phase, the data is
aggregated, summarized, filtered or combining the given
data. The result is stored in a Distributed File System.
Hadoop[2] is an open-source implementation of a
MapReduce framework. The components of the MapReduce
framework are 1) Job Tracker, 2) Task Tracker, 3) Name
Node 4) Data Node.
The Name Node stores the file system metadata. Which file
are maps to what block locations and which blocks are
stored on which data node. The data node is where the
actual data resides. All data nodes send the heartbeat
messages to name node every 3 seconds to say data nodes
are alive. If name node does not receive the heartbeat
message from data node for 10 minutes, that data node is
dead. All data node talks each other to rebalance the data,
move and copy. The Job Tracker is used to managing the
Task tracker and resource management that is tracking
resource availability and time management of each job. The
Task tracker is pre-configured a number of tasks and accept
of each task. The Job Tracker consists of Job History. Get
the required information from Job History to predict the
future workload.
This paper describe about work load prediction on map
reduce framework. The chapter 2 describes about System
Architecture Design. Chapter 3 describes about Load
prediction. Chapter 4 describes optimization process.
Chapter 5 describes about Implementation and analysis.
Chapter 6 describes Conclusion and Future work.
2. SYSTEM ARCHITECTURE DESIGN
The Job executes in cluster setup to get the job history
information from the job tracker. The architecture design of
the optimization of workload prediction based on the map
reduce framework in a cloud system.
Fig- 1. Represents the MapReduce framework consists of
different components are Name Node, Job Tracker and Task
Tracker. The Name node stores the file in a distribute file
system. The Job Tracker monitoring the resource
availability and resource management of MapReduce
framework.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 265
Fig -1: System Architecture Design
The Job Tracker consists of two phases. 1) Logs and Job
History. Job History maintaining the past job description
and provides different parameters like number of nodes in a
cluster, number of the jobs, job Id, execution time and
memory usage of each job etc. The Task Tracker is where
the data is store resides and maintains data node
information.
This paper proposes the prediction tracker component in
MapReduce framework. The prediction tracker consists of
two components 1) Analysis 2) GA (Genetic
Algorithm).The analysis component get the job history
related information from the Job Tracker. The GA is used to
predict the future workload in optimized manner.
3. LOAD PREDICTION PROCESS
The load prediction mainly focuses the prediction tracker.
The Analysis components of prediction tracker acquire the
require job history information from the Job Tracker. The
genetic algorithm is used to get the optimized solution for
workload prediction based on the historical data.
The description of paper is listed as follows.
 Collect the workload of each job from the
Hadoop cluster.
 Analysis the workload of each job
 Based the results, optimization performance is
evaluated.
The trace file [3] of the job tracker data are JobID (a unique
job identifier), job status (successful, failed or killed), job
submission time, job launch time, job finish time, the
number of map tasks, the number of reduce tasks, total
duration of map tasks, total duration of reduce tasks,
read/write bytes on HDFS (Hadoop Distributed File
System), read/write bytes on local disks.
4. OPTIMIZATION PROCESS
Hadoop framework gives the trace file of the job tracker to
get the job submission time. Prediction process [3][4] is
based on the job submission time, duration of job
completion time.
Forecast (Prediction) is an essential aspect of managing any
organization is planning for the future. It is used to
determine future inventory, costs, capacities and interest rate
changes. There the two basic approaches of forecasting:
qualitative approach, quantitative approach [6]. Qualitative
approach is subjective, they are appropriate when past data
are not available. Quantitative approach is used to forecast
future data when past data are available.
This paper focuses on quantitative approach, based on an
analysis of historical data which consider time series. A time
series is set of observations measured at successive points in
time. Time series is used to predict future values based on
previously observed value [7].
Genetic algorithm is used to find the predicted value using
historical data[8]. First step of the algorithm, select the
population depends upon the original data element. Each
element converted to the binary number to make a binary
string or chromosome. The crossover point is selected and
performs the crossover process and mutation process.
Binary strings are converted to the real value. All actual
value is converted to the binary strings or chromosomes.
Operators of the genetic algorithm are three type’s selection,
crossover and mutation.
The genetic algorithm [9] is used to
1. Initialize the population with random individuals.
2. Evaluate the fitness value of the individuals.
3. Select good solutions by using s-wise tournament
selection without replacement
4. Create new individuals by recombining the selected
population using single point crossover
5. Evaluate the fitness value of all offspring.
6. Repeat steps 3–5 until some convergence criteria are met.
Calculate the error rate using mean absolute percentage
error. The mean absolute percentage error (MAPE) is also
known as mean absolute percentage deviation (MAPD). It is
a measure the accurate method for constructing acceptable
time series values in statistics. The formula of MAPE
Prediction Tracker
Analysis GA
MAPREDUCE FRAMEWORK
Name
Node
Job Tracker
Log History
Task
Tracker
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 266
𝑀 =
1
𝑛
⃒
At − Ft
At
𝑛
𝑡=0
⃒
At - Actual value
Ft- Forecast value
n – Number of absolute value.
M – Mean Absolute percentage Error
5. IMPLEMENTATION AND ANALYSIS
In this paper, hadoop framework is installed in ubuntu
operating system. Job history detail inferred from the job
tracker with time series based. Table -1 represents the error
rate of workload prediction.
Table -1: Example of Error value calculation
SI.NO Predicted Value Actual Value Error Rate
1 12 15 0.2
2 15 14 0.07142
3 4 5 0.2
MAPE error rate(%) 9.04733
6. CONCLUSION AND FUTURE WORK
In this paper, we have presented the analysis of Hadoop
trace derived from a single-node production Hadoop cluster.
The trace covers the jobs execution files. In the future, we
plan to work on the implications derived from this work and
integrate them into the multi node cluster in real time.
REFERENCES
[1]. J. Dean and S. Ghemawat, “Mapreduce: Simplified
data processing on large clusters,” in OSDI, 2004,
pp. 137–150.
[2]. T. White, Hadoop - The Definitive Guide. O’Reilly,
2009.
[3]. Zujie Ren, Xianghua Xu, Jian Wan et.al “Workload
Characterization on a Production Hadoop Cluster:
A Case Study on Taobao” Proceedings of the 2012
IEEE International Symposium on Workload
Characterization, 2012.
[4]. Sheng Di, Cho-Li Wang, “Error-Tolerant Resource
Allocation and Payment Minimization for Cloud
System” Proc. IEEE Transactions on parallel and
distributed systems, VOL. 24, NO. 6, 2013, pp-
1097-1106.
[5]. Zhen Xiao, Weijia Song, and Qi Chen ”Dynamic
Resource Allocation Using VirtualMachines for
Cloud Computing Environment” proc. IEEE
Transactions on parallel and distributed systems,
VOL. 24, NO. 6, JUNE 2013, pp. 1107-1117.
[6]. http://www.wikipwedia.com/wiki/Time_series.
[7]. Sam Mahfound and Ganesh Mani “Financial
Forecasting Using Genetic Algorithms”
http://citeseerx.ist.psu.edu/viewdoc/download?doi=
10.1.1.86.9698&rep=rep1&type=pdf.
[8]. Satyendra, ArghyaGhosh, Subhojit Roy, J. Pal
Choudhury, S. R. Bhadra Chaudhuri “A Novel
Approach of Genetic Algorithm in Prediction of
Time Series Data” in Proc of Special issues of
international journal of computer application
(ACCTHPCA), June 2012.
[9]. Abhishek Verma, Xavier Llora, David E. Goldberg
and Roy H. Campbell,“Scaling GeneticAlgorithms
using MapReduce” Proceedings of journal of
cluster computing, special issue, 2011.
BIOGRAPHIES
V.Sivaranjani is a student,of M.E in
Pervasive Computing Technology at
Bharathidasan Institute of
Technology. Her current research
focuses on the cloud computing and
parallel computing.
Mrs.R.Jayamala, Asst. Professor
under the Department of Computer
Science and Engineering at
Bharathidasan Institute of
Technology. Her research focuses
on the cloud computing and
Networks.

More Related Content

What's hot

Heuristics based multi queue job scheduling for cloud computing environment
Heuristics based multi queue job scheduling for cloud computing environmentHeuristics based multi queue job scheduling for cloud computing environment
Heuristics based multi queue job scheduling for cloud computing environment
eSAT Journals
 
Fusion method used to tolerate the faults occurred in disrtibuted system
Fusion method used to tolerate the faults occurred in disrtibuted systemFusion method used to tolerate the faults occurred in disrtibuted system
Fusion method used to tolerate the faults occurred in disrtibuted system
eSAT Publishing House
 
Ijarcet vol-2-issue-3-904-915
Ijarcet vol-2-issue-3-904-915Ijarcet vol-2-issue-3-904-915
Ijarcet vol-2-issue-3-904-915
Editor IJARCET
 

What's hot (19)

A Survey of Job Scheduling Algorithms Whit Hierarchical Structure to Load Ba...
A Survey of Job Scheduling Algorithms Whit  Hierarchical Structure to Load Ba...A Survey of Job Scheduling Algorithms Whit  Hierarchical Structure to Load Ba...
A Survey of Job Scheduling Algorithms Whit Hierarchical Structure to Load Ba...
 
Optimized Access Strategies for a Distributed Database Design
Optimized Access Strategies for a Distributed Database DesignOptimized Access Strategies for a Distributed Database Design
Optimized Access Strategies for a Distributed Database Design
 
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
 
Heuristics based multi queue job scheduling for cloud computing environment
Heuristics based multi queue job scheduling for cloud computing environmentHeuristics based multi queue job scheduling for cloud computing environment
Heuristics based multi queue job scheduling for cloud computing environment
 
Comparative Analysis of Various Grid Based Scheduling Algorithms
Comparative Analysis of Various Grid Based Scheduling AlgorithmsComparative Analysis of Various Grid Based Scheduling Algorithms
Comparative Analysis of Various Grid Based Scheduling Algorithms
 
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
 
Dynamically Partitioning Big Data Using Virtual Machine Mapping
Dynamically Partitioning Big Data Using Virtual Machine MappingDynamically Partitioning Big Data Using Virtual Machine Mapping
Dynamically Partitioning Big Data Using Virtual Machine Mapping
 
Job Resource Ratio Based Priority Driven Scheduling in Cloud Computing
Job Resource Ratio Based Priority Driven Scheduling in Cloud ComputingJob Resource Ratio Based Priority Driven Scheduling in Cloud Computing
Job Resource Ratio Based Priority Driven Scheduling in Cloud Computing
 
DGBSA : A BATCH JOB SCHEDULINGALGORITHM WITH GA WITH REGARD TO THE THRESHOLD ...
DGBSA : A BATCH JOB SCHEDULINGALGORITHM WITH GA WITH REGARD TO THE THRESHOLD ...DGBSA : A BATCH JOB SCHEDULINGALGORITHM WITH GA WITH REGARD TO THE THRESHOLD ...
DGBSA : A BATCH JOB SCHEDULINGALGORITHM WITH GA WITH REGARD TO THE THRESHOLD ...
 
J0210053057
J0210053057J0210053057
J0210053057
 
Proposing a New Job Scheduling Algorithm in Grid Environment Using a Combinat...
Proposing a New Job Scheduling Algorithm in Grid Environment Using a Combinat...Proposing a New Job Scheduling Algorithm in Grid Environment Using a Combinat...
Proposing a New Job Scheduling Algorithm in Grid Environment Using a Combinat...
 
Fusion method used to tolerate the faults occurred in disrtibuted system
Fusion method used to tolerate the faults occurred in disrtibuted systemFusion method used to tolerate the faults occurred in disrtibuted system
Fusion method used to tolerate the faults occurred in disrtibuted system
 
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
 
Document retrieval using clustering
Document retrieval using clusteringDocument retrieval using clustering
Document retrieval using clustering
 
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
 
Ijarcet vol-2-issue-3-904-915
Ijarcet vol-2-issue-3-904-915Ijarcet vol-2-issue-3-904-915
Ijarcet vol-2-issue-3-904-915
 
Distributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data AnalysisDistributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data Analysis
 
Hybrid Approach for Intrusion Detection Model Using Combination of K-Means Cl...
Hybrid Approach for Intrusion Detection Model Using Combination of K-Means Cl...Hybrid Approach for Intrusion Detection Model Using Combination of K-Means Cl...
Hybrid Approach for Intrusion Detection Model Using Combination of K-Means Cl...
 
Energy efficient task scheduling algorithms for cloud data centers
Energy efficient task scheduling algorithms for cloud data centersEnergy efficient task scheduling algorithms for cloud data centers
Energy efficient task scheduling algorithms for cloud data centers
 

Viewers also liked

Design and characterization of various shapes of microcantilever for human im...
Design and characterization of various shapes of microcantilever for human im...Design and characterization of various shapes of microcantilever for human im...
Design and characterization of various shapes of microcantilever for human im...
eSAT Publishing House
 
A challenge for security and service level agreement in cloud computinge
A challenge for security and service level agreement in cloud computingeA challenge for security and service level agreement in cloud computinge
A challenge for security and service level agreement in cloud computinge
eSAT Publishing House
 
A comparative study on road traffic management systems
A comparative study on road traffic management systemsA comparative study on road traffic management systems
A comparative study on road traffic management systems
eSAT Publishing House
 
Automated water head controller for domestic application
Automated water head controller for domestic applicationAutomated water head controller for domestic application
Automated water head controller for domestic application
eSAT Publishing House
 
Mac protocols for cooperative diversity in wlan
Mac protocols for cooperative diversity in wlanMac protocols for cooperative diversity in wlan
Mac protocols for cooperative diversity in wlan
eSAT Publishing House
 
Dsp based implementation of field oriented control of
Dsp based implementation of field oriented control ofDsp based implementation of field oriented control of
Dsp based implementation of field oriented control of
eSAT Publishing House
 

Viewers also liked (20)

Mobility management in heterogeneous wireless networks
Mobility management in heterogeneous wireless networksMobility management in heterogeneous wireless networks
Mobility management in heterogeneous wireless networks
 
Design and characterization of various shapes of microcantilever for human im...
Design and characterization of various shapes of microcantilever for human im...Design and characterization of various shapes of microcantilever for human im...
Design and characterization of various shapes of microcantilever for human im...
 
A challenge for security and service level agreement in cloud computinge
A challenge for security and service level agreement in cloud computingeA challenge for security and service level agreement in cloud computinge
A challenge for security and service level agreement in cloud computinge
 
Optimization of energy use intensity in a design build framework
Optimization of energy use intensity in a design build frameworkOptimization of energy use intensity in a design build framework
Optimization of energy use intensity in a design build framework
 
A comparative study on road traffic management systems
A comparative study on road traffic management systemsA comparative study on road traffic management systems
A comparative study on road traffic management systems
 
Adaptive transmit diversity selection (atds) based on stbc and sfbc fir 2 x1 ...
Adaptive transmit diversity selection (atds) based on stbc and sfbc fir 2 x1 ...Adaptive transmit diversity selection (atds) based on stbc and sfbc fir 2 x1 ...
Adaptive transmit diversity selection (atds) based on stbc and sfbc fir 2 x1 ...
 
Automated water head controller for domestic application
Automated water head controller for domestic applicationAutomated water head controller for domestic application
Automated water head controller for domestic application
 
On generating functions of biorthogonal polynomials
On generating functions of biorthogonal polynomialsOn generating functions of biorthogonal polynomials
On generating functions of biorthogonal polynomials
 
Comparison of data security in grid and cloud
Comparison of data security in grid and cloudComparison of data security in grid and cloud
Comparison of data security in grid and cloud
 
Importance of post processing for improved binarization of text documents
Importance of post processing for improved binarization of text documentsImportance of post processing for improved binarization of text documents
Importance of post processing for improved binarization of text documents
 
Novel model for rural housing development
Novel model for rural housing developmentNovel model for rural housing development
Novel model for rural housing development
 
Mac protocols for cooperative diversity in wlan
Mac protocols for cooperative diversity in wlanMac protocols for cooperative diversity in wlan
Mac protocols for cooperative diversity in wlan
 
Impact of power electronics on global warming
Impact of power electronics on global warmingImpact of power electronics on global warming
Impact of power electronics on global warming
 
Wear behaviour of si c reinforced al6061 alloy metal matrix composites by usi...
Wear behaviour of si c reinforced al6061 alloy metal matrix composites by usi...Wear behaviour of si c reinforced al6061 alloy metal matrix composites by usi...
Wear behaviour of si c reinforced al6061 alloy metal matrix composites by usi...
 
Intelligent location tracking scheme for handling user’s mobility
Intelligent location tracking scheme for handling user’s mobilityIntelligent location tracking scheme for handling user’s mobility
Intelligent location tracking scheme for handling user’s mobility
 
Td ams processing for vlsi implementation of ldpc decoder
Td ams processing for vlsi implementation of ldpc decoderTd ams processing for vlsi implementation of ldpc decoder
Td ams processing for vlsi implementation of ldpc decoder
 
An extended database reverse engineering – a key for database forensic invest...
An extended database reverse engineering – a key for database forensic invest...An extended database reverse engineering – a key for database forensic invest...
An extended database reverse engineering – a key for database forensic invest...
 
Degradation of mono azo dye in aqueous solution using
Degradation of mono azo dye in aqueous solution usingDegradation of mono azo dye in aqueous solution using
Degradation of mono azo dye in aqueous solution using
 
Hybrid aco iwd optimization algorithm for minimizing weighted flowtime in clo...
Hybrid aco iwd optimization algorithm for minimizing weighted flowtime in clo...Hybrid aco iwd optimization algorithm for minimizing weighted flowtime in clo...
Hybrid aco iwd optimization algorithm for minimizing weighted flowtime in clo...
 
Dsp based implementation of field oriented control of
Dsp based implementation of field oriented control ofDsp based implementation of field oriented control of
Dsp based implementation of field oriented control of
 

Similar to Optimization of workload prediction based on map reduce frame work in a cloud system

Anomaly detection in the services provided by multi cloud architectures a survey
Anomaly detection in the services provided by multi cloud architectures a surveyAnomaly detection in the services provided by multi cloud architectures a survey
Anomaly detection in the services provided by multi cloud architectures a survey
eSAT Publishing House
 
Parametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmParametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithm
IAEME Publication
 
GROUPING BASED JOB SCHEDULING ALGORITHM USING PRIORITY QUEUE AND HYBRID ALGOR...
GROUPING BASED JOB SCHEDULING ALGORITHM USING PRIORITY QUEUE AND HYBRID ALGOR...GROUPING BASED JOB SCHEDULING ALGORITHM USING PRIORITY QUEUE AND HYBRID ALGOR...
GROUPING BASED JOB SCHEDULING ALGORITHM USING PRIORITY QUEUE AND HYBRID ALGOR...
ijgca
 
cloudworkloadanalysisandsimulation-140521153543-phpapp02
cloudworkloadanalysisandsimulation-140521153543-phpapp02cloudworkloadanalysisandsimulation-140521153543-phpapp02
cloudworkloadanalysisandsimulation-140521153543-phpapp02
PRIYANKA MEHTA
 

Similar to Optimization of workload prediction based on map reduce frame work in a cloud system (20)

Survey of streaming data warehouse update scheduling
Survey of streaming data warehouse update schedulingSurvey of streaming data warehouse update scheduling
Survey of streaming data warehouse update scheduling
 
Bragged Regression Tree Algorithm for Dynamic Distribution and Scheduling of ...
Bragged Regression Tree Algorithm for Dynamic Distribution and Scheduling of ...Bragged Regression Tree Algorithm for Dynamic Distribution and Scheduling of ...
Bragged Regression Tree Algorithm for Dynamic Distribution and Scheduling of ...
 
Use of genetic algorithm for
Use of genetic algorithm forUse of genetic algorithm for
Use of genetic algorithm for
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big data
 
Anomaly detection in the services provided by multi cloud architectures a survey
Anomaly detection in the services provided by multi cloud architectures a surveyAnomaly detection in the services provided by multi cloud architectures a survey
Anomaly detection in the services provided by multi cloud architectures a survey
 
A survey of various scheduling algorithm in cloud computing environment
A survey of various scheduling algorithm in cloud computing environmentA survey of various scheduling algorithm in cloud computing environment
A survey of various scheduling algorithm in cloud computing environment
 
A customized task scheduling in cloud using genetic algorithm
A customized task scheduling in cloud using genetic algorithmA customized task scheduling in cloud using genetic algorithm
A customized task scheduling in cloud using genetic algorithm
 
Data mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configurationData mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configuration
 
Data Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological DataData Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological Data
 
Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in Cloud
 
Parametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmParametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithm
 
C017241316
C017241316C017241316
C017241316
 
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data Analysis
 
GROUPING BASED JOB SCHEDULING ALGORITHM USING PRIORITY QUEUE AND HYBRID ALGOR...
GROUPING BASED JOB SCHEDULING ALGORITHM USING PRIORITY QUEUE AND HYBRID ALGOR...GROUPING BASED JOB SCHEDULING ALGORITHM USING PRIORITY QUEUE AND HYBRID ALGOR...
GROUPING BASED JOB SCHEDULING ALGORITHM USING PRIORITY QUEUE AND HYBRID ALGOR...
 
A Survey on Heuristic Based Techniques in Cloud Computing
A Survey on Heuristic Based Techniques in Cloud ComputingA Survey on Heuristic Based Techniques in Cloud Computing
A Survey on Heuristic Based Techniques in Cloud Computing
 
cloudworkloadanalysisandsimulation-140521153543-phpapp02
cloudworkloadanalysisandsimulation-140521153543-phpapp02cloudworkloadanalysisandsimulation-140521153543-phpapp02
cloudworkloadanalysisandsimulation-140521153543-phpapp02
 
Enhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterEnhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop Cluster
 
H04564550
H04564550H04564550
H04564550
 

More from eSAT Publishing House

Likely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnamLikely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnam
eSAT Publishing House
 
Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...
eSAT Publishing House
 
Hudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnamHudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnam
eSAT Publishing House
 
Groundwater investigation using geophysical methods a case study of pydibhim...
Groundwater investigation using geophysical methods  a case study of pydibhim...Groundwater investigation using geophysical methods  a case study of pydibhim...
Groundwater investigation using geophysical methods a case study of pydibhim...
eSAT Publishing House
 
Flood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaFlood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, india
eSAT Publishing House
 
Enhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingEnhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity building
eSAT Publishing House
 
Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...
eSAT Publishing House
 
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
eSAT Publishing House
 
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
eSAT Publishing House
 
Shear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a reviewShear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a review
eSAT Publishing House
 
Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...
eSAT Publishing House
 
Risk analysis and environmental hazard management
Risk analysis and environmental hazard managementRisk analysis and environmental hazard management
Risk analysis and environmental hazard management
eSAT Publishing House
 
Review study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallsReview study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear walls
eSAT Publishing House
 
Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...
eSAT Publishing House
 
Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...
eSAT Publishing House
 
Coastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaCoastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of india
eSAT Publishing House
 
Can fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structuresCan fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structures
eSAT Publishing House
 
Assessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingsAssessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildings
eSAT Publishing House
 
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
eSAT Publishing House
 
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
eSAT Publishing House
 

More from eSAT Publishing House (20)

Likely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnamLikely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnam
 
Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...
 
Hudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnamHudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnam
 
Groundwater investigation using geophysical methods a case study of pydibhim...
Groundwater investigation using geophysical methods  a case study of pydibhim...Groundwater investigation using geophysical methods  a case study of pydibhim...
Groundwater investigation using geophysical methods a case study of pydibhim...
 
Flood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaFlood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, india
 
Enhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingEnhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity building
 
Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...
 
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
 
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
 
Shear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a reviewShear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a review
 
Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...
 
Risk analysis and environmental hazard management
Risk analysis and environmental hazard managementRisk analysis and environmental hazard management
Risk analysis and environmental hazard management
 
Review study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallsReview study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear walls
 
Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...
 
Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...
 
Coastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaCoastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of india
 
Can fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structuresCan fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structures
 
Assessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingsAssessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildings
 
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
 
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
 

Recently uploaded

Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptx
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 

Optimization of workload prediction based on map reduce frame work in a cloud system

  • 1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 264 OPTIMIZATION OF WORKLOAD PREDICTION BASED ON MAP REDUCE FRAME WORK IN A CLOUD SYSTEM V.Sivaranjani1 , R.Jayamala2 1 Student, Pervasive Computing Technology, Bharathidasan Institute of Technology,Tamil Nadu, India 2 Assistant Professor, Computer Science and Engineering, Bharathidasan Institute of Technology, Tamil Nadu, India Abstract Nowadays cloud computing is emerging Technology. It is used to access anytime and anywhere through the internet. Hadoop is an open-source Cloud computing environment that implements the Googletm MapReduce framework. Hadoop is a framework for distributed processing of large datasets across large clusters of computers. This paper proposes the workload of jobs in clusters mode using Hadoop. MapReduce is a programming model in hadoop used for maintaining the workload of the jobs. Depend on the job analysis statistics the future workload of the cluster is predicted for potential performance optimization by using genetic algorithm. Key Words: Cloud computing, Hadoop Framework, MapReduce Analysis, Workload --------------------------------------------------------------------***---------------------------------------------------------------------- 1. INTRODUCTION The large scale data processing is very important aspects of the multimode cluster setup. It is very challenging problem. The MapReduce framework [1] is proposed by Google provides an efficient and scalable solution for working large-scale data. The basic concept of MapReduce framework is used to distribute the data among many nodes and process them in parallel manner. Hadoop is a open- source implementation of MapReduce framework. Hadoop use the Yahoo, Facebook, Twitter etc. The MapReduce consists of the two Phases. 1) Map and 2) Reduce. The Map is used to split the job into several independent chunks and each chunks assigned to different computing data node. In the reduce phase, the data is aggregated, summarized, filtered or combining the given data. The result is stored in a Distributed File System. Hadoop[2] is an open-source implementation of a MapReduce framework. The components of the MapReduce framework are 1) Job Tracker, 2) Task Tracker, 3) Name Node 4) Data Node. The Name Node stores the file system metadata. Which file are maps to what block locations and which blocks are stored on which data node. The data node is where the actual data resides. All data nodes send the heartbeat messages to name node every 3 seconds to say data nodes are alive. If name node does not receive the heartbeat message from data node for 10 minutes, that data node is dead. All data node talks each other to rebalance the data, move and copy. The Job Tracker is used to managing the Task tracker and resource management that is tracking resource availability and time management of each job. The Task tracker is pre-configured a number of tasks and accept of each task. The Job Tracker consists of Job History. Get the required information from Job History to predict the future workload. This paper describe about work load prediction on map reduce framework. The chapter 2 describes about System Architecture Design. Chapter 3 describes about Load prediction. Chapter 4 describes optimization process. Chapter 5 describes about Implementation and analysis. Chapter 6 describes Conclusion and Future work. 2. SYSTEM ARCHITECTURE DESIGN The Job executes in cluster setup to get the job history information from the job tracker. The architecture design of the optimization of workload prediction based on the map reduce framework in a cloud system. Fig- 1. Represents the MapReduce framework consists of different components are Name Node, Job Tracker and Task Tracker. The Name node stores the file in a distribute file system. The Job Tracker monitoring the resource availability and resource management of MapReduce framework.
  • 2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 265 Fig -1: System Architecture Design The Job Tracker consists of two phases. 1) Logs and Job History. Job History maintaining the past job description and provides different parameters like number of nodes in a cluster, number of the jobs, job Id, execution time and memory usage of each job etc. The Task Tracker is where the data is store resides and maintains data node information. This paper proposes the prediction tracker component in MapReduce framework. The prediction tracker consists of two components 1) Analysis 2) GA (Genetic Algorithm).The analysis component get the job history related information from the Job Tracker. The GA is used to predict the future workload in optimized manner. 3. LOAD PREDICTION PROCESS The load prediction mainly focuses the prediction tracker. The Analysis components of prediction tracker acquire the require job history information from the Job Tracker. The genetic algorithm is used to get the optimized solution for workload prediction based on the historical data. The description of paper is listed as follows.  Collect the workload of each job from the Hadoop cluster.  Analysis the workload of each job  Based the results, optimization performance is evaluated. The trace file [3] of the job tracker data are JobID (a unique job identifier), job status (successful, failed or killed), job submission time, job launch time, job finish time, the number of map tasks, the number of reduce tasks, total duration of map tasks, total duration of reduce tasks, read/write bytes on HDFS (Hadoop Distributed File System), read/write bytes on local disks. 4. OPTIMIZATION PROCESS Hadoop framework gives the trace file of the job tracker to get the job submission time. Prediction process [3][4] is based on the job submission time, duration of job completion time. Forecast (Prediction) is an essential aspect of managing any organization is planning for the future. It is used to determine future inventory, costs, capacities and interest rate changes. There the two basic approaches of forecasting: qualitative approach, quantitative approach [6]. Qualitative approach is subjective, they are appropriate when past data are not available. Quantitative approach is used to forecast future data when past data are available. This paper focuses on quantitative approach, based on an analysis of historical data which consider time series. A time series is set of observations measured at successive points in time. Time series is used to predict future values based on previously observed value [7]. Genetic algorithm is used to find the predicted value using historical data[8]. First step of the algorithm, select the population depends upon the original data element. Each element converted to the binary number to make a binary string or chromosome. The crossover point is selected and performs the crossover process and mutation process. Binary strings are converted to the real value. All actual value is converted to the binary strings or chromosomes. Operators of the genetic algorithm are three type’s selection, crossover and mutation. The genetic algorithm [9] is used to 1. Initialize the population with random individuals. 2. Evaluate the fitness value of the individuals. 3. Select good solutions by using s-wise tournament selection without replacement 4. Create new individuals by recombining the selected population using single point crossover 5. Evaluate the fitness value of all offspring. 6. Repeat steps 3–5 until some convergence criteria are met. Calculate the error rate using mean absolute percentage error. The mean absolute percentage error (MAPE) is also known as mean absolute percentage deviation (MAPD). It is a measure the accurate method for constructing acceptable time series values in statistics. The formula of MAPE Prediction Tracker Analysis GA MAPREDUCE FRAMEWORK Name Node Job Tracker Log History Task Tracker
  • 3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 266 𝑀 = 1 𝑛 ⃒ At − Ft At 𝑛 𝑡=0 ⃒ At - Actual value Ft- Forecast value n – Number of absolute value. M – Mean Absolute percentage Error 5. IMPLEMENTATION AND ANALYSIS In this paper, hadoop framework is installed in ubuntu operating system. Job history detail inferred from the job tracker with time series based. Table -1 represents the error rate of workload prediction. Table -1: Example of Error value calculation SI.NO Predicted Value Actual Value Error Rate 1 12 15 0.2 2 15 14 0.07142 3 4 5 0.2 MAPE error rate(%) 9.04733 6. CONCLUSION AND FUTURE WORK In this paper, we have presented the analysis of Hadoop trace derived from a single-node production Hadoop cluster. The trace covers the jobs execution files. In the future, we plan to work on the implications derived from this work and integrate them into the multi node cluster in real time. REFERENCES [1]. J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” in OSDI, 2004, pp. 137–150. [2]. T. White, Hadoop - The Definitive Guide. O’Reilly, 2009. [3]. Zujie Ren, Xianghua Xu, Jian Wan et.al “Workload Characterization on a Production Hadoop Cluster: A Case Study on Taobao” Proceedings of the 2012 IEEE International Symposium on Workload Characterization, 2012. [4]. Sheng Di, Cho-Li Wang, “Error-Tolerant Resource Allocation and Payment Minimization for Cloud System” Proc. IEEE Transactions on parallel and distributed systems, VOL. 24, NO. 6, 2013, pp- 1097-1106. [5]. Zhen Xiao, Weijia Song, and Qi Chen ”Dynamic Resource Allocation Using VirtualMachines for Cloud Computing Environment” proc. IEEE Transactions on parallel and distributed systems, VOL. 24, NO. 6, JUNE 2013, pp. 1107-1117. [6]. http://www.wikipwedia.com/wiki/Time_series. [7]. Sam Mahfound and Ganesh Mani “Financial Forecasting Using Genetic Algorithms” http://citeseerx.ist.psu.edu/viewdoc/download?doi= 10.1.1.86.9698&rep=rep1&type=pdf. [8]. Satyendra, ArghyaGhosh, Subhojit Roy, J. Pal Choudhury, S. R. Bhadra Chaudhuri “A Novel Approach of Genetic Algorithm in Prediction of Time Series Data” in Proc of Special issues of international journal of computer application (ACCTHPCA), June 2012. [9]. Abhishek Verma, Xavier Llora, David E. Goldberg and Roy H. Campbell,“Scaling GeneticAlgorithms using MapReduce” Proceedings of journal of cluster computing, special issue, 2011. BIOGRAPHIES V.Sivaranjani is a student,of M.E in Pervasive Computing Technology at Bharathidasan Institute of Technology. Her current research focuses on the cloud computing and parallel computing. Mrs.R.Jayamala, Asst. Professor under the Department of Computer Science and Engineering at Bharathidasan Institute of Technology. Her research focuses on the cloud computing and Networks.