SlideShare une entreprise Scribd logo
1  sur  23
Hadoop Project

Stock Analyzer
(Mapreduce and Hive Implementation)
Presented by
Punit Kishore(A13011)
Debayan Datta(A13006)
Sunil Kumar P(A13020)
Maruthi Nataraj K(A13009)
Ashish Ranjan(A13004)
Praxis Business School
AGENDA
 Understanding of the problem
 Technical Architecture
 Basic Structure
 Pseudo Code
 Final Result
 Business Implications

Electronics Template
UNDERSTANDING OF THE PROBLEM
 Objective : To find the adjusted closing price for each

day that a stock not reported a dividend.

 Data Sources :
 NYSE daily prices dataset with the below schema
exchange

stock_symbol

date

stock_price
_open

stock_
price_high

stock_price
_low

stock_price
_close

stock_volume

stock_pric
e_adj_close

 NYSE dividends dataset with the below schema
exchange

stock_symbol

date

dividends

 Isolation of dividend data from total data will give better
picture of the company because sometimes firms avoid
cutting dividends even when earnings drop.
Framework– Mapreduce/Hive
Electronics Template
TECHNICAL ARCHITECTURE

Eclipse Indigo 3.7.2
Hadoop 1.2.1 plugin

Electronics Template
TECHNICAL ARCHITECTURE

Electronics Template
TECHNICAL ARCHITECTURE

Electronics Template
TECHNICAL ARCHITECTURE
WinSCP

Electronics Template
TECHNICAL ARCHITECTURE

Electronics Template
Putty

Electronics Template

TECHNICAL ARCHITECTURE
TECHNICAL ARCHITECTURE

Unix Environment /Amazon AWS EC2 Praxis Hadoop Cluster

Electronics Template
TECHNICAL ARCHITECTURE

Sample data - NYSE_daily_prices_AT.csv (Testing is done on sample data only due to
load and time constraints).

Electronics Template
TECHNICAL ARCHITECTURE

Sample data - NSE_daily_prices_BT.csv

Electronics Template
TECHNICAL ARCHITECTURE
Sample data - dividendstest.csv

Electronics Template
BASIC STRUCTURE
Input Key Value Pair <Memory Pointer,NYSE,AIT,
12-11-2009,X,X,X,X,X,20.69>

Intermediary Key Value Pair<AIT12-11-2009,1~20.69~0>
<AIT12-11-2009,1~Null~1>

Output/Result Key Value Pair
AIT
12-11-2009
20.69

Electronics Template
PSEUDO CODE
import java and hadoop packages

Mapper
Mapper

public static class StockAnalysisMapper extends MapReduceBase implements
Mapper<LongWritable, Text, Text, Text>
{
// declaration of Mapkey and Mapvalue
@Override
public void map(LongWritable key, Text value,OutputCollector<Text, Text> output,
Reporter reporter) throws IOException
{
// declaration of private variables
// switch case to parse the input lines and store the data
// check for null values in the key
// check the header and send the key value to output collector
}

}

Electronics Template
PSEUDO CODE
public static class StockAnalysisReducer extends MapReduceBase
implements Reducer<Text, Text, Text, Text>

Reducer
Reducer

{
//Declaration of required private variables
@Override
public void reduce(Text key, Iterator<Text> values,OutputCollector<Text, Text> output, Reporter
reporter) throws IOException
{
//Declaration of sum and flag variables
while (values.hasNext())
{
// Parse the inputs which are count,stock adjusted closing price and check
// Store them as required after parsing
//check for null values of stock adjusted closing price
}
}
}

//Increment the sum
// write to output if sum is 1

Electronics Template
PSEUDO CODE
public static void main(String [] arguments) throws Exception
{
JobConf conf = new JobConf(StockAnalyzer.class);
conf.setJobName("Stock Analysis");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(Text.class);
conf.setMapperClass(StockAnalysisMapper.class);
conf.setReducerClass(StockAnalysisReducer.class);
Path MapperInputPath = new Path(arguments[0]);
Path OutputPath = new Path(arguments[1]);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, MapperInputPath);
FileOutputFormat.setOutputPath(conf, OutputPath);
JobClient.runJob(conf);
}

Electronics Template

Driver
Driver
FINAL RESULT
• NYSE Daily A
– 14 inclusive of
1 header
• NYSE Daily B
– 39 inclusive of
1 header
• Dividends file
– 22 inclusive of
1 header
Total – 75

Electronics Template
FINAL RESULT
• Total – 75
• Matching
records – 7
• Headers – 3
• Dividend
records – 21
• Final Output
– 44 records

Electronics Template
FINAL RESULT

Electronics Template
HIVE
FINAL RESULT HIVE

Electronics Template
BUSINESS IMPLICATIONS
 The daily close stock prices are adjusted for dividend distributions/stock
splits because they are a part of total return and affect the historical volatility
estimates .
 The primary use for the adjusted closing price is as a means to develop an
accurate track record of a stock's performance. The comparison of a stock's
historical adjusted closing price to its current price shows the true rate of
return.
 Graphing the volatility history of the target firm simultaneously with that of its
competitors and Market Index can provide unique insights into risk and
comparative advantages(frequency distribution of returns can also be used).
 Historic stock price volatility might have implications to business valuators.

Electronics Template
Electronics Template

Contenu connexe

Tendances

Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Spark Summit
 
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit
 
Use r tutorial part1, introduction to sparkr
Use r tutorial part1, introduction to sparkrUse r tutorial part1, introduction to sparkr
Use r tutorial part1, introduction to sparkrDatabricks
 
Kapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EngineKapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EnginePrashant Vats
 
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy StarzhinskySpark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy StarzhinskySpark Summit
 
Everyday Probabilistic Data Structures for Humans
Everyday Probabilistic Data Structures for HumansEveryday Probabilistic Data Structures for Humans
Everyday Probabilistic Data Structures for HumansDatabricks
 
Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Modern Data Stack France
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...Andrew Lamb
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Julian Hyde
 
Enhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min QiuEnhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min QiuSpark Summit
 
Apache Pulsar: The Next Generation Messaging and Queuing System
Apache Pulsar: The Next Generation Messaging and Queuing SystemApache Pulsar: The Next Generation Messaging and Queuing System
Apache Pulsar: The Next Generation Messaging and Queuing SystemDatabricks
 
Scaling graphite for application metrics
Scaling graphite for application metricsScaling graphite for application metrics
Scaling graphite for application metricsJim Plush
 
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesEvolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesDatabricks
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineeringJulian Hyde
 
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Spark Summit San Francisco 2016 - Ali Ghodsi KeynoteSpark Summit San Francisco 2016 - Ali Ghodsi Keynote
Spark Summit San Francisco 2016 - Ali Ghodsi KeynoteDatabricks
 
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Databricks
 
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...Spark Summit
 

Tendances (20)

Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
 
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
 
Use r tutorial part1, introduction to sparkr
Use r tutorial part1, introduction to sparkrUse r tutorial part1, introduction to sparkr
Use r tutorial part1, introduction to sparkr
 
Kapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EngineKapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing Engine
 
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy StarzhinskySpark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky
 
Highly Available Graphite
Highly Available GraphiteHighly Available Graphite
Highly Available Graphite
 
Everyday Probabilistic Data Structures for Humans
Everyday Probabilistic Data Structures for HumansEveryday Probabilistic Data Structures for Humans
Everyday Probabilistic Data Structures for Humans
 
Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)
 
Producing Spark on YARN for ETL
Producing Spark on YARN for ETLProducing Spark on YARN for ETL
Producing Spark on YARN for ETL
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
 
Enhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min QiuEnhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min Qiu
 
Apache Pulsar: The Next Generation Messaging and Queuing System
Apache Pulsar: The Next Generation Messaging and Queuing SystemApache Pulsar: The Next Generation Messaging and Queuing System
Apache Pulsar: The Next Generation Messaging and Queuing System
 
Scaling graphite for application metrics
Scaling graphite for application metricsScaling graphite for application metrics
Scaling graphite for application metrics
 
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesEvolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming Pipelines
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
 
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Spark Summit San Francisco 2016 - Ali Ghodsi KeynoteSpark Summit San Francisco 2016 - Ali Ghodsi Keynote
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
 
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
 
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
 

En vedette

Hadoop -- Enabling Expanded Financial Market Analysis Techniques while Improv...
Hadoop -- Enabling Expanded Financial Market Analysis Techniques while Improv...Hadoop -- Enabling Expanded Financial Market Analysis Techniques while Improv...
Hadoop -- Enabling Expanded Financial Market Analysis Techniques while Improv...DataWorks Summit
 
Stock analyzer.ppt review
Stock analyzer.ppt reviewStock analyzer.ppt review
Stock analyzer.ppt reviewSree Chinni
 
Introduction to MapReduce & hadoop
Introduction to MapReduce & hadoopIntroduction to MapReduce & hadoop
Introduction to MapReduce & hadoopColin Su
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyJay Nagar
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecasesudhakara st
 
Fresher resume-sample10 by Babasab Patil
Fresher resume-sample10 by Babasab PatilFresher resume-sample10 by Babasab Patil
Fresher resume-sample10 by Babasab PatilBabasab Patil
 
Hotel inspection data set analysis copy
Hotel inspection data set analysis   copyHotel inspection data set analysis   copy
Hotel inspection data set analysis copySharon Moses
 
Hadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceHadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceUwe Printz
 
Hadoop hbase mapreduce
Hadoop hbase mapreduceHadoop hbase mapreduce
Hadoop hbase mapreduceFARUK BERKSÖZ
 
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...Chris Schalk
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hari Shankar Sreekumar
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicDataWorks Summit
 
An Introduction to MapReduce
An Introduction to MapReduceAn Introduction to MapReduce
An Introduction to MapReduceFrane Bandov
 
Proof of Concept for Hadoop: storage and analytics of electrical time-series
Proof of Concept for Hadoop: storage and analytics of electrical time-seriesProof of Concept for Hadoop: storage and analytics of electrical time-series
Proof of Concept for Hadoop: storage and analytics of electrical time-seriesDataWorks Summit
 
MapReduce Design Patterns
MapReduce Design PatternsMapReduce Design Patterns
MapReduce Design PatternsDonald Miner
 
Map reduce: beyond word count
Map reduce: beyond word countMap reduce: beyond word count
Map reduce: beyond word countJeff Patti
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Intro to HDFS and MapReduce
Intro to HDFS and MapReduceIntro to HDFS and MapReduce
Intro to HDFS and MapReduceRyan Tabora
 

En vedette (20)

Hadoop -- Enabling Expanded Financial Market Analysis Techniques while Improv...
Hadoop -- Enabling Expanded Financial Market Analysis Techniques while Improv...Hadoop -- Enabling Expanded Financial Market Analysis Techniques while Improv...
Hadoop -- Enabling Expanded Financial Market Analysis Techniques while Improv...
 
Stock analyzer.ppt review
Stock analyzer.ppt reviewStock analyzer.ppt review
Stock analyzer.ppt review
 
Introduction to MapReduce & hadoop
Introduction to MapReduce & hadoopIntroduction to MapReduce & hadoop
Introduction to MapReduce & hadoop
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data Technology
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduce
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
Fresher resume-sample10 by Babasab Patil
Fresher resume-sample10 by Babasab PatilFresher resume-sample10 by Babasab Patil
Fresher resume-sample10 by Babasab Patil
 
Hotel inspection data set analysis copy
Hotel inspection data set analysis   copyHotel inspection data set analysis   copy
Hotel inspection data set analysis copy
 
BIGDATA & HADOOP PROJECT
BIGDATA & HADOOP PROJECTBIGDATA & HADOOP PROJECT
BIGDATA & HADOOP PROJECT
 
Hadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceHadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduce
 
Hadoop hbase mapreduce
Hadoop hbase mapreduceHadoop hbase mapreduce
Hadoop hbase mapreduce
 
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
 
An Introduction to MapReduce
An Introduction to MapReduceAn Introduction to MapReduce
An Introduction to MapReduce
 
Proof of Concept for Hadoop: storage and analytics of electrical time-series
Proof of Concept for Hadoop: storage and analytics of electrical time-seriesProof of Concept for Hadoop: storage and analytics of electrical time-series
Proof of Concept for Hadoop: storage and analytics of electrical time-series
 
MapReduce Design Patterns
MapReduce Design PatternsMapReduce Design Patterns
MapReduce Design Patterns
 
Map reduce: beyond word count
Map reduce: beyond word countMap reduce: beyond word count
Map reduce: beyond word count
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Intro to HDFS and MapReduce
Intro to HDFS and MapReduceIntro to HDFS and MapReduce
Intro to HDFS and MapReduce
 

Similaire à Hadoop Stock Analyzer Project Using MapReduce and Hive

Apache Spark, the Next Generation Cluster Computing
Apache Spark, the Next Generation Cluster ComputingApache Spark, the Next Generation Cluster Computing
Apache Spark, the Next Generation Cluster ComputingGerger
 
Spark what's new what's coming
Spark what's new what's comingSpark what's new what's coming
Spark what's new what's comingDatabricks
 
Spark Sql and DataFrame
Spark Sql and DataFrameSpark Sql and DataFrame
Spark Sql and DataFramePrashant Gupta
 
A New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKA New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKShu-Jeng Hsieh
 
GDG Jakarta Meetup - Streaming Analytics With Apache Beam
GDG Jakarta Meetup - Streaming Analytics With Apache BeamGDG Jakarta Meetup - Streaming Analytics With Apache Beam
GDG Jakarta Meetup - Streaming Analytics With Apache BeamImre Nagi
 
.NET Portfolio
.NET Portfolio.NET Portfolio
.NET Portfoliomwillmer
 
Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Soham Kulkarni
 
Cs267 hadoop programming
Cs267 hadoop programmingCs267 hadoop programming
Cs267 hadoop programmingKuldeep Dhole
 
Accelerated data access
Accelerated data accessAccelerated data access
Accelerated data accessgordonyorke
 
Madeo - a CAD Tool for reconfigurable Hardware
Madeo - a CAD Tool for reconfigurable HardwareMadeo - a CAD Tool for reconfigurable Hardware
Madeo - a CAD Tool for reconfigurable HardwareESUG
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...Michael Rys
 
Educational Objectives After successfully completing this assignmen.pdf
Educational Objectives After successfully completing this assignmen.pdfEducational Objectives After successfully completing this assignmen.pdf
Educational Objectives After successfully completing this assignmen.pdfrajeshjangid1865
 
An introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduceAn introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduceAnanth PackkilDurai
 
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Intel® Software
 
Automating Interaction Testing with UML Sequence Diagrams: Where TDD and UML ...
Automating Interaction Testing with UML Sequence Diagrams: Where TDD and UML ...Automating Interaction Testing with UML Sequence Diagrams: Where TDD and UML ...
Automating Interaction Testing with UML Sequence Diagrams: Where TDD and UML ...João Pascoal Faria
 
Hadoop_Pennonsoft
Hadoop_PennonsoftHadoop_Pennonsoft
Hadoop_PennonsoftPennonSoft
 
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docxIn Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docxbradburgess22840
 

Similaire à Hadoop Stock Analyzer Project Using MapReduce and Hive (20)

Apache Spark, the Next Generation Cluster Computing
Apache Spark, the Next Generation Cluster ComputingApache Spark, the Next Generation Cluster Computing
Apache Spark, the Next Generation Cluster Computing
 
Spark what's new what's coming
Spark what's new what's comingSpark what's new what's coming
Spark what's new what's coming
 
Spark Sql and DataFrame
Spark Sql and DataFrameSpark Sql and DataFrame
Spark Sql and DataFrame
 
A New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKA New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDK
 
GDG Jakarta Meetup - Streaming Analytics With Apache Beam
GDG Jakarta Meetup - Streaming Analytics With Apache BeamGDG Jakarta Meetup - Streaming Analytics With Apache Beam
GDG Jakarta Meetup - Streaming Analytics With Apache Beam
 
.NET Portfolio
.NET Portfolio.NET Portfolio
.NET Portfolio
 
Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Compiler Construction for DLX Processor
Compiler Construction for DLX Processor
 
Cs267 hadoop programming
Cs267 hadoop programmingCs267 hadoop programming
Cs267 hadoop programming
 
Google cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache FlinkGoogle cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache Flink
 
Refactoring
RefactoringRefactoring
Refactoring
 
Accelerated data access
Accelerated data accessAccelerated data access
Accelerated data access
 
Madeo - a CAD Tool for reconfigurable Hardware
Madeo - a CAD Tool for reconfigurable HardwareMadeo - a CAD Tool for reconfigurable Hardware
Madeo - a CAD Tool for reconfigurable Hardware
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
 
Educational Objectives After successfully completing this assignmen.pdf
Educational Objectives After successfully completing this assignmen.pdfEducational Objectives After successfully completing this assignmen.pdf
Educational Objectives After successfully completing this assignmen.pdf
 
An introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduceAn introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduce
 
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
 
Automating Interaction Testing with UML Sequence Diagrams: Where TDD and UML ...
Automating Interaction Testing with UML Sequence Diagrams: Where TDD and UML ...Automating Interaction Testing with UML Sequence Diagrams: Where TDD and UML ...
Automating Interaction Testing with UML Sequence Diagrams: Where TDD and UML ...
 
Hadoop_Pennonsoft
Hadoop_PennonsoftHadoop_Pennonsoft
Hadoop_Pennonsoft
 
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docxIn Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
 
Refactoring
RefactoringRefactoring
Refactoring
 

Plus de Maruthi Nataraj K

Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingMaruthi Nataraj K
 
Text Mining of Movie Reviews
Text Mining of Movie ReviewsText Mining of Movie Reviews
Text Mining of Movie ReviewsMaruthi Nataraj K
 
How To Find Needles In Haystacks
How To Find Needles In HaystacksHow To Find Needles In Haystacks
How To Find Needles In HaystacksMaruthi Nataraj K
 
Social Media Marketing - Daily Deals
Social Media Marketing - Daily DealsSocial Media Marketing - Daily Deals
Social Media Marketing - Daily DealsMaruthi Nataraj K
 
Customer Profiling For Rural Financial Services
Customer Profiling For Rural Financial ServicesCustomer Profiling For Rural Financial Services
Customer Profiling For Rural Financial ServicesMaruthi Nataraj K
 
Telecom Fraud Detection - Naive Bayes Classification
Telecom Fraud Detection - Naive Bayes ClassificationTelecom Fraud Detection - Naive Bayes Classification
Telecom Fraud Detection - Naive Bayes ClassificationMaruthi Nataraj K
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingMaruthi Nataraj K
 
Elementary School Performance (SAS Regression Analysis)
Elementary School Performance (SAS Regression Analysis)Elementary School Performance (SAS Regression Analysis)
Elementary School Performance (SAS Regression Analysis)Maruthi Nataraj K
 
Hospital Market Segmentation using Cluster Analysis
Hospital Market Segmentation using Cluster AnalysisHospital Market Segmentation using Cluster Analysis
Hospital Market Segmentation using Cluster AnalysisMaruthi Nataraj K
 
SAS Medical Case Study - A Comparison Between Ketamine,Clonidine and combinat...
SAS Medical Case Study - A Comparison Between Ketamine,Clonidine and combinat...SAS Medical Case Study - A Comparison Between Ketamine,Clonidine and combinat...
SAS Medical Case Study - A Comparison Between Ketamine,Clonidine and combinat...Maruthi Nataraj K
 
Maruti Suzuki India Ltd Financial Statement Analysis
Maruti Suzuki India Ltd Financial Statement AnalysisMaruti Suzuki India Ltd Financial Statement Analysis
Maruti Suzuki India Ltd Financial Statement AnalysisMaruthi Nataraj K
 
SBI Home Loan Customer Perception Survey
SBI Home Loan Customer Perception SurveySBI Home Loan Customer Perception Survey
SBI Home Loan Customer Perception SurveyMaruthi Nataraj K
 
Basketball League Sponsorship Proposal
Basketball League Sponsorship ProposalBasketball League Sponsorship Proposal
Basketball League Sponsorship ProposalMaruthi Nataraj K
 

Plus de Maruthi Nataraj K (15)

Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and Forecasting
 
Text Mining of Movie Reviews
Text Mining of Movie ReviewsText Mining of Movie Reviews
Text Mining of Movie Reviews
 
How To Find Needles In Haystacks
How To Find Needles In HaystacksHow To Find Needles In Haystacks
How To Find Needles In Haystacks
 
Social Media Marketing - Daily Deals
Social Media Marketing - Daily DealsSocial Media Marketing - Daily Deals
Social Media Marketing - Daily Deals
 
Customer Profiling For Rural Financial Services
Customer Profiling For Rural Financial ServicesCustomer Profiling For Rural Financial Services
Customer Profiling For Rural Financial Services
 
Telecom Fraud Detection - Naive Bayes Classification
Telecom Fraud Detection - Naive Bayes ClassificationTelecom Fraud Detection - Naive Bayes Classification
Telecom Fraud Detection - Naive Bayes Classification
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and Forecasting
 
Linear Regression using R
Linear Regression using RLinear Regression using R
Linear Regression using R
 
Elementary School Performance (SAS Regression Analysis)
Elementary School Performance (SAS Regression Analysis)Elementary School Performance (SAS Regression Analysis)
Elementary School Performance (SAS Regression Analysis)
 
Hospital Market Segmentation using Cluster Analysis
Hospital Market Segmentation using Cluster AnalysisHospital Market Segmentation using Cluster Analysis
Hospital Market Segmentation using Cluster Analysis
 
SAS Medical Case Study - A Comparison Between Ketamine,Clonidine and combinat...
SAS Medical Case Study - A Comparison Between Ketamine,Clonidine and combinat...SAS Medical Case Study - A Comparison Between Ketamine,Clonidine and combinat...
SAS Medical Case Study - A Comparison Between Ketamine,Clonidine and combinat...
 
Maruti Suzuki India Ltd Financial Statement Analysis
Maruti Suzuki India Ltd Financial Statement AnalysisMaruti Suzuki India Ltd Financial Statement Analysis
Maruti Suzuki India Ltd Financial Statement Analysis
 
SBI Home Loan Customer Perception Survey
SBI Home Loan Customer Perception SurveySBI Home Loan Customer Perception Survey
SBI Home Loan Customer Perception Survey
 
Basketball League Sponsorship Proposal
Basketball League Sponsorship ProposalBasketball League Sponsorship Proposal
Basketball League Sponsorship Proposal
 
Bank market classification
Bank market classificationBank market classification
Bank market classification
 

Dernier

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Hadoop Stock Analyzer Project Using MapReduce and Hive

  • 1. Hadoop Project Stock Analyzer (Mapreduce and Hive Implementation) Presented by Punit Kishore(A13011) Debayan Datta(A13006) Sunil Kumar P(A13020) Maruthi Nataraj K(A13009) Ashish Ranjan(A13004) Praxis Business School
  • 2. AGENDA  Understanding of the problem  Technical Architecture  Basic Structure  Pseudo Code  Final Result  Business Implications Electronics Template
  • 3. UNDERSTANDING OF THE PROBLEM  Objective : To find the adjusted closing price for each day that a stock not reported a dividend.  Data Sources :  NYSE daily prices dataset with the below schema exchange stock_symbol date stock_price _open stock_ price_high stock_price _low stock_price _close stock_volume stock_pric e_adj_close  NYSE dividends dataset with the below schema exchange stock_symbol date dividends  Isolation of dividend data from total data will give better picture of the company because sometimes firms avoid cutting dividends even when earnings drop. Framework– Mapreduce/Hive Electronics Template
  • 4. TECHNICAL ARCHITECTURE Eclipse Indigo 3.7.2 Hadoop 1.2.1 plugin Electronics Template
  • 10. TECHNICAL ARCHITECTURE Unix Environment /Amazon AWS EC2 Praxis Hadoop Cluster Electronics Template
  • 11. TECHNICAL ARCHITECTURE Sample data - NYSE_daily_prices_AT.csv (Testing is done on sample data only due to load and time constraints). Electronics Template
  • 12. TECHNICAL ARCHITECTURE Sample data - NSE_daily_prices_BT.csv Electronics Template
  • 13. TECHNICAL ARCHITECTURE Sample data - dividendstest.csv Electronics Template
  • 14. BASIC STRUCTURE Input Key Value Pair <Memory Pointer,NYSE,AIT, 12-11-2009,X,X,X,X,X,20.69> Intermediary Key Value Pair<AIT12-11-2009,1~20.69~0> <AIT12-11-2009,1~Null~1> Output/Result Key Value Pair AIT 12-11-2009 20.69 Electronics Template
  • 15. PSEUDO CODE import java and hadoop packages Mapper Mapper public static class StockAnalysisMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text> { // declaration of Mapkey and Mapvalue @Override public void map(LongWritable key, Text value,OutputCollector<Text, Text> output, Reporter reporter) throws IOException { // declaration of private variables // switch case to parse the input lines and store the data // check for null values in the key // check the header and send the key value to output collector } } Electronics Template
  • 16. PSEUDO CODE public static class StockAnalysisReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> Reducer Reducer { //Declaration of required private variables @Override public void reduce(Text key, Iterator<Text> values,OutputCollector<Text, Text> output, Reporter reporter) throws IOException { //Declaration of sum and flag variables while (values.hasNext()) { // Parse the inputs which are count,stock adjusted closing price and check // Store them as required after parsing //check for null values of stock adjusted closing price } } } //Increment the sum // write to output if sum is 1 Electronics Template
  • 17. PSEUDO CODE public static void main(String [] arguments) throws Exception { JobConf conf = new JobConf(StockAnalyzer.class); conf.setJobName("Stock Analysis"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(Text.class); conf.setMapperClass(StockAnalysisMapper.class); conf.setReducerClass(StockAnalysisReducer.class); Path MapperInputPath = new Path(arguments[0]); Path OutputPath = new Path(arguments[1]); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, MapperInputPath); FileOutputFormat.setOutputPath(conf, OutputPath); JobClient.runJob(conf); } Electronics Template Driver Driver
  • 18. FINAL RESULT • NYSE Daily A – 14 inclusive of 1 header • NYSE Daily B – 39 inclusive of 1 header • Dividends file – 22 inclusive of 1 header Total – 75 Electronics Template
  • 19. FINAL RESULT • Total – 75 • Matching records – 7 • Headers – 3 • Dividend records – 21 • Final Output – 44 records Electronics Template
  • 22. BUSINESS IMPLICATIONS  The daily close stock prices are adjusted for dividend distributions/stock splits because they are a part of total return and affect the historical volatility estimates .  The primary use for the adjusted closing price is as a means to develop an accurate track record of a stock's performance. The comparison of a stock's historical adjusted closing price to its current price shows the true rate of return.  Graphing the volatility history of the target firm simultaneously with that of its competitors and Market Index can provide unique insights into risk and comparative advantages(frequency distribution of returns can also be used).  Historic stock price volatility might have implications to business valuators. Electronics Template