SlideShare une entreprise Scribd logo
1  sur  5
Télécharger pour lire hors ligne
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1320
SOCIAL MEDIA MARKET TRENDER WITH DACHE MANAGER USING
HADOOP AND VISUALIZATION TOOLS
Raveena Rana [1], Tejas Shah[2], Jitanksha Shrimanker[3]
1-3BE Student, Information Technology Department
K. J. Somaiya Institute of Engineering & Information Technology, Mumbai, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract- The modern business and advertising strategies
all involve market analysis and study of the trends going on
in the world. Social media is the main advertising and
analysis market in today’s world. The current Analytics tools
and models that are available in the market are very costly,
unable to handle Big Data and less secure. The traditional
Analytics systems takes a long time to come up with results,
so it is not beneficial to use for Real Time Analytics. So, the
proposed work resolves all these problems by combining the
Apache Open Source platform which solves the issues of Real
Time Analytics using HADOOP. It also provides scalability
and reduced cost over analytics by using open Source
Software. [1]
Key Words: Map Reduce (MR), Hadoop, Java , Flume ,
Sqoop, Caching.
I. INTRODUCTION
The world is getting smaller and smaller day by day
through internet and usage of social sites. Data is growing
exponentially as the number of users and activity over the
web is increasing rapidly. The consumer market is
becoming increasingly competitive and the need to keep a
tab on trends and consumers’ opinions is a must. [3]
Social Media is being extensively used by people
belonging to all age groups’ for expressing their reviews
and suggestions whenever they buy a new product or a
new product is launched. Also it is used for
recommendations to friends and family of a particular
product or service. Social media sites like Facebook and
Twitter generate huge amounts of data on a daily basis.
This data is directly from the end user or consumer and as
a result is of great value to the organizations and
companies. [1]
This data is still scattered and not in a suitable
format that may yield any results. Also the data is so huge
that the buzz-word Big Data is the only one that can
represent such an amount of data. Our Project is based on
Hadoop framework that can
manage and handle Big Data efficiently. Main reasons for
choosing Hadoop are:
1. The data generated by Social sites are too large
for traditional databases and analysis tools.
2. Hadoop being a framework gives more flexibility
in implementing ideas.
3. Scalability is a major advantage of using Hadoop
over other platforms.
Application developers specify the computation in
terms of a map and a reduce function, and the underlying
MR Job scheduling system automatically parallelizes the
computation across a cluster of machines. MR gains
popularity for its simple programming interface and
excellent performance when implementing a large
spectrum of applications. Since most such applications
take a large amount of input data, they are nicknamed
“Big-data applications”. Hadoop is an open-source
implementation of the Google MR programming model.
Hadoop consists of the Hadoop Common, which provides
access to the file systems supported by Hadoop. MR
provides a standardized framework for implementing
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1321
large-scale distributed computation, namely, the big-data
applications. [5][6]
The work proposes to combine the Apache Open
Source Modules and configure them to get the required
result. Hadoop is flexible and scalable architecture. Google
MR is a programming model and a software framework for
large-scale distributed computing on large amounts of
data [7][8].
. We propose Dache, a data-aware cache framework
for big-data applications. In Dache, tasks submit their
intermediate results to the cache manager. A task queries
the cache manager before executing the actual computing
work. We implement Dache by extending Hadoop. [2]
In the proposed work, we would be grabbing data from
Twitter and analyzing it through visualization techniques.
[4]
The project aims at finding out the trends of the
market according to the social sites such as Twitter and
visualizing and presenting it in a simple format such as a
graph or chart that can be understood by anyone and a
clear conclusion can be drawn out of it. It also aims at
tackling data size issues by using Hadoop and MR
frameworks that are open-source. It also tries to improve
the processing speed by introducing a cache concept for
distributed computing named Dache.[9]
II. DRAWBACKS OF EXISTING SYSTEM
The current Analytics tools and models that are
available in the market are very costly, unable to handle
Big Data and less secure. The traditional Analytics systems
takes a long time to come up with results, so it is not
beneficial to use for Real Time Analytics.
Google’s MR and Apache’s Hadoop are the defacto
software systems for big-data applications. An observation
of the MR framework is that the framework generates a
large amount of intermediate data. Such abundant
information is thrown away after the tasks finish, because
MR is unable to utilize them. Also the MR operates on huge
amount of data and as a result the processing takes usually
a longer period of time.
A drawback of Hadoop operation is that MR, its
processing part creates a lot of intermediate data during
operations such as word-count on a particular file. But in
the end we get only the desired output file and the rest of
the intermediate data generated during the processing is
thrown away. This data can be stored separately and can
be used in similar future operations to speed up the
process. Dache, a concept of cache will be introduced in
the project so as to boost the operation speed of MR.
The current Analytics tools and models that are
available in the market are very costly (SocialBro,
TweetStats, Twentyfeet,) unable to handle Big Data and
less secure. Hence the existing system lags during
functioning. The traditional Analytics systems takes a long
time to come up with results, so it is not beneficial to use
for Real Time Analytics. Real time analytics require high
speed and less time for performing it.
Querying Twitter data in a traditional RDBMS is
inefficient. There are many Twitter API which provide
streaming of twitter data. Streamed data is not preferred.
Traditional Hadoop MR does not use the concept of Cache.
Dache is a caching technique for Big Data. It boosts the
speed of the analysis. This will help to cut short the time
required for the entire process.[10]
III. PROPOSED METHODOLGY
We propose to grab data from Twitter and store
in HDFS. The data will be converted into formats such as
CSV and will be visualized via different tools. Dache is a
data-aware cache framework for big-data applications. In
Dache, tasks submit their intermediate results to the cache
manager. A task queries the cache manager before
executing the actual computing work. We implement
Dache by extending Hadoop. Advantages of proposed
System are as follows:
 Efficiency and Reliability of handling Big Data
increases with Hadoop. The system can be scaled
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1322
as per the user requirement. Storing of grabbed
data in HDFS will increase the speed, reliability
and efficiency of data & results.
 The CSV used is better for understanding. The CSV
will be exported to database for backup purposes.
Graphs and charts will provide a clear picture of
the latest trends.
 The operation speed will be boosted by Dache.
There would be no cost of data grabbing and
handling as all the tools used are open source
softwares. Dache will make the MR operations
faster.
The Proposed System will show us the trends in
market that are most talked about on social media. In the
end, we can compare various brands, topics and also the
traditional Hadoop MR operations vs Hadoop MR
operations using Dache.
A front end will be created wherein a user can sign up
for an account. After sign up, he can login the account.
There will be various options for grabbing data, analysis of
the data, graphs of analysis etc. At the backend, the
grabbed data will be stored in HDFS. It will be in the form
of JSON. This data will be converted into CSV. Using
Visualization tools, reports will be generated in the form of
graphs.
Dache helps in speeding up the Big Data whenever the
same data is accessed again, just like cache functions in
other applications. This will increase the speed of
processes and decrease the time required for accessing the
same data otherwise.
Figure 1: Proposed system architecture
Figure 1 shows the block diagram of the proposed system.
The project aims to provide the end user recent Market
Trends. The proposed system is very cost effective as most
of the work is done on free open source softwares. The
drawback of cost in existing system is overcome.
The proposed system architecture involves the following
modules:
1. Module 1: Grab Data From Social Media
Description: In this module, the data will be
grabbed from social site on the basis of the
keywords entered by the user. The grabbed data
will be stored in Hadoop distributed file system
(HDFS).
Input: The user will enter the four token keys and
keywords for grabbing the data. On
the basis of keywords the data will be grabbed.
Output: Output will be the grabbed data files on
the basis of keywords. The format of the files will
be JSON and will be stored in the HDFS or other
specified location.
2. Module 2: Process Data using Hadoop
Description: The Grabbed Data which is in the
form of JSON will be converted into
Comma Separated Value (CSV) format with
MapReduce operations. In MapReduce there will
be five stages viz Input Phase, Mapper Phase, Sort
and Shuffle Phase, Reducer Phase and Output
Phase. Different Functions will be carried out on
the CSV data and visualized graphically.
Input: Grabbed data in the form of JSON
Output: The data in CSV form, Graphs of the
analyzed data.
3. Module 3: Display Result
Description: This module will be used to give the
results to the user in the form of graphs and
keywords entered.
Input: Request for results.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1323
Output: Graphs of the analyzed data.
IV. IMPLEMENTATION
Initially, softwares like Apache Flume, Sqoop,
Twitter, Hadoop, VM Ware, Tableau, Pentaho, and Putty
have to be downloaded. Along with these, Flume settings
for Twitter are also required. Setup of Flume has to be
done first. All the services have to be started. A Twitter
account has to be made. In the Flume settings downloaded
for Twitter, desired keywords are entered. The keywords
are the topics for analysis. The data will be grabbed as per
the keywords entered. The data will be in millions. Hence
it is known as Big Data. The analyzed data will be stored in
HDFS in the form of JSON. This data will be converted to
CSV with the help of Sqoop. Dache impleme1ntation is the
next step. A Dache manager is created for speeding up the
process. The data will be processed by the visualization
tools which have already been downloaded. The output of
the analysis will be in the form of graphs. Studying of
graphical data is much simpler as compared to Big Data.
[1]
Hadoop framework is used to facilitate operations
on BigData. Hadoop allows us to install various softwares
and work with them. By installing Flume and running it,
we will be grabbing data from social sites and storing it
directly in Hadoop Distributed File System (HDFS). HDFS
is the storage part of Hadoop.
MapReduce will be used to perform operations such
as Conversion of Data, Word-count, etc and Dache will be
used to optimise and increase the speed of operations.
MapReduce is the processing part of Hadoop. Visualization
tools will be used to show the data graphically. Java will be
used to write MapReduce operations.
Technique for Dache Implementation:
Input Map Intermediate Reducer Output
Phase phase phase phase phase
Fig 4: High level description of architecture of Dache
We propose a novel cache description scheme. A
high-level description is presented in Fig. 4.3. This scheme
identifies the source input from which a cache item is
obtained, and the operations applied on the input, so that
a cache item produced by the workers in the map phase is
indexed properly. In the reduce phase, we devise a
mechanism to take into consideration the partition
operations applied on the output in the map phase. We
also present a method for reducers to utilize the cached
results in the map phase to accelerate the execution of the
MapReduce job. We implement Dache in the Hadoop
project by extending the relevant components.
Algorithm for MapReduce operations (Wordcount)
 Map()
- Input <filename, file text>
- Parses file and emits <word, count> pairs
 Reduce()
- Sums Values for the same keys and emits
<word, TotalCount>
The mapper emits an intermediate key-value pair for each
word in a document. The reducer sums up all counts for
each word.
class Mapper
method Map(docid a, doc d)
for all term t ∈ doc d do
Emit(term t, count 1)
class Reducer
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1324
method Reduce(term t, counts [c1, c2, . . .])
sum ← 0
for all count c ∈ counts [c1, c2, . . .] do
sum ← sum + c 6: Emit(term t, count sum)
Figure 2 shows a screenshot of grabbed data in
the form of CSV.[1]
Figure 2: Screenshot of Grabbed Data
V. CONCLUSION
As compared to the existing system, our proposed
system is cost effective. Requirements of this proposed
system are easily available. All the softwares required are
free open source softwares which cost minimal. Time
saved in the entire process is cut down. Great speed is
provided as compared to the existing system. We have
already grabbed data for some keywords. We are working
on the Dache implementation. We have stated the
advantages of the existing system. Finally, we have
explored a cost effective way to analyze social media
market trends and generate a graphical view of the
analysis.
REFERENCES
[1] Raveena Rana, Tejas Shah and Jitanksha Shrimanker
“Social Media Market Trender with Dache Manager using
Hadoop and Visualization Tools”, “International Research
Journal of Engineering and Technology”, Volume: 03,
Issue: 02, February 2016.
[2]Hadoop: The Definitive Guide by Tom White.
[3] Yaxiong Zhao, Jie Wu, Cong Liu, “Dache: A Data Aware
Caching for Big-Data Applications Using the MapReduce
Framework” ,TSINGHUA SCIENCE AND TECHNOLOGY
ISSNl 1007-0214l l05/10l lpp39-50 Volume 19, Number 1,
February 2014.
[4] Kaveh Ketabchi Khonsari, Zahra Amin Nayeri, Ali
Fathalian, Leila Fathalian, “Social Network Analysis of
Iran’s Green Movement Opposition Groups using Twitter”,
2010 International Conference on Advances in Social
Networks Analysis and Mining.
[5] Karen Stepanyan, Kerstin Borau, Carsten Ullrich, “A
Social Network Analysis Perspective on Student
Interaction within the Twitter Microblogging
Environment”, 2010 10th IEEE International Conference
on Advanced Learning Technologies.
[6] Guojun Liu, Ming Zhang, Fei Yan, “Large-Scale Social
Network Analysis based on MapReduce”,2010
International Conference on Computational Aspects of
Social Networks568.
[7] Kai Shuang, Yin Yang, Bin Cai, Zhe Xiang, “ X-RIME:
Hadoop-based Large-Scale Social Network Analysis”,
Proceedings of IC-BNMT20 10.
[8] Hyeokju Lee, Joon Her, Sung-Ryul Kim,
“Implementation of Large-Scalable Social Data Analysis
System based on MapReduce”, 2011 First ACIS/JNU
International Conference on Computers, Networks,
Systems, and Industrial Engineering.
[9] Ge Song, Zide Meng, Fabrice Huet, Frederick Magoules,
Lei Yu, Xuelian Lin, “A Hadoop MapReduce Performance
Prediction Method”, 2013 IEEE International Conference
on High Performance Computing and Communications &
2013 IEEE International Conference on Embedded and
Ubiquitous Computing.
[10] Gaurav D Rajurkar, Rajeshwari M Goudar, “ A speedy
data uploading approach for Twitter Trend and Sentiment
Analysis using Hadoop”, 2015 International Conference on
Computing Communication Control and Automation.

Contenu connexe

Tendances

A Survey on Data Mapping Strategy for data stored in the storage cloud 111
A Survey on Data Mapping Strategy for data stored in the storage cloud  111A Survey on Data Mapping Strategy for data stored in the storage cloud  111
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
NavNeet KuMar
 
Big_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperBig_SQL_3.0_Whitepaper
Big_SQL_3.0_Whitepaper
Scott Gray
 
Hortonworks.HadoopPatternsOfUse.201304
Hortonworks.HadoopPatternsOfUse.201304Hortonworks.HadoopPatternsOfUse.201304
Hortonworks.HadoopPatternsOfUse.201304
James Kenney
 

Tendances (20)

Paper ijert
Paper ijertPaper ijert
Paper ijert
 
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
A Survey on Data Mapping Strategy for data stored in the storage cloud  111A Survey on Data Mapping Strategy for data stored in the storage cloud  111
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training report
 
IJARCCE_49
IJARCCE_49IJARCCE_49
IJARCCE_49
 
Big data analysis concepts and references by Cloud Security Alliance
Big data analysis concepts and references by Cloud Security AllianceBig data analysis concepts and references by Cloud Security Alliance
Big data analysis concepts and references by Cloud Security Alliance
 
Big data analysis model for transformation and effectiveness in the e governa...
Big data analysis model for transformation and effectiveness in the e governa...Big data analysis model for transformation and effectiveness in the e governa...
Big data analysis model for transformation and effectiveness in the e governa...
 
IRJET- Survey of Big Data with Hadoop
IRJET-  	  Survey of Big Data with HadoopIRJET-  	  Survey of Big Data with Hadoop
IRJET- Survey of Big Data with Hadoop
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
IRJET- A Scenario on Big Data
IRJET- A Scenario on Big DataIRJET- A Scenario on Big Data
IRJET- A Scenario on Big Data
 
Enabling Governed Data Access with Tableau Data Server
Enabling Governed Data Access with Tableau Data Server Enabling Governed Data Access with Tableau Data Server
Enabling Governed Data Access with Tableau Data Server
 
introduction to big data frameworks
introduction to big data frameworksintroduction to big data frameworks
introduction to big data frameworks
 
SURVEY ON BIG DATA PROCESSING USING HADOOP, MAP REDUCE
SURVEY ON BIG DATA PROCESSING USING HADOOP, MAP REDUCESURVEY ON BIG DATA PROCESSING USING HADOOP, MAP REDUCE
SURVEY ON BIG DATA PROCESSING USING HADOOP, MAP REDUCE
 
Big data frameworks
Big data frameworksBig data frameworks
Big data frameworks
 
Big_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperBig_SQL_3.0_Whitepaper
Big_SQL_3.0_Whitepaper
 
Accelerating Time to Research Using CloudBank
Accelerating Time to Research Using CloudBankAccelerating Time to Research Using CloudBank
Accelerating Time to Research Using CloudBank
 
IRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articlesIRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET- Systematic Review: Progression Study on BIG DATA articles
 
Ss eb29
Ss eb29Ss eb29
Ss eb29
 
Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technology
 
Hortonworks.HadoopPatternsOfUse.201304
Hortonworks.HadoopPatternsOfUse.201304Hortonworks.HadoopPatternsOfUse.201304
Hortonworks.HadoopPatternsOfUse.201304
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their Usage
 

Similaire à Social Media Market Trender with Dache Manager Using Hadoop and Visualization Tools

The “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedInThe “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedIn
Kun Le
 

Similaire à Social Media Market Trender with Dache Manager Using Hadoop and Visualization Tools (20)

IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOP
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A Review
 
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringBig Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and Storing
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Big Data Testing Using Hadoop Platform
Big Data Testing Using Hadoop PlatformBig Data Testing Using Hadoop Platform
Big Data Testing Using Hadoop Platform
 
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock ExchangeIRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
 
IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED-V2I3P43
IJSRED-V2I3P43
 
IJET-V2I6P25
IJET-V2I6P25IJET-V2I6P25
IJET-V2I6P25
 
IRJET- Performing Load Balancing between Namenodes in HDFS
IRJET- Performing Load Balancing between Namenodes in HDFSIRJET- Performing Load Balancing between Namenodes in HDFS
IRJET- Performing Load Balancing between Namenodes in HDFS
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big Data
 
Hadoop
HadoopHadoop
Hadoop
 
Sycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptxSycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptx
 
G017143640
G017143640G017143640
G017143640
 
The “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedInThe “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedIn
 
The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedIn
 
Survey Paper on Big Data and Hadoop
Survey Paper on Big Data and HadoopSurvey Paper on Big Data and Hadoop
Survey Paper on Big Data and Hadoop
 
DOCUMENT SELECTION USING MAPREDUCE Yenumula B Reddy and Desmond Hill
DOCUMENT SELECTION USING MAPREDUCE Yenumula B Reddy and Desmond HillDOCUMENT SELECTION USING MAPREDUCE Yenumula B Reddy and Desmond Hill
DOCUMENT SELECTION USING MAPREDUCE Yenumula B Reddy and Desmond Hill
 

Plus de IRJET Journal

Plus de IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Dernier

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 

Dernier (20)

PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 

Social Media Market Trender with Dache Manager Using Hadoop and Visualization Tools

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1320 SOCIAL MEDIA MARKET TRENDER WITH DACHE MANAGER USING HADOOP AND VISUALIZATION TOOLS Raveena Rana [1], Tejas Shah[2], Jitanksha Shrimanker[3] 1-3BE Student, Information Technology Department K. J. Somaiya Institute of Engineering & Information Technology, Mumbai, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract- The modern business and advertising strategies all involve market analysis and study of the trends going on in the world. Social media is the main advertising and analysis market in today’s world. The current Analytics tools and models that are available in the market are very costly, unable to handle Big Data and less secure. The traditional Analytics systems takes a long time to come up with results, so it is not beneficial to use for Real Time Analytics. So, the proposed work resolves all these problems by combining the Apache Open Source platform which solves the issues of Real Time Analytics using HADOOP. It also provides scalability and reduced cost over analytics by using open Source Software. [1] Key Words: Map Reduce (MR), Hadoop, Java , Flume , Sqoop, Caching. I. INTRODUCTION The world is getting smaller and smaller day by day through internet and usage of social sites. Data is growing exponentially as the number of users and activity over the web is increasing rapidly. The consumer market is becoming increasingly competitive and the need to keep a tab on trends and consumers’ opinions is a must. [3] Social Media is being extensively used by people belonging to all age groups’ for expressing their reviews and suggestions whenever they buy a new product or a new product is launched. Also it is used for recommendations to friends and family of a particular product or service. Social media sites like Facebook and Twitter generate huge amounts of data on a daily basis. This data is directly from the end user or consumer and as a result is of great value to the organizations and companies. [1] This data is still scattered and not in a suitable format that may yield any results. Also the data is so huge that the buzz-word Big Data is the only one that can represent such an amount of data. Our Project is based on Hadoop framework that can manage and handle Big Data efficiently. Main reasons for choosing Hadoop are: 1. The data generated by Social sites are too large for traditional databases and analysis tools. 2. Hadoop being a framework gives more flexibility in implementing ideas. 3. Scalability is a major advantage of using Hadoop over other platforms. Application developers specify the computation in terms of a map and a reduce function, and the underlying MR Job scheduling system automatically parallelizes the computation across a cluster of machines. MR gains popularity for its simple programming interface and excellent performance when implementing a large spectrum of applications. Since most such applications take a large amount of input data, they are nicknamed “Big-data applications”. Hadoop is an open-source implementation of the Google MR programming model. Hadoop consists of the Hadoop Common, which provides access to the file systems supported by Hadoop. MR provides a standardized framework for implementing
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1321 large-scale distributed computation, namely, the big-data applications. [5][6] The work proposes to combine the Apache Open Source Modules and configure them to get the required result. Hadoop is flexible and scalable architecture. Google MR is a programming model and a software framework for large-scale distributed computing on large amounts of data [7][8]. . We propose Dache, a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. We implement Dache by extending Hadoop. [2] In the proposed work, we would be grabbing data from Twitter and analyzing it through visualization techniques. [4] The project aims at finding out the trends of the market according to the social sites such as Twitter and visualizing and presenting it in a simple format such as a graph or chart that can be understood by anyone and a clear conclusion can be drawn out of it. It also aims at tackling data size issues by using Hadoop and MR frameworks that are open-source. It also tries to improve the processing speed by introducing a cache concept for distributed computing named Dache.[9] II. DRAWBACKS OF EXISTING SYSTEM The current Analytics tools and models that are available in the market are very costly, unable to handle Big Data and less secure. The traditional Analytics systems takes a long time to come up with results, so it is not beneficial to use for Real Time Analytics. Google’s MR and Apache’s Hadoop are the defacto software systems for big-data applications. An observation of the MR framework is that the framework generates a large amount of intermediate data. Such abundant information is thrown away after the tasks finish, because MR is unable to utilize them. Also the MR operates on huge amount of data and as a result the processing takes usually a longer period of time. A drawback of Hadoop operation is that MR, its processing part creates a lot of intermediate data during operations such as word-count on a particular file. But in the end we get only the desired output file and the rest of the intermediate data generated during the processing is thrown away. This data can be stored separately and can be used in similar future operations to speed up the process. Dache, a concept of cache will be introduced in the project so as to boost the operation speed of MR. The current Analytics tools and models that are available in the market are very costly (SocialBro, TweetStats, Twentyfeet,) unable to handle Big Data and less secure. Hence the existing system lags during functioning. The traditional Analytics systems takes a long time to come up with results, so it is not beneficial to use for Real Time Analytics. Real time analytics require high speed and less time for performing it. Querying Twitter data in a traditional RDBMS is inefficient. There are many Twitter API which provide streaming of twitter data. Streamed data is not preferred. Traditional Hadoop MR does not use the concept of Cache. Dache is a caching technique for Big Data. It boosts the speed of the analysis. This will help to cut short the time required for the entire process.[10] III. PROPOSED METHODOLGY We propose to grab data from Twitter and store in HDFS. The data will be converted into formats such as CSV and will be visualized via different tools. Dache is a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. We implement Dache by extending Hadoop. Advantages of proposed System are as follows:  Efficiency and Reliability of handling Big Data increases with Hadoop. The system can be scaled
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1322 as per the user requirement. Storing of grabbed data in HDFS will increase the speed, reliability and efficiency of data & results.  The CSV used is better for understanding. The CSV will be exported to database for backup purposes. Graphs and charts will provide a clear picture of the latest trends.  The operation speed will be boosted by Dache. There would be no cost of data grabbing and handling as all the tools used are open source softwares. Dache will make the MR operations faster. The Proposed System will show us the trends in market that are most talked about on social media. In the end, we can compare various brands, topics and also the traditional Hadoop MR operations vs Hadoop MR operations using Dache. A front end will be created wherein a user can sign up for an account. After sign up, he can login the account. There will be various options for grabbing data, analysis of the data, graphs of analysis etc. At the backend, the grabbed data will be stored in HDFS. It will be in the form of JSON. This data will be converted into CSV. Using Visualization tools, reports will be generated in the form of graphs. Dache helps in speeding up the Big Data whenever the same data is accessed again, just like cache functions in other applications. This will increase the speed of processes and decrease the time required for accessing the same data otherwise. Figure 1: Proposed system architecture Figure 1 shows the block diagram of the proposed system. The project aims to provide the end user recent Market Trends. The proposed system is very cost effective as most of the work is done on free open source softwares. The drawback of cost in existing system is overcome. The proposed system architecture involves the following modules: 1. Module 1: Grab Data From Social Media Description: In this module, the data will be grabbed from social site on the basis of the keywords entered by the user. The grabbed data will be stored in Hadoop distributed file system (HDFS). Input: The user will enter the four token keys and keywords for grabbing the data. On the basis of keywords the data will be grabbed. Output: Output will be the grabbed data files on the basis of keywords. The format of the files will be JSON and will be stored in the HDFS or other specified location. 2. Module 2: Process Data using Hadoop Description: The Grabbed Data which is in the form of JSON will be converted into Comma Separated Value (CSV) format with MapReduce operations. In MapReduce there will be five stages viz Input Phase, Mapper Phase, Sort and Shuffle Phase, Reducer Phase and Output Phase. Different Functions will be carried out on the CSV data and visualized graphically. Input: Grabbed data in the form of JSON Output: The data in CSV form, Graphs of the analyzed data. 3. Module 3: Display Result Description: This module will be used to give the results to the user in the form of graphs and keywords entered. Input: Request for results.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1323 Output: Graphs of the analyzed data. IV. IMPLEMENTATION Initially, softwares like Apache Flume, Sqoop, Twitter, Hadoop, VM Ware, Tableau, Pentaho, and Putty have to be downloaded. Along with these, Flume settings for Twitter are also required. Setup of Flume has to be done first. All the services have to be started. A Twitter account has to be made. In the Flume settings downloaded for Twitter, desired keywords are entered. The keywords are the topics for analysis. The data will be grabbed as per the keywords entered. The data will be in millions. Hence it is known as Big Data. The analyzed data will be stored in HDFS in the form of JSON. This data will be converted to CSV with the help of Sqoop. Dache impleme1ntation is the next step. A Dache manager is created for speeding up the process. The data will be processed by the visualization tools which have already been downloaded. The output of the analysis will be in the form of graphs. Studying of graphical data is much simpler as compared to Big Data. [1] Hadoop framework is used to facilitate operations on BigData. Hadoop allows us to install various softwares and work with them. By installing Flume and running it, we will be grabbing data from social sites and storing it directly in Hadoop Distributed File System (HDFS). HDFS is the storage part of Hadoop. MapReduce will be used to perform operations such as Conversion of Data, Word-count, etc and Dache will be used to optimise and increase the speed of operations. MapReduce is the processing part of Hadoop. Visualization tools will be used to show the data graphically. Java will be used to write MapReduce operations. Technique for Dache Implementation: Input Map Intermediate Reducer Output Phase phase phase phase phase Fig 4: High level description of architecture of Dache We propose a novel cache description scheme. A high-level description is presented in Fig. 4.3. This scheme identifies the source input from which a cache item is obtained, and the operations applied on the input, so that a cache item produced by the workers in the map phase is indexed properly. In the reduce phase, we devise a mechanism to take into consideration the partition operations applied on the output in the map phase. We also present a method for reducers to utilize the cached results in the map phase to accelerate the execution of the MapReduce job. We implement Dache in the Hadoop project by extending the relevant components. Algorithm for MapReduce operations (Wordcount)  Map() - Input <filename, file text> - Parses file and emits <word, count> pairs  Reduce() - Sums Values for the same keys and emits <word, TotalCount> The mapper emits an intermediate key-value pair for each word in a document. The reducer sums up all counts for each word. class Mapper method Map(docid a, doc d) for all term t ∈ doc d do Emit(term t, count 1) class Reducer
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 1324 method Reduce(term t, counts [c1, c2, . . .]) sum ← 0 for all count c ∈ counts [c1, c2, . . .] do sum ← sum + c 6: Emit(term t, count sum) Figure 2 shows a screenshot of grabbed data in the form of CSV.[1] Figure 2: Screenshot of Grabbed Data V. CONCLUSION As compared to the existing system, our proposed system is cost effective. Requirements of this proposed system are easily available. All the softwares required are free open source softwares which cost minimal. Time saved in the entire process is cut down. Great speed is provided as compared to the existing system. We have already grabbed data for some keywords. We are working on the Dache implementation. We have stated the advantages of the existing system. Finally, we have explored a cost effective way to analyze social media market trends and generate a graphical view of the analysis. REFERENCES [1] Raveena Rana, Tejas Shah and Jitanksha Shrimanker “Social Media Market Trender with Dache Manager using Hadoop and Visualization Tools”, “International Research Journal of Engineering and Technology”, Volume: 03, Issue: 02, February 2016. [2]Hadoop: The Definitive Guide by Tom White. [3] Yaxiong Zhao, Jie Wu, Cong Liu, “Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework” ,TSINGHUA SCIENCE AND TECHNOLOGY ISSNl 1007-0214l l05/10l lpp39-50 Volume 19, Number 1, February 2014. [4] Kaveh Ketabchi Khonsari, Zahra Amin Nayeri, Ali Fathalian, Leila Fathalian, “Social Network Analysis of Iran’s Green Movement Opposition Groups using Twitter”, 2010 International Conference on Advances in Social Networks Analysis and Mining. [5] Karen Stepanyan, Kerstin Borau, Carsten Ullrich, “A Social Network Analysis Perspective on Student Interaction within the Twitter Microblogging Environment”, 2010 10th IEEE International Conference on Advanced Learning Technologies. [6] Guojun Liu, Ming Zhang, Fei Yan, “Large-Scale Social Network Analysis based on MapReduce”,2010 International Conference on Computational Aspects of Social Networks568. [7] Kai Shuang, Yin Yang, Bin Cai, Zhe Xiang, “ X-RIME: Hadoop-based Large-Scale Social Network Analysis”, Proceedings of IC-BNMT20 10. [8] Hyeokju Lee, Joon Her, Sung-Ryul Kim, “Implementation of Large-Scalable Social Data Analysis System based on MapReduce”, 2011 First ACIS/JNU International Conference on Computers, Networks, Systems, and Industrial Engineering. [9] Ge Song, Zide Meng, Fabrice Huet, Frederick Magoules, Lei Yu, Xuelian Lin, “A Hadoop MapReduce Performance Prediction Method”, 2013 IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing. [10] Gaurav D Rajurkar, Rajeshwari M Goudar, “ A speedy data uploading approach for Twitter Trend and Sentiment Analysis using Hadoop”, 2015 International Conference on Computing Communication Control and Automation.