SlideShare une entreprise Scribd logo
1  sur  46
1
DEPARTMENT OF STUDIES IN COMPUTER SCIENCE
TOPIC:BIGDATA ANALYTICS IN AGRICULURE
SUBMITTED TO,
MANASA K N
DOS IN COMPUTER SCINCE
SUBMITTED BY,
TEJASHREE K
YAMUNA
RESHAD
SMITHA
DOS IN COMPUTER SCIENCE
2
3
Contents
1. INTRODUCTION
2. HOW BIGDATA APPLICABLE IN AGRICULTURE
3. ANALYSIS OF BIGDATA
4. TOOLS
5. CONCLUSION
4
INTRODUCTION 5
BIG DATA :
Big data is an extensive collection of both structured and unstructured data that
can be mined for information and analyzed to build predictive systems for better
decision making.
Big data is consider as a large collection of dataset which is having high volume,
velocity, variety.
Volume: The amount of data generated
Velocity: How speed data is generated and processed
Variety : Variation of data with respect to the time
6
BIG DATA ANALYTICS IN AGRICULTURE
Big data analysis is successfully adopted to industries
like banking, insurance etc. Allthough agriculture did
not adopted big data analysis for past few years,
recently it is used to use it .
Big data analytics in agriculture can be studied under
two major areas: Smart farming
And precision agriculture.
Farmers use big data to get information on changing
weather, rainfall, fertilizer usage, and other factors
that impact the crop yield.
All of the information assists farmers in
making accurate and dependable decision that
maximize their productivity and cultivating the land.
7
The role of Big Data in the Agriculture industry
Big data in the agriculture industry is completely based
on using technology, information, and analytics to bring
useful information to farmers. Big data can be utilized for
grabbing information about the agriculture industry or it
can prove beneficial for any specific segment or area to
improve its efficiency. Data mining processes are utilized
by Big Data to create such vital information. With this
methodology, you can find the important patterns in a
huge set of data and condense this information into
useful forms. There are different modern systems, such
as artificial intelligence, machine learning statistics, and
more, that are used in the big data mechanism.
8
How is big data analytics transforming agriculture?
 Boosting productivity – Data collected from GPS-
equipped tractors, soil sensors, and other external
sources has helped in better management of seeds,
pesticides, and fertilizers while increasing productivity to
feed the ever-increasing global population.
 Access to plant genome information – This has allowed
the development of useful agronomic traits.
 Predicting yields – Mathematical models and machine
learning are used to collate and analyze data obtained
from yield, chemicals, weather, and biomass index. The
use of sensors for data collection reduces erroneous
manual work and provides useful insights on yield
prediction.
 Risk management– Data-driven farming has mitigated
crop failures arising due to changing weather patterns.
 Food safety – Collection of data relating to temperature,
humidity, and chemicals, lowers the risk of food spoilage
by early detection of microbes and other contaminants.
9
HOW BIGDATA CAN HELP AGRICULTURE ?
10
HOW TO USE BIGDATA ANALYTICS IN
AGRICULTURE
 To counter pressures of increasing food demand and climate changes,
policymakers and industry leaders are seeking assistants from technology
forces such as Iot,bigdata analytics and cloud computing.
 IoT, devices helping first phase of this process –data collection ,sensor plugged
in tractors and trucks as well as in fields, soil and plants aid in the collection of
real time data directly from the ground.
 Second, analysts integrate the large amount of data collected with other
information available in the cloud such as weather data pricing with models to
determine the patterns.
 Finally, these patterns and insights assist in controlling the problem. They
helped to pinpoint existence issues, like operational inefficiencies and
problems.
11
 The adoption of analytics in agriculture has been increasing consistently
;its market size is expected to grow from USD 585million in 2018 to USD
1236 million by 2023 at a compound annual rate (CAGR)of 16.2%.
12
USE OF BIG DATA IN AGRICULTURE
SECTOR
Top four use cases for big data on the form:
1.Feeding a growing population.
2.Using pesticides ethical.
3.Optimizing form equipment.
4.Managing supply chain.
13
1. Feeding a growing population:
This is one of the key challenges that even governments are putting their
heads together to solve. One way to achieve this is to increase the yield from
existing farmlands.
Big data provides formers granular data on rainfall patterns, water cycles,
fertilizer requirements, and more. This enables them to make smart decisions,
such as what crops to plant for better profitability and when to harvest. The
right decisions ultimately improve farm yields.
14
2. Using pesticides ethnically
 Administration of pesticides has been a contentious issue due to its side effects
on the ecosystem. Big data allows farmers to manage this better by
recommending what pesticides to apply, when ,and by how much. By
monitoring it closely farmers can adhere to government regulations and avoid
overuse of chemicals in food production. Moreover, this leads to increased
profitability because crops don't get destroyed by weeds and insects
15
3. Optimizing farm equipment
 Companies like John Deere have integrated sensors in their farming
equipment and deployed big data applications that will help better
manage their fleet. For large farms, this level of monitoring can be a life
saver as it lets users know of tractor availability, service due dates, and
fuel refill alerts . In essence this optimizes usage and ensure the long -
term health of farm equipment.
16
4. Managing supply chain issues:
 McKinney reports that a third of food produced for human consumption
is lost or wasted every year. A devastating fact since the industry
struggles to bridge the gap between supply and demand. To address
this, food delivery cycle from producer to the market need to be reduced.
Big data can help achieve supply chain efficiencies by tracking and
optimizing delivery truck routes.
17
Analysis of agriculture data using data mining
techniques: application of big data
In agriculture sector where farmers and agribusinesses have to make
innumerable decisions every day and intricate complexities involves the
various factors influencing them. An essential issue for agricultural planning
intention is the accurate yield estimation for the numerous crops involved in
the planning. Data mining techniques are necessary approach for
accomplishing practical and effective solutions for this problem. Agriculture
has been an obvious target for big data. Environmental conditions, variability
in soil, input levels, combinations and commodity prices have made it all the
more relevant for farmers to use information and get help to make critical
farming decisions. This focuses on the analysis of the agriculture data and
finding optimal parameters to maximize the crop production using data
mining techniques like PAM, CLARA, DBSCAN and Multiple Linear Regression.
Mining the large amount of existing crop, soil and climatic data, and analyzing
new, non-experimental data optimizes the production and makes agriculture
more resilient to climatic change.
Big Data, PAM, CLARA
18
 Input dataset consist of 6 year data with following parameters namely: year, State-
Karnataka (28 districts), District, crop (cotton, groundnut, jowar, rice and wheat.),
season (kharif, rabi, summer), area (in hectares), production (in tonnes), average
temperature (°C), average rainfall (mm), soil, PH value, soil type, major fertilizers,
nitrogen (kg/Ha), phosphorus (Kg/Ha),Potassium(Kg/Ha), minimum rainfall required,
minimum temperature required.
 In proposed work, modified approach of DBSCAN method is used to cluster the data
based on districts which are having similar temperature, rain fall and soil type. PAM
and CLARA are used to cluster the data based on the districts which are producing
maximum crop production (In proposed work wheat crop is considered as example).
Based on these analyses we are obtaining the optimal parameters to produce the
maximum crop production. Multiple linear regression method is used to forecast the
annual crop yield
19
Partition around medoids (PAM)
 It is a partitioning based algorithm. It breaks the input data into number of
groups. It finds a set of objects called medoids that are centrally located.
With the medoids, nearest data points can be calculated and made it as
clusters. The algorithm has two phases:
20
21
 1. BUILD phase, a collection of k objects are selected for an initial set S.
• Arbitrarily choose k objects as the initial medoids.
• Until no change, do.
–– (Re) assign each object to the cluster with the nearest medoid.
– Improve the quality of the k-medoids .
 2. SWAP phase, one tries to improve the quality of the clustering by exchanging
selected objects with unselected objects. Choose the minimum swapping cost.
Example: For each medoid m1, for each non-medoid data point d; Swap m1 and d,
recomputed the cost (sum of distances of points to their medoid), if total cost of
the configuration increased in the previous step, undo the swap Fig. 2 depicts the
steps involved the PAM algorithms.
22
CLARA (clustering large applications)
 CLARA (clustering large applications) It is designed by Kaufman and Rousseeuw
to handle large datasets, CLARA (clustering large applications) relies on sampling
. Instead of finding representative objects for the entire data set, CLARA draws a
sample of the data set, applies PAM on the sample, and finds the medoids of the
sample. To come up with better approximations, CLARA draws multiple samples
and gives the best clustering as the output. Here, for accuracy, the quality of the
clustering is measured based on the average dissimilarity of all objects in the
entire data set.
23
Multiple linear regression
to forecast the crop yield
Multiple linear regression is a variant of
“linear regression” analysis. Tis model is
built to establish the relationship that
exists between one dependent variable
and two or more independent variables
.For a given dataset where x1… xk are
independent variables and Y is a
dependent variable, the multiple linear
regression fts the dataset to the model:
yi = β0 + β1x1i + β2x2i +···+ βkxki + ε
24
 is the y-intercept and β1, β2, ... , βk parameters are called the partial
coeffient.In matrix form
Y = XB + E Y ,
Before applying the multiple linear regression to forecast the crop yield,
it’s necessary to know the significant attributes from the database. All the
attributes used in the database will not be significant or changing the
value of these attributes will not affect anything on the dependent
variables. Such attributes can be neglected. P value test is performed on
the database to find the significant attributes and multiple linear
regression is applied only on the significant values to forecast the crop
yield
25
Evaluation methods
26
depicts the different districts of Karnataka which are
having similar temperature range, rain fall range and soil
types respectively..
27
PAM
 PAM To apply the PAM algorithm on the dataset, initially user need to give
k (Number of clusters), where k is given as 3 in current experiment. Crop
yield is categorised into LOW, MODERATE and HIGH production. Total
districts are clustered into 3 clusters using PAM clustering method.
 As a result of the analysis, North Karnataka districts such as Bijapur,
Dharwad, Bagalkot, Belgaum, Raichur, Bellary, Chitradurga and Davangere
are the districts which have maximum wheat crop production
28
RESULTS/Clusters: 29
30
31
Study and analysis of temperature and wheat crop
production in different districts of Karnataka as shown in
Fig. 12. From the Fig. 12, we can analyze that the optimal
temperature for Wheat crop
32
33
34
35
Conclusion
 Various data mining techniques are implemented on the input data to
assess the best performance yielding method. The present work used data
mining techniques PAM, CLARA and DBSCAN to obtain the optimal climate
requirement of wheat like optimal range of best temperature, worst
temperature and rain fall to achieve higher production of wheat crop.
Clustering methods are compared using quality metrics. According to the
analyses of clustering quality metrics, DBSCAN gives the better clustering
quality than PAM and CLARA, CLARA gives the better clustering quality
than the PAM. Te proposed work can also be extended to analyze the soil
and other factors for the crop and to increase the crop production under
the different climatic conditions
36
Image Processing IM toolkit, VTK toolkit, OpenCv library
Machine Learning R, Google tenserflow, Apache Mahout, Weka, Mlpack
Cloud based platforms
for large scale
information storing
EMC corporation, MapR converged data platforms, Apache Pig,
Big Databases Hive, HadoopDB, Mango DB, Google big table, Cassandra, PostGIS
Statistical tools Norsys Netica, R, Weka
37
Image Processing Tool:
VTK toolkit:
The Visualization Toolkit (VTK) is open-source software for manipulating and
displaying scientific data. It comes with state-of-the-art tools for 3D
rendering, a suite of widgets for 3D interaction, and extensive 2D plotting
capability.
OpenCV:
OpenCV is a huge open-source library the computer vision, machine learning,
and image processing and now it plays a major role in real-time operation
which is very important in today’s systems.
38
Machine Learning tools:
R:
R analytics is data analytics using R programming language, an open-source
language used for statistical computing or graphics. This programming
language is often used in statistical analysis and data mining. It can be used
for analytics to identify patterns and build practical models.
Google TensorFlow:
TensorFlow is a free and open-source software library for machine learning
and artificial intelligence. It can be used across a range of tasks but has a
particular focus on training and inference of deep neural networks.
39
Machine Learning tools:
R:
R analytics is data analytics using R programming language, an open-source language used for statistical computing or graphics.
This programming language is often used in statistical analysis and data mining. It can be used for analytics to identify patterns and
build practical models.
Google Tensor Flow:
Tensor Flow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range
of tasks but has a particular focus on training and inference of deep neural networks.
Apache Mahout:
Apache Mahout is an open-source project to create scalable, machine-learning algorithms. Mahout operates in addition to Hadoop,
which allows you to apply the concept of machine learning via a selection of Mahout algorithms to distributed computing via
Hadoop.
40
Cloud-based platforms for large-scale information storing:
EMC Corporation:
EMC is a multinational provider of products and services related to cloud computing,
storage, big data, data analytics, information security, content management, and converged
infrastructure. EMC was acquired by Dell in September 2016 and the company was
renamed to Dell EMC.
MapR converged data platforms: A platform for all the data and applications. With MapR,
users have a single platform (on a single codebase!) that delivers data-wide convergence. It
is the only platform that has a distributed file system that supports storage and analytics of
data streams, files, and NoSQL tables in the same converged
Apache Pig:
Apache Pig is an abstraction over Map Reduce. It is a tool/platform which is used to analyze
larger sets of data representing them as data flows. Pig is generally used with Hadoop; we
can perform all the data manipulation operations in Hadoop using Apache Pig
41
 Big Databases
Hive:
Hive allows users to read, write, and manage petabytes of data using SQL. Hive is built on top of Apache Hadoop, which
is an open-source framework used to efficiently store and process large datasets. As a result, Hive is closely integrated
with Hadoop, and is designed to work quickly on petabytes of data.
Hadoop DB:
Hadoop handles larger data sets but only writes data once. SQL is easier to use but more difficult to scale. Apache
Hadoop is an open-source framework that is used to efficiently store and process large datasets ranging in size from
gigabytes to petabytes of data. Instead of using one large computer to store and process the data, Hadoop allows
clustering multiple computers to analyze massive datasets in parallel more quickly.
Mango DB:
MongoDB is an open-source document-oriented database that is designed to store a large scale of data and also allows
you to work with that data very efficiently. It is categorized under the NoSQL (Not only SQL) database because the storage
and retrieval of data in the MongoDB are not in the form of tables.
Google big table:
Big table is ideal for storing large amounts of single-keyed data with low latency. It supports high read and writes
throughput at low latency, and it's an ideal data source for Map Reduce operations.
42
 Statistical tools:
Norsys Netica: Netica is a powerful, easy-to-use, complete program for working with belief networks
and influence diagrams. It has an intuitive and smooth user interface for drawing the networks, and
the relationships between variables may be entered as individual probabilities, in the form of
equations, or learned from data files (which may be in ordinary tab-delimited form and have "missing
data").
Weka:
Weka is a collection of machine-learning algorithms for data mining tasks. The algorithms can either
be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-
processing, classification, regression, clustering, association rules, and visualization.
43
References:
 The objective of proposed work is to analyses the agriculture data using
data mining In proposed work, agriculture data has been collected from
following sources: Dataset in agricultural sector [https://data.gov.in/,
http://raitamitra.kar.nic.in/ statistics].
 Crop wise agriculture data [html://CROPWISE_NORMAL_AREA],
Agriculture data of different districts
[http://14.139.94.101/fertimeter/Distkar.aspx],http://raitamitra.kar.nic.in/EN
G/statistics.asp],
 Agriculture data based on weather, temperature, and relative humidity
[http://dmc. kar.nic.in/trg.pdftechniques].
44
Thank you
45
46

Contenu connexe

Similaire à big data.pptx

An Overview of Crop Yield Prediction using Machine Learning Approach
An Overview of Crop Yield Prediction using Machine Learning ApproachAn Overview of Crop Yield Prediction using Machine Learning Approach
An Overview of Crop Yield Prediction using Machine Learning ApproachIRJET Journal
 
Big Data and AI Revolution in Precision Agriculture
Big Data and AI Revolution in Precision AgricultureBig Data and AI Revolution in Precision Agriculture
Big Data and AI Revolution in Precision AgricultureSaleAnish
 
Crop Prediction using IoT & Machine Learning Algorithm
Crop Prediction using IoT & Machine Learning AlgorithmCrop Prediction using IoT & Machine Learning Algorithm
Crop Prediction using IoT & Machine Learning AlgorithmIRJET Journal
 
IRJET- Survey of Estimation of Crop Yield using Agriculture Data
IRJET- Survey of Estimation of Crop Yield using Agriculture DataIRJET- Survey of Estimation of Crop Yield using Agriculture Data
IRJET- Survey of Estimation of Crop Yield using Agriculture DataIRJET Journal
 
IRJET- Crop Yield Prediction and Disease Detection using IoT Approach
IRJET- Crop Yield Prediction and Disease Detection using IoT ApproachIRJET- Crop Yield Prediction and Disease Detection using IoT Approach
IRJET- Crop Yield Prediction and Disease Detection using IoT ApproachIRJET Journal
 
Selection of crop varieties and yield prediction based on phenotype applying ...
Selection of crop varieties and yield prediction based on phenotype applying ...Selection of crop varieties and yield prediction based on phenotype applying ...
Selection of crop varieties and yield prediction based on phenotype applying ...IJECEIAES
 
IRJET - Agricultural Analysis using Data Mining Techniques
IRJET - Agricultural Analysis using Data Mining TechniquesIRJET - Agricultural Analysis using Data Mining Techniques
IRJET - Agricultural Analysis using Data Mining TechniquesIRJET Journal
 
An intensive analtics for farmer using big data
An intensive analtics for farmer using big dataAn intensive analtics for farmer using big data
An intensive analtics for farmer using big dataDr. C.V. Suresh Babu
 
IRJET- Crop Prediction System using Machine Learning Algorithms
IRJET- Crop Prediction System using Machine Learning AlgorithmsIRJET- Crop Prediction System using Machine Learning Algorithms
IRJET- Crop Prediction System using Machine Learning AlgorithmsIRJET Journal
 
An Intensive Analtics for farmer using Big Data by vitul chauhan.pdf
An Intensive Analtics for farmer using Big Data by vitul chauhan.pdfAn Intensive Analtics for farmer using Big Data by vitul chauhan.pdf
An Intensive Analtics for farmer using Big Data by vitul chauhan.pdfVitulChauhan
 
BHARATH KISAN HELPLINE
BHARATH KISAN HELPLINEBHARATH KISAN HELPLINE
BHARATH KISAN HELPLINEIRJET Journal
 
Automated Machine Learning based Agricultural Suggestion System
Automated Machine Learning based Agricultural Suggestion SystemAutomated Machine Learning based Agricultural Suggestion System
Automated Machine Learning based Agricultural Suggestion SystemIRJET Journal
 
A COMPREHENSIVE SURVEY ON AGRICULTURE ADVISORY SYSTEM
A COMPREHENSIVE SURVEY ON AGRICULTURE ADVISORY SYSTEMA COMPREHENSIVE SURVEY ON AGRICULTURE ADVISORY SYSTEM
A COMPREHENSIVE SURVEY ON AGRICULTURE ADVISORY SYSTEMIRJET Journal
 
IMPLEMENTATION PAPER ON AGRICULTURE ADVISORY SYSTEM
IMPLEMENTATION PAPER ON AGRICULTURE ADVISORY SYSTEMIMPLEMENTATION PAPER ON AGRICULTURE ADVISORY SYSTEM
IMPLEMENTATION PAPER ON AGRICULTURE ADVISORY SYSTEMIRJET Journal
 
IRJET- Crop Prediction and Disease Detection
IRJET-  	  Crop Prediction and Disease DetectionIRJET-  	  Crop Prediction and Disease Detection
IRJET- Crop Prediction and Disease DetectionIRJET Journal
 
Revolution in Farming with Big Data
Revolution in Farming with Big DataRevolution in Farming with Big Data
Revolution in Farming with Big DataSauravJaiswal17
 
Supervise Machine Learning Approach for Crop Yield Prediction in Agriculture ...
Supervise Machine Learning Approach for Crop Yield Prediction in Agriculture ...Supervise Machine Learning Approach for Crop Yield Prediction in Agriculture ...
Supervise Machine Learning Approach for Crop Yield Prediction in Agriculture ...IRJET Journal
 
Big Data Solution in Agriculture.pdf
Big Data Solution in Agriculture.pdfBig Data Solution in Agriculture.pdf
Big Data Solution in Agriculture.pdfLily Williams
 

Similaire à big data.pptx (20)

An Overview of Crop Yield Prediction using Machine Learning Approach
An Overview of Crop Yield Prediction using Machine Learning ApproachAn Overview of Crop Yield Prediction using Machine Learning Approach
An Overview of Crop Yield Prediction using Machine Learning Approach
 
Big Data and AI Revolution in Precision Agriculture
Big Data and AI Revolution in Precision AgricultureBig Data and AI Revolution in Precision Agriculture
Big Data and AI Revolution in Precision Agriculture
 
Datamining 4
Datamining 4Datamining 4
Datamining 4
 
Crop Prediction using IoT & Machine Learning Algorithm
Crop Prediction using IoT & Machine Learning AlgorithmCrop Prediction using IoT & Machine Learning Algorithm
Crop Prediction using IoT & Machine Learning Algorithm
 
IRJET- Survey of Estimation of Crop Yield using Agriculture Data
IRJET- Survey of Estimation of Crop Yield using Agriculture DataIRJET- Survey of Estimation of Crop Yield using Agriculture Data
IRJET- Survey of Estimation of Crop Yield using Agriculture Data
 
IRJET- Crop Yield Prediction and Disease Detection using IoT Approach
IRJET- Crop Yield Prediction and Disease Detection using IoT ApproachIRJET- Crop Yield Prediction and Disease Detection using IoT Approach
IRJET- Crop Yield Prediction and Disease Detection using IoT Approach
 
Selection of crop varieties and yield prediction based on phenotype applying ...
Selection of crop varieties and yield prediction based on phenotype applying ...Selection of crop varieties and yield prediction based on phenotype applying ...
Selection of crop varieties and yield prediction based on phenotype applying ...
 
IRJET - Agricultural Analysis using Data Mining Techniques
IRJET - Agricultural Analysis using Data Mining TechniquesIRJET - Agricultural Analysis using Data Mining Techniques
IRJET - Agricultural Analysis using Data Mining Techniques
 
An intensive analtics for farmer using big data
An intensive analtics for farmer using big dataAn intensive analtics for farmer using big data
An intensive analtics for farmer using big data
 
IRJET- Crop Prediction System using Machine Learning Algorithms
IRJET- Crop Prediction System using Machine Learning AlgorithmsIRJET- Crop Prediction System using Machine Learning Algorithms
IRJET- Crop Prediction System using Machine Learning Algorithms
 
An Intensive Analtics for farmer using Big Data by vitul chauhan.pdf
An Intensive Analtics for farmer using Big Data by vitul chauhan.pdfAn Intensive Analtics for farmer using Big Data by vitul chauhan.pdf
An Intensive Analtics for farmer using Big Data by vitul chauhan.pdf
 
BHARATH KISAN HELPLINE
BHARATH KISAN HELPLINEBHARATH KISAN HELPLINE
BHARATH KISAN HELPLINE
 
Automated Machine Learning based Agricultural Suggestion System
Automated Machine Learning based Agricultural Suggestion SystemAutomated Machine Learning based Agricultural Suggestion System
Automated Machine Learning based Agricultural Suggestion System
 
A COMPREHENSIVE SURVEY ON AGRICULTURE ADVISORY SYSTEM
A COMPREHENSIVE SURVEY ON AGRICULTURE ADVISORY SYSTEMA COMPREHENSIVE SURVEY ON AGRICULTURE ADVISORY SYSTEM
A COMPREHENSIVE SURVEY ON AGRICULTURE ADVISORY SYSTEM
 
IMPLEMENTATION PAPER ON AGRICULTURE ADVISORY SYSTEM
IMPLEMENTATION PAPER ON AGRICULTURE ADVISORY SYSTEMIMPLEMENTATION PAPER ON AGRICULTURE ADVISORY SYSTEM
IMPLEMENTATION PAPER ON AGRICULTURE ADVISORY SYSTEM
 
IRJET- Crop Prediction and Disease Detection
IRJET-  	  Crop Prediction and Disease DetectionIRJET-  	  Crop Prediction and Disease Detection
IRJET- Crop Prediction and Disease Detection
 
Revolution in Farming with Big Data
Revolution in Farming with Big DataRevolution in Farming with Big Data
Revolution in Farming with Big Data
 
Publisher in research
Publisher in researchPublisher in research
Publisher in research
 
Supervise Machine Learning Approach for Crop Yield Prediction in Agriculture ...
Supervise Machine Learning Approach for Crop Yield Prediction in Agriculture ...Supervise Machine Learning Approach for Crop Yield Prediction in Agriculture ...
Supervise Machine Learning Approach for Crop Yield Prediction in Agriculture ...
 
Big Data Solution in Agriculture.pdf
Big Data Solution in Agriculture.pdfBig Data Solution in Agriculture.pdf
Big Data Solution in Agriculture.pdf
 

Dernier

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Dernier (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

big data.pptx

  • 1. 1
  • 2. DEPARTMENT OF STUDIES IN COMPUTER SCIENCE TOPIC:BIGDATA ANALYTICS IN AGRICULURE SUBMITTED TO, MANASA K N DOS IN COMPUTER SCINCE SUBMITTED BY, TEJASHREE K YAMUNA RESHAD SMITHA DOS IN COMPUTER SCIENCE 2
  • 3. 3
  • 4. Contents 1. INTRODUCTION 2. HOW BIGDATA APPLICABLE IN AGRICULTURE 3. ANALYSIS OF BIGDATA 4. TOOLS 5. CONCLUSION 4
  • 6. BIG DATA : Big data is an extensive collection of both structured and unstructured data that can be mined for information and analyzed to build predictive systems for better decision making. Big data is consider as a large collection of dataset which is having high volume, velocity, variety. Volume: The amount of data generated Velocity: How speed data is generated and processed Variety : Variation of data with respect to the time 6
  • 7. BIG DATA ANALYTICS IN AGRICULTURE Big data analysis is successfully adopted to industries like banking, insurance etc. Allthough agriculture did not adopted big data analysis for past few years, recently it is used to use it . Big data analytics in agriculture can be studied under two major areas: Smart farming And precision agriculture. Farmers use big data to get information on changing weather, rainfall, fertilizer usage, and other factors that impact the crop yield. All of the information assists farmers in making accurate and dependable decision that maximize their productivity and cultivating the land. 7
  • 8. The role of Big Data in the Agriculture industry Big data in the agriculture industry is completely based on using technology, information, and analytics to bring useful information to farmers. Big data can be utilized for grabbing information about the agriculture industry or it can prove beneficial for any specific segment or area to improve its efficiency. Data mining processes are utilized by Big Data to create such vital information. With this methodology, you can find the important patterns in a huge set of data and condense this information into useful forms. There are different modern systems, such as artificial intelligence, machine learning statistics, and more, that are used in the big data mechanism. 8
  • 9. How is big data analytics transforming agriculture?  Boosting productivity – Data collected from GPS- equipped tractors, soil sensors, and other external sources has helped in better management of seeds, pesticides, and fertilizers while increasing productivity to feed the ever-increasing global population.  Access to plant genome information – This has allowed the development of useful agronomic traits.  Predicting yields – Mathematical models and machine learning are used to collate and analyze data obtained from yield, chemicals, weather, and biomass index. The use of sensors for data collection reduces erroneous manual work and provides useful insights on yield prediction.  Risk management– Data-driven farming has mitigated crop failures arising due to changing weather patterns.  Food safety – Collection of data relating to temperature, humidity, and chemicals, lowers the risk of food spoilage by early detection of microbes and other contaminants. 9
  • 10. HOW BIGDATA CAN HELP AGRICULTURE ? 10
  • 11. HOW TO USE BIGDATA ANALYTICS IN AGRICULTURE  To counter pressures of increasing food demand and climate changes, policymakers and industry leaders are seeking assistants from technology forces such as Iot,bigdata analytics and cloud computing.  IoT, devices helping first phase of this process –data collection ,sensor plugged in tractors and trucks as well as in fields, soil and plants aid in the collection of real time data directly from the ground.  Second, analysts integrate the large amount of data collected with other information available in the cloud such as weather data pricing with models to determine the patterns.  Finally, these patterns and insights assist in controlling the problem. They helped to pinpoint existence issues, like operational inefficiencies and problems. 11
  • 12.  The adoption of analytics in agriculture has been increasing consistently ;its market size is expected to grow from USD 585million in 2018 to USD 1236 million by 2023 at a compound annual rate (CAGR)of 16.2%. 12
  • 13. USE OF BIG DATA IN AGRICULTURE SECTOR Top four use cases for big data on the form: 1.Feeding a growing population. 2.Using pesticides ethical. 3.Optimizing form equipment. 4.Managing supply chain. 13
  • 14. 1. Feeding a growing population: This is one of the key challenges that even governments are putting their heads together to solve. One way to achieve this is to increase the yield from existing farmlands. Big data provides formers granular data on rainfall patterns, water cycles, fertilizer requirements, and more. This enables them to make smart decisions, such as what crops to plant for better profitability and when to harvest. The right decisions ultimately improve farm yields. 14
  • 15. 2. Using pesticides ethnically  Administration of pesticides has been a contentious issue due to its side effects on the ecosystem. Big data allows farmers to manage this better by recommending what pesticides to apply, when ,and by how much. By monitoring it closely farmers can adhere to government regulations and avoid overuse of chemicals in food production. Moreover, this leads to increased profitability because crops don't get destroyed by weeds and insects 15
  • 16. 3. Optimizing farm equipment  Companies like John Deere have integrated sensors in their farming equipment and deployed big data applications that will help better manage their fleet. For large farms, this level of monitoring can be a life saver as it lets users know of tractor availability, service due dates, and fuel refill alerts . In essence this optimizes usage and ensure the long - term health of farm equipment. 16
  • 17. 4. Managing supply chain issues:  McKinney reports that a third of food produced for human consumption is lost or wasted every year. A devastating fact since the industry struggles to bridge the gap between supply and demand. To address this, food delivery cycle from producer to the market need to be reduced. Big data can help achieve supply chain efficiencies by tracking and optimizing delivery truck routes. 17
  • 18. Analysis of agriculture data using data mining techniques: application of big data In agriculture sector where farmers and agribusinesses have to make innumerable decisions every day and intricate complexities involves the various factors influencing them. An essential issue for agricultural planning intention is the accurate yield estimation for the numerous crops involved in the planning. Data mining techniques are necessary approach for accomplishing practical and effective solutions for this problem. Agriculture has been an obvious target for big data. Environmental conditions, variability in soil, input levels, combinations and commodity prices have made it all the more relevant for farmers to use information and get help to make critical farming decisions. This focuses on the analysis of the agriculture data and finding optimal parameters to maximize the crop production using data mining techniques like PAM, CLARA, DBSCAN and Multiple Linear Regression. Mining the large amount of existing crop, soil and climatic data, and analyzing new, non-experimental data optimizes the production and makes agriculture more resilient to climatic change. Big Data, PAM, CLARA 18
  • 19.  Input dataset consist of 6 year data with following parameters namely: year, State- Karnataka (28 districts), District, crop (cotton, groundnut, jowar, rice and wheat.), season (kharif, rabi, summer), area (in hectares), production (in tonnes), average temperature (°C), average rainfall (mm), soil, PH value, soil type, major fertilizers, nitrogen (kg/Ha), phosphorus (Kg/Ha),Potassium(Kg/Ha), minimum rainfall required, minimum temperature required.  In proposed work, modified approach of DBSCAN method is used to cluster the data based on districts which are having similar temperature, rain fall and soil type. PAM and CLARA are used to cluster the data based on the districts which are producing maximum crop production (In proposed work wheat crop is considered as example). Based on these analyses we are obtaining the optimal parameters to produce the maximum crop production. Multiple linear regression method is used to forecast the annual crop yield 19
  • 20. Partition around medoids (PAM)  It is a partitioning based algorithm. It breaks the input data into number of groups. It finds a set of objects called medoids that are centrally located. With the medoids, nearest data points can be calculated and made it as clusters. The algorithm has two phases: 20
  • 21. 21
  • 22.  1. BUILD phase, a collection of k objects are selected for an initial set S. • Arbitrarily choose k objects as the initial medoids. • Until no change, do. –– (Re) assign each object to the cluster with the nearest medoid. – Improve the quality of the k-medoids .  2. SWAP phase, one tries to improve the quality of the clustering by exchanging selected objects with unselected objects. Choose the minimum swapping cost. Example: For each medoid m1, for each non-medoid data point d; Swap m1 and d, recomputed the cost (sum of distances of points to their medoid), if total cost of the configuration increased in the previous step, undo the swap Fig. 2 depicts the steps involved the PAM algorithms. 22
  • 23. CLARA (clustering large applications)  CLARA (clustering large applications) It is designed by Kaufman and Rousseeuw to handle large datasets, CLARA (clustering large applications) relies on sampling . Instead of finding representative objects for the entire data set, CLARA draws a sample of the data set, applies PAM on the sample, and finds the medoids of the sample. To come up with better approximations, CLARA draws multiple samples and gives the best clustering as the output. Here, for accuracy, the quality of the clustering is measured based on the average dissimilarity of all objects in the entire data set. 23
  • 24. Multiple linear regression to forecast the crop yield Multiple linear regression is a variant of “linear regression” analysis. Tis model is built to establish the relationship that exists between one dependent variable and two or more independent variables .For a given dataset where x1… xk are independent variables and Y is a dependent variable, the multiple linear regression fts the dataset to the model: yi = β0 + β1x1i + β2x2i +···+ βkxki + ε 24
  • 25.  is the y-intercept and β1, β2, ... , βk parameters are called the partial coeffient.In matrix form Y = XB + E Y , Before applying the multiple linear regression to forecast the crop yield, it’s necessary to know the significant attributes from the database. All the attributes used in the database will not be significant or changing the value of these attributes will not affect anything on the dependent variables. Such attributes can be neglected. P value test is performed on the database to find the significant attributes and multiple linear regression is applied only on the significant values to forecast the crop yield 25
  • 27. depicts the different districts of Karnataka which are having similar temperature range, rain fall range and soil types respectively.. 27
  • 28. PAM  PAM To apply the PAM algorithm on the dataset, initially user need to give k (Number of clusters), where k is given as 3 in current experiment. Crop yield is categorised into LOW, MODERATE and HIGH production. Total districts are clustered into 3 clusters using PAM clustering method.  As a result of the analysis, North Karnataka districts such as Bijapur, Dharwad, Bagalkot, Belgaum, Raichur, Bellary, Chitradurga and Davangere are the districts which have maximum wheat crop production 28
  • 30. 30
  • 31. 31
  • 32. Study and analysis of temperature and wheat crop production in different districts of Karnataka as shown in Fig. 12. From the Fig. 12, we can analyze that the optimal temperature for Wheat crop 32
  • 33. 33
  • 34. 34
  • 35. 35
  • 36. Conclusion  Various data mining techniques are implemented on the input data to assess the best performance yielding method. The present work used data mining techniques PAM, CLARA and DBSCAN to obtain the optimal climate requirement of wheat like optimal range of best temperature, worst temperature and rain fall to achieve higher production of wheat crop. Clustering methods are compared using quality metrics. According to the analyses of clustering quality metrics, DBSCAN gives the better clustering quality than PAM and CLARA, CLARA gives the better clustering quality than the PAM. Te proposed work can also be extended to analyze the soil and other factors for the crop and to increase the crop production under the different climatic conditions 36
  • 37. Image Processing IM toolkit, VTK toolkit, OpenCv library Machine Learning R, Google tenserflow, Apache Mahout, Weka, Mlpack Cloud based platforms for large scale information storing EMC corporation, MapR converged data platforms, Apache Pig, Big Databases Hive, HadoopDB, Mango DB, Google big table, Cassandra, PostGIS Statistical tools Norsys Netica, R, Weka 37
  • 38. Image Processing Tool: VTK toolkit: The Visualization Toolkit (VTK) is open-source software for manipulating and displaying scientific data. It comes with state-of-the-art tools for 3D rendering, a suite of widgets for 3D interaction, and extensive 2D plotting capability. OpenCV: OpenCV is a huge open-source library the computer vision, machine learning, and image processing and now it plays a major role in real-time operation which is very important in today’s systems. 38
  • 39. Machine Learning tools: R: R analytics is data analytics using R programming language, an open-source language used for statistical computing or graphics. This programming language is often used in statistical analysis and data mining. It can be used for analytics to identify patterns and build practical models. Google TensorFlow: TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. 39
  • 40. Machine Learning tools: R: R analytics is data analytics using R programming language, an open-source language used for statistical computing or graphics. This programming language is often used in statistical analysis and data mining. It can be used for analytics to identify patterns and build practical models. Google Tensor Flow: Tensor Flow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. Apache Mahout: Apache Mahout is an open-source project to create scalable, machine-learning algorithms. Mahout operates in addition to Hadoop, which allows you to apply the concept of machine learning via a selection of Mahout algorithms to distributed computing via Hadoop. 40
  • 41. Cloud-based platforms for large-scale information storing: EMC Corporation: EMC is a multinational provider of products and services related to cloud computing, storage, big data, data analytics, information security, content management, and converged infrastructure. EMC was acquired by Dell in September 2016 and the company was renamed to Dell EMC. MapR converged data platforms: A platform for all the data and applications. With MapR, users have a single platform (on a single codebase!) that delivers data-wide convergence. It is the only platform that has a distributed file system that supports storage and analytics of data streams, files, and NoSQL tables in the same converged Apache Pig: Apache Pig is an abstraction over Map Reduce. It is a tool/platform which is used to analyze larger sets of data representing them as data flows. Pig is generally used with Hadoop; we can perform all the data manipulation operations in Hadoop using Apache Pig 41
  • 42.  Big Databases Hive: Hive allows users to read, write, and manage petabytes of data using SQL. Hive is built on top of Apache Hadoop, which is an open-source framework used to efficiently store and process large datasets. As a result, Hive is closely integrated with Hadoop, and is designed to work quickly on petabytes of data. Hadoop DB: Hadoop handles larger data sets but only writes data once. SQL is easier to use but more difficult to scale. Apache Hadoop is an open-source framework that is used to efficiently store and process large datasets ranging in size from gigabytes to petabytes of data. Instead of using one large computer to store and process the data, Hadoop allows clustering multiple computers to analyze massive datasets in parallel more quickly. Mango DB: MongoDB is an open-source document-oriented database that is designed to store a large scale of data and also allows you to work with that data very efficiently. It is categorized under the NoSQL (Not only SQL) database because the storage and retrieval of data in the MongoDB are not in the form of tables. Google big table: Big table is ideal for storing large amounts of single-keyed data with low latency. It supports high read and writes throughput at low latency, and it's an ideal data source for Map Reduce operations. 42
  • 43.  Statistical tools: Norsys Netica: Netica is a powerful, easy-to-use, complete program for working with belief networks and influence diagrams. It has an intuitive and smooth user interface for drawing the networks, and the relationships between variables may be entered as individual probabilities, in the form of equations, or learned from data files (which may be in ordinary tab-delimited form and have "missing data"). Weka: Weka is a collection of machine-learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre- processing, classification, regression, clustering, association rules, and visualization. 43
  • 44. References:  The objective of proposed work is to analyses the agriculture data using data mining In proposed work, agriculture data has been collected from following sources: Dataset in agricultural sector [https://data.gov.in/, http://raitamitra.kar.nic.in/ statistics].  Crop wise agriculture data [html://CROPWISE_NORMAL_AREA], Agriculture data of different districts [http://14.139.94.101/fertimeter/Distkar.aspx],http://raitamitra.kar.nic.in/EN G/statistics.asp],  Agriculture data based on weather, temperature, and relative humidity [http://dmc. kar.nic.in/trg.pdftechniques]. 44
  • 46. 46