Presenters: Daniel Jiménez (Leader of the Big Data expert group, DAPA) & Edgar Torres (Leader, Rice Program, AGROBIODIVERSITY)
Title: BIG DATA: BIG DATA ANALYSIS: is it a solution to understand Big Problems? . The case of yield variation of rice in Colombia
------------------
Cukier and Mayer-Schönberger (2013) stated “As the telescope enabled us to comprehend the universe and the microscope allowed us to understand germs, the new techniques for collecting and analyzing information will help us to make sense of our world in ways we are just starting to appreciate”. We subscribe to this view and nowadays in agriculture we have the capacity to capture, analyze, store and share agricultural information in ways which 10 years ago was considered science fiction. The amount and variety of agricultural data generated by multiple individuals and organizations using a huge range of techniques and technologies is growing exponentially. We believe that the next agricultural (r)evolution will come from the development of innovation systems that harness agricultural data from multiple sources, to generate new knowledge that will increase agricultural productivity moving beyond blanket technological solutions towards a system of dynamic site-specific management, which are sensitive and responsive to climate, soil and local socio-economic conditions.
In this seminar, CIAT's researchers will share how several databases that have been collected for different purposes and shared by FEDRARROZ (the country-wide association of rice growers in Colombia), have been used to obtain important insights to support FEDEARROZ on how to be more efficient managing rice at site-specific level.
http://marafris.ciat.cgiar.org:8080/Webinars/Bluejeans/2014-24-04%20-%20Daniel_Jimenez_Edgar_Torres.mp4
Mayer-Schonberger, V., Cukier, K., 2013. ). Big Data: A Revolution That Will Transform How We Live, Work and Think
Generative AI for Social Good at Open Data Science East 2024
Big data ciat april_2014_dj_et_slideshare
1. BIG DATA: BIG DATA ANALYSIS: is it a solution
to understand big problems?
Rice program (Agrobioversity) & Big Data expert group (DAPA)
2. computational models are tailored to the analysis of the data rather than data to a
particular
methodology, as researchers have done for over a century
Applying the principles of Big Data to research in
agriculture
• Big Data refers to things that one can do at a large scale
that cannot be done at a smaller one to extract new
insights
• Sometimes to inform is better than explain – Looking for
patterns or associations
• Approaching “N=All”
• Adding value to secondary databases
Big Data (Foreign Affairs magazine / McKinsey's High Tech)… Cukier and Mayer-Schönberger (2013)
3. computational models are tailored to the analysis of the data rather than data to a
particular
methodology, as researchers have done for over a century
How?
• Including the use of ICTs to collect (androids app),
analyze (traditional and machine learning techniques),
share (in a way that facilitates the decision making at
different levels and for different users)
• Analytical approaches tailored to the analysis of the
data rather than data to a particular methodology,
as researchers have done for over a century
• Development of tools as part of a close dialogue with
end-users
4. How?
+ + =
Climate Soil Crop management productivity/ha
(including varieties)
% ? + % ? + %? = To Explain (100 %)
Maximizing productivity in agricultural systems. Working with
secondary databases
• To Identify the combination of factors that lead to high and low
productivities (empirical approaches – machine learning)
• Within the framework “Convenio MADR-CIAT” climate change project –
Adaptation strategy
6. And what are the causes for this yield reduction?
We can see similar problems in Central America, Ecuador, Peru and Venezuela.
Reductions on yield that are causing heavy losses to the rice farmers
Not a single factor is involved: Drought, high minimum temperatures, low light, high
humidity, bacteria, mites , fungus , lack of adaptation etc.
low yields are caused by Burkholderia glumae!
7. Misdiagnosis, wrong treatments and excessive
pesticides applications causing others
problems (Hoja Blanca)
Non ecoefficient
And to worsen the problem the
farmers wants a “magical cure
8. Reducing stress
because of lack
of water.
Water Harvest
Better agronomy
Key points, Crop
Rotation and
Regulations
Improved Cultivars
Increasing Yield Potential
Protecting Yield
Adding value
There is
something
missing here?
How we can manage this problem?
9. AMTEC
Massive Adoption of Technology
OBJECTIVES
To transfer jointly the
technology available for
crop management.
To increase
productivity and reduce
production costs, with
the least environmental
impact, in a context of
social responsibility
To aim for
competitiveness and
profitability of rice
farmers in Colombia
TECHNOLOGY
TRANSFER
Field days
Planning and
good
management
practices
Visits to research
centers
Demonstration
Trials
Reduction costs
10. County AMTEC Farmer AMTEC vs Farmer
Yield
Ton ha -1
Cost
US$/Ton
Yield
Ton ha -1
Cost
US$/Ton
Yield
Ton ha -1
Cost
US$/Ton
El Juncal 6,50 417 5,30 614 1,20 -197
Ibagué 7,96 338 6,90 456 1,06 -118
Norte Tolima 7,48 366 6,29 485 1,19 -119
Montería 6,38 323 4,68 470 1,70 -147
Zulia 6,56 328 5,79 370 0,77 -42
Pompeya 5,70 309 4,30 503 1,40 -194
María La Baja 8,75 248 6,13 333 2,62 -85
Pompeya 4,30 475 3,36 600 0,94 -125
Ibagué 8,66 322 7,23 406 1,43 -84
Fundación 6,53 299 5,60 384 0,93 -85
Casanare 5,90 319 5,20 434 0,70 -115
Average 6,79 340,4 5,52 459,5 1,27 -119,1
AMTEC Results from 2012 and 2013… Source Fedearroz
Agronomy helps a lot!
2012
2013
11. Gene discovery
Emerging pathogen: Burkholderia glumae, producing grain sterility
Sources of tolerance
identified
Tolerant genotype showing
60% less damage than
susceptible genotypes
Molecular markers are being
developed to speed up the
transference of this trait into
elite germplasm
Susceptible Tolerant (field evaluation)
14. computational models are tailored to the analysis of the data rather than data to a
particular
methodology, as researchers have done for over a century
National Survey
• Purpose: Keep the crop sector updated
• N= 738 cropping events
Harvesting records
• Purpose: Technical research (crop
management, soils, breeding, biotechnology, physiology)
• N= 3193 cropping events
“Data is no longer regarded as static, whose usefulness is finished once the purpose
for which it was collected is achieve”
Information on: Planting and harvesting date, productivity , grain
humidity, variety, cropping system
Zones: Caribbean, Andean (Tolima), Plains (Llanos)
Databases:
Databases…. plenty of information
15. Adding value to secondary databases. The
case of information on cropping events of rice
in Colombia
Planting dates experiments (Field trials)
• Purpose: Technical research on the best sowing date
• N= 272 cropping events
Adding value to secondary databases…but first, merging databases:
Challenging task!!!
Climate
• About 27 weather stations
16. Letting the data speak
“Before Big Data our analysis were usually limited to testing
a small number of hypotheses that we defined well before
we even collected the data. When we let the data speak we
can make connections that we had never thought existed”
Cukier and Mayer-Schönberger (2013)
17. Sowing Harvest
a cropping event in rice = 120 days
Climate series for all variables
Crop
time
Hypothesis
Yield variation is associated with climate
18. FEDEARROZ 733, 27 % of productivity
variation explained
Multivariate analysis for Saldaña (research station- Andean zone ): cropping
events (2007 to 2012)
Lagunas, 47 % of productivity
variation explained
Letting the data speak
FEDEARROZ 733
N = 189
N = 63
Cimarrón Barinas
19. Letting the data speak
Climate and analysis based on phenological stages in Saldaña (research station ) Andean zone
2007 – 2012 (N= about 800 cropping events – irrigated rice)
• The crop sector can suggest to farmers the best planting date
• By assessing the same approach in other stations (enviroments) – New insights
for future breeding
• Adaptation strategy for climate change
Climate accounts for 30% to 40% of
production variability in irrigated rice
20. computational models are tailored to the analysis of the data rather than data to a
particular
methodology, as researchers have done for over a century
Letting the data speak
Climate and analysis based on phenological stages in Zone: Colombian Plains- 2007 – 2012
(N= about 500 cropping events – Upland rice)
• Rainfall is a critical driving factor for upland rice during grain filling and panicle
initiation
• Machine learning (MLP)
Again! - climate accounts for 30% to 40% of
production variability in upland rice
21. Letting the data speak
Climate and analysis based on phenological stages in Zone Plains-Colombia 2007 – 2012
N= about 200 (cropping events – Upland rice.. variety F174)
• Temperature is a critical driving factor for variety 174 (upland rice) during grain filling
• Machine learning (MLP)
This time climate explained more than 40% of
production variability !!! in upland rice V F174
22. Case study : working with secondary databases: Seasonal
forecast, niñ@s & Big Data. Rice in Colombia (Pompeya- Llanos)
What is likely to happen in March-April-May
2014?
We generated 24 clusters based on more than 500 cropping events
• Seasonal forecast + (data) Best
technologies + Big Data analysis = Better
adaptive responses to CC and CV
Cluster 7
Rice variety Productivity (Kg/Ha) Cropping events
F174 4,564 31
FORTALEZA 3,543 17
F2000 4,977 8
LAGUNAS 5,052 6
MOCARI 4,604 6
23. What can we do with these results?
FLAR and CIAT Rice Breeders
• Better understanding of yield and its formation under
changing, complex, and extremely variable conditions.
• New breeding objectives like low light tolerance, pattern of biomass
accumulation etc.
• Better environments definition
FEDEARROZ
• Reduce pesticide applications.. since it is demonstrated that there are
other factors behind the yield variation
• Establish planting dates and new crop systems based on crop rotation
• Establish a dynamic system for crop management based on short term
prediction to manage the risk associated with the changing conditions
CGIAR
• Expand this experience to other crops and areas
• Understand the importance of FARMERS ORGANIZATIONS to have impact
• Interesting concept for CCAFS, GRiSP, MAYZE others
24. •The analytical approach used demonstrated that variation of rice
productivity can be associated with climate (30 -45%)
• Internal Cooperation between research areas within CIAT and external
FEDEARROZ is a powerful combination- Also… multidisciplinary work is
key!!!
•As long as the information is available it can be applied in any other
regions/ crops
• CCAFS is keen to integrate – CN selected CSMS (CIAT- FLAR-IRRI)
• Start collaborations with the yield gap taskforce
•Encourage others partners in LAC to collect information and be part of
this idea…(e.g strategy of FLAR) and add value to info that has been
already collected.
Concluding remarks and perspectives
25. Modern information technoloy, Big Data, Site-specific
Management/Agriculture, digital soil mapping, Terra
I, Bio-informatics are already here…
A new Ageekulture can be regarded as
complementary to CIAT’s traditional research in order
to fulfill the center`s mission
Concluding remarks and perspectives