1. Citizen Science #101
'if I have seen further it is by
standing on the shoulders of
giants'.
Scott Edmunds &Mendel Wong
NGOs and Governance
MPA, HKU 18/10/17
4. • Formerly Beijing Genomics Institute
• Founded in 1999 (1% of HGP)
• China’s 1st citizen managed not-for-profit research
institute funded by commercial sequencing-as-a-service
(BGI Tech)
• Now largest genomic organization in the world
• HQ in Shenzhen, international data production in BGI HK
(Tai Po)
About my employer:
5. Open Data Hong Kong
ExCom member
for Open Science
Open Science
Working Group
10. Why closed data sucks?
https://commons.wikimedia.org/wiki/File:Inner_door_in_forbidden_city.jpg
11. Hong Kong Edition
https://data.gov.hk
Gov't spend on open data platform =
$1.2M
Gov't spend on 20 rubbish apps =
$20M
https://www.hongkongfp.com/2015/09/14/public-finance-concern-
group-raps-10-rubbish-govt-apps-one-has-only-10-downloads/
Why closed data sucks?
12. What the Gov't builds for $20M What open data can build for free
http://gazetteer.hk/
Hong Kong Edition
Why closed data sucks?
13. Open Data as a revenue stream means can't share conservation data...
Why closed data kills spoonbills?
14. Climate change, global hunger, pollution, cancer,
disease outbreaks…
http://www.nature.com/news/data-sharing-make-outbreak-research-open-access-1.16966
Why closed data kills people?
16. Genomics Open Data = Mandate
1. Automatic release of sequence assemblies within 24 hours.
2. Immediate publication of finished annotated sequences.
3. Aim to make the entire sequence freely available in the public domain for
both research and development in order to maximise benefits to society.
Bermuda Accords 1996/1997/1998:
1. Sequence traces from whole genome shotgun projects are to be
deposited in a trace archive within one week of production.
2. Whole genome assemblies are to be deposited in a public nucleotide
sequence database as soon as possible after the assembled sequence
has met a set of quality evaluation criteria.
Fort Lauderdale Agreement, 2003:
The goal was to reaffirm and refine, where needed, the policies related to
the early release of genomic data, and to extend, if possible, similar data
release policies to other types of large biological datasets – whether from
proteomics, biobanking or metabolite research.
Toronto International data release workshop, 2009:
17. The skills needed for the 21st century
"by 2025, it is possible that as many as 25% of the population in developed nations
and half of that in less-developed nations will have their genomes sequenced”
Big Data: Astronomical or Genomical?
http://journals.plos.org/plosbiology/article?id=10.1371%2Fjournal.pbio.1002195
22. The Solution: Open Access
“By “open access” to [peer-reviewed research literature], we mean its
free availability on the public internet, permitting any users to read,
download, copy, distribute, print, search, or link to the full texts of
these articles, crawl them for indexing, pass them as data to software,
or use them for any other lawful purpose, without financial, legal, or
technical barriers other than those inseparable from gaining access to
the internet itself. The only constraint on reproduction and
distribution, and the only role for copyright in this domain, should be
to give authors control over the integrity of their work and the right to
be properly acknowledged and cited.”
Budapest Open Access Initiative:
• Maximizes reuse and access
• Gives authors control over the integrity of their work and the right
to be properly acknowledged and cited.
• “Real” OA asks for no restrictions/limitations = CC-BY
23. The Solution: OER
Open Education Resources: democratising education
https://www.oercommons.org/
25. Data platforms: easy to build
https://ckan.org/
Open source, from OKI. Used by Governments (inc. HK),
Universities (Bristol), even hospital registries.
32. The “Peoples Parrot”
Puerto Rican Parrot Genome Project (Amazona vittata )
Rarest parrot, national bird of Puerto Rico
Community funded from artworks, fashion shows, beer brands, crowdfunding…
Genome annotated by students in community college as part of bioinformatics education
Paper and Data published in GigaScience and GigaDB
Taras K Oleksyk, et al., (2012) A Locally Funded Puerto Rican Parrot (Amazona vittata) Genome Sequencing Project Increases Avian Data and Advances Young
Researcher Education. GigaScience 2012, 1:14
Steven J. O’Brien. (2012): Genome empowerment for the Puerto Rican parrot – Amazona vittata. GigaScience 2012, 1:13
Oleksyk et al., (2012): Genomic data of the Puerto Rican Parrot (Amazona vittata) from a locally funded project. GigaScience.
http://dx.doi.org/10.5524/100039
38. To maximize its utility to the research community and aid those fighting
the current epidemic, genomic data is released here into the public domain
under a CC0 license. Until the publication of research papers on the
assembly and whole-genome analysis of this isolate we would ask you to
cite this dataset as:
Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang,
Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun,
Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ;
Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482
isolate genome sequencing consortium (2011)
Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen.
doi:10.5524/100001
http://dx.doi.org/10.5524/100001
Our crowdsourcing example:
To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to
Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
39.
40.
41. “The way that the genetic data of the 2011 E. coli strain were disseminated
globally suggests a more effective approach for tackling public health
problems. Both groups put their sequencing data on the Internet, so scientists
the world over could immediately begin their own analysis of the bug's
makeup. BGI scientists also are using Twitter to communicate their latest
findings.”
“German scientists and their colleagues at the Beijing Genomics Institute in China have
been working on uncovering secrets of the outbreak. BGI scientists revised their draft
genetic sequence of the E. coli strain and have been sharing their data with dozens of
scientists around the world as a way to "crowdsource" this data. By publishing their data
publicy and freely, these other scientists can have a look at the genetic structure, and try
to sort it out for themselves.”
42.
43. Downstream consequences:
“Last summer, biologist Andrew Kasarskis was eager to help decipher the genetic origin of the Escherichia coli
strain that infected roughly 4,000 people in Germany between May and July. But he knew it that might take days
for the lawyers at his company — Pacific Biosciences — to parse the agreements governing how his team could
use data collected on the strain. Luckily, one team had released its data under a Creative Commons licence that
allowed free use of the data, allowing Kasarskis and his colleagues to join the international research effort and
publish their work without wasting time on legal wrangling.”
1. Citations 2. Therapeutics (primers, antimicrobials) 3. Platform Comparisons
4. Example for faster & more open science
44. 1.3 The power of intelligently open data
The benefits of intelligently open data were powerfully
illustrated by events following an outbreak of a severe gastro-
intestinal infection in Hamburg in Germany in May 2011. This
spread through several European countries and the US,
affecting about 4000 people and resulting in over 50 deaths. All
tested positive for an unusual and little-known Shiga-toxin–
producing E. coli bacterium. The strain was initially analysed by
scientists at BGI-Shenzhen in China, working together with
those in Hamburg, and three days later a draft genome was
released under an open data licence. This generated interest
from bioinformaticians on four continents. 24 hours after the
release of the genome it had been assembled. Within a week
two dozen reports had been filed on an open-source site
dedicated to the analysis of the strain. These analyses
provided crucial information about the strain’s virulence and
resistance genes – how it spreads and which antibiotics are
effective against it. They produced results in time to help
contain the outbreak. By July 2011, scientists published papers
based on this work. By opening up their early sequencing
results to international collaboration, researchers in Hamburg
produced results that were quickly tested by a wide range of
experts, used to produce new knowledge and ultimately to
control a public health emergency.
48. HK Botanical &
Afforestation Dept.
"The mysterious origin
of the tree & its
magnificent flowers at
once arrest the interest.
The Bauhinia Mystery?
1903
So far, all efforts to identify them with
any foreign species have failed"
56. Formation
2016 Feb
• Started as part of an International Zika Hackathon
• Collaborate around information dissemination and solutions for
early detection and increased data capture of mosquito vector
presence
• Involve community-at-large
• Established initial goals for HK:
• Leverage mobile app for crowdsourcing mosquito and breeding
ground locations via gamification
• Expand survey area coverage beyond that provided by
government
57. Zika: a “data gap” issue.
https://www.washingtonpost.com/world/the_americas/brazil-considers-reforming-biosecurity-law-amid-
criticism/2016/02/05/ba2108ba-cc80-11e5-b9ab-26591104bb19_story.html
60. Genomics approaches?
Get more data with portable genome labs? Can we get nanopores?
http://www.nature.com/nature/journal/v530/n7589/full/nature16996.html
66. Discovery
2016 Mar
•Reached out to the AtrapaelTigre (CREAF/CEAB-CSIC) team regarding the usage
of their mobile app (now Mosquito Alert) for HK
•Showcased the initiative on well-established (28 years) public television show
The Pearl Report on TVB Pearl channel during an episode highlighting the
concerns over Zika outbreak
2016 Apr
•Localised first iteration of Mosquito Alert Android app for Cantonese (Traditional
Chinese) as proof of concept
•Found resonance with local health experts on how/where our efforts can be
leveraged
•Encountered roadblock with government providing transparency of Ovitrap
locations to assess effectiveness of established survey areas
68. Citizens to the rescue: Mosquito Alert
http://www.mosquitoalert.com/en/
69. Interviewed by The Pearl Report as part of Zika expose
https://www.youtube.com/watch?v=-tJuWP_z2Bo&feature=youtu.be
70. Partnership
2016 Sept
• Deployed v2 of Mosquito Alert app
• Assisted CREAF in establishing appropriate open data policies to make
captured data available to community at large
• Connected with educators at The Chinese Foundation Secondary School
(CFSS) and medical health experts at The University of Hong Kong (HKU)
2016 Oct –
2017 Mar
• Piloted Mosquito Alert app as part of CFSS science educational programme
to raise awareness of public heath issues such as Zika disease
• Evaluated process of sharing of Mosquito Alert with global audience
74. Global Consortium
2017 Apr
• Invited by UN Environment to join a workshop to address the worldwide concern of mosquito-borne
diseases co-hosted by European Citizen Science Association (ECSA) and The Wilson Center
• Showcased what Hong Kong has managed to do as a group of concerned citizens volunteering their
own time to work towards a common cause
• Established a mission statement:
• The Global Mosquito Alert Consortium is a new citizen science initiative that aims to leverage
networks of scientists and volunteers for the global surveillance and control of mosquito species
known to carry the following diseases: Zika, yellow fever, chikungunya, dengue, malaria, and the
West Nile Virus.
• The Global Mosquito Alert will be an open, common set of protocols and toolkit that is augmented
with modular components created to meet both global and local research and management needs.
79. Regional Community
2017 May
• Formed CitizenScience.Asia as a community to
facilitate dialogues amongst citizen science
practitioners and projects across Asia and with
the rest of the world.
2017 August
• Held first citizen science faire in Hong Kong to
80.
81. Next Steps
Promotion
• Incentivize community to use the app
• Improve photo capture for better identification
Analytics
• Reduce time to reflect capture locations while considering accuracy
• Increase accessibility to data for different parties to better model
regional patterns
Application
• Motivate conversations with public and governmental organisations
to leverage citizen scientists’ effort
82. How can you help?
Hunt Tiger Mosquitoes
not Pokémons
Report breeding sites
Remove stagnant water
Download the app from
googleplay & app store
Find us entomologists
官话 translation
https://play.google.com/store/apps/details?id=ceab.movelab.tigatrapp
https://itunes.apple.com/app/id890635644