QFAB at a glance

BIOINFORMATICS
at a glance

Foundation Partners
Contents
2 About QFAB 17 Tools & Platforms
3 QFAB Solutions - mixOmics
4 QFAB Governance - Systems Biology Platform

Collaborators 6 Testimonials
8 QFAB Team

- Chemi-Biology Platform
- Genomics Virtual Laboratory
10 QFAB Service - NCI-SF
IMB – Institute for Molecular Bioscience - Next Generation Sequencing - Embl Australia
EMBL Australia Bioinformatics Resource - Genotyping Data 18 Case Studies
- Microarray Data - Featured project:
Genomics Virtual Laboratory – Nectar
- Integrated Systems Biology The embedded bioinformatician -
Australian e-Health Research Centre
- Mass Spectrometry Data QFAB research collaboration
ARC Centre of Excellence in Bioinformatics - Cheminformatics - Southern Cross Univerity -
NCI-Specialised Facility in Bioinformatics - Biostatistics Science Bioinformatics Support
Plant
Intersect Australia Ltd - Software and Web Development - Glycoselect statistical pipeline

QCIF – Queensland Cyber Infrastructure Foundation - High Performance Computing - ArachnoServer, the world’s
- Consulting first manually curatable resource for
Wound Management Innovation CRC
- Training protein spider toxins
The University of Auckland
- EBI – Making Australian
Université de Toulouse Data Discoverable
PROOF Centre of Excellence, Canada 28 Publications

1

QFAB Solutions
Mr Jeremy Barker
Chief Executive
Officer QFAB provides scalable and open solutions which enable experience and our broad suite of open source and
scientists to extract the greatest value from their research commercial software and computational systems.
data, across a variety of applications, technologies, and Flexible, rapid and researcher oriented, QFAB has built a

About QFAB
industries. reputation for supporting the production of high quality
QFAB can work on specific projects, provide time against outcomes which deliver those high impact publications
an ongoing need or become embedded in your or patents faster.
organisation. Our solution will draw on the entire team’s

What we do Who we do it for
However you define bioinformatics, as broadly or narrowly Life science scientists are our clients. With a focus on
as you wish, it fundamentally enables the modern client success, our skilled and expert team understands
day researcher to handle and analyse large and complex the biological question. We can help you define the
data – the big data challenge - an essential part of appropriate bioinformatics and biostatistics approach
understanding biology. and apply the correct analyses to deliver on your project.
Established in 2007, QFAB focuses on delivering robust, By unlocking the full value of your data, QFAB has
high quality and relevant bioinformatics services for life become a leading provider of bioinformatics services
science researchers to analyse and manage large-scale to industry, clinical, applied and academic life science
datasets. researchers, including the biotechnology, pharmaceutical,
We offer our clients a range of professional services clinical and research communities.
– from contract research, project design and analysis, We operate nationally and internationally with a
Technologies Applications Industries
scientific software development, to designing, deploying commitment to continuous engagement throughout We support evolving We design applications to We work with researchers
and refining the high performance IT infrastructure the project. technologies used by life support: across a number of industries:
that is required to support life science informatics. science researchers including:
Our support ranges from experimental design, data What we stand for
capture and mining through to NGS, proteomic - Pathology/clinical data - Molecular biology research - Academic research
Our highly experienced and skilled team is the foundation
and metabolomic analyses. We are also expert in - Biomarker discovery - Healthcare
of our success. As a team, we are committed to operating - Next generation sequencing:
cross-domain integration with clinical data.
ethically, with integrity and respect at all times. We believe Roche 454 GS-FLX, Illumina - Crop science and livestock - Pharma R&D
QFAB’s training division provides integrated workshops that open communication, flexibility and honesty are the HiSeq, LifeTechnologies Solid,
through to customised solutions in all facets of - Biodiversity - Clinical research
keys to successful customer engagement. Ion Torrent data
bioinformatics. - Molecular diagnosis - Biotechnologies
Early engagement means we can take the time to understand - Genotyping:
We have a strong track record of delivering results which your goals and help derive the most efficient and - Drug discovery - Food industry
ABI-LifeTechnologies, Illumina,
are grounded in the team’s commitment to continuous cost-effective bioinformatics solutions, while the international Affymetrix, Agilent - Translational research - Environment
learning and innovation, ensuring that our clients can experience brought to projects by our team members
trust in the knowledge and integrity of our support. - Microarrays - Epidemiology - Biodiversity
ensures the application of the relevant tools to the right
data to answer your particular questions. - Mass spectrometry: - Pathway analysis - Energy
We are the research data specialists: responsive, SELDI, MALDI, iTRACK - Clinical trials - Agricultural R&D
professional, secure and quality focused.

2 3

QFAB Specialist Advisory Group
Dr Rhys Francis Dr Gregory Harper Dr Tim Littlejohn Warren Parker Dr Nadia Rosenthal
Executive Director, Director, New Employee CEO, Director,
Australian External Learning Lead, Scion, Australian
eResearch Engagement, diversity, New Zealand Regenerative
Infrastructure CSIRO life sciences, Forest Research Medicine Institute
Council IBM Institute,
New Zealand

QFAB Governance
Mr Peter Turnbull seven management and advisory Chemistry and Computer Science Infectious Disease Research Program tradition of the Berkeley L0 project, Dr Andrew Lewis
boards in molecular bioscience, and a PhD in Theoretical Chemistry at IHBI, which forms part of IHBI’s where he contributed a connectionist
Chairman Griffith University
informatics and e-Research. at the Australian National University. Cells and Tissue Domain. Ken was synthesis of bottom-up and
Peter is a company director and He has more than 160 peer-reviewed appointed to the role of IHBI Deputy top-down attentional systems as a Andrew is a Senior Research Specialist
corporate lawyer. Peter’s business in eResearch Services at Griffith
interests span many East Asian
publications and research interests Dr Wayne Jorgensen Director in 2010. He has worked in model of spatial relations acquisition
University. Andrew’s research expertise
that include comparative, the area of mucosal immunology for and processing. Jim also leads
markets, Europe and the United Queensland Government includes parallel optimisation
computational and evolutionary the past 25 years at the University of development of software
States. He is currently the Chairman genomics, high-throughput Wayne has over 25 years of research Alabama at Birmingham (UAB) and internationalisation work within algorithms for large numerical
or a Director of various private and bioinformatics, high-performance experience in parasitology. Wayne the University of Newcastle prior Australia. simulations, including bio-inspired
unlisted public companies and is a computing, and technologies for has published over 70 research to moving to QUT. Ken’s current algorithms (particle swarm, ant colony,
former Director of the Securities and papers in refereed, international spatial social networks, evolutionary
Futures Commission of Hong Kong.
management and integration of
journals and over 70 conference
research interests focus on immunity Mr Jeremy Barker programming and extremal
large datasets. to sexually transmitted infections,
Peter has particular expertise in the papers at national and international QFAB optimisation), direct search algorithms
in particular Chlamydia trachomatis.
commercialisation of new technologies, conferences. He has written chapters Jeremy is the founding CEO of QFAB (simplex and pattern search methods)
strategic planning, and complex
Dr David Hansen on tick-borne diseases for the Office
His research aims to define and
(Queensland Facility for Advanced and gradient descent (quasi-Newton
differentiate the immune parameters
legal, commercial and corporate CSIRO International des Epizooties, Bioinformatics) which he joined with BFGS update). He specialises
of immune-mediated inflammatory
governance transactions and alternate David is CEO of the Australian International Livestock Research pathology caused by Chlamydia full time in June 2007. Jeremy is in multi-objective optimisation
dispute resolution strategies. He is a e-Health Research Centre. David Institute, Food and Agriculture infection from the immune responsible for the overall techniques for engineering design,
recent past-President, and Chairman leads the research programs of the Organization of the United Nations, mechanisms that can protect against management of QFAB. He holds including the application of a variety
of Chartered Secretaries Australia, AeHRC, a joint venture between Parasitology Journal, and Merck. chlamydial infection to developing postgraduate qualifications in of algorithms to complex problems,
and is a Fellow of the Australian CSIRO and the Queensland He is co-editor for Babesiosis in the effective chlamydial vaccines. science, commerce and governance methods of interactive optimisation,
Institute of Company Directors. Government. Prior to joining CSIRO, International Consortium for Tick and has over 20 years of experience automated preference selection in
David led development of the and Tick-borne Diseases Journal and in the management of life science Pareto-optimal sets, extension to
Associate Professor
Professor Mark Ragan Sequence Retrieval System (SRS) at developed two international patents
James Hogan
organisations including a number of many-objective and dynamic
LION Bioscience Ltd in Cambridge, on poultry coccidiosis vaccines. Board positions for biotechnology environments, and the development
IMB QUT
UK. SRS is the leading genomic data In 2008 Wayne received the Australia companies. In 2007, Jeremy was of automated, problem-solving
Mark is the founding head of the and tool integration software, and Day award for contribution to awarded a Churchill Fellowship to frameworks. He also has expertise in
Jim is an Associate Professor in the
Division of Genomics and is used by pharmaceutical and Queensland’s Primary Industries. undertake an international study parallel, distributed and grid computing
Electrical Engineering and Computer
Computational Biology at the biotechnology companies, such as on ‘Management Best Practice in methods, including cluster-based, ad
Science Faculty at QUT. His research
Institute for Molecular Bioscience of
The University of Queensland, and
Glaxo Kline-Smith, Celera and Professor Ken Beagly interests include machine learning the Delivery of Bioinformatics to hoc grid, distributed, peer-to-peer
Affymetrix, as well as academic and its application to cognitive Researchers’. He is a member of systems and cloud computing; and
Director of the Australian Research IHBI
institutes, including The European science, and bioinformatics problems the Australian Institute of Company advanced visualisation techniques for
Council Centre of Excellence in Bioinformatics Institute. David Ken is Deputy Director, Institute of Directors. scientific data analysis.
and software engineering. Jim’s work
Bioinformatics. Mark is a Fellow of undertook a Bachelor of Science at Health and Biomedical Innovation
is concerned with learning in visual
the Linnean Society of London, and the University of Queensland in (IHBI). He is a Professor of Immunology
domains with associated text, in the
serves on six editorial boards and at QUT and the leader of the

4 5

The embedded
bioinformatician
“QFAB has assigned a successionof its research students
and staff to work with our biologists and protein chemists,
helping to design experiments and to interpret the data.
As we were doing our research, QFAB was building up a
pool of expertise in wound-healing bioinformatics.”
Professor Zee Upton

Testimonials
QFAB’s expertise in experimental design and analysis can be applied Professor Zee Upton
at any scale, from a simple microarray comparison of differential Wound Management Innovation
gene expression, to a full-blown systems biology analysis integrating Cooperative Research Centre
genomic, proteomic, metabolomic and clinical data. The Co-Program Leader of the Enabling
Technologies Program in the Wound CRC,
Professor Zee Upton,
now Assistant Dean of Science at
Professor Glenn King Dr Roslyn Brandon Queensland University of Technology,
Institute of Molecular Bioscience, Co-Founder, President and CEO says it was clear from the outset that the
The University of Queensland of Immunexpress (IXP) CRC was going to need high-order capabilities
“QFAB has collaborated with us over a number of years “Since an initial collaborative project in 2010, IXP has in bioinformatics to make sense of the huge
to build a world-first spider toxin database that allows us significantly increased its work with QFAB as a contract volumes of DNA, SNP and proteomics data
to combine data from publicly available databases including supplier of bioinformatics services. QFAB undertakes IXP’s
flowing from its research.
mRNA and protein sequences, 3D structures, and genomics multivariate data analytical work – a critical
functional information for hundreds of venom proteins. foundation for the development of our novel clinical In previous projects the bioinformaticians
In our most recent project, QFAB worked with us to add diagnostic products. IXP is leading the world in applying were not physically co-located with the
a number of new features to the database, including a its patented immune system blood biomarkers to improving
toxin mass calculator, taxonomic target search, and the management of sepsis patients and those at risk of
biological scientist generating the data.
mechanisms to control the privacy of records. QFAB’s sepsis. Sepsis is a life-threatening immune response “That didn’t work as well as we hoped,”
innovative approach and responsiveness has been to infection and is a leading cause of death in Intensive Professor Upton said.
essential to the success of this project. I am happy to Care Units. QFAB works very effectively with our team
endorse QFAB as a provider of highly effective research to produce results time effectively and cost efficiently. “We found we needed a bioinformatician
data management solutions.” The QFAB project team is a mainstay for our business as our with specific expertise in handling
product development processes rely heavily on world-class wound-healing data, so
informatics. We have been very pleased with the quality
and responsiveness of QFAB, and I am delighted to provide QFAB placed one of its researchers
this testimonial for their client services.” in the CRC for two days a week.”
6 7

QFAB Team

Senior
Management Team
Mr Jeremy Barker Dr Dominique Gorse
Chief Executive Officer General Manager

“QFAB is a vibrant and dynamic workplace, reflecting the variety of
projects that attracted me to QFAB initially. I enjoy the challenge of
working for an organisation that is at the cutting edge of bioinformatics
Ms Mathilde Desselle Dr Mark Crowe Dr Stephen Rudd and the satisfaction gained from liberating more knowledge or productivity
Business Development & Training & Outreach Manager Head of Computational for the customer by applying advanced bioinformatic solutions.”
Marketing Manager Biology
Dr Leo McHugh

Computational Biology Data Specialists
Dr Kim-Anh Lê Cao Ms Roxane Legaie Dr Jeremy Parsons Ms Sarah Williams
Biostatistician Bioinformatician Bioinformatician Bioinformatician

Dr Xin-Yi Chua Ms Amanda Miotto Dr Leo McHugh Mr Pierre Chaumeil Ms Anne Kunert Mr Nick Rhodes
Senior Bioinformatician eResearch Support Specialist Senior Computational Biologist Bioinformatician Bioinformatician Database and Systems
Administrator

8 9

QFAB Services
QFAB is the research data specialist. We support the bioinformatics requirements of
research-intensive universities, institutions and companies and provide secure access to
very large databases, dedicated softwares, high-performance computing, terabyte storage,
data integration technologies and advanced bioinformatics services.
We pride ourselves in developing tailored, relevant bioinformatics solutions to meet
research needs to:
> Provide scalable infrastructure to meet future growth > Enable reasoning across data sources
> Enable collaboration between scientists within > Develop workflows for easy and efficient data
disparate research areas processing, analysis and visualisation.

Our services include:

Next Generation Sequencing Genomic capture > Adjustments of interpretation settings
> Mutation and variant detection > Taxonomic trees
Unlock the full value of your next-generation
sequencing (NGS) data sets from Illumina > Identification and positioning of known and unknown > Samples comparison
mutations RNAseq and microRNA
HiSeq, Roche 454 GS-FLX and LifeTechnologies
> Transcription: Influence of the mutation on the protein
Solid and Ion Torrent platforms. RNAseq
functionality
QFAB provides tailored bioinformatics services to biologists > Changes in expression
across the spectrum of computational techniques and ChipSeq
> Detection of novel splice isoforms or transcripts
services applicable to molecular biology and next generation > Mapping and genes identification
sequencing. QFAB researchers design and implement > De novo transcriptome assembly
> Comparison if samples regulated by the same
custom bioinformatics approaches that are developed in transcription factor > Mutation detection
consultation with researchers for specific questions in Annotation pipeline
molecular biology. We can also integrate your genomics Exome
data with other omics and clinical datasets. > Listing of identified mutations between the study > Prediction of coding regions
Our NGS data analysis services include: samples and the reference: SNPs, InDels > Annotation of coding genes, non-coding RNA, tRNA,
rRNA, miRNA…
De Novo or reference genome assembly and > Identification of known mutations by comparing the
annotation results to international databases > Repeat identification
> Assembly and mapping > Influence of the mutation on the protein functionality > Statistics of annotation and analysis of GC content
> Structural and functional annotation of the genome Metagenomics
(ORF detection, assignment of genes,…) > Creation of reference database and alignment for or
> Samples comparison ganisms identification

10 11

Integrated Systems Biology
QFAB supports systems biology research across
the spectrum of interdisciplinary techniques.
QFAB researchers work with biologists to define the nature
of the biological systems under investigation in specific
research projects. We develop or customise approaches for
understanding these systems, or inferring the properties of
such systems from high-throughput data sets:
> Integration and simultaneous analysis of multiple omics
or clinical datasets
> Pathway and network analysis
> Enrichment analysis
> Gene regulation studies
> Visualisation (correlations, networks, genome browsers).

Genotyping Data Mass Spectrometry Data
We can provide data filtering and genotype QFAB provides mass spectrometry data
calling, as well as integration of your analysis, from design to protein identification,
genotyping datasets into data management, annotation of secondary modifications, and
visualisation, pathways network-based determination of the absolute or relative
analysis or development of molecular abundance of individual proteins.
diagnostic chip/IVD device. > Design
QFAB researchers design and implement custom > Picks detection and protein identification
bioinformatics approaches that are developed in > Annotation
consultation with researchers for specific questions in
> Quantitative analysis
molecular biology:
> Interpretation
> SNP genotyping data
We support data from most of the existing mass
> Microsatellites / minisatellites genotyping data
spectrometry platforms.
> Genotyping by Sequencing data
We support data from most of the existing genotyping
and sequencing platforms. Cheminformatics
QFAB supports drug discovery projects with
Microarrays Data SAR analysis and SAR development using
cheminformatics approaches.
QFAB offers a wide range of statistical and
> Prioritisation of compounds and assays
bioinformatic analyses of microarray data, (Efficacy, Toxicity, ADME)
ranging from design and quality control to
> Provide databases and tools for compound
samples comparison. registration and inventory
> Design > HTS data handling, mining and reporting
> Quality control > Design of compound libraries
> Normalisation (diverse, focused or targeted)
> Quantification: gene expression, exon level, > Combinatorial library design and library
copy number enumeration
> Differential samples comparison, and > Undesirable molecule elimination
comparison with public datasets > Hit-to-lead optimization
> Combination and integration with other data. > ADME and toxicity profiling.

12 13

Biostatistics High Performance Computing
QFAB provides expert biostatistical services to Mr Nick Rhodes
support the design of biological experiments, Database and Systems Administrator
biomarker identification, collection and QFAB’s HPC services include:
analysis of data from experiments, and the
> Hardware access / infrastructure
interpretation of that data.
> Cluster computing
QFAB provides tailored data mining services to researchers
who want to get more from their datasets. By creating > Cloud computing
customised data mining solutions, QFAB can assist researchers > High memory servers.
to get more from their existing data sets. By consulting with
QFAB early, QFAB research staff can also provide detailed
advice on the best methods for data capture, annotation Consulting
and storage to maximise the benefits likely to result from Dr Dominique Gorse
the application of data mining techniques. General Manager
Our biostatistical services include: Discuss with us the design of your research project,
> Combination and integration of multiple and evaluate bioinformatics analysis and support to be
datasets types incorporated in your grant proposal. QFAB answers your
> Univariate and multivariate analysis consulting need on a project basis, on a regular basis
> Power calculation (number of days/ week-month-year), as well as through
embedded staff model.
> Classification
QFAB can help scientists with access to the
We undertake collaborations in clinical research informatics
bioinformatics tools and the computational capability
and research IT services with a growing number of clinical
needed to meet the challenges of visualising disparate
and translational research investigators.
datasets which may use different or no ontologies and,
at the social and regulatory levels, the need to maintain
patient confidentiality through compliance with clinical
records privacy laws. Building on our strengths in
understanding and translating the needs of the researcher
into the language of the bioinformatician and IT
professional, QFAB can work closely with your research
team to address both the data management and
Software and Web Development bioinformatic aspects of your translational research project.
Research computing resources offered by
QFAB enable the visualisation and analysis of
large biological data sets and to address Training
complex biological problems. Dr Mark Crowe
QFAB develops custom software tools, data management Training & Outreach Manager
systems and web interface for life sciences researchers. QFAB understands the biological question and can help
We support large collaborative research projects where the you unlock the answers from your data through our
member groups are based in different locations. We develop specialised training programs. We provide a range of
online tools for project participants to securely store their flexible modules, courses and workshops designed to suit
own data, access other members’ data and allow real-time your needs – ranging from introductory project-focused
communication between collaborative group members. courses to advanced bioinformatics. Our goal is to
Our ICT services include: improve knowledge diffusion between experimental
> Database development biologists and bioinformaticians, that is, increase the
interaction between developers and users.
> Web interface to biological databases
Contact QFAB for details of how QFAB can meet your
> Data analysis web applications.
training needs.

For more information on QFAB services, contact us:
www.qfab.org

14 15

Tools & Platforms
QFAB has a core team of software engineers Industry Collaboration
and biologists who work together to help QFAB provides technical collaboration for:
describe, formulate and build applications
Embl Australia Bioinformatics Resource
to analyse and manage your research data.
Our unique tools and platforms include: An online data resource replicating important
components of the data and data services provided
MixOmics by the European Bioinformatics Institute (EBI) -
The integration of multiple ‘omics’ analyses enables a to provide Australian scientists with faster access
better understanding of a biological system as a whole. to EBI datasets and expanded collections of
QFAB’s mixOmics product is an R package dedicated Australian data.
to the exploration and the integration of highly dimensional http://www.ebi.edu.au/
data sets. MixOmics provides a strong focus on graphical
representation, to better understand the relationships
between the different types of data and to visualize the Genomics Virtual Laboratory (GVL)
correlation structure between different measured entities.

Systems Biology Platform The GVL provides Australian researchers with a
community of accessible infrastructure, expertise
Analysis of complex molecular networks which control
and advocacy that connects genome researchers
biological processes requires an integrated, highly capable
with massive datasets, sophisticated analysis tools,
bioinformatics platform. QFAB has developed a scalable
and large-scale computational infrastructure.
computational environment to analyse, model and infer
biomolecular networks. Our platform is unique in Australia, https://www.nectar.org.au/genomics-virtual-laboratory
with high-performance hardware and an integrated suite
of commercial software linked to curated datasets.
NCI-SF in Bioinformatics
Chemi-Biology Platform
QFAB’s chemi-biology computational platform brings The National Computational Infrastructure,
together complementary expertise in infectious disease Specialised Facility (NCI-SF) portal provides
research and advanced computational methods to information on the hardware, software and services
accelerate the drug discovery process. The platform aims available.
at increasing the likelihood of discovering successful lead https://ncisf.org/
compounds for anti-infective medicine. Unique in Australia,
the platform consist of high-performance hardware and
an integrated suite of open source and commercial software
linked to curated chemical and biological datasets.

17

Case Studies
QFAB collaborates with life scientists to help solve important challenges relating to the
analysis, management and visualisation of their research data. Read about some of our
projects and collaborations.
Index
p. 8 Featured project: The embedded bioinformatician - QFAB research collaboration
1
p. 1 Southern Cross University - Plant Science Bioinformatics Support
2
p. 3 Glycoselect statistical pipeline
2
p. 4 Arachnoserver, the online data repository for spider toxin research
2
p. 6 EBI – Making Australian Data Discoverable
2

Featured project:
Wound Innovation
Management CRC
Dr Kim-Anh Lê Cao Mr Jeremy Barker, Professor Zee Upton,
Biostatistician CEO, QFAB QUT

The embedded bioinformatician - QFAB research collaboration
By Graeme O’Neill theatre, a car accident, a farm The Co-Program Leader of the researchers in the CRC for two days help biologists to design be project-specific.
A wound is a complex ecosystem, accident, or the consequence of Enabling Technologies Program a week. experiments and analyse the data “So they bring us this huge data set,
seething with a diverse population necrosis associated with chronic in the Wound CRC, Professor Zee “Since then, QFAB has assigned a to answer the questions posed. saying they want do to this type of
of the patient’s cells, and a complex diabetes or a lifetime smoking habit. Upton, now Assistant Dean of succession of its research students As the cost of purchasing and analysis, hoping to get a certain result,
community of microbes contending Over time, the wound’s microbial Science at Queensland University of and staff to work with our biologists operating high-speed DNA without fully understanding what
to colonise the wound. There is flora may vary with time, through Technology, says it was clear from and protein chemists, helping to sequencers has fallen, more research they’ve got from the experiment,
constant communication between succession processes. the outset that the CRC was going design experiments and to interpret groups have been successful in or how to extract what they want
them via a molecular intranet. Fibroblasts and other specialised to need high-order capabilities in the data. As we were doing our obtaining ARC infrastructure grants from the dataset.
tissue-repair cells go about their bioinformatics to make sense of the research QFAB was building up a to buy these machines.”Some set out
The Wound Management Innovation “That’s where we can help. First, we
business as the innate and adaptive huge volumes of DNA, SNP and pool of expertise in wound-healing to use the new equipment without
Cooperative Research Centre in can explain what their data will allow
immune systems’ armies of natural proteomics data flowing from its bioinformatics.” really understanding how it can be
Brisbane was established in 2010 them do, or how it might allow them
killer cells, macrophages, B cells research. used,” he said.
to explore these processes, and to And that’s the model QFAB is to do more than they expected.
apply its findings to develop novel and T-cells swarm in to mount a In previous projects the bioinformati- offering to potential customers for its “They may have bought the machine “Secondly, we can help by showing
treatments to accelerate healing, coordinated defence of the breach cians were not physically co-located expertise and facilities. QFAB Chief after reading a paper in the scientific them how to do what they really
reduce scarring, and prevent infection. against microbial invaders. with the biological scientist generating Executive Officer, Jeremy Barker says literature that used the same device. wanted to do, and help them design
The high-speed DNA sequencers the data. “That didn’t work as the success of biological, medical It might describe the methods
Every patient is genetically unique, experiments to produce the type of
and various microarray technologies well as we hoped,” Professor Upton and pharmaceutical research will used to produce the data, and the
and the microbes vying to colonise data they need to answer particular
required to study the dynamics of said. “We found we needed a depend increasingly on the expertise conclusions, but reveals very little
their wound will vary according to questions. Early engagement saves
these processes over time, generate bioinformatician with specific of bioinformaticians who not only about the assumptions underlying
how and where the wound occurred the costs involved in repeating
enormous amounts of data. expertise in handling wound-healing understand the research, but can the methodology, which tend to
– it might have been in an operating experiments.
data, so QFAB placed one of its

18 19

Barker says QFAB’s expertise in no reserve time to learn on the job. “We stay up to date with the
experimental design and That is the advantage of having a technologies out there, and the
bioinformatics analysis can be larger team to draw on. specialised methodologies that go
applied at any scale, from a simple “Around 60 per cent of QFAB’s with them, so we will have a pretty
microarray comparison of differential recruits are PhDs. Everyone else is at good idea which technology is most
gene expression, to a full-blown least postgraduate trained, and we appropriate for each client’s project.
systems biology analysis integrating have a core of experienced “The technology changes rapidly,
genomic, proteomic, metabolomic bioinformaticians in senior positions and all bioinformaticians think they
and clinical data. to bring the necessary rigour to can produce a better algorithm
“We engage with a prospective project planning. than the one described in the latest
client, discuss what they want, then “You need the depth of experience journal.
submit it to our team for a think-tank provided by staff who have worked “We learn and adapt as need
session on how it might be done with on other projects before QFAB dictates. You don’t always have to
the expertise and resources that we recruited them.” reinvent the wheel; often, there is
have in-house, and at the client’s end. Dr Jeremy Parsons, Ms Roxane Legaie,
“A bioinformatician joining the team already an algorithm that can do the
Bioinformatician Bioinformatician
“We then go back to the client with may have had experience working job. The trick is knowing that the
a proposal detailing where and how on a particular class of cancers, or algorithm exists, and how to apply it.”
we believe we can help.” on viral infections of mangoes, so As for data-crunching power, Barker
Barker says QFAB has served a they’re good with that type of data says the facility has just updated its
Southern Cross University - Plant Science Bioinformatics Support
variety of clients over the past seven and have that specialist knowledge computer cluster. It has seven nodes Brief description of the individuals of special interest to their all assemblies against each other
years including CSIRO, Australia’s as well as the more general built around a 64-core processor, project research interests. This “de novo” and to related species to identify
universities, medical research institutes, bioinformatics skills. “Collectively each with 256 gigabits of RAM, approach to whole-genome assembly the best product for subsequent
state primary industry, fisheries and we can call upon a broad range of and terabyte storage capacities. The client is an Australian university from short shotgun reads is the most genomic analyses.
forestry departments, and private expertise within our teams, and ask research team with broad genomic economical and fastest path to partial
“That’s enough for most jobs,”
biotech and pharmaceutical companies. them what they think about a experience and a special interest in What were the outcomes?
he said. “But if it isn’t, we have genome reconstruction for novel
particular question. genetics of Australian plants. QFAB organisms but it is problematic QFAB fully reconstructed two
Barker says bioinformatics is moving access to a supercomputer facility
is providing a full genomic assembly when attempted without traditional organelle genomes and partially
“at a mile a minute” so keeping
, “We apply the team approach to at the University of Queensland that
and analysis service on multiple guidance from physical clones assembled multiple plant nuclear
abreast of the changes requires a every project, but we like to broaden runs a terabyte of RAM, which can
organisms with direction and focus or genetic maps. The “de novo” genomes. Organelle genomes have
commitment to ongoing learning as the discussion on the projects we handle whole genome comparisons.”
determined by the client. shotgun assembly method leads been shared privately and nuclear
part of the job and an investment have on the boil at any point in time, “With these facilities, our broad
in the professional development of because someone from another expertise, and our data-integration Background to fragmented and isolated plant genomes for two Australian native
QFAB’s team. team may have an insight into a nuclear chromosomal contigs which species have been assembled and
abilities, we believe we offer a Recent dramatic cost reductions in
particularly tricky question.” can be difficult to analyse. QFAB analysed for genes, repeats,
Barker says he has seen an example service beyond the in-house shotgun genomic sequencing now
and the client assembled multiple microsatellite SSRs, and inter-species
of a research institute’s embedded Barker says clients can be certain bioinformatics capabilities of any enable biologists to cheaply and
genomes using different DNA similarities. We have also prepared
bioinformatician being rapidly that QFAB will have the best single institution or company in quickly sample the DNA sequence
assembly programs and then compared genome size estimates.
overwhelmed by data. “They have available expertise to do the job - Australia and the Asia Pacific region.” of an interesting species, or selected

20 21

Boxplot samples

4
3
Outcome

total intensity
2
cancer
normal

1
0
-1
-2
5 6 7 8 9 11 13 15 17 19 21 23
samples
One of the outlier detection steps: boxplots of the
biological samples coloured according to the outcome

Dr Kim-Anh Lê Cao
Biostatistician

Glycoselect statistical pipeline
Brief description of the pipeline needed to be deployed in proposed to process the data
project the existing Glycoselect database to beforehand and remove potential
identify biomarker signatures. outliers. The resulting process clearly
Implementation of a statistical One of the challenges was to propose identified the outliers in the data
pipeline in the Glycoselect database an appropriate methodology to deal enabling the researchers to remove
for the identification of glycoprotein with data which include many zero these prior to selection of the
biomarker signatures. values – many classical statistical biomarker panel.
Background approaches (t-test, non parametric The R script was handed over to
A new high-throughput test) do not apply in this case. the Glycoselect developer to be
glycoproteomics technology is What were the outcomes? implemented into the Glycoselect
being developed by the client to analytical pipeline. The outliner
The statistical methodology
uncover potential glycosylation detection methodology including
developed by Lê Cao et al., 2011*
changes in a complex mix of proteins visualisation of the analyses was also
was proposed for this project and
present in biological fluids such as provided to the client.
produced very satisfying results.
serum. The input data consists of The pipeline was developed using *Lê Cao K.-A., Boitard, S. and Besse,
protein identification as determined this methodology and implemented P. (2011). Sparse PLS Discriminant
by tandem mass spectrometry, in the R statistical programming Analysis: biologically relevant
together with their binding affinities language. feature selection and graphical
to a panel of lectins, which indicate displays for multiclass problems BMC
An outlier detection step was also
the glycan structure. A statistical Bioinformatics, 12:253

23

Cumulative Toxins Deposited by Year Toxin Distribution by Family

Other
900 (86)
Zodariidae Theraphosidae
800 (28) (201)
700 Agelenidae
Number of Toxins

(73)
600

500
Hexathelidae
400
(91)
300
Lyscosidae
200 (175)
100 Ctenidae
(99)
0
1995 1990 1995 2000 2005 2010 Sicariidae
Year (167)

ArachnoServer, the world’s first manually curated resource
for protein spider toxins
Brief description of the These databases also require that 1) Allowing secure manual curation a single page and, where have been made available through Volker Herzig, David L. A. Wood,
project their structure, data, and utility be of the toxin records by the expert available, a toxin’s structure can the web interface. First, similarity Felicity Newell, Pierre-Alain Chaumeil,
appropriately modified over time team led by Professor Glenn King be dynamically displayed. searches can be made using BLAST. Quentin Kaas, Greta J. Binford;
QFAB has worked in collaboration
with new discoveries and changes using available literature and The solution chosen for ArachnoServer Secondly, signal peptide and Graham M. Nicholson; Dominique
with Professor Glenn King to create
in information priority. patent information. was to develop a Java Spring Model propeptide regions in spider-toxin Gorse; Glenn F. King (2011)
Arachnoserver, the online data
The aim of this project was to develop 2) Providing easy and powerful View Controller (MVC) application precursors can be predicted using ArachnoServer 2.0, an updated
repository for spider toxin research.
a robust, extensible and maintainable advanced search, browse and that uses a Hibernate Object SpiderP, a new tool developed for online resource for spider toxin
Background software architecture that would view capabilities. ArachnoServer Relational Mapping (ORM) layer to ArachnoServer. sequences and structures. Nucleic
High impact research requires the make ArachnoServer the world’s gold enables neuroscientists, a MySQL database. Using this ArachnoServer has become an Acids Research 39, D653-D657
wide and clear distribution of results. standard online data repository for pharmacologists, and toxinologists architecture, the application and international resource that is David LA Wood, Tomas Miljenović,
This can be hampered by spider toxin research for years to come. to explore high quality toxin data model can be easily extended cross-referenced by the European Shuzhi Cai, Robert J Raven, Quentin
bioinformatics web applications, information and rapidly answer or modified, as changes to the data Bioinformatics Institute’s UniProt Kaas, Pierre Escoubas, Volker Herzig,
which often suffer from a lack of
What were the outcomes?
their research questions. model do not require SQL changes. knowledge base (UniProtKB). David Wilson and Glenn F King
maintenance, leading to a decline in A web application was developed to Each toxin record is displayed in (2009) BMC Genomics 10:375
Two powerful bioinformatics tools http://www.arachnoserver.org/
data currency and subsequent accuracy. enable two key functions:

24 25

Dr Dominique Gorse
General Manager

EBI – Making Australian Data Discoverable
Brief description of the project The aim of this project was to develop The link between RDA and the EBI is > Identification of Australian species: sequences. A Java library was the primary data housed at the
a set of software to allow nucleotide provided through the use of landing A list of Australian species was implemented which used EBI EBI. The webpage lists basic
Populating Research Data Australia
and protein sequence data of pages that are simple to use and sourced from the Atlas of Living hosted web services (http://www. metadata for the collection
with collection descriptions of data
Australian interest to be discoverable contain structured information useful Australia through the IBIS taxono ebi.ac.uk/Tools/webservices/) (eg a short description, synonyms
held in the European Bioinformatics
through RDA in the form of collections. to non-domain specialists who are my web services (http://www to query these databases. for the collection) as well as
Institute databanks.
The project was funded by the unfamiliar with the content of the EBI ala.org.au/tools-services/ This library then inserted the displaying a list of records
Background Australian National Data Service databases (http://rda.ebi.edu.au). species-name-services/). extracted data into a MySQL (eg records of DNA or protein
The European Bioinformatics through the DIISRTE Education Molecular data of Australian interest These species were assigned database. Other EBI databases sequences) relevant to that
Institute (EBI, part of the European Infrastructure Fund. that is present on the EBI are now to approximately 800 higher were not interrogated as they collection. It also allows for
Molecular Biology Laboratory, EMBL) more easily found, accessible and level taxonomic ranking groups either did not contain data that navigation back to the primary
provides international access to data
What were the outcomes? re-usable through RDA (eg genus, class, order) using the could be definitively identified as source at EBI and navigation to
in molecular bioscience generated In this project, more than 13,000 (http://researchdata.ands.org.au). NCBI taxonomy (http://www.ncbi. Australian, or were not able to be related collections. The webpage
by researchers worldwide, including collection records describing The technical solutions developed nlm.nih.gov/taxonomy). queried using the web services. was developed using Java
Australia. In its present state, Australian-related content of the EBI for this project were: The higher order groupings were > Automatic generation of collections servlets, JSP and JavaScript.
Australian specific data is difficult to nucleotide and protein sequence selected in consultation with and submission to RDA: Data The web interface is deployed on
> Identification of Australian
isolate within the EBI databases, databases were created. A large ANDS whilst the NCBI taxonomy stored in the MySQL database was an Apache Tomcat web server
research institutions: A list of
particularly for the non-domain user. effort was made to divide and was used for species assignment converted into ANDS compliant on an ESX server with RedHat
relevant Australian research
The establishment of the EMBL describe the content of large databases as this taxonomy is used in the EBI RIF-CS xml (using an ANDS enterprise Linux 5.4.
institutions conducting biological
Australia Bioinformatics Resource at into many smaller datasets that are databases. supplied RIF-CS Java library) and The software is freely available for
research was compiled. This list
the University of Queensland has of potential interest to a wide and > Extraction of data from EBI made accessible to a RDA harvest download from Scourceforge under
includes institutions identified
provided the opportunity for linking varied range of researchers. databases: Australian species or data source. More than 13,000 the GNU General Public Licence.
through ARC and NHMRC grant
data of Australian interest deposited The collections encompass two types research institutions were used as collections were generated. http://sourceforge.net/projects/
information and having a National
at the EBI, to Research Data Australia of Australian data: a) data submitted query items to interrogate the EBI ebi-rda-linkage/?source=directory
Library of Australia (NLA) Party > Landing page for collections:
(RDA), a cohesive repository of from Australian-based researchers; databases: Uniprot (http://www
Persistent Identifier. Research The landing page is a webpage QFAB staff contact:
research data collections enabling b) data associated with sets (and ebi.ac.uk/uniprot/) for protein
institutions were then grouped by that is accessible from RDA and Dominique Gorse Project Manager
Australian researchers to easily publish, subsets thereof) of Australian species. sequences and ENA (http://
states and territories. acts as a link between RDA and
discover, access and use research data. www.ebi.ac.uk/ena/) for nucleotide

26 27

Publications

We have supported our client’s successful grant applications worth over $56 million.
The intellectual input of the QFAB Team provided clear bioinformatics strategies in their
experimental design and valuable outcomes in the analyses.

Recent publications in which QFAB Schroder, K., Irvine, K.M., Taylor, M.S., Bauer, D. C., Willadsen, K., Buske, F. feature selection and graphical BMC bioinformatics, 11, 498. untranslated regions. Nucleic Acids
has collaborated include: Bokil, N.J., Lê Cao, K.-A., Masterman, A., Lê Cao, K. A., Bailey, T. L., displays for multiclass problems. Res, 39((6)), 2393-2403.
Degrelle, S. A., Lê Cao, K. A.,
K-A., Labzin, L.I., Semple, C.A., Dellaire, G., & Boden, M. (2011). BMC bioinformatics, 12, 253.
Muscat GE, Eriksson NA, Byth K, Loi Heyman, Y., Everts, R. E., Campion, Taft, R. J, Simons, C., Nahkuri, S.,
Kapetanovic, R.A., Fairbairn, L., Sorting the nuclear proteome.
S, Graham D, Jindal S, Davis MJ, Lingwood, B. E., Henry, A. M., E., Richard, C., Ducroix-Crépy, C., Oey, H., Korbie, D. J., Mercer, T. R.,
Akalin, A., Faulkner, G.J., Baillie , Bioinformatics, 27(13), i7-14.
Clyne C, Funder JW, Simpson ER, d’Emden, M. C., Fullerton, A. M., Tian, X. C., Lewin, H. A., Renard, J. P., Holst, J., Ritchie, W., Wong, J. J.,
J.K., Gongora, M., Daub, C.O.,
Ragan MA, Kuczek E, Fuller PJ, Tilley Herzig, V., Wood, D. L., Newell, F., Mortimer, R. H., Colditz, P. B., Robert-Granié, C., Hue, I. (2010). Rasko, J. E., Rokhsar, D. S., Degnan,
Kawaji, H., McLachlan, G.J., Goldman,
WD, Leedman PJ, Clarke CL. (2013). Chaumeil ,P. A., Kaas, Q., Binford, G. Lê Cao, K. A., Callaway, L. K. (2011). A small set of extra-embryonic genes B. M., Mattick, J. S. (2010). Nuclear-
N., Grimmond, S.M., Carninci, P.,
Research Resource: Nuclear J., Nicholson, G. M, Gorse, D., King, Determinants of body fat in infants defines a new landmark for bovine localized tiny RNAs are associated
Suzuki, H., Hayashizaki, Y., Lenhard,
Receptors as Transcriptome: G. F. (2011). ArachnoServer 2.0, an of women with gestational diabetes embryo staging Reproduction with transcription initiation and
B., Hume, D.A., Sweet, M.J (2012).
Discriminant and Prognostic Value updated online resource for spider mellitus differ with fetal sex. Diabetes 141(1), 79-89. splice sites in metazoans. Nature
Conservation and Divergence in
in Breast Cancer. Mol Endocrinol toxin sequences and structures. Care, 34(12), 2581-2585. Structural & Molecular Biology, 17(8),
Toll-like Receptor 4-regulated gene Lê Cao, K. A., Meugnier, E., &
27(2), 350-365. [Journal article]. Nucleic acids research, 1030-U1146.
expression in primary human versus Choi, J., Davis, M. J., Newman, A. F., McLachlan, G. J. (2010). Integrative
39(Database issue), D653 - D657.
Donald M. Gardiner, Megan C. mouse macrophages. Proceedings of & Ragan, M. A. (2010). A semantic mixture of experts to combine Shin, C. J., Davis, M. J., & Ragan, M.
McDonald, Lorenzo Covarelli, Peter the National Academy of Sciences. Lê Cao, K.-A., & LeGall, C. (2011). web ontology for small molecules clinical factors and gene markers. A. (2009). Towards the mammalian
S. Solomon, Anca G. Rusu, Mhairi Proc Natl Acad Sci 109(16), E944 - E953 Integration and variable selection of and their biological targets. J Chem Bioinformatics, 26(9), 1192-1198. interactome: Inference of a core
Marshall, Kemal Kazan, Sukumar ‘omics’ data sets with PLS: a survey Inf Model, 50(5), 732-741. mammalian interaction set in mouse.
Yao, F., Coquery, J., & Lê Cao, Mercer, T. R., Wilhelm, D., Dinger,
Chakraborty, Bruce A. McDonald, Journal de la Société Francaise de Proteomics, 9(23), 5256-5266.
K.-A. (2012). Independent Principal Davis, M. J., Sehgal, M. S., & M.E., Soldà, G., Korbie, D. J., Glazov,
John M. Manners (2012). Comparative Statistique, 152(2).
Component Analysis for biologically Ragan, M. A. (2010). Automatic, E. A., Truong, V., Schwenke, M., Shin, C. J., Wong, S., Davis, M. J., &
Pathogenomics Reveals Horizontally
meaningful dimension reduction Lê Cao, K. A., Boitard, S., & Besse, context-specific generation of Gene Simons, C., Matthaei, K.I., Saint, R., Ragan, M. A. (2009). Protein-protein
Acquired Novel Virulence Genes
of large biological data sets. BMC P. (2011). Sparse PLS discriminant Ontology slims. [Journal Article Koopman, P., Mattick, J. S. (2010). interaction as a predictor of subcellular
in Fungi Infecting Cereal Hosts.
bioinformatics, 13(1), 24. analysis: biologically relevant Research Support Non-U S Gov’t]. Expression of distinct RNAs from 3’ location. BMC Syst Biol, 3, 28.
PLoS Pathog Sep, 8(9).

28 29

www.qfab.org

QFAB Bioinformatics
Level 6, Queensland Bioscience Precinct
The University of Queensland
306 Carmody Road
St Lucia QLD 4067
Australia

T +61 (0)7 3346 2604
F +61 (0)7 3346 2101
E contact@qfab.org

QFAB at a glance

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (6)

Similaire à QFAB at a glance

Similaire à QFAB at a glance (20)

Dernier

Dernier (20)

QFAB at a glance