SlideShare une entreprise Scribd logo
1  sur  33
Managing Experiment Data Using
Excel and Friends: Digging Out
from Under the Avalanche
Yannick Pouliot, PhD
Bioresearch Informationist
Lane Medical Library & Knowledge Management Center
6/1/2006
© 2006 The Board of Trustees of The Leland Stanford Junior University

Lane Medical Library & Knowledge Management Center
http://lane.stanford.edu
Course Expectations
Objectives



Demonstrate







Windows vs. Mac
Structure










… good practices
… useful features
… the value of querying via Excel

Examples, use cases
Exercises
Resources

Class evaluation questionnaire:
http://www.surveymk.com/s.asp?u=915602161402

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

2
Contents
Complexity

+
Querying Web sites &
databases using Excel
Excel handy functions

Excel good practices

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

3
So Why Are We Here?


Lots of data


 Need for better management of these data








Need exceeds Excel
Excel never really meant for data management anyway

Applying common tools to ameliorate the problem
“In IT, there’s no problem that enough money
can’t solve”  not the philosophy here…
Instead: invest yourself and you’ll get a handsome
return 

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

4
Essential Tip
Clippy: not as dorky as
you might think

Lane Medical Library & Knowledge Management Center
http://lane.stanford.edu
How To Help Clippy Give You
Better Answers
 Read a (good) Excel manual cover to cover
Don’t try to understand everything




Just flip pages and let it impress into your brain

Not fun, but it will give you the requisite
vocabulary






Increases your odds of getting the right answer
Gives you an idea of what Excel can do

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

6
Part I: Essential Excel
Functions

Lane Medical Library & Knowledge Management Center
http://lane.stanford.edu
Essential Excel Functions
1.

2.
3.
4.
5.
6.

Conditional Formatting
Named ranges & Input validation
Custom Toolbar
PivotTable
Web Querying
MS Query

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

8
Excel Functions 1: Conditional
Formatting


Definition: A formatting (e.g., cell shading or
font color) applied automatically by Excel to
cells if a specified condition is true.







Example: applying green cell color to the cell if a
test result exceeds a threshold value
In: Format/Conditional Formatting
See Spreadsheet1.xls/ConditionalExample1 - try

Reference
Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

9
Excel Functions 2: Named Ranges and
Validation




Named ranges are ranges of cells that
are…named!
Named ranges can be used for validating input
data


Important for ensuring data consistency









Essential for queryability

Also useful to avoid repetitive typing by using drop-down
menu
See: Spreadsheet1.xls/InputValidation - try

How to: here
Other references
Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

10
Excel Functions 3: Custom Toolbar





Why? Bring often used functions together for faster
access
DEMO
How to? 50 min online tutorial


Section on custom toolbars here

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

11
Excel Functions 4: PivotTables


Automatic summarization of data





See: Spreadsheet3.xls/Summary1 - try





Converting same category data into summarized values
Tall/skinny  wide/fat

Underlying data can always be accessed by
clicking on a summary cell

Online demo (5 min)
How to? 30 min online tutorial
Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

12
Excel Functions 5: Web Querying


Why Query the Web Using Excel?


Data in a Web page = first step


Need data stored in tool used for daily work 
Excel


E.g., with a list I can:
 Sort
 Annotate
 Edit

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

13
Excel Functions 5: Web Querying
Options



Copy/paste Web page into Excel - try
Run Web query from within Excel  more control try

1.
2.





Going one step further: creating a refreshable Web query

Excel Web querying is not perfect…





Still limited to how data are formatted on Web page 
requires editing
Some Web pages don’t work
No arbitrary querying capability (limited by Web interface)

 The answer: direct querying using e.g. SQL
Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

14
BREAK

Lane Medical Library & Knowledge Management Center
http://lane.stanford.edu
Part II: Querying
Databases Using Excel

Lane Medical Library & Knowledge Management Center
http://lane.stanford.edu
Putting MSQuery to Work


MSQuery, an unknown hero






Free
Facilitates writing a SQL query  graphical
What is SQL?

First, need to find it!


Search for “MSQRY32.EXE” using “Search for Files or
Folders”






Search hidden files and folders

On my disk, it is located in C:Program FilesMicrosoft
OfficeOFFICE11
Once you find it, create a shortcut to it and rename it e.g.
MSQuery


move the shortcut to a desired location

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

17
Example: Network Querying of Ensembl
Database Using MS Query





Remote
Big database, lots of data to return from far away… DB

ult
s



What happens when you use MS Query
DEMO
query
qu
May take some time
e ry

re s



results

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

18
FYI - Bioinformatics Databases:
Direct
WhoQueryability of Selected Bioinformatics Databases Querying?
Supports Direct
Database

Internet SQL querying?

ArrayExpress

How?

Eventually

Modality

DB Engine

SOAP-based

Ensembl

Yes

http://www.ensembl.org
/info/data/download.ht SQL
ml

Mouse Genome
Database

Yes

ask for account

Yes

http://eutils.ncbi.nlm.nih
.gov/entrez/query/static SOAP-based
/esoap_help.html

SQL Server

Yes

http://www.pharmgkb.or
g/home/projects/webser SOAP-based
vices/

Oracle

NCBI Entrez

PharmGKB

SQL

MySQL

Sybase

Saccharomyces Genome
EventuallyMaybe
Database

Oracle

Stanford Microarray
Database

Oracle

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

No

19
How to Query Using MSQuery
Steps
1. Make sure you have the requisite driver
2. Create a Data Source Name
3. Write your SQL query
4. Get the results back into Excel!

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

20
Step1: Getting Drivers
Essential for Querying




A driver is a piece of software that lets your
operating system talk to a database
Each database engine (Oracle, MySQL, etc)
requires its own driver






Generally must be installed by user

Drivers are needed by Data Source Name
tool and querying programs
Require (simple) installation
Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

21
MySQL Driver: Needed to Query
MySQL Databases




Windows: Download MySQL
Connector/ODBC 3.51 here
Must be installed for direct querying using
e.g. Excel


Not necessary if you are using the MySQL Query
Browser

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

22
Oracle Driver: Needed to Query
Oracle Databases


Installing “client” software will install
driver





Windows: Download 10g Client here
Mac: Download 10g Client here

Must be installed if you are querying
using e.g. Excel

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

23
Step 2: Creating a Data Source Name




A Data Source Name (DSN) tells programs
on your PC where and how to query a
database
Populating the fields:





Data Source Name: Unique name of your choice
Description: anything
Server: exactly as given by the database provider
Port number: as specified by database provider


Defaults: MySQL: 3306; Oracle: 1521; MS Access: N/A

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

24
Step 3: Building a Query


DEMO

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

25
Resources – Excel
Summarizing Numerical Data



Data summarization (text):
http://office.microsoft.com/enus/assistance/HA011864391033.aspx

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

26
Resources – MS Access
Free Online Training Resources








Using an Access database to store and information (2 min)
http://office.microsoft.com/en-us/assistance/HA011709681033.aspx
Creating a database from Excel (5 min): http://office.microsoft.com/enus/assistance/HA012013211033.aspx
Creating tables in Access (50 min):
http://office.microsoft.com/training/training.aspx?AssetID=RC061183261033
Writing queries (50 min):
http://office.microsoft.com/training/training.aspx?AssetID=RC010776611033

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

27
Resources - Excel

Accessible from
Lane Library

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

Available
via Safari

Available
via Safari

28
Resources - Excel

Available from
Lane Library

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

29
MS Query Resources


Excellent tutorial:
http://office.microsoft.com/training/Training.as
px?AssetID=RP011856321033&CTT=6&Orig
in=RC011856161033

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

30
Resources – SQL


SQL=Structured Query Language







The Language to Query Relational Databases

Beginning SQL, Wilton P & Colby JW: E
http://jenson.stanford.edu/uhtbin/cgisirsi/5AG
uKeptoD/GREEN/59960102/9#holdings
Oracle SQL*Plus, Gennick, J.
Beginning MySQL: E
http://site.ebrary.com/lib/stanford/Doc?id=101
14227
Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

31
Resources – MS Access

Accessible from
Lane Library

Lane Medical Library &
Knowledge Management Center
http://lane.stanford.edu

Not in SU catalog; on
order by Lane

1st edition available
from SU; 2nd edition
available via Safari

32
Lane Medical Library & Knowledge Management Center
http://lane.stanford.edu

Contenu connexe

Similaire à Managing experiment data using Excel and Friends

A guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databasesA guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databasesYannick Pouliot
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8dallemang
 
Style Intelligence Evaluation Documentation
Style Intelligence Evaluation DocumentationStyle Intelligence Evaluation Documentation
Style Intelligence Evaluation DocumentationArleneWatson
 
Sql a practical_introduction
Sql a practical_introductionSql a practical_introduction
Sql a practical_introductioninvestnow
 
Sql interview question part 8
Sql interview question part 8Sql interview question part 8
Sql interview question part 8kaashiv1
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresSteven Johnson
 
Sql server 2012 tutorials writing transact-sql statements
Sql server 2012 tutorials   writing transact-sql statementsSql server 2012 tutorials   writing transact-sql statements
Sql server 2012 tutorials writing transact-sql statementsSteve Xu
 
Oracle application express ppt
Oracle application express pptOracle application express ppt
Oracle application express pptAbhinaw Kumar
 
Managing SQLserver for the reluctant DBA
Managing SQLserver for the reluctant DBAManaging SQLserver for the reluctant DBA
Managing SQLserver for the reluctant DBAConcentrated Technology
 
Access Apps for Office 365 with Power BI
Access Apps for Office 365 with Power BIAccess Apps for Office 365 with Power BI
Access Apps for Office 365 with Power BIChris McNulty
 
Sql a practical introduction
Sql   a practical introductionSql   a practical introduction
Sql a practical introductionHasan Kata
 
Sql a practical introduction
Sql   a practical introductionSql   a practical introduction
Sql a practical introductionsanjaychauhan689
 

Similaire à Managing experiment data using Excel and Friends (20)

A guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databasesA guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databases
 
Data Analysis using Excel.pdf
Data Analysis using Excel.pdfData Analysis using Excel.pdf
Data Analysis using Excel.pdf
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
Style Intelligence Evaluation Documentation
Style Intelligence Evaluation DocumentationStyle Intelligence Evaluation Documentation
Style Intelligence Evaluation Documentation
 
Sql a practical_introduction
Sql a practical_introductionSql a practical_introduction
Sql a practical_introduction
 
Managing SQLserver
Managing SQLserverManaging SQLserver
Managing SQLserver
 
Excel
ExcelExcel
Excel
 
Intro to Application Express
Intro to Application ExpressIntro to Application Express
Intro to Application Express
 
Sql interview question part 8
Sql interview question part 8Sql interview question part 8
Sql interview question part 8
 
Ebook8
Ebook8Ebook8
Ebook8
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome Measures
 
Sql server 2012 tutorials writing transact-sql statements
Sql server 2012 tutorials   writing transact-sql statementsSql server 2012 tutorials   writing transact-sql statements
Sql server 2012 tutorials writing transact-sql statements
 
Oracle application express ppt
Oracle application express pptOracle application express ppt
Oracle application express ppt
 
Database
DatabaseDatabase
Database
 
Mule jdbc
Mule   jdbcMule   jdbc
Mule jdbc
 
SQL2SPARQL
SQL2SPARQLSQL2SPARQL
SQL2SPARQL
 
Managing SQLserver for the reluctant DBA
Managing SQLserver for the reluctant DBAManaging SQLserver for the reluctant DBA
Managing SQLserver for the reluctant DBA
 
Access Apps for Office 365 with Power BI
Access Apps for Office 365 with Power BIAccess Apps for Office 365 with Power BI
Access Apps for Office 365 with Power BI
 
Sql a practical introduction
Sql   a practical introductionSql   a practical introduction
Sql a practical introduction
 
Sql a practical introduction
Sql   a practical introductionSql   a practical introduction
Sql a practical introduction
 

Plus de Yannick Pouliot

Survey of Spark for Data Pre-Processing and Analytics
Survey of Spark for Data Pre-Processing and AnalyticsSurvey of Spark for Data Pre-Processing and Analytics
Survey of Spark for Data Pre-Processing and AnalyticsYannick Pouliot
 
Systems Immunology -- 2014
Systems Immunology -- 2014Systems Immunology -- 2014
Systems Immunology -- 2014Yannick Pouliot
 
Essential UNIX skills for biologists
Essential UNIX skills for biologistsEssential UNIX skills for biologists
Essential UNIX skills for biologistsYannick Pouliot
 
Ontologically-Aware Automated Gating
Ontologically-Aware Automated GatingOntologically-Aware Automated Gating
Ontologically-Aware Automated GatingYannick Pouliot
 
Why The Cloud Is A Computational Biologist's Best Friend
Why The Cloud Is A Computational Biologist's Best FriendWhy The Cloud Is A Computational Biologist's Best Friend
Why The Cloud Is A Computational Biologist's Best FriendYannick Pouliot
 
There’s No Avoiding It: Programming Skills You’ll Need
There’s No Avoiding It:  Programming Skills You’ll NeedThere’s No Avoiding It:  Programming Skills You’ll Need
There’s No Avoiding It: Programming Skills You’ll NeedYannick Pouliot
 
Ontologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataOntologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataYannick Pouliot
 
Predicting Adverse Drug Reactions Using PubChem Screening Data
Predicting Adverse Drug Reactions Using PubChem Screening DataPredicting Adverse Drug Reactions Using PubChem Screening Data
Predicting Adverse Drug Reactions Using PubChem Screening DataYannick Pouliot
 
Repositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational ApproachesRepositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational ApproachesYannick Pouliot
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyYannick Pouliot
 

Plus de Yannick Pouliot (10)

Survey of Spark for Data Pre-Processing and Analytics
Survey of Spark for Data Pre-Processing and AnalyticsSurvey of Spark for Data Pre-Processing and Analytics
Survey of Spark for Data Pre-Processing and Analytics
 
Systems Immunology -- 2014
Systems Immunology -- 2014Systems Immunology -- 2014
Systems Immunology -- 2014
 
Essential UNIX skills for biologists
Essential UNIX skills for biologistsEssential UNIX skills for biologists
Essential UNIX skills for biologists
 
Ontologically-Aware Automated Gating
Ontologically-Aware Automated GatingOntologically-Aware Automated Gating
Ontologically-Aware Automated Gating
 
Why The Cloud Is A Computational Biologist's Best Friend
Why The Cloud Is A Computational Biologist's Best FriendWhy The Cloud Is A Computational Biologist's Best Friend
Why The Cloud Is A Computational Biologist's Best Friend
 
There’s No Avoiding It: Programming Skills You’ll Need
There’s No Avoiding It:  Programming Skills You’ll NeedThere’s No Avoiding It:  Programming Skills You’ll Need
There’s No Avoiding It: Programming Skills You’ll Need
 
Ontologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataOntologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological Data
 
Predicting Adverse Drug Reactions Using PubChem Screening Data
Predicting Adverse Drug Reactions Using PubChem Screening DataPredicting Adverse Drug Reactions Using PubChem Screening Data
Predicting Adverse Drug Reactions Using PubChem Screening Data
 
Repositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational ApproachesRepositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational Approaches
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems Immunology
 

Dernier

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 

Dernier (20)

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 

Managing experiment data using Excel and Friends

  • 1. Managing Experiment Data Using Excel and Friends: Digging Out from Under the Avalanche Yannick Pouliot, PhD Bioresearch Informationist Lane Medical Library & Knowledge Management Center 6/1/2006 © 2006 The Board of Trustees of The Leland Stanford Junior University Lane Medical Library & Knowledge Management Center http://lane.stanford.edu
  • 2. Course Expectations Objectives  Demonstrate     Windows vs. Mac Structure       … good practices … useful features … the value of querying via Excel Examples, use cases Exercises Resources Class evaluation questionnaire: http://www.surveymk.com/s.asp?u=915602161402 Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 2
  • 3. Contents Complexity + Querying Web sites & databases using Excel Excel handy functions Excel good practices Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 3
  • 4. So Why Are We Here?  Lots of data   Need for better management of these data      Need exceeds Excel Excel never really meant for data management anyway Applying common tools to ameliorate the problem “In IT, there’s no problem that enough money can’t solve”  not the philosophy here… Instead: invest yourself and you’ll get a handsome return  Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 4
  • 5. Essential Tip Clippy: not as dorky as you might think Lane Medical Library & Knowledge Management Center http://lane.stanford.edu
  • 6. How To Help Clippy Give You Better Answers  Read a (good) Excel manual cover to cover Don’t try to understand everything   Just flip pages and let it impress into your brain Not fun, but it will give you the requisite vocabulary    Increases your odds of getting the right answer Gives you an idea of what Excel can do Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 6
  • 7. Part I: Essential Excel Functions Lane Medical Library & Knowledge Management Center http://lane.stanford.edu
  • 8. Essential Excel Functions 1. 2. 3. 4. 5. 6. Conditional Formatting Named ranges & Input validation Custom Toolbar PivotTable Web Querying MS Query Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 8
  • 9. Excel Functions 1: Conditional Formatting  Definition: A formatting (e.g., cell shading or font color) applied automatically by Excel to cells if a specified condition is true.     Example: applying green cell color to the cell if a test result exceeds a threshold value In: Format/Conditional Formatting See Spreadsheet1.xls/ConditionalExample1 - try Reference Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 9
  • 10. Excel Functions 2: Named Ranges and Validation   Named ranges are ranges of cells that are…named! Named ranges can be used for validating input data  Important for ensuring data consistency      Essential for queryability Also useful to avoid repetitive typing by using drop-down menu See: Spreadsheet1.xls/InputValidation - try How to: here Other references Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 10
  • 11. Excel Functions 3: Custom Toolbar    Why? Bring often used functions together for faster access DEMO How to? 50 min online tutorial  Section on custom toolbars here Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 11
  • 12. Excel Functions 4: PivotTables  Automatic summarization of data    See: Spreadsheet3.xls/Summary1 - try    Converting same category data into summarized values Tall/skinny  wide/fat Underlying data can always be accessed by clicking on a summary cell Online demo (5 min) How to? 30 min online tutorial Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 12
  • 13. Excel Functions 5: Web Querying  Why Query the Web Using Excel?  Data in a Web page = first step  Need data stored in tool used for daily work  Excel  E.g., with a list I can:  Sort  Annotate  Edit Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 13
  • 14. Excel Functions 5: Web Querying Options  Copy/paste Web page into Excel - try Run Web query from within Excel  more control try 1. 2.   Going one step further: creating a refreshable Web query Excel Web querying is not perfect…    Still limited to how data are formatted on Web page  requires editing Some Web pages don’t work No arbitrary querying capability (limited by Web interface)  The answer: direct querying using e.g. SQL Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 14
  • 15. BREAK Lane Medical Library & Knowledge Management Center http://lane.stanford.edu
  • 16. Part II: Querying Databases Using Excel Lane Medical Library & Knowledge Management Center http://lane.stanford.edu
  • 17. Putting MSQuery to Work  MSQuery, an unknown hero     Free Facilitates writing a SQL query  graphical What is SQL? First, need to find it!  Search for “MSQRY32.EXE” using “Search for Files or Folders”    Search hidden files and folders On my disk, it is located in C:Program FilesMicrosoft OfficeOFFICE11 Once you find it, create a shortcut to it and rename it e.g. MSQuery  move the shortcut to a desired location Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 17
  • 18. Example: Network Querying of Ensembl Database Using MS Query   Remote Big database, lots of data to return from far away… DB ult s  What happens when you use MS Query DEMO query qu May take some time e ry re s  results Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 18
  • 19. FYI - Bioinformatics Databases: Direct WhoQueryability of Selected Bioinformatics Databases Querying? Supports Direct Database Internet SQL querying? ArrayExpress How? Eventually Modality DB Engine SOAP-based Ensembl Yes http://www.ensembl.org /info/data/download.ht SQL ml Mouse Genome Database Yes ask for account Yes http://eutils.ncbi.nlm.nih .gov/entrez/query/static SOAP-based /esoap_help.html SQL Server Yes http://www.pharmgkb.or g/home/projects/webser SOAP-based vices/ Oracle NCBI Entrez PharmGKB SQL MySQL Sybase Saccharomyces Genome EventuallyMaybe Database Oracle Stanford Microarray Database Oracle Lane Medical Library & Knowledge Management Center http://lane.stanford.edu No 19
  • 20. How to Query Using MSQuery Steps 1. Make sure you have the requisite driver 2. Create a Data Source Name 3. Write your SQL query 4. Get the results back into Excel! Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 20
  • 21. Step1: Getting Drivers Essential for Querying   A driver is a piece of software that lets your operating system talk to a database Each database engine (Oracle, MySQL, etc) requires its own driver    Generally must be installed by user Drivers are needed by Data Source Name tool and querying programs Require (simple) installation Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 21
  • 22. MySQL Driver: Needed to Query MySQL Databases   Windows: Download MySQL Connector/ODBC 3.51 here Must be installed for direct querying using e.g. Excel  Not necessary if you are using the MySQL Query Browser Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 22
  • 23. Oracle Driver: Needed to Query Oracle Databases  Installing “client” software will install driver    Windows: Download 10g Client here Mac: Download 10g Client here Must be installed if you are querying using e.g. Excel Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 23
  • 24. Step 2: Creating a Data Source Name   A Data Source Name (DSN) tells programs on your PC where and how to query a database Populating the fields:     Data Source Name: Unique name of your choice Description: anything Server: exactly as given by the database provider Port number: as specified by database provider  Defaults: MySQL: 3306; Oracle: 1521; MS Access: N/A Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 24
  • 25. Step 3: Building a Query  DEMO Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 25
  • 26. Resources – Excel Summarizing Numerical Data  Data summarization (text): http://office.microsoft.com/enus/assistance/HA011864391033.aspx Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 26
  • 27. Resources – MS Access Free Online Training Resources     Using an Access database to store and information (2 min) http://office.microsoft.com/en-us/assistance/HA011709681033.aspx Creating a database from Excel (5 min): http://office.microsoft.com/enus/assistance/HA012013211033.aspx Creating tables in Access (50 min): http://office.microsoft.com/training/training.aspx?AssetID=RC061183261033 Writing queries (50 min): http://office.microsoft.com/training/training.aspx?AssetID=RC010776611033 Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 27
  • 28. Resources - Excel Accessible from Lane Library Lane Medical Library & Knowledge Management Center http://lane.stanford.edu Available via Safari Available via Safari 28
  • 29. Resources - Excel Available from Lane Library Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 29
  • 30. MS Query Resources  Excellent tutorial: http://office.microsoft.com/training/Training.as px?AssetID=RP011856321033&CTT=6&Orig in=RC011856161033 Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 30
  • 31. Resources – SQL  SQL=Structured Query Language     The Language to Query Relational Databases Beginning SQL, Wilton P & Colby JW: E http://jenson.stanford.edu/uhtbin/cgisirsi/5AG uKeptoD/GREEN/59960102/9#holdings Oracle SQL*Plus, Gennick, J. Beginning MySQL: E http://site.ebrary.com/lib/stanford/Doc?id=101 14227 Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 31
  • 32. Resources – MS Access Accessible from Lane Library Lane Medical Library & Knowledge Management Center http://lane.stanford.edu Not in SU catalog; on order by Lane 1st edition available from SU; 2nd edition available via Safari 32
  • 33. Lane Medical Library & Knowledge Management Center http://lane.stanford.edu