SlideShare une entreprise Scribd logo
1  sur  23
TM4P
Translational Medicine for Patients

TM Data Hub Project
Implementation of a Translational Medicine Data Integration Platform

tranSMART Community Meeting
Developer Stream, Nov 06-2013
Charlotte Raillère (tranSMART Expert)
Claire Virenque (Project Manager)

|

1
Content of the presentation
●

Update on Sanofi latest achievements
1. IT security assessment of tranSMART
2. Improvement of SNP (subject level) data loading

●

Update on work-in-progress
3. New release under development (‘RC2’)
4. tranSMART x MongoDB integration

tranSMART Community Meeting – Nov 06, 2013

|

2
Context – tranSMART at Sanofi
●

Pilot experience with tranSMART from September 2011 till June 2012
● Evaluate tranSMART capabilities to support clinical biomarker research

●

Implementation project launched in September 2012
● Identify tranSMART improvements, which are of highest value for Sanofi
● Implement tranSMART improvements through two successive tranSMART Release
Candidates (RC)
• RC1 is available since March, 2013 – code base available in Github
• RC2 building is in progress
• RC2 is expected to move into production mode in Q2 next year

●

Working version of tranSMART available for our early adopter business units
● Obj = Meet their ongoing needs related to translational research data integration.
● Support for data curation & loading is also provided.

tranSMART Community Meeting – Nov 06, 2013

|

3
tranSMART IT security
assessment
Feedback
Special thanks to Vincent Rossetto
and the IS Security Team!
Part 1 – Scope and Context
●

●

Objective of Security Risk Assessment: Protect R&D information
● Mission of R&D IS Security team – Control and Assess the risks on R&D information asset

Risk assessment methodology
● ‘Ethical’ hacking – penetration testing
• From vulnerability scans to exploitation
• Using free tools (Nessus, BackTrack, Metasploit, Sharepoint perl script)
• With no account on Sanofi systems neither sanofi standard workstation

●

●

● Without access account, try to gain high level access (admin account, sensitive data)

Risk Classification: Four grades
● From ‘High’: Risk with important consequences on Sanofi activities – can happen or be
caused easily
● Till ‘Negligible’: Risk with minor consequences – requires expert knowledge or favorable
context

Recommendations: Remediation Action Plan
● With prioritization of the recommendations

|

5
Part 2 – tranSMART risk assessment results
●

tranSMART strength overview
● No trivial system accounts found. No default database accounts found.

● Web servers are running under low privileges. User authentication cannot be
bypassed.
• Authentication through Sanofi’s Active Directory

●

Main risks identified
● Credential disclosure
(database, Tomcat, Jboss…)
● Session hi-jacking
● Privilege elevation
● Application malevolence (XSS)

●

Impact
● Sensitive data disclosure
● Technical information disclosure
● Identity usurpation

|

6
Part 3 – Application Security weaknesses
●

XSS attack: Certain parameters (tags) are
prone to store cross-site scripting attacks.
●

●

This vulnerability can be exploited to take control
of another administrator’s browser or more
probably to lead phishing or viral spreading attacks
Admin session hijacking XSS alert :
• <script>alert(String.fromCharCode(88, 83, 83, 32, 97,
116, 116, 97, 99, 107, 32, 105, 110, 32, 112, 114, 111,
103, 114, 101, 115, 115))</script>

●

Privilege escalation: Basic users can access some administrative features
●

The following URL must not be accessible to users with standard account:
•
•

/transmart/secureObjectAccess/manageAccess
/transmart/secureObjectAccess/manageAccessBySecObj

|

7
Part 4 – Recommendations and good practices
●
●
●
●

Use good development practices to avoid XSS attacks and privilege escalation
●

Based on development standards such as OWASP

Ensure compliance of application accounts with company’s password policy
●
●

LDAP authentication using AD (preferred)
Or set up specific application password policy (pwd complexity, pwd expiration, time out…)

Encrypt tranSMART authentication (https)
●

Avoid sniffing attacks and credential disclosure

Avoid default or weak accounts
● Administrative console (Jboss, Tomcat, Axis2) must have complex and secret password
•
•

●

●

Risk: Exploit vulnerability to access admin areas and compromise the application (crafted application
Consequence: Can impact the application availability or the data confidentiality & integrity.

Database accounts (DBA, application) must have complex and secret password
•
•

Risk: Exploit vulnerability to access the Web application database
Conquequence: Can impact the data confidentiality & integrity

Sensitize users on security topics
●

Lock Workstation or log off from tranSMART session to avoid unauthorized access

|

8
Loading of SNP data
Latest achievements

tranSMART Community Meeting – Nov 06, 2013 |

9
Loading of SNP genotyping data
●

Modification of loader.jar (from tranSMART-ETL repository)
● Correction of errors
● Loading speeded up
• Some inserts replaced by batch inserts
• Parameters modified to insert/select data
● Less constraints on file format
• Columns from the annotation file can be described in property files
• New class to load SNP data from Illumina platform

●

Loading of three studies with SNP data from Illumina platform (> 1million SNP)
● 4 patients → 40 minutes
● 30 patients → 5 hours
● 1500 patients → 80 hours

●

Estimation
(on-going)

Integration of SNP loading in ICE (tranSMART
Curation & Loading Tool) done

tranSMART Community Meeting – Nov 06, 2013

|

10
New tranSMART release
under development (‘RC2’)
Improvements – New features

tranSMART Community Meeting – Nov 06, 2013 |

11
tranSMART RC2 – Scope outline
●

Accommodate new data types
●
●
●
●

●

●
●

miRNA data (qPRC and microarray)
Proteomic data (RBM data, mass spec data)
Metabolomic data
RNA sequencing data

Accommodate serial data (time courses, doses
responses, etc.)

Enable sequential loading of data for a study
Enhance critical current analytics

Developments in-progress.
Partnership w/ Cognizant and
The Hyve.
Completion of RC2
developments planned for
January, 2014.
Developments will be
contributed back to the
community.

● Box Plot, Line Graph, Correlation Analysis, Grid View
● Plus adaptation of analytics to new data types

●

Enhance data export features

Click here for further details
on RC2 enhancements

tranSMART Community Meeting – Nov 06, 2013

| 12
tranSMART RC2 – Key points
●

RC2 is built ‘on top of’ Sanofi RC1 release
● ETL: impact of changes = high (Kettle scripts converted into Groovy, new ETL
pipelines, mapping files modified)
● Data model: impact = high (creation of new tables for new data types, etc.)
● UI: impact = low

●

Our goal is to converge towards the GPL version
● RC1 was merged with ‘Core DB’ & ‘Core API’ enhancements (from GPL1.1)
• Start of the modularization of tranSMART

● New data types are implemented in a modular fashion.
• This should help to the future merging of RC2 with open source code base
Limit deviation from the open
source code base

Do not duplicate efforts

Maximally benefit from public
tranSMART development efforts

Contribute back all
developments to the community

tranSMART Community Meeting – Nov 06, 2013

|

13
tranSMART x MonGo DB
integration
Objective and timeline

tranSMART Community Meeting – Nov 06, 2013

|

14
MongoDB integration with tranSMART (1/2)
●
●

MongoDB is a NoSQL document oriented database
Main need for tranSMART: Physical storage of unstructured data (i.e., files)
● Any files that are uploaded and visible through the Browse tab of the Sanofi RC1 (raw
data files, study related documentation such as clinical protocol, etc.)
● Currently, files are stored on tranSMART app server… Limited storage capacity.
 Objective: Move storage of unstructured data from tranSMART server to MongoDB db

●

Why MongoDB ?
● Ability to store huge volume of unstructured files
● Horizontal scalability
● Easy installation process

tranSMART Community Meeting – Nov 06, 2013

|

15
MongoDB integration with tranSMART (2/2)
●

Timelines
● Integration with Sanofi RC2 release (backend + UI): Q4-2013
● Testing in Q1-2014

tranSMART Community Meeting – Nov 06, 2013

|

16
Conclusion
Any questions?

Thank you!
Acknowledgement: Sherry Cao, Jike Cui, Angelo DeCristofano, Christophe Gibault, Lars
Greiffenberg, Manfred Hendlich, Rainer Kappes, Adam Palermo, Annick Peleraux, David
Peyruc, Charlotte Raillère, Vincent Rossetto, Claire Virenque

Making a difference in Healthcare with Information Technologies.

tranSMART Community Meeting – Nov 06, 2013

|

17
Additional slides

tranSMART Community Meeting – Nov 06, 2013

|

18
tranSMART RC1 – Summary
●

Released in March 2013
● Code base available in Github

●

Main improvements delivered in tranSMART RC1:

Topic 1: Data
Management

•
•
•
•
•
•

Topic 2: tranSMART
User Interface

•

Topic 3: Data
Searching and
Analysis

Ability to organize data within a hierarchical structure (Program/Study/Assay) with new
tagging capabilities
Synonym management for several dictionaries (e.g. compounds, genes, diseases)
New capabilities for posting, searching and exporting files
New functionality to load gene expression analysis results
Better support for time points/series
Improvement of tranSMART curation and loading tool & pipelines
Simplification of tranSMART UI:
– All searching functionalities centralized
– Synchronization of the browser and analysis modules

•

Improvement of data searching capabilities:
– Integrated search / filter for querying any data available (levels 1 to 4)
– More search / filter criteria
• Implementation of standard analytics from GPL1.0

tranSMART Community Meeting – Nov 06, 2013

| 19
RC1 – New organization of tranSMART UI
●

Two main tabs – synchronized with each other:

Global view of all the data available
From level 1 data (uncurated/raw files)
to levels 3-4 data (analysis results, findings)

Run analysis on subject-level data
(former Dataset Explorer)

Navigate within Programs > Studies > Assays
, Analysis and File Folders (see next slide)

Browse level 2 (processed) data – incl. clinical /
preclinical / molecular data, etc.

Search data using dictionaries

Search subject-level data

Create new Programs > Studies > Assays and Files
Folders, and annotate (tag) them

Select data subsets (cohorts)

Export files

Run basic statistical and genomic analyses on
those subsets (standard features from tranSMART v1.0)

Visualize gene expression analysis results

Export out data subsets

tranSMART Community Meeting – Nov 06, 2013

| 20
tranSMART RC2 – Requirements (1/2)
Area

Req #
1
2

3
Data
loading /
ETL
pipelines

Security

4
5
6
7
8
9
10
11
12
13
14
15
16

Analytics –
Advanced
Workflows

17
18
19

Requirement
Optimize the clinical ETL pipeline to accelerate loading time for large clinical studies
Enable incremental loading of data for a given study
Enable loading of ‘serial’ high and low dimensional data (time course, dose response,
different sampling conditions, etc.)
Improve samples handling
Enable loading of RBM subject-level data as high dimensional data
Enable loading of microarray miRNA subject-level data as high dimensional data.
Enable loading of qPCR miRNA subject-level data
Enable loading of mass spec proteomic subject-level data as high dimensional data.
Enable loading of metabolomic subject-level data as high dimensional data.
Improve SNP subject-level data loading – in particular, accelerate loading time
Enable loading of RNA sequencing subject-level data (gene-level expression quantification)
Optimize the management of annotation files for omic data
Set up user authentication through the company’s Active Directory
Implement security rules and user permissions in Browse tab (RC1 feature)
Allow better analysis of ‘serial’ high and low dimensional data using existing analytics
Improve the Line Graph analytics:
• Enable Line Graph to use high dimensional data
•Better handle x axis
• Add option to plot individual data in addition to group means or medians.
Improve sub categorization of high dimensional data (tissue, time points, etc.) in the high
dimensional data node selection screen in Advanced Workflows – linked to req #3
Improve the Boxplot analytics – make individual box plots for each variable when dragging
multiple nodes in field ‘Dependent Variable’, and present output in table format
Improve the Correlation Analysis analytics

Sprint #
2
4

2
1
3
2
2
3
3
Done
2
Done
Done
1
2
4

2
4
4

tranSMART Community Meeting – Nov 06, 2013

|

21
tranSMART RC2 – Requirements (2/2)
Area

Req #
20
21
Analytics –
22
Advanced
23
Workflows
24
25

Analytics –
Grid View

26
27

Requirement
Allow analysis of RBM data using existing analytics for high dimensional data
Allow analysis of microarray miRNA data using existing analytics
Allow analysis of qPCR miRNA and mRNA data using existing analytics
Allow analysis of mass spectrometry subject-level data using existing analytics
Allow analysis of metabolomic subject-level data using existing analytics
Allow analysis of RNA sequencing data using existing analytics
Improve Grid View
•
•
•
•

Sprint #
3
2
2
3
3
2

Enable categorical variables in a single column
Enable column deletion, row or column selection
Enable export of selection
Automatically include variables used in Advanced Workflows

3

Display sample ID related to patient ID in Grid View
Improve export of data
• Improve performances (response time) when exporting large data volume

1

28

• Add advanced filters to allow users to limit the exported data to subset of clinical fields, genes…
• Add ability to better categorize the data available for a study (clinical, gene expression, etc.)
• Harmonize with Grid View export capabilities

2+4

Tagging
Gene sign.

29
30
31

1
2
Done

UI

32

Add ability to preview a file in browser (IE8 and Firefox)
Add dictionaries for miRNA, proteins, metabolites
In Gene Signature/List tab, add gene symbols – linked to req #12
Improve consistency and synchronization of data trees in Browse (Program Explorer panel) and
in Analyze (Navigate Terms panel)
Secure file indexing
After running a free text search in Browse tab, when clicking on bold items in Program Explorer
panel, highlight in right hand side Browse panel:

Export

33
Search

34

2
Done

• String found in metadata (including in file names)
• Files containing that string
tranSMART Community Meeting – Nov 06, 2013

3
|

22
Risk Assessment methodology

tranSMART Community Meeting – Nov 06, 2013

|

23

Contenu connexe

Similaire à tranSMART Community Meeting 5-7 Nov 13 - Session 3: Clinical Biomarker Discovery

Sourav_Giri_Resume_2015
Sourav_Giri_Resume_2015Sourav_Giri_Resume_2015
Sourav_Giri_Resume_2015
sourav giri
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrument
Jonah Kowall
 
Nitin - Data Specialist
Nitin - Data SpecialistNitin - Data Specialist
Nitin - Data Specialist
Nitin singhal
 

Similaire à tranSMART Community Meeting 5-7 Nov 13 - Session 3: Clinical Biomarker Discovery (20)

Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
 
How to move from Monolith to Microservice
How to move from Monolith to MicroserviceHow to move from Monolith to Microservice
How to move from Monolith to Microservice
 
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
 
MuleSoft Manchester Meetup #2 slides 29th October 2019
MuleSoft Manchester Meetup #2 slides 29th October 2019MuleSoft Manchester Meetup #2 slides 29th October 2019
MuleSoft Manchester Meetup #2 slides 29th October 2019
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Azure Application Architecture Guide
Azure Application Architecture GuideAzure Application Architecture Guide
Azure Application Architecture Guide
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
 
Sourav_Giri_Resume_2015
Sourav_Giri_Resume_2015Sourav_Giri_Resume_2015
Sourav_Giri_Resume_2015
 
Apache Cassandra at Target - Cassandra Summit 2014
Apache Cassandra at Target - Cassandra Summit 2014Apache Cassandra at Target - Cassandra Summit 2014
Apache Cassandra at Target - Cassandra Summit 2014
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrument
 
Cs6703 grid and cloud computing unit 4
Cs6703 grid and cloud computing unit 4Cs6703 grid and cloud computing unit 4
Cs6703 grid and cloud computing unit 4
 
WTF is a Microservice - Rafael Schloming, Datawire
WTF is a Microservice - Rafael Schloming, DatawireWTF is a Microservice - Rafael Schloming, Datawire
WTF is a Microservice - Rafael Schloming, Datawire
 
OpenTelemetry For Architects
OpenTelemetry For ArchitectsOpenTelemetry For Architects
OpenTelemetry For Architects
 
Nitin - Data Specialist
Nitin - Data SpecialistNitin - Data Specialist
Nitin - Data Specialist
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDS
 
sonal
sonalsonal
sonal
 
Not my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructureNot my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructure
 
Dark launch
Dark launchDark launch
Dark launch
 

Plus de David Peyruc

Plus de David Peyruc (20)

tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analy...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analy...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analy...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analy...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: eTRIKS - Science Driven D...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: eTRIKS - Science Driven D...tranSMART Community Meeting 5-7 Nov 13 - Session 5: eTRIKS - Science Driven D...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: eTRIKS - Science Driven D...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: EMIF (European Medical In...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: EMIF (European Medical In...tranSMART Community Meeting 5-7 Nov 13 - Session 5: EMIF (European Medical In...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: EMIF (European Medical In...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Modularization (Plug‐Ins,...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Modularization (Plug‐Ins,...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Modularization (Plug‐Ins,...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Modularization (Plug‐Ins,...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 4: tranSMART Foundation (tF)...
tranSMART Community Meeting 5-7 Nov 13 - Session 4: tranSMART Foundation (tF)...tranSMART Community Meeting 5-7 Nov 13 - Session 4: tranSMART Foundation (tF)...
tranSMART Community Meeting 5-7 Nov 13 - Session 4: tranSMART Foundation (tF)...
 
Community
CommunityCommunity
Community
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART and the One Min...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART and the One Min...tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART and the One Min...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART and the One Min...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART a Data Warehous...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART a Data Warehous...tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART a Data Warehous...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART a Data Warehous...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart-data
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart-datatranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart-data
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart-data
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Simulation in tranSMART
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Simulation in tranSMARTtranSMART Community Meeting 5-7 Nov 13 - Session 3: Simulation in tranSMART
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Simulation in tranSMART
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Developing a TR Community...
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Developing a TR Community...tranSMART Community Meeting 5-7 Nov 13 - Session 2: Developing a TR Community...
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Developing a TR Community...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Herding Cat
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Herding CattranSMART Community Meeting 5-7 Nov 13 - Session 2: Herding Cat
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Herding Cat
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Creating a Comprehensive ...
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Creating a Comprehensive ...tranSMART Community Meeting 5-7 Nov 13 - Session 2: Creating a Comprehensive ...
tranSMART Community Meeting 5-7 Nov 13 - Session 2: Creating a Comprehensive ...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Translational Drug Disco...
tranSMART Community Meeting 5-7 Nov 13 - Session 1:  Translational Drug Disco...tranSMART Community Meeting 5-7 Nov 13 - Session 1:  Translational Drug Disco...
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Translational Drug Disco...
 

Dernier

College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
perfect solution
 
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Dipal Arora
 
Russian Escorts Girls Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
Russian Escorts Girls  Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls DelhiRussian Escorts Girls  Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
Russian Escorts Girls Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
AlinaDevecerski
 

Dernier (20)

Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
 
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
 
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
 
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...
 
Call Girls Haridwar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Haridwar Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Haridwar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Haridwar Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Dehradun Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Dehradun Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Dehradun Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Dehradun Just Call 9907093804 Top Class Call Girl Service Available
 
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
 
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
 
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
 
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
 
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
 
Bangalore Call Girls Nelamangala Number 9332606886 Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 9332606886  Meetin With Bangalore Esc...Bangalore Call Girls Nelamangala Number 9332606886  Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 9332606886 Meetin With Bangalore Esc...
 
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
 
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
 
Russian Escorts Girls Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
Russian Escorts Girls  Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls DelhiRussian Escorts Girls  Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
Russian Escorts Girls Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
 

tranSMART Community Meeting 5-7 Nov 13 - Session 3: Clinical Biomarker Discovery

  • 1. TM4P Translational Medicine for Patients TM Data Hub Project Implementation of a Translational Medicine Data Integration Platform tranSMART Community Meeting Developer Stream, Nov 06-2013 Charlotte Raillère (tranSMART Expert) Claire Virenque (Project Manager) | 1
  • 2. Content of the presentation ● Update on Sanofi latest achievements 1. IT security assessment of tranSMART 2. Improvement of SNP (subject level) data loading ● Update on work-in-progress 3. New release under development (‘RC2’) 4. tranSMART x MongoDB integration tranSMART Community Meeting – Nov 06, 2013 | 2
  • 3. Context – tranSMART at Sanofi ● Pilot experience with tranSMART from September 2011 till June 2012 ● Evaluate tranSMART capabilities to support clinical biomarker research ● Implementation project launched in September 2012 ● Identify tranSMART improvements, which are of highest value for Sanofi ● Implement tranSMART improvements through two successive tranSMART Release Candidates (RC) • RC1 is available since March, 2013 – code base available in Github • RC2 building is in progress • RC2 is expected to move into production mode in Q2 next year ● Working version of tranSMART available for our early adopter business units ● Obj = Meet their ongoing needs related to translational research data integration. ● Support for data curation & loading is also provided. tranSMART Community Meeting – Nov 06, 2013 | 3
  • 4. tranSMART IT security assessment Feedback Special thanks to Vincent Rossetto and the IS Security Team!
  • 5. Part 1 – Scope and Context ● ● Objective of Security Risk Assessment: Protect R&D information ● Mission of R&D IS Security team – Control and Assess the risks on R&D information asset Risk assessment methodology ● ‘Ethical’ hacking – penetration testing • From vulnerability scans to exploitation • Using free tools (Nessus, BackTrack, Metasploit, Sharepoint perl script) • With no account on Sanofi systems neither sanofi standard workstation ● ● ● Without access account, try to gain high level access (admin account, sensitive data) Risk Classification: Four grades ● From ‘High’: Risk with important consequences on Sanofi activities – can happen or be caused easily ● Till ‘Negligible’: Risk with minor consequences – requires expert knowledge or favorable context Recommendations: Remediation Action Plan ● With prioritization of the recommendations | 5
  • 6. Part 2 – tranSMART risk assessment results ● tranSMART strength overview ● No trivial system accounts found. No default database accounts found. ● Web servers are running under low privileges. User authentication cannot be bypassed. • Authentication through Sanofi’s Active Directory ● Main risks identified ● Credential disclosure (database, Tomcat, Jboss…) ● Session hi-jacking ● Privilege elevation ● Application malevolence (XSS) ● Impact ● Sensitive data disclosure ● Technical information disclosure ● Identity usurpation | 6
  • 7. Part 3 – Application Security weaknesses ● XSS attack: Certain parameters (tags) are prone to store cross-site scripting attacks. ● ● This vulnerability can be exploited to take control of another administrator’s browser or more probably to lead phishing or viral spreading attacks Admin session hijacking XSS alert : • <script>alert(String.fromCharCode(88, 83, 83, 32, 97, 116, 116, 97, 99, 107, 32, 105, 110, 32, 112, 114, 111, 103, 114, 101, 115, 115))</script> ● Privilege escalation: Basic users can access some administrative features ● The following URL must not be accessible to users with standard account: • • /transmart/secureObjectAccess/manageAccess /transmart/secureObjectAccess/manageAccessBySecObj | 7
  • 8. Part 4 – Recommendations and good practices ● ● ● ● Use good development practices to avoid XSS attacks and privilege escalation ● Based on development standards such as OWASP Ensure compliance of application accounts with company’s password policy ● ● LDAP authentication using AD (preferred) Or set up specific application password policy (pwd complexity, pwd expiration, time out…) Encrypt tranSMART authentication (https) ● Avoid sniffing attacks and credential disclosure Avoid default or weak accounts ● Administrative console (Jboss, Tomcat, Axis2) must have complex and secret password • • ● ● Risk: Exploit vulnerability to access admin areas and compromise the application (crafted application Consequence: Can impact the application availability or the data confidentiality & integrity. Database accounts (DBA, application) must have complex and secret password • • Risk: Exploit vulnerability to access the Web application database Conquequence: Can impact the data confidentiality & integrity Sensitize users on security topics ● Lock Workstation or log off from tranSMART session to avoid unauthorized access | 8
  • 9. Loading of SNP data Latest achievements tranSMART Community Meeting – Nov 06, 2013 | 9
  • 10. Loading of SNP genotyping data ● Modification of loader.jar (from tranSMART-ETL repository) ● Correction of errors ● Loading speeded up • Some inserts replaced by batch inserts • Parameters modified to insert/select data ● Less constraints on file format • Columns from the annotation file can be described in property files • New class to load SNP data from Illumina platform ● Loading of three studies with SNP data from Illumina platform (> 1million SNP) ● 4 patients → 40 minutes ● 30 patients → 5 hours ● 1500 patients → 80 hours ● Estimation (on-going) Integration of SNP loading in ICE (tranSMART Curation & Loading Tool) done tranSMART Community Meeting – Nov 06, 2013 | 10
  • 11. New tranSMART release under development (‘RC2’) Improvements – New features tranSMART Community Meeting – Nov 06, 2013 | 11
  • 12. tranSMART RC2 – Scope outline ● Accommodate new data types ● ● ● ● ● ● ● miRNA data (qPRC and microarray) Proteomic data (RBM data, mass spec data) Metabolomic data RNA sequencing data Accommodate serial data (time courses, doses responses, etc.) Enable sequential loading of data for a study Enhance critical current analytics Developments in-progress. Partnership w/ Cognizant and The Hyve. Completion of RC2 developments planned for January, 2014. Developments will be contributed back to the community. ● Box Plot, Line Graph, Correlation Analysis, Grid View ● Plus adaptation of analytics to new data types ● Enhance data export features Click here for further details on RC2 enhancements tranSMART Community Meeting – Nov 06, 2013 | 12
  • 13. tranSMART RC2 – Key points ● RC2 is built ‘on top of’ Sanofi RC1 release ● ETL: impact of changes = high (Kettle scripts converted into Groovy, new ETL pipelines, mapping files modified) ● Data model: impact = high (creation of new tables for new data types, etc.) ● UI: impact = low ● Our goal is to converge towards the GPL version ● RC1 was merged with ‘Core DB’ & ‘Core API’ enhancements (from GPL1.1) • Start of the modularization of tranSMART ● New data types are implemented in a modular fashion. • This should help to the future merging of RC2 with open source code base Limit deviation from the open source code base Do not duplicate efforts Maximally benefit from public tranSMART development efforts Contribute back all developments to the community tranSMART Community Meeting – Nov 06, 2013 | 13
  • 14. tranSMART x MonGo DB integration Objective and timeline tranSMART Community Meeting – Nov 06, 2013 | 14
  • 15. MongoDB integration with tranSMART (1/2) ● ● MongoDB is a NoSQL document oriented database Main need for tranSMART: Physical storage of unstructured data (i.e., files) ● Any files that are uploaded and visible through the Browse tab of the Sanofi RC1 (raw data files, study related documentation such as clinical protocol, etc.) ● Currently, files are stored on tranSMART app server… Limited storage capacity.  Objective: Move storage of unstructured data from tranSMART server to MongoDB db ● Why MongoDB ? ● Ability to store huge volume of unstructured files ● Horizontal scalability ● Easy installation process tranSMART Community Meeting – Nov 06, 2013 | 15
  • 16. MongoDB integration with tranSMART (2/2) ● Timelines ● Integration with Sanofi RC2 release (backend + UI): Q4-2013 ● Testing in Q1-2014 tranSMART Community Meeting – Nov 06, 2013 | 16
  • 17. Conclusion Any questions? Thank you! Acknowledgement: Sherry Cao, Jike Cui, Angelo DeCristofano, Christophe Gibault, Lars Greiffenberg, Manfred Hendlich, Rainer Kappes, Adam Palermo, Annick Peleraux, David Peyruc, Charlotte Raillère, Vincent Rossetto, Claire Virenque Making a difference in Healthcare with Information Technologies. tranSMART Community Meeting – Nov 06, 2013 | 17
  • 18. Additional slides tranSMART Community Meeting – Nov 06, 2013 | 18
  • 19. tranSMART RC1 – Summary ● Released in March 2013 ● Code base available in Github ● Main improvements delivered in tranSMART RC1: Topic 1: Data Management • • • • • • Topic 2: tranSMART User Interface • Topic 3: Data Searching and Analysis Ability to organize data within a hierarchical structure (Program/Study/Assay) with new tagging capabilities Synonym management for several dictionaries (e.g. compounds, genes, diseases) New capabilities for posting, searching and exporting files New functionality to load gene expression analysis results Better support for time points/series Improvement of tranSMART curation and loading tool & pipelines Simplification of tranSMART UI: – All searching functionalities centralized – Synchronization of the browser and analysis modules • Improvement of data searching capabilities: – Integrated search / filter for querying any data available (levels 1 to 4) – More search / filter criteria • Implementation of standard analytics from GPL1.0 tranSMART Community Meeting – Nov 06, 2013 | 19
  • 20. RC1 – New organization of tranSMART UI ● Two main tabs – synchronized with each other: Global view of all the data available From level 1 data (uncurated/raw files) to levels 3-4 data (analysis results, findings) Run analysis on subject-level data (former Dataset Explorer) Navigate within Programs > Studies > Assays , Analysis and File Folders (see next slide) Browse level 2 (processed) data – incl. clinical / preclinical / molecular data, etc. Search data using dictionaries Search subject-level data Create new Programs > Studies > Assays and Files Folders, and annotate (tag) them Select data subsets (cohorts) Export files Run basic statistical and genomic analyses on those subsets (standard features from tranSMART v1.0) Visualize gene expression analysis results Export out data subsets tranSMART Community Meeting – Nov 06, 2013 | 20
  • 21. tranSMART RC2 – Requirements (1/2) Area Req # 1 2 3 Data loading / ETL pipelines Security 4 5 6 7 8 9 10 11 12 13 14 15 16 Analytics – Advanced Workflows 17 18 19 Requirement Optimize the clinical ETL pipeline to accelerate loading time for large clinical studies Enable incremental loading of data for a given study Enable loading of ‘serial’ high and low dimensional data (time course, dose response, different sampling conditions, etc.) Improve samples handling Enable loading of RBM subject-level data as high dimensional data Enable loading of microarray miRNA subject-level data as high dimensional data. Enable loading of qPCR miRNA subject-level data Enable loading of mass spec proteomic subject-level data as high dimensional data. Enable loading of metabolomic subject-level data as high dimensional data. Improve SNP subject-level data loading – in particular, accelerate loading time Enable loading of RNA sequencing subject-level data (gene-level expression quantification) Optimize the management of annotation files for omic data Set up user authentication through the company’s Active Directory Implement security rules and user permissions in Browse tab (RC1 feature) Allow better analysis of ‘serial’ high and low dimensional data using existing analytics Improve the Line Graph analytics: • Enable Line Graph to use high dimensional data •Better handle x axis • Add option to plot individual data in addition to group means or medians. Improve sub categorization of high dimensional data (tissue, time points, etc.) in the high dimensional data node selection screen in Advanced Workflows – linked to req #3 Improve the Boxplot analytics – make individual box plots for each variable when dragging multiple nodes in field ‘Dependent Variable’, and present output in table format Improve the Correlation Analysis analytics Sprint # 2 4 2 1 3 2 2 3 3 Done 2 Done Done 1 2 4 2 4 4 tranSMART Community Meeting – Nov 06, 2013 | 21
  • 22. tranSMART RC2 – Requirements (2/2) Area Req # 20 21 Analytics – 22 Advanced 23 Workflows 24 25 Analytics – Grid View 26 27 Requirement Allow analysis of RBM data using existing analytics for high dimensional data Allow analysis of microarray miRNA data using existing analytics Allow analysis of qPCR miRNA and mRNA data using existing analytics Allow analysis of mass spectrometry subject-level data using existing analytics Allow analysis of metabolomic subject-level data using existing analytics Allow analysis of RNA sequencing data using existing analytics Improve Grid View • • • • Sprint # 3 2 2 3 3 2 Enable categorical variables in a single column Enable column deletion, row or column selection Enable export of selection Automatically include variables used in Advanced Workflows 3 Display sample ID related to patient ID in Grid View Improve export of data • Improve performances (response time) when exporting large data volume 1 28 • Add advanced filters to allow users to limit the exported data to subset of clinical fields, genes… • Add ability to better categorize the data available for a study (clinical, gene expression, etc.) • Harmonize with Grid View export capabilities 2+4 Tagging Gene sign. 29 30 31 1 2 Done UI 32 Add ability to preview a file in browser (IE8 and Firefox) Add dictionaries for miRNA, proteins, metabolites In Gene Signature/List tab, add gene symbols – linked to req #12 Improve consistency and synchronization of data trees in Browse (Program Explorer panel) and in Analyze (Navigate Terms panel) Secure file indexing After running a free text search in Browse tab, when clicking on bold items in Program Explorer panel, highlight in right hand side Browse panel: Export 33 Search 34 2 Done • String found in metadata (including in file names) • Files containing that string tranSMART Community Meeting – Nov 06, 2013 3 | 22
  • 23. Risk Assessment methodology tranSMART Community Meeting – Nov 06, 2013 | 23