SlideShare une entreprise Scribd logo
1  sur  61
Data Mining in the Pharmaceutical Industry
Introduction 
• Data Mining is the process of extracting information 
from large data sets through the use of algorithms 
and techniques drawn from the field of Statistics, 
Machine Learning and Data Base Management 
Systems. 
• “Mining” means to find something that already exists. 
• Therefore, data mining can be defined as a process of 
identifying hidden patterns and relationships, and 
trends within data. 
• Traditional methods often involves:- 
1) manual work 
2) interpretation of data.
• Data Mining, popularly called as knowledge 
discovery in 
• large data 
• Enables organizations to make calculated 
decisions by 
• Assembling 
• accumulating 
• analyzing and 
• accessing corporate data.
__ 
__ 
__ 
__ 
__ 
__ 
__ 
__ 
__ 
Transformed 
Data 
Patterns 
and 
Rules 
Target 
Data 
Interpretation 
& Evaluation 
Knowledge 
Understanding 
Raw 
Dat 
a 
DATA 
Ware 
house 
Integration
• The scope of pharmaceutical applications is large and it 
may involve drug manufacturing processes as well as 
data processing. 
• Data processing and analysis is a key area in the 
pharmaceutical industry. 
• The vision of a pharmaceutical industry that can be 
achieved with data mining. 
• pharmaceutical companies delivers drugs, developing 
test kits (including genetic tests) and computer 
programs to deliver the best drug to the patient.
Pharmaceutical companies can also employ data mining 
methods to huge masses of genomic data to predict how 
a patient’s genetic makeup determines his or her response 
to a drug therapy . 
genomic data :-The complete set of chromosomal and 
extra chromosomal genes of an organism, a cell, an 
organelle or a virus; the complete DNA component of an 
organism.
It uses variety of tools like 
• Query and reporting tools:-
Analytical processing tools 
•Use to analyze database information from multiple database 
systems at one time.
Decision Support System (DSS) tools. 
• Decision support 
systems (DSS) are 
defined as 
• interactive computer-based 
systems intended 
to help decision makers 
to utilize data and 
models in order to 
• identify problems, solve 
problems and make 
decisions.
DATA MINING TECHNIQUES. 
•Many organizations generate 
mountains of data about their new 
drugs discovered and its 
performance reports, etc. 
•This data is a strategic resource. 
Now, making use of most of these 
strategic resources will lead to 
•improving the quality of pharma 
industries.
• Six important steps in the Data Mining process 
as 
1. Problem Definition. 
2. Knowledge acquisition. 
3. Data selection. 
4. Data Preprocessing. 
5. Analysis and Interpretation. 
6. Reporting and Use.
Identify the data mining process as 
1. Definition of the objectives of the analysis. 
2. Selection &Pretreatment of the data. 
4. Explanatory analysis. 
5. Specification of the statistical methods. 
6. Analysis of the data. 
7. Evaluation and comparison of methods. 
8. Interpretation of the chosen model.
1. Definition of the objectives of the analysis. 
Understanding the project objectives and 
requirements from a business perspective and then 
converting this knowledge into a data mining 
problem definition with a preliminary plan 
designed to achieve the objectives.
Relevant data sources for the pharma industry are: 
•clinical data (patient data, pharmaceutical data, 
medical treatments, length of stay); 
•administrative data (staff skills, overtime, nursing 
care hours, staff sick leave); 
• financial data (treatment costs, drug costs, staff 
salaries, accounting, cost-effectiveness studies); and 
• organizational data (room occupation, facilities, 
equipment).
Data mining is used to support: 
•The clinicians at the point of care delivery; 
•The controlling of clinical treatment pathways; 
•The administrative and management tasks; and 
•Efficient management of organizational and 
financial data.
Associations, Mining Frequent 
Patterns. 
• These methods identify rules of affinities 
among the collections. 
• rules of affinities:- relationships among 
data 
• That the patterns occur frequently during 
Data Mining process. 
• The applications of association rules 
include market basket analysis 
• attached mailing in direct marketing 
• Fraud detection 
• department store floor/shelf planning etc.
•Association of training undertaken diseases 
with drugs 
•Association and analysis of staff movements 
•Application tracking mechanism in 
physicians adopting drugs with customer’s 
prescription
Classification And Prediction. 
• The classification and 
prediction models are two 
data analysis techniques 
that are used to describe 
data classes and predict 
future data classes. 
• E.g. A credit card company 
whose customer credit 
history is known can 
classify its customer Record 
as 
• Good, Medium, or Poor.
•Predicting consumer behavior 
•Predicting the likelihood of success in a drug 
adoption process 
•Predicting the percentage accuracy in performance of 
a drug 
•Classifying the historical health records 
•Prediction of what type of drugs most likely to be 
retained, most likely to be left, most likely to 
transform their composition.
Predicting pharma product behavior and attitude 
•Predicting demand projections by seasonal variations 
•Predicting the performance progress of segments 
throughout the performance period 
•Identifying the best profile for different drugs 
•Classify trends of movements through the 
organization for successful/unsuccessful patient 
historical records 
•Categorization of drugs, diseases and patients.
• The models of decision 
trees, neural networks 
based classifications 
schemes are very much 
useful in pharma industry.
• Decision trees:- Decision-tree is a common knowledge 
representation used for classification. 
• In classification, one is given data from a specific 
instance, and the decision tree predicts, based on the 
data, into which of two or more classes the instance 
belongs. 
• Each instance contains data from multiple attributes. 
• Instances are collections of previously acquired data 
which are sorted into class labels. 
• It does so by determining which tests best divide the 
instances into separate classes, forming a tree.
• Neural Networks 
– Learn through training 
– Resemble to biological 
networks in structure 
– Can produce very good 
predictions 
– Not easy to use and to 
understand 
– Cannot deal with 
missing data
Uses Bayesian neural network 
Prior probability is probability that any report 
contains reference to adverse event 
Posterior probability is probability that report has 
link between drug and adverse event 
Determines “strength” of link between adverse 
event and drug (called Information Component or 
IC) 
More complicated than appears: patient may 
consume multiple drugs – which one caused 
adverse event?
Bayesian Neural Network 
Adverse 
Event 
Drug 
Strength of 
link 
between 
adverse 
event and 
drug
• Classification works on discrete and unordered data, while prediction 
works on continuous data. 
• E.g. Discrete data This data set shows a group of discrete data. 
Music format Number sold 
CD albums 140 
CD singles 70 
Downloads 55 
Vinyl 5 
Total sales 270 
• This is called discrete data because the units of measurement (for example, 
CDs) cannot be split up; there is nothing between 1 CD and 2 CDs 
• E.g. Continues data 
• This data is called continuous because the scale of measurement - distance - 
has meaning at all points between the numbers given, e.g we can travel a 
distance of 1.2 and 1.85 and even 1.632 miles. 
Distance in miles 0.1 0.2 0.6 1.1 1.2 1.8 2.0 2.7 3.4 4.6 6.2 8.0 12.1 14.2
• Regression is often used as it is a 
statistical method used for numeric 
prediction. 
• Primary emphasis should be made on 
the selection measurement accuracy 
and predicative efficiency of any 
new drug discovery. 
• Simple or multiple regressions is 
the basic prediction model that 
enables a decision maker to forecast 
each criterion status based on 
predictor information. 
• neural network technology is useful 
from different areas of business.
CLUSTERING. 
• It is a method by which similar 
records are grouped together. 
• Clustering is usually used to mean 
segmentation. 
• An organization can take the 
hierarchy of classes that group 
similar events. 
• Using clustering, patients can be 
grouped based on age, name, 
diseases etc. 
• In business, clustering helps identify 
groups of similarities; 
• characterize customer groups based 
on purchasing patterns, etc.
DATA MINING AND STATISTICS. 
• The ability to build a successful 
predictive model depends on past 
data. 
• Data Mining is designed to learn from 
past success and failures and will be 
able to predict what will happen 
next (future prediction). 
• The Data Mining tool checks the 
statistical significance of the 
predicted patterns and reports.
The difference between Data Mining 
and statistics 
• Data Mining automates the statistical process 
requiring in several tools. 
• Statistical inference is assumption driven in the 
sense that a hypothesis is formed and tested 
against data. 
• Data Mining, in contrast is discovery driven. 
That is, the hypothesis is automatically 
extracted from the given data.
Data Mining can answer analytical 
questions such as: 
• what are discovery of new molecules and 
issues over it? 
• What factors or combinations are directly 
impacting the drugs? 
• What are the best and outstanding drugs? 
• Which drugs are likely to be retained? 
• How to optimally allocate resources to ensure 
effectiveness and efficiency? etc.
• An intelligent text mining system could 
provide a platform for extracting and 
managing specific information at the entity 
level. 
• For e.g. Information pertaining to 
• genes 
• proteins 
• diseases 
• organisms 
• chemical substance etc can be analytically 
extracted for patterns .
It would also provide insights into inter relationships 
such as 
• protein-protein 
• Gene-gene 
• Protein-Chemical 
• Gene-Disease and 
• Drug-Drug interactions. 
• Text mining can be applied to biomedical literature, 
clinical documents and other medical literary sources 
for data curation and database population in a semi-automated 
manner.
Applications Of Data Mining In 
The Pharmaceutical Industry 
• A lot of information is hidden in the legacy 
systems. 
• This information can easily be extracted. 
• Most of the times this can not be done directly 
from the legacy systems, because these are not 
build to answer questions that are 
unpredictable.
• A user-interface may be designed to accept all kinds 
of information from the user (e.g. weight, sex, age, 
foods consumed, reactions reported, dosage, length of 
usage). 
• Then, based upon the information in the databases 
and the relevant data entered by the user, 
• a list of warnings or known reactions (accompanied 
by probabilities) should be reported. 
• Note that user profiles can contain large amounts of 
information, and efficient and effective data mining 
tools need to be developed to probe the databases for 
relevant information.
• Secondly, the patient's (anonymous) profile should 
be recorded along with any adverse reactions 
reported by the patient, so that future correlations 
can be reported. 
• Over time, the databases will become much larger, 
and interaction data for existing medicines will 
become more complete. 
• The amount of existing pharmaceutical information 
pharmacological properties, dosages, 
contraindications, warnings, etc. is enormous; 
• however, this fact reflects the number of medicines 
on the market, rather than an abundance of detailed 
information about each product.
One of the major problems with pharmaceutical 
data is a lack of information. 
• a food and drug administration department 
estimated that 
• only about 1% of serious events are reported to 
the food and drug administration department. 
Fear of litigation may be a contributing factor; 
• however, most health care providers simply 
don't have the time to fill out reports of 
possible adverse drug reactions.
•Furthermore, it is expensive and time consuming 
for pharmaceutical companies to perform a 
thorough job of data collection, especially when 
most of the information is not required by law. 
•Finally, one should note that the food and drug 
administration department does not require 
manufacturers to test new medicines for potential 
interactions.
Three stages of drug development 
• Finding of new drugs 
• Development tests and Predicts drug behavior 
• Clinical trials test the drug in humans and 
• Commercialization takes drug and sells it to 
likely Consumers (doctors and patients).
APPLICATIONS OF DATA 
MINING IN THE 
PHARMACEUTICAL INDUSTRY
1) Clinical data analysis – clinical data analysis 
evaluates and streamlines from large amount of 
information. 
Data mining helps to see trends, irregularity, and 
risk during product development and launch. 
2) Marketing and sales analysis –the 
identification of the most profitable product and 
allocation of marketing funds. 
Data mining here helps to examine consumer 
behavior in terms of prescription renewal and 
product purchases.
3) Customer analysis – using data mining one can 
develop more targeted customer profiles that focus 
not only on products, but also on the ability to pay 
for them by analyzing historical health trends in 
combination with demographics. 
4) Target physicians who have high prescription 
rates of a certain drug or treatment with new drug 
information that treat complementary symptoms or 
conditions.
DEVELOPMENT OF NEW 
DRUGS. 
• This can be achieved by clustering the 
molecules into groups according to the 
chemical properties of the molecules via 
cluster analysis. 
• every time a new molecule is discovered it can 
be grouped with other chemically similar 
molecules.
•Mining can help us to measure the chemical activity 
of the molecule on specific disease say tuberculosis 
and find out which part of the molecule is causing the 
action. 
•This way we can combine a vast number of 
molecules forming a super molecule with only the 
specific part of the molecule which is responsible for 
the action and inhibiting the other parts. 
•This would greatly reduce the adverse effects 
associated with drug actions.
• They use high speed screening to test tens, 
hundreds, or thousands of drugs very quickly. 
• The general goal is to find activity on 
relevant genes or to find drug compounds that 
have desirable characteristics. 
• The Data mining techniques that are used in 
developing of new drugs are clustering, 
classification and neural networks. 
• The basic objective is to determine 
compounds with similar activity.
• The reason is for similar activity compounds 
behave similarly. 
• This is possible only when we have known 
compound and looking for something better. 
• When we don’t have known compounds but 
have desired activity and want to find 
compound that exhibits this activity, then data 
mining rescues this.
DEVELOPMENT TESTS AND 
PREDICTS DRUG BEHAVIOR 
• Issues which affect the success of a drug which 
can impact the future development of the drug. 
1) Adverse reactions to the drugs are reported 
spontaneously and not in any organized manner. 
2) we can only compare the adverse reactions with 
the drugs of our own company and not with other 
drugs from competing firms. 
3) we only have information on the patient taking 
the drug not the adverse reaction that the patient 
is suffering from
Solution 
• All this can be solved with creation of a data 
warehouse for drug reactions and running 
business intelligence tools on them. 
• BI tool:- Business intelligence tools are a type of 
software that is designed to retrieve, analyze and 
report data. 
• This broad definition includes everything from 
spreadsheets, visual analytics, and querying 
software to data mining, warehousing, and 
decision engineering.
•The drug undergoes testing in animals and human 
tissue to observe effect and determines how much 
drug to consume for desired effect or how 
dangerous is the drug. 
•The Data mining techniques can be here used is 
classification and neural networks.
• The goal here is to predict if treatment will aid 
patients. 
• Because if drug will not aid patients, what 
purpose does drug serve. 
• Predicting the drug behavior is essential when we 
have data supporting use of drug and also have 
training data that shows effects of drug (positive 
or negative). 
• The test should be able to predict which patients 
will benefit and which treatment help sickle cell 
anemia patients.
How it works 
•The information like gender, body weight, 
disease state, etc will play crucial role. 
•This crucial data should be fed into neural 
network and predict whether patient will 
benefit from drug. 
•Only one of two classifications yes/no will 
be available on training data. 
•Network is trained for the yes 
classifications and a snapshot is taken of the 
neural network. 
•Then network is trained for the no 
classifications and another snapshot is 
taken. 
•The output is yes or no, depending on 
whether the inputs are more similar to the 
yes or the no training data. 
•E.G. ARTMAP.
Weight 
Height 
Gender 
Blood 
Pressure 
Imagine array of 
weights, one for 
each “template” 
Template closest 
to input chosen. 
Patient 
Benefits? 
Path of “least resistance” 
chosen for output.
CLINICAL TRIALS TEST THE 
DRUG IN HUMANS 
• Company tests drugs in actual patients on larger 
scale. 
• company has to keep track of data about patient 
progress. 
• The Government wants to protect health of 
citizens, many rules govern clinical trials. 
• In developed countries food and drug 
administration oversees trials. 
• The Data mining techniques used here can be 
neural networks.
• Here data is collected by pharmaceutical 
company but undergoes statistical analysis to 
determine success of trial. 
• Data is generally reported to food and drug 
administration department and inspected 
closely. 
• Too many negative reactions might indicate 
drug is too dangerous. 
• An adverse event might be medicine causing 
drowsiness.
• The goal is to detect when too many adverse 
events occur or detect link between drug and 
adverse event. 
• Too many adverse events linked to a drug might 
indicate drug is too dangerous or health of patient 
is at risk. 
• Adverse events are reported to food and drug 
administration when link is suspected. 
• One can feed the information on drug causing too 
many adverse events pertaining to drugs into a 
neural network and let network lead us to what is 
meant by ‘too many’.
Benefits 
• Research Stage – instead of trial and error, data 
mining can help find drugs that have desirable 
activity 
• Development Stage – data mining can help 
predict who will benefit from drug 
• Clinical Trials Stage – data mining protects 
patients and helps regulate drug testing 
• Commercialization Stage – data mining can 
optimize use of sales resources like manpower, 
advertising
CONCLUSION. 
• Due to increased computerization and consumer/patient 
awareness. 
• Reporting (via the internet) by health care workers can easily 
be facilitated. 
• Data collection in hospitals and extended care facilities is not 
difficult, and this information is of high quality since such 
institutions typically have tailored diets for their patients, and 
maintain accurate records of treatments, lab tests, and 
administration of prescriptions. 
• Furthermore, given the popularity of the internet, it is 
relatively easy for consumers to voluntarily fill in and submit 
detailed profiles of themselves.
•It is mostly observed that data mining techniques are 
seldom used in a pharmaceutical environment. 
•How data mining can help find drugs that have desirable 
activity and predict who will benefit from drug. 
•Data mining protects patients and helps regulate drug 
testing and optimizes use of sales resources like 
manpower, advertising.

Contenu connexe

Tendances

Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Hellmuth Broda
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & ApplicationsFazle Rabbi Ador
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : ConceptsPragya Pandey
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an IntroductionAli Abbasi
 
Data mining & data warehousing
Data mining & data warehousingData mining & data warehousing
Data mining & data warehousingShubha Brota Raha
 
Business research data collection
Business research data collectionBusiness research data collection
Business research data collectionNishant Pahad
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryDr. Gerry Higgins
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data MiningDHIVYADEVAKI
 
Prescriptive analytics
Prescriptive analyticsPrescriptive analytics
Prescriptive analyticsIpsita Kulari
 
Big data analytics in healthcare
Big data analytics in healthcareBig data analytics in healthcare
Big data analytics in healthcareJoseph Thottungal
 
Global Promotional Review
Global Promotional Review Global Promotional Review
Global Promotional Review Alan Bergstrom
 
Introduction of mixed effect model
Introduction of mixed effect modelIntroduction of mixed effect model
Introduction of mixed effect modelVivian S. Zhang
 
Prescriptive Analytics
Prescriptive AnalyticsPrescriptive Analytics
Prescriptive AnalyticsŁukasz Grala
 
What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation Pralhad Rijal
 

Tendances (20)

Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Data Cleaning
Data CleaningData Cleaning
Data Cleaning
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : Concepts
 
Data mining
Data miningData mining
Data mining
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data mining & data warehousing
Data mining & data warehousingData mining & data warehousing
Data mining & data warehousing
 
Business research data collection
Business research data collectionBusiness research data collection
Business research data collection
 
Data mining
Data miningData mining
Data mining
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discovery
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
Prescriptive analytics
Prescriptive analyticsPrescriptive analytics
Prescriptive analytics
 
Big data analytics in healthcare
Big data analytics in healthcareBig data analytics in healthcare
Big data analytics in healthcare
 
Global Promotional Review
Global Promotional Review Global Promotional Review
Global Promotional Review
 
Pharmacovigilance
PharmacovigilancePharmacovigilance
Pharmacovigilance
 
Introduction of mixed effect model
Introduction of mixed effect modelIntroduction of mixed effect model
Introduction of mixed effect model
 
Prescriptive Analytics
Prescriptive AnalyticsPrescriptive Analytics
Prescriptive Analytics
 
What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation
 

En vedette

Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Ankur Khanna
 
Improving pharmaceutical marketing using big data solutions
Improving pharmaceutical marketing using big data solutionsImproving pharmaceutical marketing using big data solutions
Improving pharmaceutical marketing using big data solutionsPaul Grant
 
New Pharma Market Reality - Predictive Analytics is the Solution
New Pharma Market Reality - Predictive Analytics is the SolutionNew Pharma Market Reality - Predictive Analytics is the Solution
New Pharma Market Reality - Predictive Analytics is the SolutionDr. Sandeep Juneja
 
Application of BI in pharmaceutical industry
Application of BI in pharmaceutical industryApplication of BI in pharmaceutical industry
Application of BI in pharmaceutical industryBiBoard.Org
 
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
Data Mining in Healthcare:  How Health Systems Can Improve Quality and Reduce...Data Mining in Healthcare:  How Health Systems Can Improve Quality and Reduce...
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...Health Catalyst
 
Pharmaceutical Industry Overview
Pharmaceutical Industry OverviewPharmaceutical Industry Overview
Pharmaceutical Industry OverviewDemetris Iacovides
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Case Study: Big Data Analytics
Case Study: Big Data AnalyticsCase Study: Big Data Analytics
Case Study: Big Data AnalyticsAbhinav Das
 
Pistoia Alliance Debates: SEND, the CDISC Standard for Exchange of Nonclinica...
Pistoia Alliance Debates: SEND, the CDISC Standard for Exchange of Nonclinica...Pistoia Alliance Debates: SEND, the CDISC Standard for Exchange of Nonclinica...
Pistoia Alliance Debates: SEND, the CDISC Standard for Exchange of Nonclinica...Pistoia Alliance
 
Zeller Edm Summit Agile Deployment Of Predictive Analytics
Zeller Edm Summit   Agile Deployment Of Predictive AnalyticsZeller Edm Summit   Agile Deployment Of Predictive Analytics
Zeller Edm Summit Agile Deployment Of Predictive AnalyticsRonald.Ramos
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom IndustryCloudera, Inc.
 
Agile 2013 presentation, tom grant
Agile 2013 presentation, tom grantAgile 2013 presentation, tom grant
Agile 2013 presentation, tom grantTom Grant
 
20160512 predictive and adaptive approach
20160512   predictive and adaptive approach20160512   predictive and adaptive approach
20160512 predictive and adaptive approachSilvia Fragola
 
Bio variance j_scheiber_bioit_repurposingworkshop2013_draft
Bio variance j_scheiber_bioit_repurposingworkshop2013_draftBio variance j_scheiber_bioit_repurposingworkshop2013_draft
Bio variance j_scheiber_bioit_repurposingworkshop2013_draftJosef Scheiber
 
HealthCare Data Mining and Natural Language Processing
HealthCare Data Mining and Natural Language ProcessingHealthCare Data Mining and Natural Language Processing
HealthCare Data Mining and Natural Language ProcessingNehal (Neil) Shah
 
Data Mining A Healthcare Database
Data Mining A Healthcare DatabaseData Mining A Healthcare Database
Data Mining A Healthcare Databasebrucco
 

En vedette (20)

Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma
 
Analytics in Pharmaceutical Industry
Analytics in Pharmaceutical IndustryAnalytics in Pharmaceutical Industry
Analytics in Pharmaceutical Industry
 
Improving pharmaceutical marketing using big data solutions
Improving pharmaceutical marketing using big data solutionsImproving pharmaceutical marketing using big data solutions
Improving pharmaceutical marketing using big data solutions
 
New Pharma Market Reality - Predictive Analytics is the Solution
New Pharma Market Reality - Predictive Analytics is the SolutionNew Pharma Market Reality - Predictive Analytics is the Solution
New Pharma Market Reality - Predictive Analytics is the Solution
 
Application of BI in pharmaceutical industry
Application of BI in pharmaceutical industryApplication of BI in pharmaceutical industry
Application of BI in pharmaceutical industry
 
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
Data Mining in Healthcare:  How Health Systems Can Improve Quality and Reduce...Data Mining in Healthcare:  How Health Systems Can Improve Quality and Reduce...
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
 
Pharmaceutical Industry Overview
Pharmaceutical Industry OverviewPharmaceutical Industry Overview
Pharmaceutical Industry Overview
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Case Study: Big Data Analytics
Case Study: Big Data AnalyticsCase Study: Big Data Analytics
Case Study: Big Data Analytics
 
Pistoia Alliance Debates: SEND, the CDISC Standard for Exchange of Nonclinica...
Pistoia Alliance Debates: SEND, the CDISC Standard for Exchange of Nonclinica...Pistoia Alliance Debates: SEND, the CDISC Standard for Exchange of Nonclinica...
Pistoia Alliance Debates: SEND, the CDISC Standard for Exchange of Nonclinica...
 
Cisp dm
Cisp dmCisp dm
Cisp dm
 
Zeller Edm Summit Agile Deployment Of Predictive Analytics
Zeller Edm Summit   Agile Deployment Of Predictive AnalyticsZeller Edm Summit   Agile Deployment Of Predictive Analytics
Zeller Edm Summit Agile Deployment Of Predictive Analytics
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
 
Agile 2013 presentation, tom grant
Agile 2013 presentation, tom grantAgile 2013 presentation, tom grant
Agile 2013 presentation, tom grant
 
20160512 predictive and adaptive approach
20160512   predictive and adaptive approach20160512   predictive and adaptive approach
20160512 predictive and adaptive approach
 
Bio variance j_scheiber_bioit_repurposingworkshop2013_draft
Bio variance j_scheiber_bioit_repurposingworkshop2013_draftBio variance j_scheiber_bioit_repurposingworkshop2013_draft
Bio variance j_scheiber_bioit_repurposingworkshop2013_draft
 
Medical data mining
Medical data miningMedical data mining
Medical data mining
 
HealthCare Data Mining and Natural Language Processing
HealthCare Data Mining and Natural Language ProcessingHealthCare Data Mining and Natural Language Processing
HealthCare Data Mining and Natural Language Processing
 
Data Mining A Healthcare Database
Data Mining A Healthcare DatabaseData Mining A Healthcare Database
Data Mining A Healthcare Database
 

Similaire à Data mining (DM) in the pharmaceutical industry

Big Data Mining Methods in Medical Applications [Autosaved].pptx
Big Data Mining Methods in Medical Applications [Autosaved].pptxBig Data Mining Methods in Medical Applications [Autosaved].pptx
Big Data Mining Methods in Medical Applications [Autosaved].pptxHemaSenthil5
 
Introduction to Data Analytics - PPM.pptx
Introduction to Data Analytics - PPM.pptxIntroduction to Data Analytics - PPM.pptx
Introduction to Data Analytics - PPM.pptxssuser5cdaa93
 
datamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxdatamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxshyam1985
 
JR's Lifetime Advanced Analytics
JR's Lifetime Advanced AnalyticsJR's Lifetime Advanced Analytics
JR's Lifetime Advanced AnalyticsChase Hamilton
 
Medical Applications of Decision Support System DSS
Medical Applications of Decision Support System DSSMedical Applications of Decision Support System DSS
Medical Applications of Decision Support System DSSKhaled Elkhrashy
 
JR's Lifetime Advanced Analytics
JR's Lifetime Advanced AnalyticsJR's Lifetime Advanced Analytics
JR's Lifetime Advanced Analyticsd-Wise Technologies
 
Data Science in Pharmaceutical Industry.pptx
Data Science in Pharmaceutical Industry.pptxData Science in Pharmaceutical Industry.pptx
Data Science in Pharmaceutical Industry.pptxVANDANASHREEP2237059
 
Analytical Wizards' Claims Data Navigator for Patient Journey and More
Analytical Wizards' Claims Data Navigator for Patient Journey and MoreAnalytical Wizards' Claims Data Navigator for Patient Journey and More
Analytical Wizards' Claims Data Navigator for Patient Journey and MoreEric Levin
 
Clinical Analytics
Clinical AnalyticsClinical Analytics
Clinical AnalyticsMichael Bice
 
7.-Data-Analytics.pptx
7.-Data-Analytics.pptx7.-Data-Analytics.pptx
7.-Data-Analytics.pptxmarow75067
 
DATA ANALYSIS Presentation Computing Fundamentals.pptx
DATA ANALYSIS Presentation Computing Fundamentals.pptxDATA ANALYSIS Presentation Computing Fundamentals.pptx
DATA ANALYSIS Presentation Computing Fundamentals.pptxAmarAbbasShah1
 
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Perficient, Inc.
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSpartan60
 
Sharing and standards christopher hart - clinical innovation and partnering...
Sharing and standards   christopher hart - clinical innovation and partnering...Sharing and standards   christopher hart - clinical innovation and partnering...
Sharing and standards christopher hart - clinical innovation and partnering...Christopher Hart
 

Similaire à Data mining (DM) in the pharmaceutical industry (20)

Big Data Mining Methods in Medical Applications [Autosaved].pptx
Big Data Mining Methods in Medical Applications [Autosaved].pptxBig Data Mining Methods in Medical Applications [Autosaved].pptx
Big Data Mining Methods in Medical Applications [Autosaved].pptx
 
Introduction to Data Analytics - PPM.pptx
Introduction to Data Analytics - PPM.pptxIntroduction to Data Analytics - PPM.pptx
Introduction to Data Analytics - PPM.pptx
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
datamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxdatamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptx
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
JR's Lifetime Advanced Analytics
JR's Lifetime Advanced AnalyticsJR's Lifetime Advanced Analytics
JR's Lifetime Advanced Analytics
 
Medical Applications of Decision Support System DSS
Medical Applications of Decision Support System DSSMedical Applications of Decision Support System DSS
Medical Applications of Decision Support System DSS
 
JR's Lifetime Advanced Analytics
JR's Lifetime Advanced AnalyticsJR's Lifetime Advanced Analytics
JR's Lifetime Advanced Analytics
 
Data Science in Pharmaceutical Industry.pptx
Data Science in Pharmaceutical Industry.pptxData Science in Pharmaceutical Industry.pptx
Data Science in Pharmaceutical Industry.pptx
 
Statistics — Your Friend, Not Your Foe
Statistics — Your Friend, Not Your Foe Statistics — Your Friend, Not Your Foe
Statistics — Your Friend, Not Your Foe
 
Analytical Wizards' Claims Data Navigator for Patient Journey and More
Analytical Wizards' Claims Data Navigator for Patient Journey and MoreAnalytical Wizards' Claims Data Navigator for Patient Journey and More
Analytical Wizards' Claims Data Navigator for Patient Journey and More
 
Clinical Analytics
Clinical AnalyticsClinical Analytics
Clinical Analytics
 
7.-Data-Analytics.pptx
7.-Data-Analytics.pptx7.-Data-Analytics.pptx
7.-Data-Analytics.pptx
 
DATA ANALYSIS Presentation Computing Fundamentals.pptx
DATA ANALYSIS Presentation Computing Fundamentals.pptxDATA ANALYSIS Presentation Computing Fundamentals.pptx
DATA ANALYSIS Presentation Computing Fundamentals.pptx
 
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
 
Datamining
DataminingDatamining
Datamining
 
Datamining
DataminingDatamining
Datamining
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Sharing and standards christopher hart - clinical innovation and partnering...
Sharing and standards   christopher hart - clinical innovation and partnering...Sharing and standards   christopher hart - clinical innovation and partnering...
Sharing and standards christopher hart - clinical innovation and partnering...
 

Plus de lurdhu agnes

Computer dictation words
Computer dictation wordsComputer dictation words
Computer dictation wordslurdhu agnes
 
6 chapter font formatting
6 chapter font formatting6 chapter font formatting
6 chapter font formattinglurdhu agnes
 
Working with fields and record
Working with fields and recordWorking with fields and record
Working with fields and recordlurdhu agnes
 
Getting started with access
Getting started with accessGetting started with access
Getting started with accesslurdhu agnes
 
Computer peripherals chapter 1
Computer peripherals chapter 1Computer peripherals chapter 1
Computer peripherals chapter 1lurdhu agnes
 
About the internet 2 nd chapter
About the internet 2 nd chapterAbout the internet 2 nd chapter
About the internet 2 nd chapterlurdhu agnes
 
2 nd chapter the internet
2 nd chapter   the internet2 nd chapter   the internet
2 nd chapter the internetlurdhu agnes
 
6 chapter font formatting
6 chapter font formatting6 chapter font formatting
6 chapter font formattinglurdhu agnes
 
An overview of windows
An overview of windowsAn overview of windows
An overview of windowslurdhu agnes
 
Introduction to computer 7 th std
Introduction to computer  7 th stdIntroduction to computer  7 th std
Introduction to computer 7 th stdlurdhu agnes
 
Introduction to ms access
Introduction to ms accessIntroduction to ms access
Introduction to ms accesslurdhu agnes
 
OSPF redistribution (open shortest path first)
OSPF redistribution (open shortest path first)OSPF redistribution (open shortest path first)
OSPF redistribution (open shortest path first)lurdhu agnes
 

Plus de lurdhu agnes (20)

Google docs
Google docsGoogle docs
Google docs
 
WINDOWS 10
WINDOWS 10WINDOWS 10
WINDOWS 10
 
Computer work sheet
Computer work sheetComputer work sheet
Computer work sheet
 
Computer dictation words
Computer dictation wordsComputer dictation words
Computer dictation words
 
Input output
Input outputInput output
Input output
 
6 chapter font formatting
6 chapter font formatting6 chapter font formatting
6 chapter font formatting
 
Introduction to ms
Introduction to msIntroduction to ms
Introduction to ms
 
Working with fields and record
Working with fields and recordWorking with fields and record
Working with fields and record
 
Planning a database
Planning a databasePlanning a database
Planning a database
 
Getting started with access
Getting started with accessGetting started with access
Getting started with access
 
Computer peripherals chapter 1
Computer peripherals chapter 1Computer peripherals chapter 1
Computer peripherals chapter 1
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 
About the internet 2 nd chapter
About the internet 2 nd chapterAbout the internet 2 nd chapter
About the internet 2 nd chapter
 
2 nd chapter the internet
2 nd chapter   the internet2 nd chapter   the internet
2 nd chapter the internet
 
6 chapter font formatting
6 chapter font formatting6 chapter font formatting
6 chapter font formatting
 
An overview of windows
An overview of windowsAn overview of windows
An overview of windows
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
Introduction to computer 7 th std
Introduction to computer  7 th stdIntroduction to computer  7 th std
Introduction to computer 7 th std
 
Introduction to ms access
Introduction to ms accessIntroduction to ms access
Introduction to ms access
 
OSPF redistribution (open shortest path first)
OSPF redistribution (open shortest path first)OSPF redistribution (open shortest path first)
OSPF redistribution (open shortest path first)
 

Dernier

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Dernier (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Data mining (DM) in the pharmaceutical industry

  • 1. Data Mining in the Pharmaceutical Industry
  • 2. Introduction • Data Mining is the process of extracting information from large data sets through the use of algorithms and techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. • “Mining” means to find something that already exists. • Therefore, data mining can be defined as a process of identifying hidden patterns and relationships, and trends within data. • Traditional methods often involves:- 1) manual work 2) interpretation of data.
  • 3. • Data Mining, popularly called as knowledge discovery in • large data • Enables organizations to make calculated decisions by • Assembling • accumulating • analyzing and • accessing corporate data.
  • 4. __ __ __ __ __ __ __ __ __ Transformed Data Patterns and Rules Target Data Interpretation & Evaluation Knowledge Understanding Raw Dat a DATA Ware house Integration
  • 5. • The scope of pharmaceutical applications is large and it may involve drug manufacturing processes as well as data processing. • Data processing and analysis is a key area in the pharmaceutical industry. • The vision of a pharmaceutical industry that can be achieved with data mining. • pharmaceutical companies delivers drugs, developing test kits (including genetic tests) and computer programs to deliver the best drug to the patient.
  • 6.
  • 7. Pharmaceutical companies can also employ data mining methods to huge masses of genomic data to predict how a patient’s genetic makeup determines his or her response to a drug therapy . genomic data :-The complete set of chromosomal and extra chromosomal genes of an organism, a cell, an organelle or a virus; the complete DNA component of an organism.
  • 8. It uses variety of tools like • Query and reporting tools:-
  • 9. Analytical processing tools •Use to analyze database information from multiple database systems at one time.
  • 10. Decision Support System (DSS) tools. • Decision support systems (DSS) are defined as • interactive computer-based systems intended to help decision makers to utilize data and models in order to • identify problems, solve problems and make decisions.
  • 11. DATA MINING TECHNIQUES. •Many organizations generate mountains of data about their new drugs discovered and its performance reports, etc. •This data is a strategic resource. Now, making use of most of these strategic resources will lead to •improving the quality of pharma industries.
  • 12. • Six important steps in the Data Mining process as 1. Problem Definition. 2. Knowledge acquisition. 3. Data selection. 4. Data Preprocessing. 5. Analysis and Interpretation. 6. Reporting and Use.
  • 13. Identify the data mining process as 1. Definition of the objectives of the analysis. 2. Selection &Pretreatment of the data. 4. Explanatory analysis. 5. Specification of the statistical methods. 6. Analysis of the data. 7. Evaluation and comparison of methods. 8. Interpretation of the chosen model.
  • 14. 1. Definition of the objectives of the analysis. Understanding the project objectives and requirements from a business perspective and then converting this knowledge into a data mining problem definition with a preliminary plan designed to achieve the objectives.
  • 15. Relevant data sources for the pharma industry are: •clinical data (patient data, pharmaceutical data, medical treatments, length of stay); •administrative data (staff skills, overtime, nursing care hours, staff sick leave); • financial data (treatment costs, drug costs, staff salaries, accounting, cost-effectiveness studies); and • organizational data (room occupation, facilities, equipment).
  • 16. Data mining is used to support: •The clinicians at the point of care delivery; •The controlling of clinical treatment pathways; •The administrative and management tasks; and •Efficient management of organizational and financial data.
  • 17. Associations, Mining Frequent Patterns. • These methods identify rules of affinities among the collections. • rules of affinities:- relationships among data • That the patterns occur frequently during Data Mining process. • The applications of association rules include market basket analysis • attached mailing in direct marketing • Fraud detection • department store floor/shelf planning etc.
  • 18. •Association of training undertaken diseases with drugs •Association and analysis of staff movements •Application tracking mechanism in physicians adopting drugs with customer’s prescription
  • 19. Classification And Prediction. • The classification and prediction models are two data analysis techniques that are used to describe data classes and predict future data classes. • E.g. A credit card company whose customer credit history is known can classify its customer Record as • Good, Medium, or Poor.
  • 20. •Predicting consumer behavior •Predicting the likelihood of success in a drug adoption process •Predicting the percentage accuracy in performance of a drug •Classifying the historical health records •Prediction of what type of drugs most likely to be retained, most likely to be left, most likely to transform their composition.
  • 21. Predicting pharma product behavior and attitude •Predicting demand projections by seasonal variations •Predicting the performance progress of segments throughout the performance period •Identifying the best profile for different drugs •Classify trends of movements through the organization for successful/unsuccessful patient historical records •Categorization of drugs, diseases and patients.
  • 22. • The models of decision trees, neural networks based classifications schemes are very much useful in pharma industry.
  • 23. • Decision trees:- Decision-tree is a common knowledge representation used for classification. • In classification, one is given data from a specific instance, and the decision tree predicts, based on the data, into which of two or more classes the instance belongs. • Each instance contains data from multiple attributes. • Instances are collections of previously acquired data which are sorted into class labels. • It does so by determining which tests best divide the instances into separate classes, forming a tree.
  • 24.
  • 25. • Neural Networks – Learn through training – Resemble to biological networks in structure – Can produce very good predictions – Not easy to use and to understand – Cannot deal with missing data
  • 26. Uses Bayesian neural network Prior probability is probability that any report contains reference to adverse event Posterior probability is probability that report has link between drug and adverse event Determines “strength” of link between adverse event and drug (called Information Component or IC) More complicated than appears: patient may consume multiple drugs – which one caused adverse event?
  • 27. Bayesian Neural Network Adverse Event Drug Strength of link between adverse event and drug
  • 28. • Classification works on discrete and unordered data, while prediction works on continuous data. • E.g. Discrete data This data set shows a group of discrete data. Music format Number sold CD albums 140 CD singles 70 Downloads 55 Vinyl 5 Total sales 270 • This is called discrete data because the units of measurement (for example, CDs) cannot be split up; there is nothing between 1 CD and 2 CDs • E.g. Continues data • This data is called continuous because the scale of measurement - distance - has meaning at all points between the numbers given, e.g we can travel a distance of 1.2 and 1.85 and even 1.632 miles. Distance in miles 0.1 0.2 0.6 1.1 1.2 1.8 2.0 2.7 3.4 4.6 6.2 8.0 12.1 14.2
  • 29. • Regression is often used as it is a statistical method used for numeric prediction. • Primary emphasis should be made on the selection measurement accuracy and predicative efficiency of any new drug discovery. • Simple or multiple regressions is the basic prediction model that enables a decision maker to forecast each criterion status based on predictor information. • neural network technology is useful from different areas of business.
  • 30. CLUSTERING. • It is a method by which similar records are grouped together. • Clustering is usually used to mean segmentation. • An organization can take the hierarchy of classes that group similar events. • Using clustering, patients can be grouped based on age, name, diseases etc. • In business, clustering helps identify groups of similarities; • characterize customer groups based on purchasing patterns, etc.
  • 31. DATA MINING AND STATISTICS. • The ability to build a successful predictive model depends on past data. • Data Mining is designed to learn from past success and failures and will be able to predict what will happen next (future prediction). • The Data Mining tool checks the statistical significance of the predicted patterns and reports.
  • 32. The difference between Data Mining and statistics • Data Mining automates the statistical process requiring in several tools. • Statistical inference is assumption driven in the sense that a hypothesis is formed and tested against data. • Data Mining, in contrast is discovery driven. That is, the hypothesis is automatically extracted from the given data.
  • 33. Data Mining can answer analytical questions such as: • what are discovery of new molecules and issues over it? • What factors or combinations are directly impacting the drugs? • What are the best and outstanding drugs? • Which drugs are likely to be retained? • How to optimally allocate resources to ensure effectiveness and efficiency? etc.
  • 34. • An intelligent text mining system could provide a platform for extracting and managing specific information at the entity level. • For e.g. Information pertaining to • genes • proteins • diseases • organisms • chemical substance etc can be analytically extracted for patterns .
  • 35. It would also provide insights into inter relationships such as • protein-protein • Gene-gene • Protein-Chemical • Gene-Disease and • Drug-Drug interactions. • Text mining can be applied to biomedical literature, clinical documents and other medical literary sources for data curation and database population in a semi-automated manner.
  • 36. Applications Of Data Mining In The Pharmaceutical Industry • A lot of information is hidden in the legacy systems. • This information can easily be extracted. • Most of the times this can not be done directly from the legacy systems, because these are not build to answer questions that are unpredictable.
  • 37. • A user-interface may be designed to accept all kinds of information from the user (e.g. weight, sex, age, foods consumed, reactions reported, dosage, length of usage). • Then, based upon the information in the databases and the relevant data entered by the user, • a list of warnings or known reactions (accompanied by probabilities) should be reported. • Note that user profiles can contain large amounts of information, and efficient and effective data mining tools need to be developed to probe the databases for relevant information.
  • 38. • Secondly, the patient's (anonymous) profile should be recorded along with any adverse reactions reported by the patient, so that future correlations can be reported. • Over time, the databases will become much larger, and interaction data for existing medicines will become more complete. • The amount of existing pharmaceutical information pharmacological properties, dosages, contraindications, warnings, etc. is enormous; • however, this fact reflects the number of medicines on the market, rather than an abundance of detailed information about each product.
  • 39. One of the major problems with pharmaceutical data is a lack of information. • a food and drug administration department estimated that • only about 1% of serious events are reported to the food and drug administration department. Fear of litigation may be a contributing factor; • however, most health care providers simply don't have the time to fill out reports of possible adverse drug reactions.
  • 40. •Furthermore, it is expensive and time consuming for pharmaceutical companies to perform a thorough job of data collection, especially when most of the information is not required by law. •Finally, one should note that the food and drug administration department does not require manufacturers to test new medicines for potential interactions.
  • 41. Three stages of drug development • Finding of new drugs • Development tests and Predicts drug behavior • Clinical trials test the drug in humans and • Commercialization takes drug and sells it to likely Consumers (doctors and patients).
  • 42. APPLICATIONS OF DATA MINING IN THE PHARMACEUTICAL INDUSTRY
  • 43. 1) Clinical data analysis – clinical data analysis evaluates and streamlines from large amount of information. Data mining helps to see trends, irregularity, and risk during product development and launch. 2) Marketing and sales analysis –the identification of the most profitable product and allocation of marketing funds. Data mining here helps to examine consumer behavior in terms of prescription renewal and product purchases.
  • 44. 3) Customer analysis – using data mining one can develop more targeted customer profiles that focus not only on products, but also on the ability to pay for them by analyzing historical health trends in combination with demographics. 4) Target physicians who have high prescription rates of a certain drug or treatment with new drug information that treat complementary symptoms or conditions.
  • 45. DEVELOPMENT OF NEW DRUGS. • This can be achieved by clustering the molecules into groups according to the chemical properties of the molecules via cluster analysis. • every time a new molecule is discovered it can be grouped with other chemically similar molecules.
  • 46. •Mining can help us to measure the chemical activity of the molecule on specific disease say tuberculosis and find out which part of the molecule is causing the action. •This way we can combine a vast number of molecules forming a super molecule with only the specific part of the molecule which is responsible for the action and inhibiting the other parts. •This would greatly reduce the adverse effects associated with drug actions.
  • 47. • They use high speed screening to test tens, hundreds, or thousands of drugs very quickly. • The general goal is to find activity on relevant genes or to find drug compounds that have desirable characteristics. • The Data mining techniques that are used in developing of new drugs are clustering, classification and neural networks. • The basic objective is to determine compounds with similar activity.
  • 48. • The reason is for similar activity compounds behave similarly. • This is possible only when we have known compound and looking for something better. • When we don’t have known compounds but have desired activity and want to find compound that exhibits this activity, then data mining rescues this.
  • 49. DEVELOPMENT TESTS AND PREDICTS DRUG BEHAVIOR • Issues which affect the success of a drug which can impact the future development of the drug. 1) Adverse reactions to the drugs are reported spontaneously and not in any organized manner. 2) we can only compare the adverse reactions with the drugs of our own company and not with other drugs from competing firms. 3) we only have information on the patient taking the drug not the adverse reaction that the patient is suffering from
  • 50. Solution • All this can be solved with creation of a data warehouse for drug reactions and running business intelligence tools on them. • BI tool:- Business intelligence tools are a type of software that is designed to retrieve, analyze and report data. • This broad definition includes everything from spreadsheets, visual analytics, and querying software to data mining, warehousing, and decision engineering.
  • 51.
  • 52. •The drug undergoes testing in animals and human tissue to observe effect and determines how much drug to consume for desired effect or how dangerous is the drug. •The Data mining techniques can be here used is classification and neural networks.
  • 53. • The goal here is to predict if treatment will aid patients. • Because if drug will not aid patients, what purpose does drug serve. • Predicting the drug behavior is essential when we have data supporting use of drug and also have training data that shows effects of drug (positive or negative). • The test should be able to predict which patients will benefit and which treatment help sickle cell anemia patients.
  • 54. How it works •The information like gender, body weight, disease state, etc will play crucial role. •This crucial data should be fed into neural network and predict whether patient will benefit from drug. •Only one of two classifications yes/no will be available on training data. •Network is trained for the yes classifications and a snapshot is taken of the neural network. •Then network is trained for the no classifications and another snapshot is taken. •The output is yes or no, depending on whether the inputs are more similar to the yes or the no training data. •E.G. ARTMAP.
  • 55. Weight Height Gender Blood Pressure Imagine array of weights, one for each “template” Template closest to input chosen. Patient Benefits? Path of “least resistance” chosen for output.
  • 56. CLINICAL TRIALS TEST THE DRUG IN HUMANS • Company tests drugs in actual patients on larger scale. • company has to keep track of data about patient progress. • The Government wants to protect health of citizens, many rules govern clinical trials. • In developed countries food and drug administration oversees trials. • The Data mining techniques used here can be neural networks.
  • 57. • Here data is collected by pharmaceutical company but undergoes statistical analysis to determine success of trial. • Data is generally reported to food and drug administration department and inspected closely. • Too many negative reactions might indicate drug is too dangerous. • An adverse event might be medicine causing drowsiness.
  • 58. • The goal is to detect when too many adverse events occur or detect link between drug and adverse event. • Too many adverse events linked to a drug might indicate drug is too dangerous or health of patient is at risk. • Adverse events are reported to food and drug administration when link is suspected. • One can feed the information on drug causing too many adverse events pertaining to drugs into a neural network and let network lead us to what is meant by ‘too many’.
  • 59. Benefits • Research Stage – instead of trial and error, data mining can help find drugs that have desirable activity • Development Stage – data mining can help predict who will benefit from drug • Clinical Trials Stage – data mining protects patients and helps regulate drug testing • Commercialization Stage – data mining can optimize use of sales resources like manpower, advertising
  • 60. CONCLUSION. • Due to increased computerization and consumer/patient awareness. • Reporting (via the internet) by health care workers can easily be facilitated. • Data collection in hospitals and extended care facilities is not difficult, and this information is of high quality since such institutions typically have tailored diets for their patients, and maintain accurate records of treatments, lab tests, and administration of prescriptions. • Furthermore, given the popularity of the internet, it is relatively easy for consumers to voluntarily fill in and submit detailed profiles of themselves.
  • 61. •It is mostly observed that data mining techniques are seldom used in a pharmaceutical environment. •How data mining can help find drugs that have desirable activity and predict who will benefit from drug. •Data mining protects patients and helps regulate drug testing and optimizes use of sales resources like manpower, advertising.