SlideShare une entreprise Scribd logo
1  sur  7
Télécharger pour lire hors ligne
Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering
            Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                       Vol. 2, Issue4, July-August 2012, pp.1303-1309
    Data Mining Techniques and Their Implementation in Blood
                    Bank Sector –A Review

                Ankit Bhardwaj1, Arvind Sharma2, V.K. Shrivastava3
                    1
                        M.Tech Scholar, Department of CSE, HCTM, Kaithal, Haryana, India
                    2
                        Head, Department of Computer Science, DAV Kota, Rajasthan, India
                            3
                              Head, Department of I.T, HCTM, Kaithal, Haryana, India


ABSTRACT                                                  and challenge for data mining. Data mining is a
         This paper focuses on the data mining            process of the knowledge discovery in databases and
and the current trends associated with it. It             the goal is to find out the hidden and interesting
presents an overview of data mining system and            information [1]. Various important steps are
clarifies how data mining and knowledge                   involved in knowledge discovery in databases
discovery in databases are related both to each           (KDD) which helps to convert raw data into
other and to related fields. Data Mining is a             knowledge. Data mining is just a step in KDD
technology used to describe knowledge discovery           which is used to extract interesting patterns from
and to search for significant relationships such as       data that are easy to perceive, interpret, and
patterns, association and changes among                   manipulate. Several major kinds of data mining
variables in databases. This enables users to             methods, including generalization, characterization,
search, collect and donate blood to the patients          classification, clustering, association, evolution,
who are waiting for the last drop of the blood            pattern matching, data visualization, and meta-rule
and are nearby to death. We have also tried to            guided mining will be reviewed. The explosive
identify the research area in data mining where           growth of databases makes the scalability of data
further work can be continued.                            mining techniques increasingly important. Data
                                                          mining algorithms have the ability to rapidly mine
Keywords:        Data mining, KDD, Association            vast amount of data.
Rules, Classification, Clustering, Prediction.
                                                                   The paper is organised as follows: In
I. INTRODUCTION                                           section 2 we use the data mining process and several
          In today’s computer age data storage has        stages of the KDD. Section 3 presents the literature
been growing in size to unthinkable ranges that only      survey. Section 4 discusses about the research
computerized methods applied to find information          methodology to be proposed in our work. Finally,
among these large repositories of data available to       section 5 concludes the paper.
organizations whether it was online or offline. Data
mining was conceptualised in the 1990s as a means         II. DATA MINING AND KDD PROCESS
of addressing the problem of analyzing the vast           Data mining is a detailed process of analyzing large
repositories of data that are available to mankind,       amounts of data and picking out the relevant
and being added to continuously. Data mining is           information. It refers to extracting or mining
necessary to extract hidden useful information from       knowledge from large amounts of data [2]. Data
the large datasets in a given application. This           Mining is the fundamental stage inside the process
usefulness relates to the user goal, in other words       of extraction of useful and comprehensible
only the user can determine whether the resulting         knowledge, previously unknown, from large
knowledge answers his goal. The growing quality           quantities of data stored in different formats, with
demand in the blood bank sector makes it necessary        the objective of improving the decision of
to exploit the whole potential of stored data             companies, organizations where the data can be
efficiently, not only the clinical data and also to       collected. However data mining and overall process
improve the behaviours of the blood donors. Data          known as Knowledge Discovery from Databases
mining can contribute with important benefits to the      (KDD), is usually an expensive process, especially
blood bank sector, it can be a fundamental tool to        in the stages of business objectives elicitation, data
analyse the data gathered by blood banks through          mining objectives elicitation, and data preparation.
their information systems. In recent years, along         This is especially the case each time data mining is
with development of medical informatics and               applied to a blood bank. Data Mining can be defined
information technology, blood bank information            as the extraction or fetching of the relevant
system grows rapidly. With the growth of the blood        information i.e. Knowledge from the large
banks, enormous Blood Banks Information Systems           repositories of data. That’s the reason it is also
(BBIS) and databases are produced. It creates a need      called as Knowledge Mining. However many

                                                                                               1303 | P a g e
Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering
            Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                       Vol. 2, Issue4, July-August 2012, pp.1303-1309
synonyms are linked with Data Mining viz.                           Many people treat the data mining as a synonym for
Knowledge Mining from Data, Knowledge                               another popularly used term, Knowledge Discovery
extraction, Data/ pattern analysis, Data archaeology                from Data. The following figure (fig.1), shows the
and Data dredging. Data Mining is also popularly                    data mining as simply an essential step in the
known as Knowledge Discovery in data bases                          process of KDD i.e. Knowledge Discovery from
(KDD), refers to the nontrivial extraction of                       Data[3].
implicit, previously unknown and potentially useful
information from data in Databases.


             Selection                      Preprocessing                        Transformation


                                                                                                  Transformed data




                                                                                                                               Data Mining
                                                             Preprocessed data
   INITIAL               TARGET    TARGET
    DATA                  DATA      DATA




                                                                                                        PATTERNS
                                                 KNOWLEDGE

                                                                            Evaluation




                                  Fig.1: Data mining is the core of KDD Process


          The KDD process includes selecting the                    2.1 STEPS OF KDD PROCESS
data needed for data mining process & may be                                The knowledge discovery in databases
obtained from many different & heterogeneous data                   (KDD) process comprises of a few steps leading
sources. Preprocessing includes finding incorrect or                from raw data collections to some form of new
missing data. There may be many different activities                knowledge. The iterative process contains of the
performed at this time. Erroneous data may be                       following steps:
corrected or removed, whereas missing data must be
supplied. Preprocessing also include: removal of                    1. Data cleaning
noise or outliers, collecting necessary information to                        It also known as data cleansing, it is a
model or account for noise, accounting for time                     fundamental step in which noise data and irrelevant
sequence information and known changes.                             data are removed from the collection.
Transformation is converting the data into a
common format for processing. Some data may be                      2. Data integration
encoded or transformed into more usable format.                              In this step, multiple data sources, often
Data reduction, dimensionality reduction (e.g.                      heterogeneous, may be combined in a common
feature selection i.e. attribute subset selection,                  source.
heuristic method etc) & data transformation method                  3. Data selection
(e.g. sampling, aggregation, generalization etc) may                         At this step, the data relevant to the
be used to reduce the number of possible data values                analysis is decided on and retrieved from the data
being considered. Data Mining is the task being                     collection.
performed, to generate the desired result.
Interpretation/Evaluation is how the data mining                    4. Data transformation
results are presented to the users which are                                  It is also known as data consolidation, it is
extremely important because the usefulness of the                   a stage in which the selected data is transformed into
result is dependent on it. Various visualization &                  forms appropriate for the mining procedure.
GUI strategies are used at this step. Different kinds
of knowledge requires different kinds of                            5. Data mining
representation e.g. classification, clustering,                              It is the core step of KDD process in which
association rule etc.[4]                                            clever techniques are applied to extract patterns
                                                                    potentially useful.




                                                                                                                     1304 | P a g e
Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering
               Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                          Vol. 2, Issue4, July-August 2012, pp.1303-1309
  6. Pattern evaluation                                     4. Discovering Patterns and Rules: It concern with
           In this step, strictly interesting patterns      pattern detection, the aim is spotting fraudulent
  representing knowledge are identified based on            behavior by detecting regions of the space defining
  given measures.                                           the different types of transactions where the data
                                                            points significantly different from the rest.
  7. Knowledge representation
           This is the final step in which the              5. Retrieval by Content: It is finding pattern
  discovered knowledge is visually represented to the       similar to the pattern of interest in the data set. This
  user. This is a very essential step that uses             task is most commonly used for text and image data
  visualization techniques to help users understand         sets.
  and interpret the data mining results.
                                                           III. LITERATURE SURVEY
  2.2 DATA MINING MODELS                                        In this section, lot of research works have been
                                    The data mining        recorded from past few years. They are presented
  models and tasks are shown in figure 2 given below:      here:
                                                           [8] In June 2009, work is carried out on management
                       Data Mining                         information system that helps managers in providing
                         Models                            decision making in any organization. This MIS is
                                                           basically all about the process of collecting,
                                                           processing, storing and transmitting the relevant
                                                           information that is just like the life blood of the
                                                           organization that uses a data mining approach.
  Predictive Model                    Descriptive Model    [7] In June 2009, the research work entitled on web
    (Supervised)                       (Unsupervised)      based information system for blood donation, all the
                                                           information regarding blood donation are available on
                                                           world wide web i.e. online systems that
                                                           communicate/interconnect all the blood donor
Classification                       Clustering            societies in a country using LAN Technology.
                                     Association Rule
                                                           [6] In July 2009, the research work entitled on
Regression
                                                           Segmentation, Reconstruction and Analysis of Blood
Prediction                           Sequence discovery
                                                           thrombus formation in 3D-2 photons microscopy
Time Series Analysis                 Summarization         images. In this work, method and platform are
                                                           applied to differentiate between the thromby formed
             Fig.2: Data Mining Models and Tasks           in wild type and low FV11 mice. The high resolution
                                                           quantitative structural analysis using some algorithms
            The predictive model makes prediction          that provide a new matrix, that is likely to be critical
  about unknown data values by using the known             to categorize and understanding bio-medically
  values. The descriptive model identifies the patterns    relevant characteristics of thromby. Also, in this
  or relationships in data and explores the properties     work, composition of different components on the
  of the data examined.                                    clot surface and number of voxels in each clot
                                                           component has been computed.
  2.3 DATA MINING TASKS                                    [5] In December 2009, the work proposed on an
           There are different types of data mining        application to find spatial distribution of blood
  tasks depending on the use of data mining results.       donors from the blood bank information system,
  These data mining tasks are categorised as follows:      blood donation and transfusion services are carried
                                                           out and help the patients to access the availability of
  1. Exploratory Data Analysis: It is simply               blood from anywhere.
  exploring the data without any clear ideas of what       [9] In the year 2009, the research was based
  we are looking for. These techniques are interactive     geographical variation in correlates of blood donor
  and visual.                                              turn out rates-an investigation of Canadian
                                                           metropolitan areas. In this work, Canadian blood
  2. Descriptive Modeling: It describes all the data, it   services i.e. an organization that aims the collecting
  includes models for overall probability distribution     and distributing the blood supply across the country.
  of the data, partitioning of the p-dimensional space     Accessibility variables are introduced and calculated
  into groups and models describing the relationships      using a 2-step floating catchment area in order to
  between the variables.                                   analyse the accessibility of services. The regression
                                                           technique is also used.
  3. Predictive Modeling: This model permits the           [10] In the year 2010, the research is accomplished on
  value of one variable to be predicted from the           application of CART algorithm in blood donor’s
  known values of other variables.                         classification. In this work, one of the popular data
                                                                                                  1305 | P a g e
Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering
             Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                        Vol. 2, Issue4, July-August 2012, pp.1303-1309

mining technique i.e. Classification is used and through   [14] In year 2011, the work was proposed on an
the use of CART application, a model is created that       intelligent system for improving performance of
determines the donor’s behavior.                           blood donation. In this work, many important
[11] In the year 2010, the data mining tool is used to     techniques i.e. Clustering, K-means classification is
extract information from PPI (Protein-Protein              adopted that improves the performance of blood
Interaction) systems and developed a PPI search            donation services.
system known as PP Look that is an effective tool          [19] In January 2012, the data mining techniques are
4-tier information extraction systems based on a full      applied on medical databases and clinical databases
sentence parsing approach. In this paper, researchers      which are used to store huge amount of information
introduced a useful tool that is PP Look which uses        regarding patient’s diagnosis, lab test results,
an improved keyword, dictionary pattern matching           patient’s treatments etc. which is a way of mining the
algorithm to extract protein-protein interaction           medical information for doctors and medical
information from biomedical literature. Some visual        researchers. In all developed countries, it is found
methods were adopted to conclude PPI in the form of        that the routine health tests are very common among
3D stereoscopic displays.                                  all adults. It is also determined that precautionary
[12] In December 2010, the work is proposed on             measures are less expensive rather than the
binary classifiers for health care databases i.e. a        treatments and also facilitates a better chance for
comparative study of data mining classification            patient’s treatment at earlier stages. In this research
algorithms in the diagnosis of breast cancer. This         work, five medical databases are used and
work helps in uncovering the valuable knowledge            experimental results are computed using data mining
hidden behind them and also helping the decision           software tool.
makers to improve the health care services. In this        [20] In February 2012, the work is carried out
work, the presented experiment provides medical            through the Health Care Applications to diagnose
doctors and health care planers a tool to help them        different diseases using Red Blood Cells counting. In
quickly make sense of vast clinical databases.             this work, the researchers presented the automatic
[15] In February 2011, the research paper presented        process of RBC count from an image and some of the
on classifying blood donors using data mining              data mining techniques like Segmentation,
techniques, the work is performed on blood group           Equalization and K-means clustering are used for
donor data sets using classification technique.            preprocessing of the images. Some diseases which
[13] In August 2011, the research work carried out on      are related to sickness of RBC count also predicted.
interactive knowledge discovery in blood transfusion       [21] In this work, the main purpose of the system is
data set, the work is done through conducting data         to guide diabetic patients during the disease. Diabetic
mining experiments that help the health professionals      patients could benefit from the diabetes expert system
in better management of blood bank facility.               by entering their daily glucoses rate and insulin
[16] In year 2011, the work presented on rule              dosages; producing a graph from insulin history;
extraction for blood donors with fuzzy sequential          consulting their insulin dosage for next day. It’s also
pattern mining, fuzzy sequential pattern algorithm is      tried to determine an estimation method to predict
used to extract rules from blood transfusion service       glucose rate in blood which indicates diabetes risk. In
center data set that predicts the behaviour of donor in    this work, the WEKA data mining tool is used to
the future.                                                classify the data and the data is evaluated using
[17] In August 2011, the research work proposed on         10-fold cross validation and the results are compared.
a comparison of blood donor classification data
mining models, which uses decision tree to examine          IV. RESEARCH METHODOLOGY
the blood donor’s classification. In this work,
                                                            4.1 Data Mining Techniques in Blood Bank
comparison between extended RVD based model and
                                                            Sector
DB2K7 procedures are carried out and it was
discovered that RVD classification is better than           In our proposed methodology, we suppose a blood
DB2K7 in aspects of recalling and precision                 bank system has the following objectives:
capability.                                                  Increase blood donor’s rate and their attitude.
[18] In the September 2011, the research work                Attract more blood donors to donate blood.
carried out on real time blood donor management              Assist blood bank professionals in making
using dash boards based on data mining models, the               policy on the acquisition of blood donors and
data mining techniques are used to examine the blood             new blood banks.
donor classification and promote it to development of       There are several major data mining techniques had
real time blood donor management using dash boards          been developed and used in data mining research
with blood profile or RVD profile and geo-location          projects recently including classification, clustering,
data.                                                       association, prediction and sequential patterns.
                                                            These techniques and methods in data mining need
                                                            brief mention to have better understanding.
                                                                                                  1306 | P a g e
Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering
            Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                       Vol. 2, Issue4, July-August 2012, pp.1303-1309
4.1.1 Classification                                     specific target variable. Association rule has the
The estimation and prediction may be viewed as           several algorithms like: Apriori, CDA, DDA,
types of classification. The problem usually is          interestingness measure etc. Association rules are
evaluating the training data set and then we apply to    if/then statements that help uncover relationships
the future model.                                        between seemingly unrelated data in a relational
The following table1 shows different classification      database or other information repository. An
algorithms:                                              example of an association rule can be ‘if a customer
                                                         buys a dozen eggs, he is 80% likely to also purchase
    Type                   Name of algorithm
                                                         milk.’ An association rule has two parts, an
   Statistical            Regression
                                                         antecedent (if) and a consequent (then). An
                          Bayesian
                                                         antecedent is an item found in the data. A
     Distance             Simple distance
                                                         consequent is an item that is found in combination
                          K-nearest neighbors
                                                         with the antecedent.
     Decision             ID3                            The association rules are created by analyzing data
     Tree                 C4.5                           for frequent if/then patterns and using the
                          CART                           criteria support and confidence to identify the most
                          SPRINT                         important relationships. The support is an indication
     Neural               Propagation                    of how frequently the data items appear in the
     Network              NN Supervised learning         database. The confidence indicates the number of
                          Radial base function           times the if/then statements have been found to be
                          network                        true.
   Rule based             Genetic rules from DT          Types of Association Rule Techniques are as
                          Genetic rules from NN          follows:
                          Genetic rules without           Multilevel Association Rule
                          DT and NN                       Multidimensional Association Rule
           Table 1: Classification Algorithms             Quantitative Association Rule

4.1.2 Clustering                                         4.1.4 Prediction
Clustering technique is grouping of data, which is       The prediction as it name implied is one of the data
not predefined. By using clustering technique we         mining techniques that discovers relationship
can identify dense and sparse regions in object          between independent variables and relationship
space. The following table2 shows different              between dependent and independent variables. For
clustering algorithms:                                   instance, prediction analysis technique can be used
     Type               Name of algorithm                in blood donors to predict the behaviour for the
    Similarity and     Similarity    &   distance        future if we consider donor is an independent
    distance           measure                           variable, blood could be a dependent variable. Then
    measure                                              based on the historical data, we can draw a fitted
     Outlier            Outlier 3333                     regression curve that is used for donor’s behaviour
     Hierarchical      Agglomerative, Divisive           prediction. Regression technique can be adapted for
     Partitional       Minimum spanning tree             predication. Regression analysis can be used to
                       Squared matrix                    model the relationship between one or more
                       K-means                           independent variables and dependent variables. In
                       Nearest Neighbor                  data mining independent variables are attributes
                       PAM                               already known and response variables are what we
                       Bond Energy                       want to predict. Unfortunately, many real-world
                       Clustering with Neural            problems are not simply prediction.
                       Network                           Types of Regression Techniques are as follows:
    Clustering large BIRCH                                Linear Regression
    database           DB Scan                            Multivariate Linear Regression
                       CURE                               Nonlinear Regression
     Categorical       ROCK                               Multivariate Nonlinear Regression
            Table 2: Clustering Algorithms
                                                         4.1.5 Sequential Patterns
4.1.3 Association Rule                                   Sequential patterns analysis in one of data mining
The central task of association rule mining is to find   technique that seeks to discover similar patterns in
sets of binary variables that co-occur together          data transaction over a business period. The uncover
frequently in a transaction database, while the goal     patterns are used for further business analysis to
of feature selection problem is to identify groups of    recognize relationships among data.
that are strongly correlated with each other with a

                                                                                             1307 | P a g e
Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering
             Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                        Vol. 2, Issue4, July-August 2012, pp.1303-1309

                V. CONCLUSION
This paper presents an overview of data mining and        [11] Dr. Varun Kumar, Luxmiverma; ‘Binary
its techniques which have been used to extract            classifiers for Health Care databases - A
interesting patterns and to develop significant           comparative study of data mining classification
relationships among variables stored in a huge            algorithms in the diagnosis of Breast cancer’,
dataset. Data mining is needed in many fields to          IJCST, Vol. I, Issue 2, December 2010.
extract the useful information from the large amount      [12] Vikram Singh and Sapna Nagpal; ‘Interactive
of data. Large amount of data is maintained in every      Knowledge discovery in Blood Transfusion Data
field to keep different records such as medical data,     Set’; VSRD International Journal of Computer
scientific data, educational data, demographic data,      Science and Information Technology; Vol. I, Issue 8,
financial data, marketing data etc. Therefore,            2011.
different ways have been found to automatically           [13] Wen-ChanLee and Bor-Wen Cheng; ‘An
analyze the data, to summarize it, to discover and        Intelligent system for improving performance of
characterize trends in it and to automatically flag       blood donation’, Journal of Quality, Vol. 18, Issue
anomalies. The several data mining techniques are         No. II, 2011.
introduced by the different researchers. These            [14] P.Ramachandran et al. ‘Classifying blood
techniques are used to do classification, to do           donors using data mining techniques’; IJCST, Vol.
clustering, to find interesting patterns. In our future   I, Issue1, February 2011.
work, the data mining techniques will be                  [15] F. Zabihi et al. ‘Rule Extraction for Blood
implemented on blood donor’s data set for predicting      donators with fuzzy sequential pattern mining’; The
the blood donor’s behaviour and attitude, which have      Journal of mathematics and Computer Science; Vol.
been collected from the blood bank center.                II, Issue No. I, 2011.
                                                          [16] Shyam Sundaram and Santhanam T: ‘A
                                                          comparison of Blood donor classification data
REFERENCES                                                mining models’, Journal of Theoretical and Applied
 [1] Badgett RG: How to search for and evaluate           Information Technology, Vol.30, No.2, 31 August
 medical evidence. Seminars in Medical Practice           2011.
 1999, 2:8-14, 28.                                        [17] Shyam Sundaram and Santhanam T; ‘Real
 [2] Jaiwei Han and Micheline Kamber, ‘Data               Time Blood donor management using Dash boards
 Mining Concepts and Techniques’, Second Edition,         based on Data mining models’, International
 Morgan Kaufmann Publishers.                              Journal of Computing issues, Vol.8, Issue5, No.2,
 [3] ‘Data Mining Introductory and Advanced               September 2011.
 Topics’ by Margaret H. Dunham.                           [18] Prof. Dr. P.K. Srimani et al. ‘Outlier data
 [4] Sunita B Aher, LOBO L.M.R.J., International          mining in medical databases by using statistical
 Conference on Emerging Technology Trends                 methods’, Vol.4, No.1, January 2012.
 (ICETT)-2011       Proceedings       published      by   [19] Alaa hamouda et al; ‘Automated Red Blood
 International Journal of Computer Applications           Cells counting’, International Journal of Computing
 (IJCA).                                                  Science, Vol.1, No.2, February 2012.
 [5] B.G. Premasudha et al., ‘An Application to find      [20] Ivana D. Radojevic et al. ‘Total coliforms and
 spatial distribution of Blood Donors from Blood          data mining as a tool in water quality monitoring’,
 Bank Information’,Vol. II, Issue No.2, July-             African Journal of Microbiology Research, Vol.
 December 2009.                                           6(10), 16 March 2012.
 [6] Jian Mu et al., ‘Segmentation, Reconstruction,       [21] P.Yasodha, M. Kannan; Analysis of a
 and analysis of blood thrombus formation in 3 D-         Population of Diabetic Patients Databases in Weka
 2photon microscopy images’, EURASIP Journal on           Tool; International Journal of Scientific &
 Advances in Signal Processing, 10 July 2009.             Engineering Research Volume 2, Issue 5, May 2011.
 [7] Abdur Rashid Khan et and al. ‘Web based              .
 Information System for Blood Donation’,
                                                          Author’s Profile
 International Jounal of digital content Technology
 and its applications,Vol. 3, Issue No.2, July 2009.
 [8] G. Satyanaryana Reddy et al; ‘Management
 Information System to help Managers for providing
 decision making in any organization’.
 [9] T. Santhanm and Shyam Sunderam: ‘Application
 of Cart Algorithm in Blood donor’s classification’,
 Journal of computer Science Vol. 6, Issue 5.             Ankit Bhardwaj received the B.Tech. Degree in
 [10] Zhangetal. ‘PPLook: an automated data               Information Technology from Maharishi Dayanand
 mining tool for protein–protein interaction’, BMC        University, Rohtak, Haryana, India in 2009 and pursuing
 Bio-informatics- 2010.                                   M.Tech (Comp.Sc. & Engg) from Haryana College of

                                                                                                1308 | P a g e
Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering
            Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                       Vol. 2, Issue4, July-August 2012, pp.1303-1309
Technology and Management, Kaithal, Kurukshetra
University, Kurukshetra. His research interest lies in the
area of Data Mining.




Arvind Sharma received the M.Sc. Degree in Computer
Science from Maharishi Dayanand University, Rohtak,
Haryana, India in 2003, M.C.A. Degree from JRNRV
University Udaipur, Rajasthan, India, M.Phil. Degree in
Computer Science with specialization in Web Data
Mining Applications from the Alagappa University,
Karaikudi, Tamil Nadu, India and pursuing Ph.D.
Computer Science from Jaipur National University,
Jaipur, Rajasthan, India. He has published many technical
papers in International and National reputed journals and
conferences. His research interest lies in the area of Data
Mining and Web Applications.




V.K. Shrivastava is working as an Associate Professor and
HOD, Department of Information Technology, Haryana
College of Technology and Management, Kaithal,
Haryana, India. He has published many research papers in
International and National reputed journals and
conferences. His research interest lies in the area of
Digital Signal Processing.




                                                                              1309 | P a g e

Contenu connexe

Tendances

Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an IntroductionAli Abbasi
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Privacy Preserving Clustering on Distorted data
Privacy Preserving Clustering on Distorted dataPrivacy Preserving Clustering on Distorted data
Privacy Preserving Clustering on Distorted dataIOSR Journals
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...IRJET Journal
 
Data Mining and Knowledge Discovery in Large Databases
Data Mining and Knowledge Discovery in Large DatabasesData Mining and Knowledge Discovery in Large Databases
Data Mining and Knowledge Discovery in Large DatabasesSSA KPI
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryHoang Nguyen
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data miningDevakumar Jain
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data MiningSi Krishan
 
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionMultilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionIOSR Journals
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process Shuvra Ghosh
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsGDi Techno Solutions
 
Data Mining methodology
 Data Mining methodology  Data Mining methodology
Data Mining methodology rebeccatho
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CSThanveen
 

Tendances (18)

Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
50120130406032
5012013040603250120130406032
50120130406032
 
Privacy Preserving Clustering on Distorted data
Privacy Preserving Clustering on Distorted dataPrivacy Preserving Clustering on Distorted data
Privacy Preserving Clustering on Distorted data
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
 
Data Mining and Knowledge Discovery in Large Databases
Data Mining and Knowledge Discovery in Large DatabasesData Mining and Knowledge Discovery in Large Databases
Data Mining and Knowledge Discovery in Large Databases
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionMultilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
U0 vqmtq3m tc=
U0 vqmtq3m tc=U0 vqmtq3m tc=
U0 vqmtq3m tc=
 
Data Mining
Data MiningData Mining
Data Mining
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
Data Mining methodology
 Data Mining methodology  Data Mining methodology
Data Mining methodology
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CS
 

En vedette

July 12 Regulation University Presentation - SBA
July 12 Regulation University Presentation - SBAJuly 12 Regulation University Presentation - SBA
July 12 Regulation University Presentation - SBATrevor Carlsen
 
Periodic Table Project 2012
Periodic Table Project 2012Periodic Table Project 2012
Periodic Table Project 2012jmori1
 
Proyecto de aula sede el uvito
Proyecto de aula sede el uvitoProyecto de aula sede el uvito
Proyecto de aula sede el uvitoAlvaro Velandia
 
Курс по програмиране за напреднали (2012) - 5. Windows Presentation Foundation
Курс по програмиране за напреднали (2012) - 5. Windows Presentation FoundationКурс по програмиране за напреднали (2012) - 5. Windows Presentation Foundation
Курс по програмиране за напреднали (2012) - 5. Windows Presentation FoundationDAVID Academy
 
Nutrition In The Fast Lane
Nutrition In The Fast LaneNutrition In The Fast Lane
Nutrition In The Fast Laneroach10
 

En vedette (7)

July 12 Regulation University Presentation - SBA
July 12 Regulation University Presentation - SBAJuly 12 Regulation University Presentation - SBA
July 12 Regulation University Presentation - SBA
 
Periodic Table Project 2012
Periodic Table Project 2012Periodic Table Project 2012
Periodic Table Project 2012
 
Proyecto de aula sede el uvito
Proyecto de aula sede el uvitoProyecto de aula sede el uvito
Proyecto de aula sede el uvito
 
Proyecto de vida
Proyecto de vidaProyecto de vida
Proyecto de vida
 
Курс по програмиране за напреднали (2012) - 5. Windows Presentation Foundation
Курс по програмиране за напреднали (2012) - 5. Windows Presentation FoundationКурс по програмиране за напреднали (2012) - 5. Windows Presentation Foundation
Курс по програмиране за напреднали (2012) - 5. Windows Presentation Foundation
 
Nutrition In The Fast Lane
Nutrition In The Fast LaneNutrition In The Fast Lane
Nutrition In The Fast Lane
 
Health IT Beyond Hospitals
Health IT Beyond HospitalsHealth IT Beyond Hospitals
Health IT Beyond Hospitals
 

Similaire à Data Mining Techniques Review for Blood Banks

Introduction to dm and dw
Introduction to dm and dwIntroduction to dm and dw
Introduction to dm and dwANUSUYA T K
 
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEDATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEIJDKP
 
A Comprehensive Study on Outlier Detection in Data Mining
A Comprehensive Study on Outlier Detection in Data MiningA Comprehensive Study on Outlier Detection in Data Mining
A Comprehensive Study on Outlier Detection in Data MiningBRNSSPublicationHubI
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
 
DM-Unit-1-Part 1-R.pdf
DM-Unit-1-Part 1-R.pdfDM-Unit-1-Part 1-R.pdf
DM-Unit-1-Part 1-R.pdfssuserb933d8
 
A Review Of Data Mining Literature
A Review Of Data Mining LiteratureA Review Of Data Mining Literature
A Review Of Data Mining LiteratureAddison Coleman
 
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...ijtsrd
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1Mahmoud Alfarra
 
Association rule visualization technique
Association rule visualization techniqueAssociation rule visualization technique
Association rule visualization techniquemustafasmart
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueMehmet Beyaz
 

Similaire à Data Mining Techniques Review for Blood Banks (20)

Seminar Report Vaibhav
Seminar Report VaibhavSeminar Report Vaibhav
Seminar Report Vaibhav
 
Introduction to dm and dw
Introduction to dm and dwIntroduction to dm and dw
Introduction to dm and dw
 
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEDATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
 
A Comprehensive Study on Outlier Detection in Data Mining
A Comprehensive Study on Outlier Detection in Data MiningA Comprehensive Study on Outlier Detection in Data Mining
A Comprehensive Study on Outlier Detection in Data Mining
 
Data mining
Data miningData mining
Data mining
 
15 19
15 1915 19
15 19
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope
 
Datamining
DataminingDatamining
Datamining
 
Data Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope SurveyData Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope Survey
 
G045033841
G045033841G045033841
G045033841
 
DM-Unit-1-Part 1-R.pdf
DM-Unit-1-Part 1-R.pdfDM-Unit-1-Part 1-R.pdf
DM-Unit-1-Part 1-R.pdf
 
A Review Of Data Mining Literature
A Review Of Data Mining LiteratureA Review Of Data Mining Literature
A Review Of Data Mining Literature
 
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
 
dwdm unit 1.ppt
dwdm unit 1.pptdwdm unit 1.ppt
dwdm unit 1.ppt
 
Fayyad model in data mining
Fayyad model in data miningFayyad model in data mining
Fayyad model in data mining
 
Association rule visualization technique
Association rule visualization techniqueAssociation rule visualization technique
Association rule visualization technique
 
Data mining
Data miningData mining
Data mining
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining Technique
 
2 Data-mining process
2   Data-mining process2   Data-mining process
2 Data-mining process
 

Dernier

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Dernier (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Data Mining Techniques Review for Blood Banks

  • 1. Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.1303-1309 Data Mining Techniques and Their Implementation in Blood Bank Sector –A Review Ankit Bhardwaj1, Arvind Sharma2, V.K. Shrivastava3 1 M.Tech Scholar, Department of CSE, HCTM, Kaithal, Haryana, India 2 Head, Department of Computer Science, DAV Kota, Rajasthan, India 3 Head, Department of I.T, HCTM, Kaithal, Haryana, India ABSTRACT and challenge for data mining. Data mining is a This paper focuses on the data mining process of the knowledge discovery in databases and and the current trends associated with it. It the goal is to find out the hidden and interesting presents an overview of data mining system and information [1]. Various important steps are clarifies how data mining and knowledge involved in knowledge discovery in databases discovery in databases are related both to each (KDD) which helps to convert raw data into other and to related fields. Data Mining is a knowledge. Data mining is just a step in KDD technology used to describe knowledge discovery which is used to extract interesting patterns from and to search for significant relationships such as data that are easy to perceive, interpret, and patterns, association and changes among manipulate. Several major kinds of data mining variables in databases. This enables users to methods, including generalization, characterization, search, collect and donate blood to the patients classification, clustering, association, evolution, who are waiting for the last drop of the blood pattern matching, data visualization, and meta-rule and are nearby to death. We have also tried to guided mining will be reviewed. The explosive identify the research area in data mining where growth of databases makes the scalability of data further work can be continued. mining techniques increasingly important. Data mining algorithms have the ability to rapidly mine Keywords: Data mining, KDD, Association vast amount of data. Rules, Classification, Clustering, Prediction. The paper is organised as follows: In I. INTRODUCTION section 2 we use the data mining process and several In today’s computer age data storage has stages of the KDD. Section 3 presents the literature been growing in size to unthinkable ranges that only survey. Section 4 discusses about the research computerized methods applied to find information methodology to be proposed in our work. Finally, among these large repositories of data available to section 5 concludes the paper. organizations whether it was online or offline. Data mining was conceptualised in the 1990s as a means II. DATA MINING AND KDD PROCESS of addressing the problem of analyzing the vast Data mining is a detailed process of analyzing large repositories of data that are available to mankind, amounts of data and picking out the relevant and being added to continuously. Data mining is information. It refers to extracting or mining necessary to extract hidden useful information from knowledge from large amounts of data [2]. Data the large datasets in a given application. This Mining is the fundamental stage inside the process usefulness relates to the user goal, in other words of extraction of useful and comprehensible only the user can determine whether the resulting knowledge, previously unknown, from large knowledge answers his goal. The growing quality quantities of data stored in different formats, with demand in the blood bank sector makes it necessary the objective of improving the decision of to exploit the whole potential of stored data companies, organizations where the data can be efficiently, not only the clinical data and also to collected. However data mining and overall process improve the behaviours of the blood donors. Data known as Knowledge Discovery from Databases mining can contribute with important benefits to the (KDD), is usually an expensive process, especially blood bank sector, it can be a fundamental tool to in the stages of business objectives elicitation, data analyse the data gathered by blood banks through mining objectives elicitation, and data preparation. their information systems. In recent years, along This is especially the case each time data mining is with development of medical informatics and applied to a blood bank. Data Mining can be defined information technology, blood bank information as the extraction or fetching of the relevant system grows rapidly. With the growth of the blood information i.e. Knowledge from the large banks, enormous Blood Banks Information Systems repositories of data. That’s the reason it is also (BBIS) and databases are produced. It creates a need called as Knowledge Mining. However many 1303 | P a g e
  • 2. Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.1303-1309 synonyms are linked with Data Mining viz. Many people treat the data mining as a synonym for Knowledge Mining from Data, Knowledge another popularly used term, Knowledge Discovery extraction, Data/ pattern analysis, Data archaeology from Data. The following figure (fig.1), shows the and Data dredging. Data Mining is also popularly data mining as simply an essential step in the known as Knowledge Discovery in data bases process of KDD i.e. Knowledge Discovery from (KDD), refers to the nontrivial extraction of Data[3]. implicit, previously unknown and potentially useful information from data in Databases. Selection Preprocessing Transformation Transformed data Data Mining Preprocessed data INITIAL TARGET TARGET DATA DATA DATA PATTERNS KNOWLEDGE Evaluation Fig.1: Data mining is the core of KDD Process The KDD process includes selecting the 2.1 STEPS OF KDD PROCESS data needed for data mining process & may be The knowledge discovery in databases obtained from many different & heterogeneous data (KDD) process comprises of a few steps leading sources. Preprocessing includes finding incorrect or from raw data collections to some form of new missing data. There may be many different activities knowledge. The iterative process contains of the performed at this time. Erroneous data may be following steps: corrected or removed, whereas missing data must be supplied. Preprocessing also include: removal of 1. Data cleaning noise or outliers, collecting necessary information to It also known as data cleansing, it is a model or account for noise, accounting for time fundamental step in which noise data and irrelevant sequence information and known changes. data are removed from the collection. Transformation is converting the data into a common format for processing. Some data may be 2. Data integration encoded or transformed into more usable format. In this step, multiple data sources, often Data reduction, dimensionality reduction (e.g. heterogeneous, may be combined in a common feature selection i.e. attribute subset selection, source. heuristic method etc) & data transformation method 3. Data selection (e.g. sampling, aggregation, generalization etc) may At this step, the data relevant to the be used to reduce the number of possible data values analysis is decided on and retrieved from the data being considered. Data Mining is the task being collection. performed, to generate the desired result. Interpretation/Evaluation is how the data mining 4. Data transformation results are presented to the users which are It is also known as data consolidation, it is extremely important because the usefulness of the a stage in which the selected data is transformed into result is dependent on it. Various visualization & forms appropriate for the mining procedure. GUI strategies are used at this step. Different kinds of knowledge requires different kinds of 5. Data mining representation e.g. classification, clustering, It is the core step of KDD process in which association rule etc.[4] clever techniques are applied to extract patterns potentially useful. 1304 | P a g e
  • 3. Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.1303-1309 6. Pattern evaluation 4. Discovering Patterns and Rules: It concern with In this step, strictly interesting patterns pattern detection, the aim is spotting fraudulent representing knowledge are identified based on behavior by detecting regions of the space defining given measures. the different types of transactions where the data points significantly different from the rest. 7. Knowledge representation This is the final step in which the 5. Retrieval by Content: It is finding pattern discovered knowledge is visually represented to the similar to the pattern of interest in the data set. This user. This is a very essential step that uses task is most commonly used for text and image data visualization techniques to help users understand sets. and interpret the data mining results. III. LITERATURE SURVEY 2.2 DATA MINING MODELS In this section, lot of research works have been The data mining recorded from past few years. They are presented models and tasks are shown in figure 2 given below: here: [8] In June 2009, work is carried out on management Data Mining information system that helps managers in providing Models decision making in any organization. This MIS is basically all about the process of collecting, processing, storing and transmitting the relevant information that is just like the life blood of the organization that uses a data mining approach. Predictive Model Descriptive Model [7] In June 2009, the research work entitled on web (Supervised) (Unsupervised) based information system for blood donation, all the information regarding blood donation are available on world wide web i.e. online systems that communicate/interconnect all the blood donor Classification Clustering societies in a country using LAN Technology. Association Rule [6] In July 2009, the research work entitled on Regression Segmentation, Reconstruction and Analysis of Blood Prediction Sequence discovery thrombus formation in 3D-2 photons microscopy Time Series Analysis Summarization images. In this work, method and platform are applied to differentiate between the thromby formed Fig.2: Data Mining Models and Tasks in wild type and low FV11 mice. The high resolution quantitative structural analysis using some algorithms The predictive model makes prediction that provide a new matrix, that is likely to be critical about unknown data values by using the known to categorize and understanding bio-medically values. The descriptive model identifies the patterns relevant characteristics of thromby. Also, in this or relationships in data and explores the properties work, composition of different components on the of the data examined. clot surface and number of voxels in each clot component has been computed. 2.3 DATA MINING TASKS [5] In December 2009, the work proposed on an There are different types of data mining application to find spatial distribution of blood tasks depending on the use of data mining results. donors from the blood bank information system, These data mining tasks are categorised as follows: blood donation and transfusion services are carried out and help the patients to access the availability of 1. Exploratory Data Analysis: It is simply blood from anywhere. exploring the data without any clear ideas of what [9] In the year 2009, the research was based we are looking for. These techniques are interactive geographical variation in correlates of blood donor and visual. turn out rates-an investigation of Canadian metropolitan areas. In this work, Canadian blood 2. Descriptive Modeling: It describes all the data, it services i.e. an organization that aims the collecting includes models for overall probability distribution and distributing the blood supply across the country. of the data, partitioning of the p-dimensional space Accessibility variables are introduced and calculated into groups and models describing the relationships using a 2-step floating catchment area in order to between the variables. analyse the accessibility of services. The regression technique is also used. 3. Predictive Modeling: This model permits the [10] In the year 2010, the research is accomplished on value of one variable to be predicted from the application of CART algorithm in blood donor’s known values of other variables. classification. In this work, one of the popular data 1305 | P a g e
  • 4. Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.1303-1309 mining technique i.e. Classification is used and through [14] In year 2011, the work was proposed on an the use of CART application, a model is created that intelligent system for improving performance of determines the donor’s behavior. blood donation. In this work, many important [11] In the year 2010, the data mining tool is used to techniques i.e. Clustering, K-means classification is extract information from PPI (Protein-Protein adopted that improves the performance of blood Interaction) systems and developed a PPI search donation services. system known as PP Look that is an effective tool [19] In January 2012, the data mining techniques are 4-tier information extraction systems based on a full applied on medical databases and clinical databases sentence parsing approach. In this paper, researchers which are used to store huge amount of information introduced a useful tool that is PP Look which uses regarding patient’s diagnosis, lab test results, an improved keyword, dictionary pattern matching patient’s treatments etc. which is a way of mining the algorithm to extract protein-protein interaction medical information for doctors and medical information from biomedical literature. Some visual researchers. In all developed countries, it is found methods were adopted to conclude PPI in the form of that the routine health tests are very common among 3D stereoscopic displays. all adults. It is also determined that precautionary [12] In December 2010, the work is proposed on measures are less expensive rather than the binary classifiers for health care databases i.e. a treatments and also facilitates a better chance for comparative study of data mining classification patient’s treatment at earlier stages. In this research algorithms in the diagnosis of breast cancer. This work, five medical databases are used and work helps in uncovering the valuable knowledge experimental results are computed using data mining hidden behind them and also helping the decision software tool. makers to improve the health care services. In this [20] In February 2012, the work is carried out work, the presented experiment provides medical through the Health Care Applications to diagnose doctors and health care planers a tool to help them different diseases using Red Blood Cells counting. In quickly make sense of vast clinical databases. this work, the researchers presented the automatic [15] In February 2011, the research paper presented process of RBC count from an image and some of the on classifying blood donors using data mining data mining techniques like Segmentation, techniques, the work is performed on blood group Equalization and K-means clustering are used for donor data sets using classification technique. preprocessing of the images. Some diseases which [13] In August 2011, the research work carried out on are related to sickness of RBC count also predicted. interactive knowledge discovery in blood transfusion [21] In this work, the main purpose of the system is data set, the work is done through conducting data to guide diabetic patients during the disease. Diabetic mining experiments that help the health professionals patients could benefit from the diabetes expert system in better management of blood bank facility. by entering their daily glucoses rate and insulin [16] In year 2011, the work presented on rule dosages; producing a graph from insulin history; extraction for blood donors with fuzzy sequential consulting their insulin dosage for next day. It’s also pattern mining, fuzzy sequential pattern algorithm is tried to determine an estimation method to predict used to extract rules from blood transfusion service glucose rate in blood which indicates diabetes risk. In center data set that predicts the behaviour of donor in this work, the WEKA data mining tool is used to the future. classify the data and the data is evaluated using [17] In August 2011, the research work proposed on 10-fold cross validation and the results are compared. a comparison of blood donor classification data mining models, which uses decision tree to examine IV. RESEARCH METHODOLOGY the blood donor’s classification. In this work, 4.1 Data Mining Techniques in Blood Bank comparison between extended RVD based model and Sector DB2K7 procedures are carried out and it was discovered that RVD classification is better than In our proposed methodology, we suppose a blood DB2K7 in aspects of recalling and precision bank system has the following objectives: capability.  Increase blood donor’s rate and their attitude. [18] In the September 2011, the research work  Attract more blood donors to donate blood. carried out on real time blood donor management  Assist blood bank professionals in making using dash boards based on data mining models, the policy on the acquisition of blood donors and data mining techniques are used to examine the blood new blood banks. donor classification and promote it to development of There are several major data mining techniques had real time blood donor management using dash boards been developed and used in data mining research with blood profile or RVD profile and geo-location projects recently including classification, clustering, data. association, prediction and sequential patterns. These techniques and methods in data mining need brief mention to have better understanding. 1306 | P a g e
  • 5. Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.1303-1309 4.1.1 Classification specific target variable. Association rule has the The estimation and prediction may be viewed as several algorithms like: Apriori, CDA, DDA, types of classification. The problem usually is interestingness measure etc. Association rules are evaluating the training data set and then we apply to if/then statements that help uncover relationships the future model. between seemingly unrelated data in a relational The following table1 shows different classification database or other information repository. An algorithms: example of an association rule can be ‘if a customer buys a dozen eggs, he is 80% likely to also purchase Type Name of algorithm milk.’ An association rule has two parts, an Statistical Regression antecedent (if) and a consequent (then). An Bayesian antecedent is an item found in the data. A Distance Simple distance consequent is an item that is found in combination K-nearest neighbors with the antecedent. Decision ID3 The association rules are created by analyzing data Tree C4.5 for frequent if/then patterns and using the CART criteria support and confidence to identify the most SPRINT important relationships. The support is an indication Neural Propagation of how frequently the data items appear in the Network NN Supervised learning database. The confidence indicates the number of Radial base function times the if/then statements have been found to be network true. Rule based Genetic rules from DT Types of Association Rule Techniques are as Genetic rules from NN follows: Genetic rules without  Multilevel Association Rule DT and NN  Multidimensional Association Rule Table 1: Classification Algorithms  Quantitative Association Rule 4.1.2 Clustering 4.1.4 Prediction Clustering technique is grouping of data, which is The prediction as it name implied is one of the data not predefined. By using clustering technique we mining techniques that discovers relationship can identify dense and sparse regions in object between independent variables and relationship space. The following table2 shows different between dependent and independent variables. For clustering algorithms: instance, prediction analysis technique can be used Type Name of algorithm in blood donors to predict the behaviour for the Similarity and Similarity & distance future if we consider donor is an independent distance measure variable, blood could be a dependent variable. Then measure based on the historical data, we can draw a fitted Outlier Outlier 3333 regression curve that is used for donor’s behaviour Hierarchical Agglomerative, Divisive prediction. Regression technique can be adapted for Partitional Minimum spanning tree predication. Regression analysis can be used to Squared matrix model the relationship between one or more K-means independent variables and dependent variables. In Nearest Neighbor data mining independent variables are attributes PAM already known and response variables are what we Bond Energy want to predict. Unfortunately, many real-world Clustering with Neural problems are not simply prediction. Network Types of Regression Techniques are as follows: Clustering large BIRCH  Linear Regression database DB Scan  Multivariate Linear Regression CURE  Nonlinear Regression Categorical ROCK  Multivariate Nonlinear Regression Table 2: Clustering Algorithms 4.1.5 Sequential Patterns 4.1.3 Association Rule Sequential patterns analysis in one of data mining The central task of association rule mining is to find technique that seeks to discover similar patterns in sets of binary variables that co-occur together data transaction over a business period. The uncover frequently in a transaction database, while the goal patterns are used for further business analysis to of feature selection problem is to identify groups of recognize relationships among data. that are strongly correlated with each other with a 1307 | P a g e
  • 6. Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.1303-1309 V. CONCLUSION This paper presents an overview of data mining and [11] Dr. Varun Kumar, Luxmiverma; ‘Binary its techniques which have been used to extract classifiers for Health Care databases - A interesting patterns and to develop significant comparative study of data mining classification relationships among variables stored in a huge algorithms in the diagnosis of Breast cancer’, dataset. Data mining is needed in many fields to IJCST, Vol. I, Issue 2, December 2010. extract the useful information from the large amount [12] Vikram Singh and Sapna Nagpal; ‘Interactive of data. Large amount of data is maintained in every Knowledge discovery in Blood Transfusion Data field to keep different records such as medical data, Set’; VSRD International Journal of Computer scientific data, educational data, demographic data, Science and Information Technology; Vol. I, Issue 8, financial data, marketing data etc. Therefore, 2011. different ways have been found to automatically [13] Wen-ChanLee and Bor-Wen Cheng; ‘An analyze the data, to summarize it, to discover and Intelligent system for improving performance of characterize trends in it and to automatically flag blood donation’, Journal of Quality, Vol. 18, Issue anomalies. The several data mining techniques are No. II, 2011. introduced by the different researchers. These [14] P.Ramachandran et al. ‘Classifying blood techniques are used to do classification, to do donors using data mining techniques’; IJCST, Vol. clustering, to find interesting patterns. In our future I, Issue1, February 2011. work, the data mining techniques will be [15] F. Zabihi et al. ‘Rule Extraction for Blood implemented on blood donor’s data set for predicting donators with fuzzy sequential pattern mining’; The the blood donor’s behaviour and attitude, which have Journal of mathematics and Computer Science; Vol. been collected from the blood bank center. II, Issue No. I, 2011. [16] Shyam Sundaram and Santhanam T: ‘A comparison of Blood donor classification data REFERENCES mining models’, Journal of Theoretical and Applied [1] Badgett RG: How to search for and evaluate Information Technology, Vol.30, No.2, 31 August medical evidence. Seminars in Medical Practice 2011. 1999, 2:8-14, 28. [17] Shyam Sundaram and Santhanam T; ‘Real [2] Jaiwei Han and Micheline Kamber, ‘Data Time Blood donor management using Dash boards Mining Concepts and Techniques’, Second Edition, based on Data mining models’, International Morgan Kaufmann Publishers. Journal of Computing issues, Vol.8, Issue5, No.2, [3] ‘Data Mining Introductory and Advanced September 2011. Topics’ by Margaret H. Dunham. [18] Prof. Dr. P.K. Srimani et al. ‘Outlier data [4] Sunita B Aher, LOBO L.M.R.J., International mining in medical databases by using statistical Conference on Emerging Technology Trends methods’, Vol.4, No.1, January 2012. (ICETT)-2011 Proceedings published by [19] Alaa hamouda et al; ‘Automated Red Blood International Journal of Computer Applications Cells counting’, International Journal of Computing (IJCA). Science, Vol.1, No.2, February 2012. [5] B.G. Premasudha et al., ‘An Application to find [20] Ivana D. Radojevic et al. ‘Total coliforms and spatial distribution of Blood Donors from Blood data mining as a tool in water quality monitoring’, Bank Information’,Vol. II, Issue No.2, July- African Journal of Microbiology Research, Vol. December 2009. 6(10), 16 March 2012. [6] Jian Mu et al., ‘Segmentation, Reconstruction, [21] P.Yasodha, M. Kannan; Analysis of a and analysis of blood thrombus formation in 3 D- Population of Diabetic Patients Databases in Weka 2photon microscopy images’, EURASIP Journal on Tool; International Journal of Scientific & Advances in Signal Processing, 10 July 2009. Engineering Research Volume 2, Issue 5, May 2011. [7] Abdur Rashid Khan et and al. ‘Web based . Information System for Blood Donation’, Author’s Profile International Jounal of digital content Technology and its applications,Vol. 3, Issue No.2, July 2009. [8] G. Satyanaryana Reddy et al; ‘Management Information System to help Managers for providing decision making in any organization’. [9] T. Santhanm and Shyam Sunderam: ‘Application of Cart Algorithm in Blood donor’s classification’, Journal of computer Science Vol. 6, Issue 5. Ankit Bhardwaj received the B.Tech. Degree in [10] Zhangetal. ‘PPLook: an automated data Information Technology from Maharishi Dayanand mining tool for protein–protein interaction’, BMC University, Rohtak, Haryana, India in 2009 and pursuing Bio-informatics- 2010. M.Tech (Comp.Sc. & Engg) from Haryana College of 1308 | P a g e
  • 7. Ankit Bhardwaj, Arvind Sharma, V.K. Shrivastava / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.1303-1309 Technology and Management, Kaithal, Kurukshetra University, Kurukshetra. His research interest lies in the area of Data Mining. Arvind Sharma received the M.Sc. Degree in Computer Science from Maharishi Dayanand University, Rohtak, Haryana, India in 2003, M.C.A. Degree from JRNRV University Udaipur, Rajasthan, India, M.Phil. Degree in Computer Science with specialization in Web Data Mining Applications from the Alagappa University, Karaikudi, Tamil Nadu, India and pursuing Ph.D. Computer Science from Jaipur National University, Jaipur, Rajasthan, India. He has published many technical papers in International and National reputed journals and conferences. His research interest lies in the area of Data Mining and Web Applications. V.K. Shrivastava is working as an Associate Professor and HOD, Department of Information Technology, Haryana College of Technology and Management, Kaithal, Haryana, India. He has published many research papers in International and National reputed journals and conferences. His research interest lies in the area of Digital Signal Processing. 1309 | P a g e