Contenu connexe
Plus de Shubhashish Biswas
Plus de Shubhashish Biswas (6)
Ebookforpredictiveanalytics2 120305070450-phpapp01
- 1. Hexaware E-book
on Predictive Analytics
Business Intelligence & Analytics
Actionable Intelligence Enabled
© Hexaware Technologies. All rights reserved.
www.hexaware.com
Published on : Feb 7, 2012
- 2. © Hexaware Technologies. All rights reserved.
www.hexaware.com
Hexaware E-book on
Predictive Analytics
What is Data mining?
Data mining, or knowledge discovery, is the computer-assisted process of
finding hidden patterns in data. Data mining tools predict behaviors and
future trends, allowing businesses to make proactive, knowledge-driven
decisions. Hence it is also called predictive analytics.
Data mining tools can answer business questions that traditionally were
time consuming to resolve. They scour databases for hidden patterns,
finding predictive information that experts may miss because it lies outside
their expectations.
When and where the data mining and predictive analytics could be
useful?
The amount of raw data stored in corporate databases is exploding. From
trillions of point-of-sale transactions and credit card purchases to
pixel-by-pixel images of galaxies, databases are now measured in
gigabytes and terabytes. Raw data by itself, however, does not provide
much information.
In today's fiercely competitive business environment, companies need to
rapidly turn these terabytes of raw data into significant insights into their
customers and markets to guide their marketing, investment, and
management strategies.
In these scenarios data mining would help you unlock the hidden potential
of your data and deliver actionable insights.
Can you name some specific uses of data mining?
Some Specific uses of data mining include:
Market segmentation - Identify the common characteristics of customers
who buy the same products from your company.
Customer churn - Predict which customers are likely to leave your
company and go to a competitor.
Fraud detection - Identify which transactions are most likely to be
fraudulent.
Direct marketing - Identify which prospects should be included in a
mailing list to obtain the highest response rate.
Interactive marketing - Predict what each individual accessing a Web site
is most likely interested in seeing.
Market basket analysis - Understand what products or services are
commonly purchased together; e.g., beer and diapers.
Trend analysis - Reveal the difference between typical customers this
month and last.
Will sharing data to do data mining raise privacy related issues?
What needs to be done in such scenarios?
There is a way to deal with sensitive data like credit card numbers,
insurance policy numbers and account numbers. Data need to be masked
or recoded to maintain the privacy.
1
- 3. © Hexaware Technologies. All rights reserved.
www.hexaware.com
Name some tools used for Data mining?
a. Microsoft BI stack has SSAS as part of SQL Server 2008
b. There is an open source tool called R which offers data mining
solution
c. Rapid Miner is an another open source tool
d. SAS is of course is a highly sophisticated tool with enormous
computational power
e. IBM has it’s tool called SPSS
f. Oracle’s ODM – Oracle Data Miner
The above list is not exhaustive.
What are all the industries in which Predictive Analytics is
applicable?
We have recently provided predictive analytics solutions for following
Industries:
• Insurance
• Education
• Mining
• Logistics
• Health & Hygiene
In short, predictive analytics can be deployed for diverse industries.
Public
What are all steps involved in data mining?
Data mining process involves predefined steps starting from
• Business case understanding or Problem understanding,
• Data understanding
• Data extraction
• Pre-processing
• Mining model building
• Testing and Evaluation
Is there any maintenance to be done for the mining model once it is
deployed?
Yes, the mining model needs to be calibrated at least once in six months.
The frequency varies based on the business need and the volume of data
flow
Calibration involves
• Checking the predicted output with the actual output
• Modifying the mining model if required
Can data mining solution be offered on the cloud?
Yes, Data mining solution can be offered on the cloud.
Organizations can adopt “Pay per use” method without investing on the
infrastructure required. Here the challenge is to upload huge amount of
data on the cloud.
2
- 4. What are all the risks involved in using predictive analytics solution?
• Wrong understanding of business problems and data will result in a
prediction model with complex statistical algorithms, but it will be of no
use to the business.
• Wrong interpretation of the results would lead to wrong decisions.
• Poor data quality would result in poor predictions.
• Absence of maintenance of the mining model would make predictions
obsolete.
• Building the predictive analytics solution with resources with less
statistical knowledge will lead to less accurate models.
What is Text mining?
Text Mining is the process of deriving high quality information from
unstructured text data. There are various techniques used to derive high
quality information from textual data, such as computational linguistics,
information retrieval, statistics, machine learning, etc.
Various forms of text mining include categorization, classification,
clustering, concept extraction, summarization, sentiment analysis, etc.
Are the open source tools sufficient and robust in providing
answers to tough business cases?
Open source tools like R and Rapid miner provide excellent flexibility to
build the model. Online R community constantly updates algorithms and
industry specific solutions as packages to R after validating. So far there
are around 3500 packages built in R.
R’s popularity has been increasing over the other predictive analytics tools.
In a recent survey Kdnuggets.com reports that R has 24% of market share
and R is the most sought after statistical programming language.
© Hexaware Technologies. All rights reserved.
www.hexaware.com
Can you list some of the best practices in Data mining and
Predictive analytics?
• Executive Support: Support from the decision makers and middle
management would make a world of difference.
• Business problem specificity: Identification of correct business problem
to apply predictive analytics is vital to the success of the mining model.
• Availability of historical data: Richer the data, the more robust will be
the mining model.
• Good quality data: It is the most important factor for an accurate mining
model.
• Pre-processing: While building the mining model, one of the main
activities is pre-processing where the data is cleansed, sliced, diced
and categorized to suit to the mining model. Good business
knowledge and a sound data mining knowledge is required to do this
as this is the base for the predictive model.
• Selection of statistical techniques: Experienced data mining resource
can choose the correct statistical technique and can compare the
accuracy of other techniques.
• Interpretation of output: It is extremely important to interpret the output
in the correct way and link it back to the business problem stated
initially.
3
- 5. © Hexaware Technologies. All rights reserved.
www.hexaware.com
Thank you for reading our E- Book, in case you have any queries please write back to us at corporatemarketing@hexaware.com
If you want to keep up with the industry's latest trends, please visit our blog on BI http://blogs.hexaware.com/index/business-intelligence
For more information on our Business Intelligence & Analytics services please visit http://hexaware.com/business-intelligence-analytics.htm
4
About Hexaware
Hexaware is a leading global provider of IT and BPO services. The
company has achieved leadership position in domains such as
Banking, Financial Services, Insurance, Transportation, Logistics and
HR-IT solutions. Hexaware focuses on delivering business result
leveraging technology solution and specializes in Business
Intelligence & Analytics, Enterprise Applications, Independent Testing
and Legacy Modernization. Hexaware has been providing business
technology solutions for over 20 years and offers world class services
delivery, technology leadership and skilled human capital.