Generative AI for Technical Writer or Information Developers
Bi ppt version 3.6.2
1. Aureole InfoTech Co.
Hamdard University A Summer Training Report on
Department of
Management Studies
New Delhi-India
November 2006
Presented by:
Parinaz SarafiGohar
Hamid ShamlouNasab
MBA 2nd year
2. Agenda
I. Introduction
II. Business Intelligence
III. Open-Source
IV. BI Products
V. Summary
3. I.1 Introduction
Research Methodology
Research Mandate:
Understanding BI systems, architecture and features
of available tools
Research Type:
Descriptive and Comparative
Approaches:
Product-Oriented:
Hardcopy report
Focus on products
Two categories
Side-by-side comparison
Concept-Oriented:
PPT Presentation
Different audience’s background in IT
Focus on concepts
Pentaho as a complete BI suite
4. I.2 Introduction
Objectives
Our objective to this report is based on the
fact that most organizations have become
very good at catching and storing data that
they generate every time they perform a
business operation
But the question is: “Is it enough !?”
Does simply saving detailed
data/information on each and every entity,
guarantee access to useful intelligence !?
5. I.3 Introduction
Today’s Business Environment
Huge and constantly growing operational
databases, but little insight into the driving
forces of the business
Moving from product-centric world to a
customer-centric world
Rapidly advancing technology, delivers
new opportunities
Reduced time to market
Highly competitive environment
Mergers and acquisitions cause
business confusion
The goal is to be more competitive
6. II. Business Intelligence
II.1 Importance of BI
II.2 What is BI?
II.3 BI Environment & Business Flow
II.4 BI Implementation
7. II.1 Importance of BI
Airplane Scenario
Pilot: “Airplane has lost all communications
with air traffic control”
no way to understand the
flight environment (other airliners and potential hazardous weather)
Pilot: “There’s nothing to worry about,
because I’m an experienced pilot and have
flown the same route many times !”
Yet many business leaders make
decisions daily without an
operational business radar
A reliable BI system
8. II.1 Importance of BI
Airplane Scenario (Continued)
It doesn’t matter if the plane is large or
small;
the pilot must know the environment,
in which, the plane is flying
9. II.1 Importance of BI
Facts
“The question is not whether your company
will lose touch with the competitive arena or
not, but when it will lose touch.” Ben Gilad, educator and
author of Business Blindspots
In 1980, when ‘Richard Sears’ (Roebuck-int. Dep.Stores)
was the retail leader, they admitted they
had never heard of ‘Sam Walton’ and ‘Wal-
Mart’; they know them now
More than 200 companies that made up the
1979 ‘Fortune 500’ are now out
of business, just 21 years later
Of the companies that made the
1955 list, 70% (350) no longer exist
10. II.1 Importance of BI
Facts (Continued)
Recently, executives of the a leading
Internet network supplier, said they had no
need for a business intelligence system
because they’re the market leader and
have no real competition
If they don’t have competition now,
they will
11. II.1 Importance of BI
Facts (Continued)
In 1997, the CEO of a Silicon Valley-based
software company said he had all the
information he needed about his
Competitors
Several years later, that same CEO used a
private investigator to unethically pilfer
through the trash of a major competitor
When that CEO needed competitive
intelligence to learn about the future
strategy and courses of action of that
competitor (a large Seattle-based software company),
he didn’t have the capability
12. II.1 Importance of BI
Questions
What was the net profit for a particular product
last year?
What will be the total sales for
coming year?
What are the key factors to be
focused on, in order to increase the
sales for this year?
How can we analyze our competitors?
How fast can we assess the
business environment?
How can we gain our competitive
advantages?
13. II.2 What is BI?
Definition
“It’s a systematic process that
collects,
analyzes, and
organizes
the flow of critical information,
focusing it on important strategic
and operational issues” James H. Thomas
Using BI, the corporate data can be
organized and analyzed in a better
way and then converted into an
useful knowledge
14. II.2 What is BI?
Definition
Movies Graphics
Spread-sheets Web Pages
Text
Video CRM DSS DM
KM GIS
Audio
Documents EIS OLAP
ERP DW
A single ver sion
of t he
TR UTH
15. II.2 What is BI?
Features
BI applications include:
Query and Reporting
Online Analytical Processing (OLAP)
Statistical Analysis
Decision Support Systems
Forecasting
Data Mining
Dash boarding
16. II.2 What is BI?
BI Model
BI Models are based on:
Key Performance Indicators (KPI)
Multi dimensional analysis
17. II.2 What is BI?
BI Model KPI: Key performance indicators
KPI is a statistical measure used to quantify
objectives to reflect the strategic performance of
an organization
A KPI is used in BI to assess the present state of
business and to prescribe the course of action
KPIs are frequently used to “value”,
difficult-to-measure activities
Benefits of leadership development
Engagement service
Satisfaction
KPIs are typically tied to an organization's
strategy
18. II.2 What is BI?
BI Model KPI - Key performance indicators (Continued)
KPIs differ depending on the nature of the organization
and the organization's strategy
A KPI is a key part of a measurable objective, which is
made up of a direction, KPI, benchmark, target and
timeframe
For example: "Increase Average Revenue per Customer
from $10 to $15 by EOY 2008"
Where 'Average Revenue Per Customer' is the KPI
KPI should not be confused with a Critical Success
Factor
For the example above, a critical success factor would be
something that needs to be in place to achieve that
objective; e.g. “launching a new product”
19. II.2 What is BI?
BI Model Multi-Dimensional Data (Cube)
Sales of Laptop in India during Fall season
Season Total
Spring Summer Fall Winter Product-Branch
t
PC
uc
Laptop India
od
Pr
PDA
Branch
Branch
Total Australia
Season-Branch
USA
Total
Branch
l
ta uct
Total To rod
Product-Season P
Total
Seasonal
GRAND TOTAL
20. II.2 What is BI?
Goals
BI is binocular which ensures
management isn’t blindsided
The primary goals of BI are:
Avoid surprises
Identify threats and opportunities
Understand where your company is vulnerable
Decrease reaction time
Out-think the competition
Protect intellectual capital
21. II.3 BI Environment & Business Flow
Journey from Data to Wisdom
According to Russell Ackoff, a systems theorist and professor of organizational change,
the content of the human mind can be classified into five categories:
Data: symbols, raw facts
Information: data that are processed to be useful; provides
answers to “who”, “what”, “where”, and “when” questions
Knowledge: relevant and actionable data and information;
answers “how” questions
Understanding: appreciation of “why” and difference between
understanding and knowledge is like difference between
learning and memorizing
Wisdom: evaluated understanding and by that we can judge
between wrong and right, between good and bad
“Wisdom is not a product of schooling, but of
the lifelong attempt to acquire it” Albert Einstein
22. II.2 What is BI?
Journey from Data to Wisdom (Continued)
Division
New Product Succession Plans
Intro Plans
BI is comprised of Marketing a variety of types of information
Strategy
What if?
Material
that can range from Costs Tooling
Decisions
Ultimate
Pricing
being fairly easy to acquire, Research
Volumes &
Programs Strategy
to being very difficult to acquire
Capacities Sales
Interviews
Mix Emphasis
Street
Pricing
Local Press Customer Information on the right side is
Satisfaction Product typically only available through
Trade Press
Teardown primary research interviews
Dictionary
Listings D&B
ADS
DOW
Annual
Reports Product
Literature
Information on the left side is often available
through secondary research using online
databases or the Internet
23. II.3 BI Environment & Business Flow
BI Architecture
Enterprise
Applications
Data Mart Cubes
Data
Warehouse
SQL Server
Oracle
Main Frame
Data Mart
1 2 3 4 5
Output:
OLTP Analytical Data
BI Tools Performance
Systems Infrastructure Management
Management
24. II.3 BI Environment & Business Flow
Challenges
Business Executive CIO
Heterogeneous data sources
Building an information
Infrastructure that is
Application freedom accessible & can integrate
and data from all application
unlimited access to servers
data Providing a single
transparent interface to
those applications
A system infrastructure that
dynamically allocates system
resources to guarantee that
Information systems business priorities are met
in synch with business Ensuring that service level
agreements are constantly
processes honored
Appropriate volume of work is
consistently produced
25. II.3 BI Environment & Business Flow
Challenges (Continued)
Business Executive CIO
Total cost of ownership
Low purchase cost
Skill shortages and rising
cost of workforce
Availability of data
Access to all the data Multi-tiered and multi-vendor
all the time solutions
Real-time updates to operational
data stores and data
Ability to transform warehouses
information into actions No interruption of end user’s
access
26. II.4 BI Implementation
BI Implementation
BI Implementation is a large process that involves:
Business Models
Data Models
Data sources
ETL
Tools
Then it transforms and organizes the data into:
Useful Information
Target Data warehouse
Data marts
OLAP analysis
Reporting Tools
27. II.4 BI Implementation
Requirements for setting up a BI environment
Intelligence environment relies on
Tools
Techniques
Processes
Skilled business people
29. III.1 Open-Source
Definition
Open-Source software is
customer-constructed software
With the source code
Is modifiable
Is resalable
Open-Source is like a stone
thrown into a pond;
the ripples spread outwards,
even if you can no longer see the
stone that caused them
30. III.2 Open-Source
Regulations
Free Redistribution
Source Code
Derived Works (Modification and redistribution)
No against Persons or Groups
No Discrimination Against Fields of Endeavor
License Must Not Be Specific to a Product
License Must Not Restrict
Other Software
License Must Be
Technology-Neutral
31. III.3 Open-Source
Benefits
Lower software costs
More flexibility
More reliable products
Better standardization and long term
stability
Not reliant on a single vendor
Faster pace of innovation
New projects can build on the existing
base of Open-Source code
Peer review increases security for systems
exposed to public networks
32. III.4 Open-Source
Risks
Open-Source projects may fail
Open-Source projects are not deadline
driven
There are some application areas where
the economics don't make sense
Open-Source software is not as well
established as proprietary software
Open-Source software is unproven
for non-technical applications
33. III.5 Open-Source
Why should we use it?
The main reasons are:
Internet as a key enabler for development & distribution of open-source
Linux & Apache (Popularity & market share of %15)
Changes in proprietary software pricing
Shortcomings of proprietary solutions
x
Li nu
34. III.5 Open-Source
Shortcomings of Proprietary Solutions
Price
Usability They price, to are
The solutions pricing skills
Lack‘solutions’are tool
are adequate
of unable
Toosoftware areto provide
The difficult foruse for
The solutions
Skills proprietarynot vendor
maintenance impossible
complete and
customers toacosts, to
sets andfromsupport the
transfer tracking and
most users thesolution
focused on
models do not develop
at all Who and the
for customers services
prototypingand integrate
auditing. phases
solutionsandgot
support,
customer
reporting analysis
Customization aftermarket suppliersthe
to action to
necessaryand ignore the
report? What ensuredid
are too expensive
of KPIs, direct
business rules the
Tool-set Orientation Lack of of Business
success implementation
extend andHow long did it
they take?
performance of the not
system. Customers did
methodologies
Intelligence projects.
take? Was a business
Extensibility processes thatas outlay
process initiated affect
buy the software, they paid
Significant financial a
Reporting and analysis focus the contractual
and metric right to use
upfront How far along is
result? for the
it. This is likemust be a the
agreements gettingis
that process? What
Tracking and Auditing lease on a car full making
performance ofbut
signed before the
Prototyping all the payments on day
evaluation
process? and prototyping
one:be doneworst of both
can it’s the
worlds
35. III.6 Open-Source
L.A.M.P
The acronym LAMP refers to a set of free
software programs commonly used
together to run dynamic web sites or
servers:
Linux, the operating system
Apache, the Web server
MySQL, the database management system
Perl, PHP, Python, the scripting/programming
languages
To be precise, it is an
Open-Source Web platform
36. IV. BI Products
IV.1 Open-Source BI Products IV.2 Proprietary BI Products
IV.1.A Pentaho
IV.2.A Microsoft
IV.1.B BEE Project
IV.2.B SAS
IV.1.C Bizgres
IV.2.C Cognos
IV.1.D MARVELit
IV.2.D Hyperion
IV.1.E Open I
IV.2.E Panorama
IV.1.F SpagoBI
IV.2.F Prophix
IV.1.G JasperSoft
IV.2.G Targit
IV.1.H Firebird
IV.2.H TM1
IV.1.I MySQL
IV.1.J PostgreSQL
37. IV.1 Open-Source BI Products
Pentaho
Process-Centric:
Processes can be easily customized and new processes can be added
Solution-Oriented:
Enables companies to develop complete solutions to business
intelligence problems
The platform consists of:
BI framework:
Provides logging, auditing, security, scheduling, ETL, Web services,
attribute repository, rules engines
BI component:
Includes reporting, analysis, workflow, dashboards, Data mining
BI workbench:
A set of design and administration tools that allows business analysts or
developers to create reports, dashboards, analysis, models, business
rules
Desktop Inboxes:
Third-party RSS readers
39. IV.1 Open-Source BI Products
Pentaho Architecture
Briefly, we can say Pentaho includes:
Server
BI work bench
Inbox alerter
40. IV.1 Open-Source BI Products
Pentaho Architecture - Server
Server is made up of
BI framework
BI components
The server runs inside a J2EE Web server
such as:
Apache
Oracle
WebLogic
JBoss
Websphere
In Pentaho, component content can be
retrieved as XML, HTML
41. IV.1 Open-Source BI Products
Pentaho Architecture - Server
The Pentaho Server includes embedded
repositories that store the data necessary
to define, execute and audit a solution
Solution repository: The meta data that defines
solutions
Runtime repository: Items of work that the workflow
engine
Audit repository: Tracking and auditing information
42. IV.1 Open-Source BI Products
Pentaho Architecture
External application (data warehouse-data mart using an Open-Source
ETL tool)
The services of BI framework (Web services, Workflow engine)
the Pentaho Server includes:
Reporting
Workflow
Business rules
Dashboards/ analysis
Web services
Scheduling
Pentaho BI platform provides system monitoring via SNMP
(Simple Network Management Protocol)
The Repositories are stored inside an RDBMS that is outside of
the Pentaho platform (FireBird (preferred) , MySQL, Oracle, SQLServer or
DB/2)
The Desktop Alerter is an application that provides alerts in
RSS format
43. IV.1 Open-Source BI Products
Pentaho Architecture – Inbox Alerter
Pentaho- Inbox Alerter
The optional inbox alerter is an agent that
needs to be installed on the machines of the
users that wish to take advantage of its
functionality
It has these features:
Notification of new workflow tasks
Notification of report delivery
Management of Off-line content
44. IV.1 Open-Source BI Products
Pentaho Architecture – Work Bench
Pentaho- BI Work Bench (Continued)
Analysis
Enables ad-hoc, interactive data exploration with
the ability to slice-and-dice, drill-down, and pivot
information
Includes highly graphical front-end to OLAP
cubes for automated
aggregation and
speed-of-thought
response times
45. IV.1 Open-Source BI Products
Pentaho Architecture – Work Bench
Pentaho- BI Work Bench
It provides easy to use design tools for
reports, dashboards, analytic views
Reporting
From simple reports on a web page
to high quality production reporting
for applications such as financial statements
and other formal reporting needs
Enterprise-class features include
automated bursting of reports tailored
by role, parameter-driven filtering,
and a server-based report repository
46. IV.1 Open-Source BI Products
Pentaho Architecture – Work Bench
Pentaho- BI Work Bench (Continued)
Dashboards
Brings together reports, analyses, and other
displays into a single graphical place for easy
access
Can be customized by
person, business role,
and/or subject matter
47. IV.1 Open-Source BI Products
Pentaho Architecture – Work Bench
Pentaho- BI Work Bench (Continued)
Data mining console for data preparation
Uncovers hidden relationships in data which can
be used to optimize business processes and
predict future results
Provides a full range of advanced data mining
algorithms
Enables results
to be displayed to
users in an
easy-to-understand
format
48. IV.1 Open-Source BI Products
How Pentaho solves the problem
It integrates: Design & administration tools
Workflow
Analysis tools
Business rules
Dashboards
Information delivery
Data warehouse
Notification
Data mining
Scheduling
Inbox alerter
Auditing
Application Integration
Content navigation
User Interfaces
Reporting tools
49. Summary
Importance of BI Lower software costs
Process-Centric
A systematic process
Analytical Tools
As a competitive
that Warehousing
advantage
More flexibility
Solution-Oriented
Datacollects
Business Intelligence
Morebinocular which
As aanalyzes
OLAPreliable
BI features ensures
organizes
products
ETL Tools
Architecture
management isn’t
the flow of critical
Open-Source benefits Betterboarding
information
Dash
blindsided
Server
standardization and
BI Product Flexible the right
Access Reports
long term stabilityof
Assists all bench
BI work levels
information at the
Not reliantdesigning
Workflow org. in
people inalerter
Inbox on a
right time
making
single vendor
Single version of the
strategic
TRUTH tactical
operational
L.A.M.P: web
DECISIONs
platform
50. Magazines
References
Data Quest Magazine - Issued: May 31, 2006
World Business Magazine - Issued: June 5, 2006
Articles
Business Intelligence, by Elizabeth Vitt, Michel Luckvich, Stucia Misner
Thank You
Sun Microsystems-Business Intelligence and Data Warehousing -Transform raw data into business results
Microsystems-Business
IBM systems Journal: The integration of business intelligence and knowledge management, by W. F. Cody, J. T.Kreulen, V. Krishna, and W. S. Spangler
For Your Migration to Open-Source Databases, by Jutta Horstmann
Moving to Strategic Business Intelligence, Butler Group, Mar. 1, 2006
Time And Attention
The Business Value of Business Intelligence, by: Steve Williams,Nancy Williams
Business Intelligence, why?, By: James H. Thomas Jr.
IBM-Business Intelligence Architecture on S/390, by:Viviane Anavi-Chaput, Patrick Bossman, Robert Catterall,
Kjell Hansson, Vicki Hicks, Ravi Kumar, Jongmin Son
Data, Information, Knowledge, and Wisdom, by Gene Bellinger, Durval Castro, Anthony Mills
Web
http://www.Bee.insightstrategy.cz
http://www.CaMagazine.com
Questions ?
http://www.180Systems.com
http://www.DestinationCRM.com
http://www.DMReview.com
http://www.LearnBI.com
http://www.Wikipedia.com
http://www.Oreillynet.com
http://www.OpenSource.org
http://www.Openi.sourceforge.net
http://www.pentaho.com
http://www.sas.com
http://www.hyperion.com
http://www.cognos.com
51. Aureole InfoTech Co.
Hamdard University A Summer Training Report on
Department of
Management Studies
New Delhi-India
November 2006
Presented by:
Parinaz SarafiGohar
Hamid ShamlouNasab
MBA 2nd year
Notes de l'éditeur
In this report, which is based on our research via the Web and magazines, First we have a brief introduction about: Objectives of this report How is Today’s Business Environment Then we describe basic concepts and features of BI technology Then we will have a look at Open-Source concept, its importance and effects on BI products and tools In forth section we introduce most famous BI products, focusing on “Pentaho” as a complete BI suite to explore how a BI product solves the organizational problems At final section we will conclude our discussion
There are two approaches to this project In 1 st approach, as we have done in the hard-copy report, our focus is on BI products. Products are divided into 2 categories: Open-Source and proprietary products each products is discussed in detail and there is side-by-side comparison among products make it easy to choose between them based on organizational requirements. In 2 nd Approach, as we have done for this presentation based on different audiences with different backgrounds on IT , and of course time limitation for evaluating each and every product, our focus is more on BI Basic concepts and its importance Open-Source advantages And we have selected Pentaho as the most powerful product in BI environment, base on which we try to understand more about a BI product, its features, abilities and architecture
It should be mentioned, this report does not imply that organizations are guaranteed success by using only these tools to improve overall corporate performance. An assessment of organizational unique needs is so important to gaining the most benefit from any technology.
Businesses today are faced with a highly competitive marketplace, where technology is moving at an unprecedented pace and customers’ demands are changing just as quickly. Understanding customers, rather than markets, is recognized as the key to success. Industry leaders are quickly moving from a product-centric world into a customer-centric world. Information technology is taking on a new level of importance due to its business intelligence application solutions. The goal is to be more competitive
Imagine being a passenger on an airplane when the pilot suddenly announces that the airplane has lost all communications with air traffic control as well as on-board radar. In other words , the pilot has no way to understand the flight environment — including other airliners and potential hazardous weather. Would it make you feel better if the pilot assured you that there’s nothing to worry about because he’s an experienced pilot and has flown the same route many times ? It shouldn’t, yet many business leaders make decisions daily without an operational business radar — a reliable business intelligence system. It doesn’t matter if the plane is large or small; the pilot must know the environment in which the plane is flying.
Sears, Roebuck and Company is a mid-range chain of international department stores , founded by Richard Sears and Alvah Roebuck. Sears merged with Kmart in early 2005, creating the Sears Holdings Corporation. The company competes on an average price level on par with J.C. Penney. Sears has also recently rivaled with Belk, Dillard's, and Macy's. However, the company competes below Bloomingdale's, Neiman Marcus, Nordstrom and Saks Fifth Avenue.
James H. Thomas Jr. is a market intelligence consultant who served for 26 years as a federal intelligence officer. He is managing director of the J Thomas Group Inc., specializing in the development of strategic business intelligence and counterintelligence systems. He developed the business intelligence system for three Fortune 100 companies and numerous smaller companies. e-Mail: jt-group@mindspring.com; Website: www.mindspring.com/~jt-group/default.htm.
Identifying indicators Performance indicators differ with business drivers and aims (or goals). A school might consider the graduation rate of its students as a Key Performance Indicator which might help the school understand its position in the educational community, whereas a business might consider the percentage of income from return customers as a potential KPI. But it is necessary for an organization to at least identify its KPIs. The key conditions before properly identifying KPIs are: Having a pre-defined business process. Having clear goals/performance requirements for the business proceses. Having a quantitative/qualitative measurement of the results and comparison with set goals. Investigating variances and tweaking processes or resources to achieve long-term goals. Categorization of indicators Key Performance Indicators define a set of values used to measure against. These raw sets of values fed to systems to summarize information against are called indicators. Indicators identifiable as possible candidates for KPIs can be summarized into the following sub-categories: Quantitative indicators which can be presented as a number. Practical indicators that interface with existing company processes. Directional indicators specifying whether an organization is getting better or not. Actionable indicators are sufficiently in an organization's control to effect change.
Ackoff indicates that the first four categories relate to the past; they deal with what has been or what is known. Only the fifth category, wisdom, deals with the future because it incorporates vision and design. With wisdom, people can create the future rather than just grasp the present and past. But achieving wisdom isn't easy; people must move successively through the other categories. Knowledge ... knowledge is the appropriate collection of information, such that it's intent is to be useful. Knowledge is a deterministic process. When someone "memorizes" information (as less-aspiring test-bound students often do), then they have amassed knowledge. This knowledge has useful meaning to them, but it does not provide for, in and of itself, an integration such as would infer further knowledge. For example, elementary school children memorize, or amass knowledge of, the "times table". They can tell you that "2 x 2 = 4" because they have amassed that knowledge (it being included in the times table). But when asked what is "1267 x 300", they can not respond correctly because that entry is not in their times table. To correctly answer such a question requires a true cognitive and analytical ability that is only encompassed in the next level... understanding. In computer parlance, most of the applications we use (modeling, simulation, etc.) exercise some type of stored knowledge. Understanding ... understanding is an interpolative and probabilistic process. It is cognitive and analytical. It is the process by which I can take knowledge and synthesize new knowledge from the previously held knowledge. The difference between understanding and knowledge is the difference between "learning" and "memorizing". People who have understanding can undertake useful actions because they can synthesize new knowledge, or in some cases, at least new information, from what is previously known (and understood). That is, understanding can build upon currently held information, knowledge and understanding itself. In computer parlance, AI systems possess understanding in the sense that they are able to synthesize new knowledge from previously stored information and knowledge. Wisdom ... wisdom is an extrapolative and non-deterministic, non-probabilistic process. It calls upon all the previous levels of consciousness, and specifically upon special types of human programming (moral, ethical codes, etc.). It beckons to give us understanding about which there has previously been no understanding, and in doing so, goes far beyond understanding itself. It is the essence of philosophical probing. Unlike the previous four levels, it asks questions to which there is no (easily-achievable) answer, and in some cases, to which there can be no humanly-known answer period. Wisdom is therefore, the process by which we also discern, or judge, between right and wrong, good and bad. I personally believe that computers do not have, and will never have the ability to posses wisdom. Wisdom is a uniquely human state, or as I see it, wisdom requires one to have a soul, for it resides as much in the heart as in the mind. And a soul is something machines will never possess (or perhaps I should reword that to say, a soul is something that, in general, will never possess a machine). Data represents a fact or statement of event without relation to other things. Ex: It is raining. Information embodies the understanding of a relationship of some sort, possibly cause and effect. Ex: The temperature dropped 15 degrees and then it started raining. Knowledge represents a pattern that connects and generally provides a high level of predictability as to what is described or what will happen next. Ex: If the humidity is very high and the temperature drops substantially the atmospheres is often unlikely to be able to hold the moisture so it rains. Wisdom embodies more of an understanding of fundamental principles embodied within the knowledge that are essentially the basis for the knowledge being what it is. Wisdom is essentially systemic. Ex: It rains because it rains. And this encompasses an understanding of all the interactions that happen between raining, evaporation, air currents, temperature gradients, changes, and raining. Now consider the following: I have a box. The box is 3' wide, 3' deep, and 6' high. The box is very heavy. The box has a door on the front of it. When I open the box it has food in it. It is colder inside the box than it is outside. You usually find the box in the kitchen. There is a smaller compartment inside the box with ice in it. When you open the door the light comes on. What is it? A refrigerator. You knew that, right? At some point in the sequence you connected with the pattern and understood it was a description of a refrigerator. From that point on each statement only added confirmation to your understanding. If you lived in a society that had never seen a refrigerator you might still be scratching your head as to what the sequence of statements referred to. Also, realize that I could have provided you with the above statements in any order and still at some point the pattern would have connected. When the pattern connected the sequence of statements represented knowledge to you. To me all the statements convey nothing as they are simply 100% confirmation of what I already knew as I knew what I was describing even before I started.
Many times, multiple operational systems may have different formats of data. Often, the transactional data does not provide a comprehensive view of the business environment and must be integrated with data from external sources such as industry reports, media data, etc. Existing data in the operational data store is updated to reflect the current status of the source system. Typically, the data is stored in “real time” and used for day-to-day management of business operations. • Data warehouse A data warehouse (or an enterprise data warehouse) contains detailed and summarized data extracted from transaction processing systems and possibly other sources. The data is cleansed, transformed, integrated, and loaded into databases separate from the production databases. The data that flows into the data warehouse does not replace existing data, rather it is accumulated to maintain historical data over a period of time. The historical data facilitates detailed analysis of business trends and can be used for decision making in multiple business units. • Data mart A data mart contains a subset of corporate data that is important to a particular business unit or a set of users. A data mart is usually defined by the functional scope of a given business problem within a business unit or set of users. It is created to help solve a particular business problem, such as customer attrition, loyalty, market share, issues with a retail catalog, or issues with suppliers. A data mart, however, does not facilitate analysis across multiple business units. • Extract, transform and load (ETL) tools These solutions are concerned with the collection of data from disparate systems (enterprise solutions across the business), the standardization of data, and then population of the data warehouse (DW). • Data quality (DQ) tools The usefulness of analysis of data from the DW depends on its quality. So-called 'dirty' data can significantly reduce the value of a CRM, problems include duplicate records, incomplete records and issues relating to the formatting of data from different sources. DQ tools are focused on addressing these issues. • Data warehouses (DW) Acting as an enterprise-wide data depository, the DW should enable what has become widely referred to as the 'single customer view'. The single customer view represents the full range of information a business holds on its customers and their interactions with the company. It should be held in a standardized format, and refreshed as appropriate for that company's needs • Business intelligence tools Rather than attempt to create an exhaustive list of the different types of tool used to analyze data, Datamonitor defines the broad range as business intelligence tools. These may include online analytical processing (OLAP), data mining, reporting, dashboards, ad-hoc reporting and numerous other tools. This range of technologies can be simplified further: • Analytical infrastructure, which include ETL and DQ tools; • Data warehousing and data management tools; • Business intelligence tools: The tools employed to analyse data collected by the first two components. BI processes and tasks can be summarized as follows: • Understand the business problem to be addressed • Design the warehouse • Learn how to extract source data and transform it for the warehouse • Implement extract-transform-load (ETL) processes • Load the warehouse, usually on a scheduled basis • Connect users and provide them with tools • Provide users a way to find the data of interest in the warehouse • Leverage the data (use it) to provide information and business knowledge • Administer all these processes • Document all this information in meta-data BI processes extract the appropriate data from operational systems. Data is then cleansed, transformed and structured for decision making. Then the data is loaded in a data warehouse and/or subsets are loaded into data mart(s) and made available to advanced analytical tools or applications for multidimensional analysis or data mining.
The business executive views the challenges of implementing effective business intelligence solutions differently than does the CIO, who must build the infrastructure and support the technology. The business executive wants : • Application freedom and unlimited access to data - the flexibility and freedom to utilize any application or tool on the desktop, whether developed internally or purchased off the shelf, and access to any and all of the many sources of data that are required to feed the business process, such as operational data, text and html data from e-mail and the internet, flat files from industry consultants, audio and video from the media, without regard to the source or format of that data. And the executive wants that information accessible at all times. The CIO’s challenge is: • Connectivity and heterogeneous data sources - building an information infrastructure with a database technology that is accessible from all application servers and can integrate data from all data formats into a single transparent interface to those applications. The business executive wants: • Information systems in synch with business processes – information systems that can recognize his business priorities and adjust automatically when those priorities change. The CIO challenge is: • Dynamic resource management/performance and throughput - a system infrastructure that dynamically allocates system resources to guarantee that business priorities are met, ensuring that service level agreements are constantly honored and that the appropriate volume of work is consistently produced.
The business executive wants: • Low purchase cost. Today’s BI solutions are most often funded in large part or entirely by the business unit, and the focus at this level is on the cost of purchasing the solution and bringing it on line. The CIO challenge is: • Total cost of ownership. Skill shortages and rising cost of the workforce, along with incentives to come in under budget, drive the CIO to leverage the infrastructure and skill investments already made. The business executive wants: • Access to all the data all the time/the ability to transform information into actions. Most e-business companies operate across multiple time zones and multiple nations. With decision makers in headquarters and regional offices around the world, BI systems must be on line 24x365. Furthermore, the goal of integrating customer relationship management with real-time transactions makes currency of data in the decision support systems critical. The CIO challenge is: • Availability/multi-tiered and multi-vendor solutions. Reliability and integrity of the hardware and software technology for decision support systems are as critical as those for transaction systems. Growing in importance is the need to be able to do real-time updates to operational data stores and data warehouses, without interrupting access by the end users.
Implementing BI is a long process and it requires a lot of analysis and investment. A typical BI environment involves business models, data models, data sources, ETL, tools needed to transform and organize the data into useful information, target data warehouse, data marts, OLAP analysis and reporting tools.
Setting up a Business Intelligence environment not only relies on tools, techniques and processes, it also requires skilled business people to carefully drive these in the right direction. Care should be taken in understanding the business requirements, setting up the targets, analyzing and defining the various processes associated with these, determining what kind of data needed to be analyzed, determining the source and target for that data, defining how to integrate that data for BI analysis and determining and gathering the tools and techniques to achieve this goal.
At its base, Open-Source software is software that comes with the source code in a form that customers can modify for their own needs and resell or give away to others under the same terms. Users of the software fund its development directly by either working on the software themselves or contracting someone to do it. This is the key to its success and why it is revolutionizing the software industry Linux vs. Windows face-off is the wrong way to think about Open-Source
1. Free Redistribution The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale. 2. Source Code The program must include source code, and must allow distribution in source code as well as compiled form. 3. Derived Works The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software. 5. No against Persons or Groups The license must not discriminate against any person or group of persons. 6. No Discrimination against Fields of Endeavor The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research. License Must Not Be Specific to a Product The rights attached to the program must not depend on the program's being part of a particular software distribution 9. License Must Not Restrict Other Software The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be Open-Source software. 10. License Must Be Technology-Neutral No provision of the license may be predicated on any individual technology or style of interface.
One of the most exciting things about Open-Source is that it represents a huge shift of power from vendors to end users, who are not left without recourse if the original developer abandons the marketplace
Open-Source projects may fail. Open-Source methods, like all software development methods, do not guarantee success. The key technical factors for success are the skills and dedication of the core developers and interface designers on the project. Open-Source projects can also fail for market reasons if they do not produce results faster than competitive projects. Although no statistics are available, the failure rate for new Open-Source projects is probably similar to the failure rate for new proprietary development projects. Canceled Open-Source projects leave a legacy of source code and ideas that can be merged into more successful efforts or recycled into other projects. Open-Source projects are not deadline driven. With an open ended development team, it is impossible to reliably predict release dates. This is not a problem when deploying finished works, but can be a problem if customers become dependent on anticipated future events. Customers can manage this risk by active participation in the Open-Source project concerned. There are some application areas where the economics don't make sense. Where the number of users is small and they are in strong competition, the value of contributing to an Open-Source project is less clear. Open-Source software is not as well established as proprietary software. Open-Source software has been available and growing in scope for decades, but there are still many application areas where Open-Source solutions are not yet available in final form. There are an extremely large number of active projects working to close this gap. Open-Source software is also unfamiliar to many potential users. Individuals and corporations with UNIX experience have a wide range of Open-Source products that are familiar and available. Users habituated to other platforms have fewer Open-Source products available without changing operating systems and face more of a learning curve. In the past, the press and market research organizations have not evaluated Open-Source alternatives to proprietary software. Background information on Open-Source software is only now being written. Open-Source software is unproven for non-technical applications. Not surprisingly, the first successes of Open-Source software have been in areas where the users and developers are one and the same. The origins of Open-Source have been developers with unmet problems, needs or desires who then wrote code for their own use, often in their spare time, and shared the results with other developers. Open-Source is now expanding into new areas and producing products for non-technical users, but this work is in its infancy. BUT If we say, Open-Source software isn't reliable enough to use, then the Internet isn't reliable enough, because the Internet infrastructure relies heavily on Open-Source software. Every single internet address--both web and email--depends on the Domain Name System, or DNS. At the heart of the DNS is an Open-Source program called BIND BIND ( Berkeley Internet Name Domain , previously: Berkeley Internet Name Daemon ) is the most commonly used DNS server on the Internet, especially on Unix-like systems It's also well known that the Open-Source Apache web server hosts more than 60% of the world's web sites, including many of the most heavily trafficked, such as Yahoo!, which runs on a network of more than 2000 FreeBSD-based machines running a modified version of the Apache web server.
The Internet is a key enabler for the development and distribution of Open-Source software. The rapid expansion of the Internet into business and the home has extended the reach of Open-Source and more widely publicized its benefits. Open-Source products have been available for years and used extensively in the UNIX world. The creation of Open-Source operating systems such as Linux and others, have "completed the circle" enabling complete systems and networks to be deployed entirely with Open-Source software. The popularity of Linux has generated new revenue for Open-Source vendors that is now being used to expand development efforts. The unbundling of support from products makes the Open-Source business model more attractive to vendors and more familiar to customers. Linux is the only operating system other than the Microsoft Windows family with a growing market share. According to press reports quoting IDC and Datapro studies , Linux is now used by more than 14% of businesses and it's market share is expected to overtake the Mac OS before 2010.
Many customers spend massive amounts of money on proprietary BI solutions in the hope that these software products will help them, But, Commercial BI solutions are consistently criticized in the following areas: (have following problems) Price The price, maintenance costs, support, and services are too expensive. Usability Too difficult to use for most users. Skills Lack of adequate skills transfer from vendor to customer. Lack of implementation methodologies. Customization Too difficult for customers to develop solutions and integrate business rules. Tool-Set orientation The ‘solutions’ are tool sets and not a solution at all. Extensibility The solutions are proprietary and impossible for customers and aftermarket suppliers to extend and direct the system. Customers did not buy the software, they paid upfront for the right to use it. This is like getting a lease on a car but making all the payments on day one: it’s the worst of both worlds. Reporting and analysis focus The solutions are focused on the reporting and analysis of KPIs, and ignore the performance of the processes that affect the metric. Process influence They are unable to ensure driving changes in a business process. They assume that the delivery of a report will have the side effect of influencing a business process. Tracking and Auditing They are unable to provide complete tracking and auditing. Who got the report? What action did they take? How long did it take? Was a business process initiated as a result? How far along is that process? What is the performance of the process? Prototyping The software pricing models do not support the prototyping phases necessary to ensure the success of Business Intelligence projects. Significant financial outlay and contractual agreements must be signed before full evaluation and prototyping can be done.
The Pentaho BI Platform is different from traditional BI products. It is a process-centric, solution-oriented framework with Business Intelligence (BI) components that enable companies to develop complete solutions to Business Intelligence problems. The BI Platform is process-centric because the central controller is a workflow engine. The workflow engine uses process definitions to define the Business Intelligence processes that execute within the BI Platform. The processes can be easily customized and new processes can be added. The BI Platform includes components and reports for analyzing the performance of these processes. The BI Platform is solution-oriented because the operations of the Platform are specified in process definitions and action documents that specify every activity. These processes and operations collectively define the solution to a Business Intelligence problem. This BI Solution can be easily integrated into business processes that are external to the Platform. The definition of a Solution can contain any number of processes and operations. The Platform consists of a BI Framework, BI Components, a BI Workbench, and desktop Inboxes: • The BI Framework provides logging, auditing, security, scheduling, ETL, web services, attribute repository and rules engines. • The BI Components include reporting, analysis, workflow, dashboards, and data mining. • The BI Workbench is a set of design and administration tools that are integrated into the popular Eclipse environment. These tools allow business analysts or developers to create reports, dashboards, analysis models, business rules, and BI processes. • The desktop Inboxes can be third-party RSS readers or the Pentaho Inbox Alerter. The Inboxes deliver tasks and report / exception notifications. • The BI Framework and BI Components form the Pentaho Server. BI Solutions are as designed using the BI Workbench and deployed to the Pentaho Server. The Pentaho Server is the runtime engine, driven by the workflow engine, which coordinates the execution and communication between all the BI Components. The architecture is a combination of original source code and mature Open-Source components that have been integrated to form a complete, scalable, sophisticated BI Platform. The Pentaho BI Platform is built upon a foundation of servers, engines, and components. These provide the J2EE server, security, portal, workflow, rules engines, charting, collaboration, content management, data integration, analysis, and modeling features of the system. Many of these components are standards-based and can be replaced with other products. To create a truly integrated, single-source solution, Pentaho adds the following: • Common metadata in the form of solution definition documents • Common user interfaces and user interface components • Security • Email and desktop notifications • Installation, integration and validation of all components • Sample solutions • Application connectors • Usage and diagnostic tools • Design tools • Customization and configuration • Process Performance analysis reports and ‘what-if’ modeling
BI Platform is integrated with external applications that provide data to drive the solutions. This data is loaded into a data warehouse or data mart using an Open-Source ETL tool. The Solution Engine is central to the architecture and manages access to the BI components. The services of the BI Framework: Provide web services to external applications Have access to the same Solution Engine as the user interface components Are called by the workflow engine and scheduler to execute system actions The Server includes the components and technologies required to build a Business Intelligence solution: reporting, workflow, business rules, dashboards/analysis, web services, scheduling, a mix of convenient web and desktop user interfaces, and auditing. The Pentaho BI Platform provides system monitoring via Simple Network Management Protocol (SNMP). The repositories are stored inside an RDBMS that is outside of the Pentaho platform. The embedded repositories in the preconfigured installation are stored inside an Open-Source database, either FireBird (preferred) or MySQL. These repositories can be replaced with other relational databases such as Oracle, SQLServer or DB/2 if required The Desktop Alerter is an application that provides alerts in RSS format when new workflow tasks are assigned, or reports made available to, a user. This application must be installed on the computer of every user that needs to use it.
Importance of BI As a competitive advantage As a binocular which ensures management isn’t blindsided Access the right information at the right time Single version of the TRUTH Business Intelligence A systematic process that collects analyzes organizes the flow of critical information Assists all levels of people in org. in making strategic tactical operational DECISIONs BI features Analytical Tools Data Warehousing OLAP ETL Tools Dash boarding Flexible Reports Workflow designing Open-Source benefits Lower software costs More flexibility More reliable products Better standardization and long term stability Not reliant on a single vendor L.A.M.P: web platform BI Product Process-Centric Solution-Oriented Architecture Server BI work bench Inbox alerter