Individual assignment 3328773

INFOMGMT 393 | Data Mining & Decision Support Systems

Individual Assignment | Semester One, 2008

Global Stationery Supplies

Document Contents;
Page No.

Task One 1-6
Task Two 7-10
References 11

Jess Maher | 3328773 | jmah021

INFOMGMT 393 Individual Assignment | Semester One, 2008 | Global Stationery Supplies .

Task One

Data Warehousing

The implementation of a Data Warehouse into Global Stationery Supplies (GSS) is largely feasible due
to the nature of such being a ‘physical repository where relational data are specially organised to provide
enterprise wide, cleansed data in an integrated, standardised format which is multidimensional’ (Turban,
Aronson, Liang & Sharda, 2007). Due to GSS’s global nature, the information collected from a number of
physically dispersed locations is required to be utilised in conjunction, causing difficulties due to varying time
zones, currencies, languages, laws & taxes, holidays, policies and practises. A data warehouse would provide
the organisation with a resource which would enable them to have an enterprise perspective of the
organisations data which could more richly utilise otherwise redundant data for decision making.

GSS’s strong commitment to uphold their good reputation to provide value for money and speedy,
reliable services enforces the importance of the role and responsibility held by the Procurement team within
the organisation. In order for the Procurement team to effectively utilise these connections established, often
held directly with producing factories, and provide stock to required locations in a “just-in-time” basis, an
effective communication and interaction must occur between the Procurement, Logistics & Ordering teams.
The nature of the “just-in-time” basis within which GSS operates its stationery procurement service means the
need for paperwork, ordering, logistics, procurement and delivery all need to be processed and forecast
through an integrated information source. There is also a clear need for information transferred between the
stores and warehouses with the payments and invoicing teams, the nature of which may see difficulties when
dealing with different locations, recording procedures or time zones.

According to Ariyachandra & Watson (2005) (as cited in Turban, et, al.) the key factors that potentially
affect architecture selection decision include’ information independence between units, upper managements
information needs and the nature of the end users tasks’(2007, pp?), with GSS, as described, requires an
enterprise wide perspective of information collected requiring free flows of information. There are a number
of possible structures and designs of Data Warehouses which could potentially be utilised within and create a
potential benefit for GSS, those which are potentially the most beneficial have been explored in more detail
below.

Data Mart Centric
Data Marts contain lightly and highly summarised data and metadata which are generally easy to
organisationally and technically construct (Turban, et, al., 2007, Hsieh & Lin, 2003). They utilise an
architecture which would enable GSS to constructively extract sources of information from within the
business, transform the data (into a collaborated, standardised format) and load into data warehouse which
acts as a respiratory for current and historical data of particular interest (Turban, et, al., 2007). They are
somewhat limited however in their application, they do not provide a business enterprise view, and as the
data warehouse itself cannot create more data, there is often a high cost of redundant data (Turban, et, al.,
2007). This data warehouse architecture may require more maintenance than other alternatives and in turn
the cost of capturing and maintaining the data can be high (Hsieh, et, al., 2003).

1 Jess Maher | 3328773 .


Distributed Data Warehouse
Distributed data warehouses which are commonly incrementally constructed over time, provides no
common metadata components across data marts which can potentially lead to complications when
attempting to integrate aspects of the organisation (Hsieh, et, al., 2003, Turban, et, al., 2007). The majority of
the workload utilised when using such a data warehouse is placed on the individual work stations, meaning
the data warehouse itself is only viable for low volumes of data (Turban, et, al., 2007). These issues have lead
to distributed data warehouse designs being widely perceived as unacceptable in the long run (Hsieh, et, al.,
2007), and would not be a recommended option for integration into GSS for the same reasons.

Hub & Spoke Data Mart
The Data Warehousing Institute (TDWI) research findings report the most frequently implemented
architecture among a number of large, powerful firms is the Hub-and-Spoke architecture of the data
warehouse design (Agosta, 2003). Such architectures enable easy customization of user interfaces and reports
(Turban, et, al, 2007) which would be beneficial to GSS as it would allow the m to alter these interfaces and
reports for all areas of their organisation on an international scale. The hub-and-spoke architecture would
allow GSS to have a centrally planned area of operations while also allowing flexibility as ‘the number of times
the data must be transformed is optimal for a majority of scenarios involving many to many nodes in a
network of source and target data stores (Agosta, 2003). This structure utilises dependant data marts which
are subsets created directly from the data warehouse which ensures the advantage of consistent and quality
data, however limits access to only one user at a time (Turban, et, al., 2007). This separated dependant nature
of the data stored means a business enterprise view can be challenging to obtain and the costs for redundant
data, database administration and operations can be high (Turban, et, al., 2007).

Enterprise Data Warehouse (EDW)
The large scope for data allows the data warehouse to draw information from a variety of sources and
consolidate the information so it can be organised in a way which allows a business enterprise view of the
organisation (King, 2006, Turban, et, al., 2007). The EDW provides a number of tools to assist end users with
no computer programming knowledge to effectively find, understand and evaluate data stored within the
data warehouse (King, 2006). Predefined reports provide instant access to relevant information which is
retrieved and organised in a way which is understandable to the end user, however query and analysis tools
can be used to further discover patterns and draw conclusions from such (King, 2006).

The EDW can be used to provide data for many types of additional types of decision support systems,
which could allow the benefit of shared, consolidated information to be utilised by areas other than the
analysts and executives of GSS, for example the procurement, ordering and logistic teams, marketing and
advertising departments and financial functions of the organisation could make use of data through support
such as; CRM, SCM, BPM, PLM and KMS (Turban, et, al., 2007). The development of an EDW is not fast process
and in this nature, it can be costly in set up (Turban, et, al., 2007), EDW’s are usually built a step at a time over
several years, with the collection of data prioritised in order of importance (King, 2006). It is also essential to
have broad participation in development and strong executive support as without it, the project to develop
such a large system will lose momentum and development may become cost ineffective (Turban, et, al.,
2007).

2 Jess Maher | 3328773 .


When comparing the data mart and EDW approaches to data warehouses, Turban, Aronson, Liang &
Sharda (2007) found the scope of the EDW is much larger (it can include several subject areas), there can be a
greater number of simultaneous users, providing cross functional optimization and decision making. However
in comparison the data mart approach has a considerably shorter time, lower cost and difficulty in
development, yet requires more effort in the frequency of updates. The user type of the data mart approach
are typically business area analysts and managers and the spotlight is on optimizing activities within the
business area, which is not really conducive to GSS’s structure and nature of operations. The EDW approach is
more commonly utilised by enterprise analysts and senior executives as it provides an enterprise wide view of
the businesses data (p229) which is more conducive to the intention GSS holds.

The implications of developing a data warehouse, regardless of the architecture chosen, will require
time, resources and buy in from key members of the organisation and is therefore only a task which should be
undertaken if a strong commitment is intended. The decision for design implementation of a data warehouse
within GSS needs to be researched more thoroughly in consideration of the intended purpose of the data
warehouse developed in order for any of the recommended suggestions to be considered of appropriate
potential. However, as the need to integrate and consolidate data from a variety of sources is critical to the
beneficial implementation of any data warehouse architecture within GSS, the issues posed by utilising a data
mart centric or distributed data warehouse approaches are of great concern to GSS. Both approaches have
difficulty in providing an accurate enterprise view and the integration provided by the distributed data
warehouse is questionable due to a lack of common metadata (Hsieh, et, al., 2003,Turban, et, al., 2007),
making these options the least favourable options for potential implementation within GSS.

It is ideally recommended that an enterprise data warehouse be the preferred data warehouse
architecture to be developed within GSS. Such implementation would allow GSS to consolidate information
from a variety of sources providing an enterprise view of information and potentially assisting varied other
units of the organisation by providing data for a range of decision support systems (Turban, et, al., 2007, King,
2006). However, due to the time and effort required in the process of incrementally obtaining the prioritised
data collections, the time and cost required for development of such an architecture can be considerably high
(Turban, et, al., 2007, King, 2006), possibly making its utilisation within GSS not feasible. Alternatively, benefit
could also be potentially gained through the implementation of a hub-and-spoke data warehouse architecture
which would potentially enable GSS to take an enterprise view through a network of linked source and target
data stores (Agosta, 2003).

This structure would enable consistent, quality data to be retrieved from a number of dependant data
marts (Turban, et, al., 2007) which would enable the independent regions of GSS to accurately record and
report their data to a central data store. There is however a requirement to consider the limitations of this
approach by GSS before implementation in order to develop strategies to minimise the implications of these,
the limited access to one user at a time may not potentially pose a great concern within GSS as the majority of
operations analysis support data will be stored in regional data marts meaning access to the data warehouse
would be limited to those executives based in the head office operations. With adequate consideration and
planning in order to minimise the costs of operation, data base administration and redundant data, the
application of the hub-and-spoke data warehouse architecture within GSS would potentially provide decision
support in an enterprise view which could be of considerable benefit if utilised correctly.

3 Jess Maher | 3328773 .


Data Mining

Global Stationery Suppliers (GSS) like any large, multinational organisation could potentially gain
benefit from accessing data which may otherwise be redundant due to its untapped location. The possibility to
understand more fully the aspects of its business and customers and the ability to customise to the needs of
clients and markets differently, is obviously of great potential advantage to GSS. Data mining overcomes many
of the limitations of traditional forecasting which many organisations may utilise and allows them to
understand and analyse complex data sets in a number of ways (Garver, 2002). Data mining could potentially
be utilised within GSS to improve advertising and marketing campaigns, assist in outlining the most effective
communication processes and gaining better understanding of their potential customers more thoroughly
through the information held about their current customers (Berry & Linoff, 2004).

Hypothesis-driven data mining could be utilised within GSS by a user whom has a proposition of which
validation of truthfulness can be gained using data mining tools, alternatively, patterns, associations and
relationships among data can uncover facts that were previously unknown to GSS by utilising a discovery-
driven data mining approach (Turban, et, al., 2007). An example of the potential benefit could be utilised
through hypothesis-driven data mining within the procurement and ordering teams, whom recognise that
there are peaks to their business at certain times of the year, in different locations. By utilising a data mining
technique, GSS could understand the time frame for such peaks more clearly and validate their existing
understanding, which could potentially assist in their planning and accommodation of such periods. Discovery-
driven data mining techniques could potentially benefit GSS in a number of ways and areas, the process and
operations within the procurement team, which are of considerable strategic importance within the business,
is one area where benefit could be perceived through such methods. The current and historical data collected
from store and warehouse sale records, ordering teams, procurement processes, marketing efforts and client
information could be analysed to potentially discover patterns, associations and relationships which are
collected from dispersed and differing locations. The alternative data mining tools and techniques which could
potentially be implemented within GSS are based on a varying array of methods and approaches to data
mining, these include; statistical, mathematical or next generation and artificial intelligence or machine
learning approaches (Turban, et, al., 2007, Berson, Smith & Thearling, 1999).

There are a number of statistical methods of data mining which could benefit GSS by being used in
order to discover patterns and build predictive models, these methods include; linear and non linear
regression, point estimation, Bayes theorem, correlations and cluster analysis (Turban, et, al., 2007).
Regression analysis is a commonly used technique that is used to forecast estimates/predict data based on
patterns observed within large data sets (Turban, et, al., 2007). There are a number of mathematical and next
generation data mining techniques which would potentially provide GSS benefit through their ability to
uncover new information in large databases while also providing the building of new predictive models
(Turban, et, al., 2007, Berson, et, al., 1999), which include; decision trees, algorithms, classification and
regression trees (CART) as well as neural networks which are also classified as a machine learned approach
(Turban, et, al., 2007, Berson, et, al., 1999). Further investigation into the approaches considered to have the
most potential benefit to GSS have been explored and expanded below.

Cluster Analysis
Cluster analysis is an exploratory data analysis tool for solving classification problems, it can assist in
providing measures of definition and rules for assigning classes for identification, targeting and diagnostic

4 Jess Maher | 3328773 .


purposes (Turban, et, al., 2007). Cluster analysis can be approached in either a hierarchical or partitional
manner, hierarchical algorithms use previously established clusters to discover successive ones, while
partitional algorithms determines all the clusters at once (Wikipedia, 2008). GSS could utilise cluster analysis in
a number of ways, for example, in order to segment their client market and determine target markets which
would enable them to better utilise advertising resources, alternatively, the procurement team would define
which locations, featuring which clients, purchase which products from which suppliers, potentially assisting
the planning of the logistics and procurement teams. There are a number of software options which integrate
tools which allow cluster analysis, such as Microsoft Visual Studio, would allow GSS to complete a range of
analysis which could potentially provide benefit to many areas of the organisation with decision making. There
have also been a number of specialised software packages which have been specifically developed for the
purposes of cluster analysis, including; Clustan Graphics and SPSS (Turban, et, al., 2007), in addition to a range
of open source code freely available on the internet.

Decision Trees
Decision tree analysis identifies predictor variables, searching all variables until all relevant ones are
selected, if they are not selected, they are not important to the prediction (Garver, 2002). By utilising
classification and clustering methods, decision trees can be developed to assist GSS with decision making by
taking complex problems and breaking them down into increasingly discrete subsets (Turban, et, al., 2007).
Decision trees can be used in a wide variety of business problems for both exploration and prediction using a
variety of algorithms based on the tree created (Berson, et, al., 1999). There are a number of decision tree
algorithms which are commonly implemented in many organisations, including; ID3 and C4.5 (Berson, et, al.,
1999, Turban, et, al., 2007). These algorithms work on the basis that predictions are picked and splitting values
based on information provided by that slipt or splits (Berson, et, al., 1999).

Decision Tree Analysis uses categorical and continuos data and has the ability to accommodate for
missing data (Thomas, 2004), which would provide GSS with an accurate, historically based representation of
patterns, associations and subsets of data collected. Decision trees have the ability to discover unexpected
relationships and more clearly identify the differences between subgroups (Thomas, 2004). An example of the
potential insight which can be provided is described by Garner (2002) in the use of decision trees within the
pizza delivery industry to assist customer satisfaction and loyalty data (p62). In this case, the use of decision
trees provides a greater insight into the different segments of loyalty analysed, providing perceptions of both
importance and performance (Garver, 2002). The use of decision tree analysis in a simular manner within GSS
could potentially provide benefit in a number of areas, for example, the analysis of customer preferences in
stores, the merchandise order patterns in differing countries or the periods of elevated purchasing.

There are a number of software packages that provide decision tree analysis as one of their tools or
options, including; Oracle, Microsoft Visual Studio and Clementine (Turban, et, al., 2007). The interface
provided by Microsoft Visual Studio would possibly be the easiest for implementation as members of GSS are
all currently users of other Microsoft software, such as Office, and the familiar interface would help in the
implementation of this system in a way that will encourage its use. Microsoft Visual Studio also allows for a
range of other analysis options such as; cluster analysis, association rule analysis, linear and logistic regression,
Naïve Bayes and even neural networks, all within the same project.

Neural Computing
Neural computing uses masses of historical data to identify and analysis changes in patterns, situations
and tactics (Williams, 1994), enabled by utilising a method which emulates how the brain works (Turban, et,

5 Jess Maher | 3328773 .


al., 2007) in a manner which is mathematically driven (Garver, 2002). Such networks require a step by step
process in order to create the knowledge stored in weight associations between two neurons (Turban, et, al.,
2007). This provides the neural network of the advantage of being able to perform tasks that other linear
analysis options can not while also having the ability to be implemented into a wide variety of applications
(Artificial Neural Networks, 2008). A neural network has the advantage of being able to perform tasks that
other linear analysis options can not while also having the ability to be implemented into a wide variety of
applications (Artificial Neural Networks, 2008). The neural network is actually a form of artificial intelligence,
which can either be based on a mainframe (such as a data mart) or on a network of personal computers
(Williams, 1994) which would provide GSS with more flexibility and chose when implementing and using such a
data mining technique.

Neural networks have the ability to learn from their experience and do not require programming of
fixed rules or equations in order to analyse quantities of complex data and identify patterns from which
predictions can be made (Taylor, 1997). The architecture of the neural network however, needs to be emulated
and the network itself requires training in order to operate (Artificial Neural Networks, 2008). There are a
number of software options for completing neural network analysis, those which are specific for Neural
Networking include; Stuttgard Neural Network Simulator (SNNS), Emergent and Java NNS (Wikipedia, 2008).
There are also a large number of software options available for data mining analysis which include neural
network analysis including; Clementine, SAS Enterprise Miner and again Microsoft Visual Studio (Turban, et, al.,
2007), while again open source software, such as WEKA, developed at Waikato University, also provide neural
network analysis ability. While the techniques used to analyse the data in both decision tree and neural
network data analysis differ conceptually, the results from both are simular, providing predictions and
examining the impact of certain variables on those predictions (Garver, 2002), however it is clear the initial
effort required to enable a neural network is considerably more. There are a large amount of software options
available for

Any one of the data mining techniques explored show large potential benefit if implemented within
GSS, it would be important to reconsider the organisations strategic objective and goals in order to assess the
best application of any technique in accordance with the intended purpose. One potential software option
which could easily be utilised within GSS to gain benefit from cluster, decision tree and neural network analysis,
in conjunction with a number of further tools, is found in the implementation of Microsoft Visual Studio. Neural
computing would be considered top of the line for GSS, however, the implications of such a large project
implementation within GSS is not apparent that is clearly feasible, especially considering the result from the
investment could be achieved through other methods, such as neural network analysis and also decision tree
analysis (Garver, 2002). Decision tree analysis would also enable GSS to gain decision making support in a way
that allows for both the discovery of new information and creating predictive models (Turban, et, al., 2007,
Bensen, et, al., 1999).

The use of cluster analysis within GSS would be of clear benefit to understanding the trends and
patterns of their business, however to implement specific cluster analysis software within GSS would be
inadequate to the intended purpose. It would be recommended that GSS utilise a software package option that
enables them to complete cluster analysis to verify finds from other next generation data mining techniques,
by utilising tools and techniques provided within Microsoft Visual Studio, GSS would be able to create decision
trees, cluster analysis, association rule analysis, linear and logistic regression, Naïve Bayes and even neural
networks, all within the same project. The potential benefit of using such analysis tools such as those utilised
through the implementation of software such as Microsoft Visual Studio could be experienced through various
layers and units within GSS.
6 Jess Maher | 3328773 .


Task Two

Prototype Project Plan for the implementation of Microsoft Visual Studio
Data Mining Tools & Techniques within Global Stationery Supplies

Project Brief
The intention of this project is assess the feasibility of the implementation of data mining tools provided
within Microsoft Visual Studio, for the potential benefit of a number of areas of GSS, with particular reference
in this project to the procurement processes as well as sales and marketing functions of the business. There
are a number of assumptions which have been made in order to appropriately consider the viability of such
implementation within GSS, which include assumptions about organisation structure and processes, data
collection and storage. It is assumed that the procurement and marketing operations for GSS are based
primary from head office location, with regional units based on store locations globally. It also assumed that
GSS will utilise an appropriate data warehouse architecture which will be the basis to provide consolidated,
cleansed data for the purposes of data mining, allowing consideration from a store, region or enterprise
perspective which can be utilised by various layers and areas of the organisation.

Ideally for the purposes of this prototype project, it would be recommended that a particular region be
selected for analysis. By utilising Microsoft Visual Studio tools, GSS would be able to implement a number of
data mining techniques, such as a decision tree and cluster analysis, to discover patterns, new data and
predictive models from data collected throughout the organisation. Such information gained from suggested
techniques could be utilised for the potential benefit of the procurement team; understanding peaks and
activity of business in different regions could potentially assist them in allowing for such times, and also the
sales, advertising and marketing functions; potential to benefit from a variety of information in a range of
ways, such as understanding which consumers are most likely to purchase a particular product, which would
enable them to more accurately customise their marketing efforts to the most appropriate audience.

Project Aim
To implement a regional prototype project implementation of Microsoft Visual Studio and the data mining
tools and techniques it provides such as, decision tree and cluster analysis, in order to test its feasibility for
general application organisation wide into GSS, in order to provide benefit and support to decision makers
within the organisation.

Requirements for Project
It is assumed that a data warehouse will be utilised to provide information for the purposes of utilising the
data mining tools and techniques suggested and as the prototype project will only be considering information
and data collected within that region, then the scope of the requirements is limited to the selected region. It
is assumed that for the purposes of this prototype project that the focus for data mining use be the

7 Jess Maher | 3328773 .


procurement, sales, advertising and marketing functions of the organisation, therefore the scope of data
requirements is also limited within this assumption.

Data required for prototyping purposes
In order to accuracy complete analysis suggested information will be required from a number of sources;

• Transactional Data (information from sales, store records and client purchases).
• Ordering and Stock holding Data (information about the patterns of ordering for stores, products they
hold on hand and potentially product life cycle information).
• Procurement and Logistics Data (information which allows categorisation of products and potential
assessment of process).
• Marketing and Advertising Data (information to assess the efforts of functions, benefits and
relationship to sales).

Platform and software intended on use
For the purposes of this prototype project, Microsoft Visual Studio will be utilised to access and assess
information held within GSS’s data warehouse records. By utilising the Microsoft Visual Studio software and
tools provided, GSS would be able to create not only the recommended decision tree and cluster analysis but
also have access to further analysis methods such as; association rule analysis, linear and logistic regression,
Naïve Bayes and even neural networks, all within the same project.

Project Implementation Concerns

• Training of staff
In order to successfully utilise the tools and techniques provided within the Microsoft Visual Studio’s software
package, training of those end users whom will be utilising the software will need to be undertaken to ensure
these processes can be accessed. For the purposes of this project, it is assumption that the regional and
executive leaders from within the GSS structure would be the users of such tools to assist their decision
making within the procurement, marketing and advertising functions. Whilst the tools provided within
Microsoft Visual Studio are relatively easy to use, if the user has not utilised the software before, it may be
difficult to navigate and a common cause of software implementation failure is a lack of understanding by its
key users.

On the job training is generally the best application method for such software uses, as the ability to remember
information if learned in the same environment as it is applied is greatly high (Read, Hunt & Ellis, 2004). With
this consideration in mind we would be required to utilise someone within the organisation as a trainer and
mentor with the software and tools selected who would train the appropriate regional leaders. There is also a
clear requirement that buy-in from the key members of the proposed project be gained in order to receive
the most from the knowledge and expertise they hold and have the best implementation of the tools
provided in the prototype project.

8 Jess Maher | 3328773 .


• Project Development & Implementation Costs
The training cost for staff through the implementation of this project is only one aspect which must be
considered, there are a number of other costs which must be also considered in order to complete a working
prototype of the potential utilisation of Microsoft Visual Studio tools, such as clustering and data mining
analysis. As it is assumed that a data warehouse will be utilised for the purposes of data retrieval for this
project, there is no requirement for consideration of a data cleansing or consolidation, however, as this
project utilises the services of software provided by Microsoft, GSS will need to arrange an appropriate licence
agreement to provide access to potential users of such systems. The Microsoft Visual Studio software can be
expensive to purchase however as multiple licences will be required by GSS in order to utilise such tools
throughout regions and areas of the business, it is recommended that a package be sought out from
Microsoft. For the purposes of this prototype project a trial version of Microsoft Visual Studio could be
obtained free of charge from the Microsoft Website (http://office.microsoft.com/en-us/default.aspx).

Feasibility Elements being tested
This prototype project has been designed to test the viability of potentially implementing the data mining
tools (such as decision trees and cluster analysis) provided by Microsoft Visual Studio into GSS. There are a
number of areas of potential concern to GSS, the feasibility of which is intended to be considered through the
development and execution of the prototype project. The areas of concern which feasibility will be considered
are outlined below;

• Viability of data mining techniques selected (Can the right information be retrieved?)
The prototype project will allow GSS to assess the feasibility of utilising decision tree and cluster analysis
within regional units, this will be greatly assessed through the experiences of leaders to not only gain
information from the data but utilise this within the business operations.

• Potential benefit of information provided by data mining techniques to business units through
implementation (Can the information be used?)
The prototype project will allow GSS to assess the potential benefit received from information gained
from analysis within the decision making and analysis of general business processes. Some additional
training maybe required to assist leaders and executives in understanding how to utilise analysis to gain
specific information required.

• Viability of Microsoft Visual Studio as software to utilise data mining tools and techniques
The prototype project will allow GSS to assess the softwares potential for application organisation wide,
the assessment and feedback of key users within prototype project will greatly enrich the assessment
made.

• Appropriateness of Implementation for such tools and software
The prototype project process, results and feedback of those involved will assist GSS assess the feasibility
of the implementation approach taken to introduce such software and tools to the organisation as a
whole.

9 Jess Maher | 3328773 .


Outline of Process
• Determine the region, people and information which will specifically be utilised by the prototype
project.

• Work with regional managers and other members involved with prototype project to gain 'buy-in' to
project and utilise knowledge and expertise held in development and execution of project ensuring the
most potential benefit is gained in the process.

• Clearly define the variables for consideration based on the intended question or pattern being analysed.

• Assess and define the best decision tree analysis algorithms and variables to discover trends and
patterns within GSS business data and appropriate cluster analysis to support and build on information
intended to gain in analysis.

• Define and advise appropriate trial parties on required methods and procedure in order to gain
beneficial results and utilise in staff training for project implementation.

• Assess viability of results gained by regional unit from defined decision tree and cluster analysis
completed as well as any other supporting assessments or investigations completed independently
within the regional unit in order to determine viability of data mining tools themselves, investigations
defined and process of implementation and training of prototype project.

• Use information gained from prototype project and feedback and assessment of those involved to
develop a proposed plan for the potential enterprise wide utilisation of data mining tools to assist
different layers and areas of GSS to enable better decision making.

Expected Outcomes:
The defined algorithms, variables and analysis options recommended for use by the prototype project
regional unit is only intended as a guide. It is expected that benefit will be obtained in the decision analysis
tools utilised in this project, however potentially more benefit is expected from the use and exploration of
such regional leaders within this prototype. The knowledge and expertise held by members of the
procurement, sales and marketing and indeed all areas and functions of the business within GSS can provide
the best basis for predicative modelling and data discovery. By implementing software such as Microsoft
Visual Studio, which provides a range of data mining tools and techniques, and providing adequate
explanation and training to key members, GSS can truly understand the potential benefit available through
implementation of such tools enterprise wide.

10 Jess Maher | 3328773 .


References

Agosta, L., (2003, 1 Dec), Hub-and-Spoke Architecture Most Popuplar for Data Warehousing , DM Review,
New York, Vol. 13 (12), Retrieved 18 April, 2008 from Proquest Computing database

Artifical Neural Networks, (2008), Artifical Intellegence Technologies Tutorials, Retrieved 22 April, 2008 from;
http://www.learnartificalneuralnetworks.com/

Berry, M. J. A., Linoff, G., (2004), Data Mining Techniques: For Marketing, Sales & Customer Relationship
Management , John Wiley & Sons, Indianapolis, pp 6-40

Berson, A., Smith, S., Thearling, K., (1999), An Overview of Data Mining Techniques, Building Data Mining
Applications for CRM, Retrieved 20 April, 2008 from;
http://www.thearling.com/text/dmtechniques/dmtechniques.htm

Garver, M. S., (2002, 16 Sep), Try New Data-Mining Techniques, Marketing News, Chicago, Vol 36 (19),
Retrieved 20 April, 2008 from Proquest Computing database

Hsieh, C., Lin, B., (2003) Web Based data warehouseing: Current status and perspective , Journal of Computer
Science Information Systems, 43(02), Retrieved 18 April, 2008 from Proquest Computing database

King, T., (2006, 18 Sep), A Road Map for Decision Making: Data Warehouse Architecture , iNews: Planning,
architecture & development, IST Division, UC Berlkey, Retrieved 16 April, 2008 from;
http://istpub.berkeley.edu:4201/bcc/Fall2006/929.html

Taylor, P., (1997, 21 June), A Head for Business: New Software can mimic human thought when processing
information, Financial Post, Retrieved 21 April, 2008 from CBCA Business database

Thomas, E., (2004, Jan), Data Mining: Definations and Decision Tree Examples, Stony Brook State University
of New York, Retrieved 20 April, 2008 from;
http://airpo.binghamton.edu/conference/jan2004/Thomas_data_mining.pdf

Turban, E., Aronson, J. E., Liang, T., Sharda, R., (2007), Decision Support and Business Intellegence Systems,
Eighth Edition, Pearson Prentice Hall, New Jersey, pp. 206-244, 302-337

Wikipedia, (2008), Neural Network Software, Retrieved 22 April, 2008 from;
http://www.en.wikipedia.org/wiki/Neural_network_software

Wikipedia, (2008, 11 April), Cluster Analysis, Retrieved 21 April, 2008 from;
http://www.en.wikipedia.org/wiki/Data_clustering

Williams, N., (1994, Mar), Data Mining with Neural Networks, Insurance Systems Bulletin, 9(7), Retrieved 21
April, 2008 from ABI/INFORM Global Database

11 Jess Maher | 3328773 .

Individual assignment 3328773

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (14)

Similaire à Individual assignment 3328773

Similaire à Individual assignment 3328773 (20)

Plus de Jess Maher

Plus de Jess Maher (16)

Individual assignment 3328773