Contenu connexe Similaire à Bi analytics with gt data mining (20) Bi analytics with gt data mining1. BI from open sources
with
GT data mining
By: Edith Ohri, Datalert
Home of GT data mining
edith@datalert.co.il
2. About
Edith Ohri – Developer of GT data mining and founder of Datalert
startup for early detection.
Industrial & Management Eng. from the Technion, MSc from NY
Polytech.
Management member of IE group in Association of Engineers and
Architects in Israel, and a Liaison to Israel Society for Quality.
GT applications include:
SMU Singapore – early detection of top students and dropouts.
RAFAE”L – root cause of late deliveries in Purchasing
SCD Israel – root cause of a quality issue in production
Detection of earthquakes seismology patterns of behavior - Israel
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 2
3. The challenge
How to exploit free data?
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 3
4. Open data integrity issues
1. Unsupervised records
2. Interdependencies
3. “Long tail”
4. “Overfitting”
5. BIG DATA concerns,
such as
inconsistencies
and dynamics...
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 4
5. The GT data mining solution
GT is about patterns detection* in unsupervised
complex data, including rare patterns and newly
emerging ones.
*Patterns always provide further resolution,
associations and insight.
s27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 5
6. Some of GT new principles
1. They shall not clean input data!
2. Always prefer unsupervised data!
3. Include exceptions;
4. Include the data environment;
5. No pre-assumption about data behavior…
Consider variables as interdependent unless proved
otherwise;
6. Conclusions have to be explicit and fully traceable.
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 6
7. Example: predicting market prices
The target: pricing of new products, based on
historic price lists.
The client doubts if analysis can help at all, since
there is no data on competitors prices and clients’
behavior. Currently, Marketing determines prices
by trial-and-error.
*See how GT resolve that issue in slide #13
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 7
8. Cont. example – predicting prices,
input data
The input contains ~20,000 lines and 22 inter -
dependent variables (YET NO data about
competitors or clients behavior)
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 8
9. Cont. example – predicting prices,
patterns
GT identifies the 3 product families and 9 subgroups -
Inserts, Tools and Solid Carbide, and in them 5 sgr defined
by specific functions: 6Q CUT-Grip, 30B Turning-with-
hole, 21T Milling-new, 3 sgr defined by their typical
Prod.Type, Grade, Shape and Size, and 1 Exceptions sgr.
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 9
10. Cont. example – predicting prices,
key factors
Key-factors in addition to the already existing
definitions of Sales-group and Item group:
₋ Type of product
₋ Geometric shape
₋ Type of Chipbreaker
₋ Grade
₋ Product Radius (Length though has no effect..)
*GT arrives to similar key factors on two independent sets.
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 10
Non-
quantitative
11. Cont. example – predicting prices,
test
Projecting prices with GT formulas:
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 11
Sub-
group
Price by
formula
Actual
price $
DescriptionItem
Gr73244.1145.7438U drilling Inserts5505456
Gr7507.529.841E self-grip Inserts6002918
Gr73618.1018.8812n~A do-grip Inserts6002410
Gr73616.1715.9612n~A do-grip Inserts6095285
…
Anticipated prices are very close to true prices, see full simulation in next slide
12. Cont. example – predicting prices,
simulation
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 12
13. Cont. example – predicting prices,
GT insights
Answer to Slide-7: pricing can be done without
data on competitors and clients, by reverse-
engineering their old item price lists.
It means that we may know the competitors’
prices more than themselves!
PS, GT groups characteristics also help improve
existing Sales-group and Item-group definitions.
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 13
14. Cont. example – predicting prices,
GT Benefits
1. Shorter time to market.
2. Improved specifications of new products.
3. Marketing competitive edge.
4. Extra windfall from data...
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 14
15. GT features Sum-up
Discovery, Root-causes
Early detection
Automation, fits all platforms
Low cost application method
Fast adaptation to changes
Scalability (by fast & affordable adaptation).
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 15
16. Thanks
Edith Ohri
Home of GT data mining
*Imported pictures are from free web sources.
27Oct 2015 BI from open sources with GT data mining. All rights reserved © Edith Ohri 2015 Slide 16
Notes de l'éditeur Open source free data in most cases are unfit for statistics due to integrity issues Function: CUT-Grip, Turning with hole, etc.
Prod.Type: Tools, Solid Carbide There are 4 types of benefits:
Enhancing patterns of success on the account of others
Managing success key factors
Early detection, fast reaction and seizing opportunities
In process watch control
What is not listed here?
How fast the algorithms work!
Why?
- There are 2 levels of algorithms, the client algo should work very fast and so they designed to do, but the GT service algo can be slow, the project starts only when it is ready.