SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
한국보건정보통계학회 추계학술발표회 2013

“빅” 데이터의 분석적 시각화
Analytic Data Visualization
許 明 會

2013.11.29

고려대학교 통계학과 stat420@korea.ac.kr

1

Health Info & Stat
Data Visualization
- Descriptive vs Analytic ...
- Small vs Big ...

science

technology
art

2013.11.29

2

Health Info & Stat
Contents
- Scatterplot
- Biplot
- Regression Biplot
- Kernel PCA
- SVM Biplot

2013.11.29

3

Health Info & Stat
Scatterplot: 산점도
- “Lego” for analytic data visualization
- Reflecting the third variable

quakes:

2013.11.29

longitude(=x), latitude(=y), depth(=z)

4

Health Info & Stat
Scatterplot: 산점도
- For the case of large  (≧  ), over-plotting can produce
serious outcome.

Skin Segmentation Data:  (red) vs.  (green)
      

2013.11.29

5

Health Info & Stat
Scatterplot: 산점도
- For the case of large  (≧  ), alpha channel can be utilized.

Skin Segmentation Data:  (red) vs.  (green)
      

2013.11.29

6

Health Info & Stat
Scatterplot: 산점도
- lowess: A nonparametric regression for bivariate data

cars data: distance vs. speed

2013.11.29

7

Health Info & Stat
Scatterplot: 산점도
- 3D Rotation for three variables

Skin Segmentation Data:  (red),  (green),  (blue)

- ggobi:

2013.11.29

3D Rotation for four or more variables

8

Health Info & Stat
Biplot of Observations and Variables,

Gabriel (1971)

- The biplot is a graph that shows  observations and  variables.

Protein data (row: 25 nations, column: 9 protein sources)

2013.11.29

9

Health Info & Stat
Biplot of Observations and Variables,

Gabriel (1971)

- Idea: Linear projection

Protein data: variable cereal

2013.11.29

10

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

- Regression biplot is a graph for  observations of   ⋯    ,
arranged by predicted  .
- Assume that the model fit is determined by a function of linear
combination of   ⋯    . For instance,
   ⋯  ,


 
 
or

log           ⋯    .



- Set the vertical dimension by the direction of regression coefficients
  

  ⋮ ,
or      .
∥∥
  
- Set the horizontal dimension by the direction of principal axis of





  ⋯   ,



where  

denotes the orthogonal component generated from the

projection of   on  .

2013.11.29

11

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

Example 1. Stack Loss Data (  ;   loss of ammonia,         )

2013.11.29

12

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

Example 2. Magazine Data (  ;   Subscription (0,1),   )

2013.11.29

13

Health Info & Stat
Kernel PCA,

Scholkopf et al. (1998)

- For  observations    ⋯    ( × ), consider the nonlinear mapping
    ⋯   
to a Hilbert space, in which                      .
- Denoting            , Kernel PCA is obtained from
eigen-decomposing
             .






- Kernel PCA yields a plot of observations by projecting       ⋯      
on 









′  


where 

2013.11.29


′

   ′  ,

     ,   is an eigenvector of  .



14

Health Info & Stat
Kernel PCA Diagram (or Kernel Biplot),

Huh (2013)

- Aim: Representation of  variables in Kernel PC plot of observations.
- Proposed Procedure:

1) For each    ⋯    , map         on the plane,

   ⋯   , where    is a constant and     ⋯   ⋯    .
Projection is given by




′  

  ′   
′


 
 
  

 ″    
 ′ ″   
 ″ ″′  .

 ″  
 ″  
 ″  ″′  








2) For each  , link the projection points of   and  

2013.11.29

15

by an arrow.

Health Info & Stat
Example 1. Arrow diagrams [  ] for kernel PCA of the iris data
with rbf kernel,   

2013.11.29

16

Health Info & Stat
Example 1. Arrow diagrams [  ] for kernel PCA of the iris data
with rbf kernel,   

2013.11.29

17

Health Info & Stat
Example 2. Arrow diagrams [  ] for kernel PCA of the spam data
[      ]

2013.11.29

18

Health Info & Stat
SVM-Guided Biplot as an extension of Regression Biplot
- Idea: Combine Linear/Logistic Regression Biplot and Kernel PCA.
- Classification/Regression Part:
Classified

as

SVM classifier

  -1 or 1 for    ⋯   .
              ,


where

 

      , 





Vertical dimension is set to


  
   
  



2013.11.29







≧ .







(      ,        ).

19

Health Info & Stat
SVM-Guided Biplot: Classification
- Kernel PCA Part:
         
 

 
∴




(   
      ′  ),
 ′   ′



   ⋯   .

          ′                  ′   ′   


 ′   ′ ,

  ′   ⋯   .

Hence


 →      (   ) or          .




Horizontal dimension is determined by eigen-decomposing  .

- Perturbation Scheme for Arrow Diagrams.
Define      ,  ×  , where  represents a perturbation of
which the magnitude is controlled by . Then, project   on the first
(vertical) and the second (horizontal) dimension.

2013.11.29

20

Health Info & Stat
Example 1. Iris Data: Versicolor vs. Virginica [sigma=0.1, C=1,   ]

2013.11.29

21

Health Info & Stat
Importance of Variables

(in the case of large

)

- It is necessary to select a small number of variables in determining
the first and second dimensions.
- Measures of Importance (definition)  Length of Arrows
1) in vertical direction,
2) in horizontal direction.
- Plot arrow diagrams for importance variables only.

2013.11.29

22

Health Info & Stat
Example 2. Spam Data [sigma=0.1, C=10,   ],   

2013.11.29

23

Health Info & Stat
SVM-Guided Biplot: Regression
- The same method can be applied to SVM regression.
- Example 3. Aerobic Fitness [       ] for oxygen uptake (=  )
with RBF kernel ( =0.1, C=10,  =0.1,   )

2013.11.29

24

Health Info & Stat
Concluding Remarks
- Biplot method can be extended to be suited for linear regression or
classification (logistic regression).
- Biplot method can be extended to allow nonlinear mapping of
observations and variables, by fully utilizing kernel trick.

http://blog.naver.com/huh4200

금붕어 어항 (on the iPad)

2013.11.29

25

Health Info & Stat
References
Gabriel, K.R. (1971). “The biplot display of matrices with the application to
principal component analysis”. Biometrika, 58. 453-467.
Huh, M.H. (2013). “Arrow diagrams for kernel principal component analysis”.
Communications for Statistical Applications and Methods, 20. 175-184.
Huh, M.H. (2013). “SVM-guided biplot of observations and variables”.
Communications for Statistical Applications and Methods. (to appear)
Huh, M.H. and Lee, Y.G. (2013). “Biplots of multivariate data guided by linear
and/or logistic regression”. Communications for Statistical Applications and
Methods, 20. 129-136.
Scholkopf, B., Smola, A. and Muller, K.R. (1998). Nonlinear component analysis as
a kernel eigenvalue problem. Neural Computation, 10. 1299–1319.

2013.11.29

26

Health Info & Stat

Contenu connexe

Tendances

Tendances (6)

Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix Multiplication
 
Supporting Flight Test And Flight Matching
Supporting Flight Test And Flight MatchingSupporting Flight Test And Flight Matching
Supporting Flight Test And Flight Matching
 
A Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather EventsA Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather Events
 
Vldb14
Vldb14Vldb14
Vldb14
 
Four data models in GIS
Four data models in GISFour data models in GIS
Four data models in GIS
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
 

En vedette

20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering) 20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
Tae Young Lee
 

En vedette (20)

데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)
 
대화형지도 Carto를 활용한 데이터 분석 및 통찰력
대화형지도 Carto를 활용한 데이터  분석 및 통찰력대화형지도 Carto를 활용한 데이터  분석 및 통찰력
대화형지도 Carto를 활용한 데이터 분석 및 통찰력
 
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
 
인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)
 
데이터 분석 실무 1강
데이터 분석 실무 1강데이터 분석 실무 1강
데이터 분석 실무 1강
 
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
 
판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템
 
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
 
비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래
 
예측 분석 산업별 사례 147
예측 분석 산업별 사례 147예측 분석 산업별 사례 147
예측 분석 산업별 사례 147
 
검색로그시스템 with Python
검색로그시스템 with Python검색로그시스템 with Python
검색로그시스템 with Python
 
파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트
 
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering) 20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
 
데이터분석의 길 4: “고수는 통계학습의 달인이다”
데이터분석의 길 4:  “고수는 통계학습의 달인이다”데이터분석의 길 4:  “고수는 통계학습의 달인이다”
데이터분석의 길 4: “고수는 통계학습의 달인이다”
 
[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장
 
빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석
 
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
 
빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)
 
빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법
 
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
 

Similaire à "빅" 데이터의 분석적 시각화

Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
BRNSSPublicationHubI
 
Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...
ijma
 
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of ReservoirsApplication of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
IOSR Journals
 
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
ijma
 

Similaire à "빅" 데이터의 분석적 시각화 (20)

MFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand SystemMFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand System
 
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
 
tadejko2007.pdf
tadejko2007.pdftadejko2007.pdf
tadejko2007.pdf
 
iPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White PaperiPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White Paper
 
Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...
 
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBPIRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
 
L14.pdf
L14.pdfL14.pdf
L14.pdf
 
PCA and SVD in brief
PCA and SVD in briefPCA and SVD in brief
PCA and SVD in brief
 
Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...
 
Lec-3 DIP.pptx
Lec-3 DIP.pptxLec-3 DIP.pptx
Lec-3 DIP.pptx
 
Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...
 
A Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable SensorsA Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable Sensors
 
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of ReservoirsApplication of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
 
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
 
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm OptimizationReduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
 
H235055
H235055H235055
H235055
 
Human Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataHuman Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerData
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and Forecasting
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
 

Plus de Myung-Hoe Huh (7)

법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
 
데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008
 
22 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 2014040422 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 20140404
 
21 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 2014032521 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 20140325
 
Data visualization using r pt 20140316
Data visualization using r pt 20140316Data visualization using r pt 20140316
Data visualization using r pt 20140316
 
통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413
 
통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knou통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knou
 

Dernier

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Dernier (20)

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

"빅" 데이터의 분석적 시각화

  • 1. 한국보건정보통계학회 추계학술발표회 2013 “빅” 데이터의 분석적 시각화 Analytic Data Visualization 許 明 會 2013.11.29 고려대학교 통계학과 stat420@korea.ac.kr 1 Health Info & Stat
  • 2. Data Visualization - Descriptive vs Analytic ... - Small vs Big ... science technology art 2013.11.29 2 Health Info & Stat
  • 3. Contents - Scatterplot - Biplot - Regression Biplot - Kernel PCA - SVM Biplot 2013.11.29 3 Health Info & Stat
  • 4. Scatterplot: 산점도 - “Lego” for analytic data visualization - Reflecting the third variable quakes: 2013.11.29 longitude(=x), latitude(=y), depth(=z) 4 Health Info & Stat
  • 5. Scatterplot: 산점도 - For the case of large  (≧  ), over-plotting can produce serious outcome. Skin Segmentation Data:  (red) vs.  (green)        2013.11.29 5 Health Info & Stat
  • 6. Scatterplot: 산점도 - For the case of large  (≧  ), alpha channel can be utilized. Skin Segmentation Data:  (red) vs.  (green)        2013.11.29 6 Health Info & Stat
  • 7. Scatterplot: 산점도 - lowess: A nonparametric regression for bivariate data cars data: distance vs. speed 2013.11.29 7 Health Info & Stat
  • 8. Scatterplot: 산점도 - 3D Rotation for three variables Skin Segmentation Data:  (red),  (green),  (blue) - ggobi: 2013.11.29 3D Rotation for four or more variables 8 Health Info & Stat
  • 9. Biplot of Observations and Variables, Gabriel (1971) - The biplot is a graph that shows  observations and  variables. Protein data (row: 25 nations, column: 9 protein sources) 2013.11.29 9 Health Info & Stat
  • 10. Biplot of Observations and Variables, Gabriel (1971) - Idea: Linear projection Protein data: variable cereal 2013.11.29 10 Health Info & Stat
  • 11. Regression Biplot, Huh and Lee (2013) - Regression biplot is a graph for  observations of   ⋯    , arranged by predicted  . - Assume that the model fit is determined by a function of linear combination of   ⋯    . For instance,    ⋯  ,       or log           ⋯    .   - Set the vertical dimension by the direction of regression coefficients       ⋮ , or      . ∥∥    - Set the horizontal dimension by the direction of principal axis of      ⋯   ,  where   denotes the orthogonal component generated from the projection of   on  . 2013.11.29 11 Health Info & Stat
  • 12. Regression Biplot, Huh and Lee (2013) Example 1. Stack Loss Data (  ;   loss of ammonia,         ) 2013.11.29 12 Health Info & Stat
  • 13. Regression Biplot, Huh and Lee (2013) Example 2. Magazine Data (  ;   Subscription (0,1),   ) 2013.11.29 13 Health Info & Stat
  • 14. Kernel PCA, Scholkopf et al. (1998) - For  observations    ⋯    ( × ), consider the nonlinear mapping     ⋯    to a Hilbert space, in which                      . - Denoting            , Kernel PCA is obtained from eigen-decomposing              .       - Kernel PCA yields a plot of observations by projecting       ⋯       on      ′    where  2013.11.29  ′    ′  ,      ,   is an eigenvector of  .   14 Health Info & Stat
  • 15. Kernel PCA Diagram (or Kernel Biplot), Huh (2013) - Aim: Representation of  variables in Kernel PC plot of observations. - Proposed Procedure:  1) For each    ⋯    , map         on the plane,    ⋯   , where    is a constant and     ⋯   ⋯    . Projection is given by   ′     ′    ′           ″      ′ ″     ″ ″′  .   ″    ″    ″  ″′       2) For each  , link the projection points of   and   2013.11.29 15 by an arrow. Health Info & Stat
  • 16. Example 1. Arrow diagrams [  ] for kernel PCA of the iris data with rbf kernel,    2013.11.29 16 Health Info & Stat
  • 17. Example 1. Arrow diagrams [  ] for kernel PCA of the iris data with rbf kernel,    2013.11.29 17 Health Info & Stat
  • 18. Example 2. Arrow diagrams [  ] for kernel PCA of the spam data [      ] 2013.11.29 18 Health Info & Stat
  • 19. SVM-Guided Biplot as an extension of Regression Biplot - Idea: Combine Linear/Logistic Regression Biplot and Kernel PCA. - Classification/Regression Part: Classified as SVM classifier   -1 or 1 for    ⋯   .               ,  where         ,    Vertical dimension is set to              2013.11.29    ≧ .     (      ,        ). 19 Health Info & Stat
  • 20. SVM-Guided Biplot: Classification - Kernel PCA Part:                ∴   (          ′  ),  ′   ′     ⋯   .           ′                  ′   ′       ′   ′ ,   ′   ⋯   . Hence   →      (   ) or          .    Horizontal dimension is determined by eigen-decomposing  .  - Perturbation Scheme for Arrow Diagrams. Define      ,  ×  , where  represents a perturbation of which the magnitude is controlled by . Then, project   on the first (vertical) and the second (horizontal) dimension. 2013.11.29 20 Health Info & Stat
  • 21. Example 1. Iris Data: Versicolor vs. Virginica [sigma=0.1, C=1,   ] 2013.11.29 21 Health Info & Stat
  • 22. Importance of Variables (in the case of large ) - It is necessary to select a small number of variables in determining the first and second dimensions. - Measures of Importance (definition)  Length of Arrows 1) in vertical direction, 2) in horizontal direction. - Plot arrow diagrams for importance variables only. 2013.11.29 22 Health Info & Stat
  • 23. Example 2. Spam Data [sigma=0.1, C=10,   ],    2013.11.29 23 Health Info & Stat
  • 24. SVM-Guided Biplot: Regression - The same method can be applied to SVM regression. - Example 3. Aerobic Fitness [       ] for oxygen uptake (=  ) with RBF kernel ( =0.1, C=10,  =0.1,   ) 2013.11.29 24 Health Info & Stat
  • 25. Concluding Remarks - Biplot method can be extended to be suited for linear regression or classification (logistic regression). - Biplot method can be extended to allow nonlinear mapping of observations and variables, by fully utilizing kernel trick. http://blog.naver.com/huh4200 금붕어 어항 (on the iPad) 2013.11.29 25 Health Info & Stat
  • 26. References Gabriel, K.R. (1971). “The biplot display of matrices with the application to principal component analysis”. Biometrika, 58. 453-467. Huh, M.H. (2013). “Arrow diagrams for kernel principal component analysis”. Communications for Statistical Applications and Methods, 20. 175-184. Huh, M.H. (2013). “SVM-guided biplot of observations and variables”. Communications for Statistical Applications and Methods. (to appear) Huh, M.H. and Lee, Y.G. (2013). “Biplots of multivariate data guided by linear and/or logistic regression”. Communications for Statistical Applications and Methods, 20. 129-136. Scholkopf, B., Smola, A. and Muller, K.R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10. 1299–1319. 2013.11.29 26 Health Info & Stat