SlideShare une entreprise Scribd logo
1  sur  34
Pattern Mining to  Chinese Unknown word Extraction 資工碩三  955202037  楊傑程 2008/10/14
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction ,[object Object],[object Object],[object Object]
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction- types of unknown words ,[object Object],Types of Chinese unknown words Organization names Ex: 華碩電腦 Ex: 總經理、電腦化 Abbreviation Proper Names Ex:  中油、中大 Personal names Ex:  王小明 Derived Words Compounds Ex: 電腦桌、搜尋法 Numeric type  compounds Ex: 1986 年、 19 巷
Introduction- unknown word identification ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction- unknown word identification ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction- applied techniques  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Related Works- particular methods ,[object Object],[object Object],[object Object],[object Object],[object Object]
Related Works- general methods  (Rule-based) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Related Works- general methods  (Machine Learning-based) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Related Works – Imbalanced Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unknown Word Detection & Extraction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unknown Word Detection & Extraction Unknown Word Detection (Detection Rule Mining) Judge Judge Unknown Word Extraction (Machine Learning- Classification) 8/10 corpus +  detection tags (Initial Segmentation) 8/10 corpus 1/10 corpus (Validation) 1/10 corpus (Initial Segmentation) Classification Decision 1/10 corpus + detection tags training testing Phase 1 Phase 2 Rules 1/10 corpus (Validation) Mining tool (Prowl) Model POS tagging POS tagging
Unknown Word Detection ,[object Object],[object Object],[object Object],[object Object]
Unknown word detection- Pattern Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unknown word detection- Continuity Pattern Mining ,[object Object],[object Object],[object Object],[object Object],[object Object]
Encoding ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Create Detection Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],(  葡  (Na),  萄  (Na), ) : 2 (  葡  (Na) Y,  萄  (Na), ) : 1 (  葡  (Na) N,  萄  (Na) N, ) : 1 (  葡  (Na) Y,  萄  (Na) Y, ) : 1 (  葡  ,  萄  , ) : 2 (  葡  (Na),  萄  , ) : 2 (  葡  ,  萄  (Na), ) : 2
Unknown Word Extraction ,[object Object],[object Object],[object Object]
Unknown Word Extraction-  feature ( Pos) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unknown Word Extraction-  feature ( term_attribute) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],運動會 ()  ‧ ()  四年 ()  甲班 ()  王 (?)  姿 (?)  分 (?)  ‧ ()  本校 ()  為 ()  響 ()  應 () ps()  dot()  ds()  ds()  ms(?)  ms(?)  ms(?)  dot()  ds()  ms()  ms()  ms()
Data Processing- Sliding Window ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],t3 t2 t1 prefix t0 3-gram suffix t4
EX: 3-gram Model discard negative negative negative positive 運動會 ()  ‧ ()  四年 ()  甲班 ()  王 (?)  姿 (?)  分 (?)  ‧ ()  本校 ()  為 ()  響 ()  應 () 運動會 ‧ 四年 甲班 王 (?) ‧ 四年 甲班 王 (?) 姿 (?) 四年 甲班 王 (?) 姿 (?) 分 (?) 甲班 王 (?) 姿 (?) 分 (?) ‧ 王 (?) 姿 (?) 分 (?) ‧ 本校
Unknown Word Extraction-  feature (Statistical Information) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],t3 t2 t1 prefix t0 3-gram suffix t4
Data presentation ,[object Object],[object Object],term_attribute (6) pos (55) t2 term_attribute (6) pos (55) term_attribute (6) pos (55) prefix t1 …… … statistics (7) term_attribute (6) pos (55) …… suffix
Experiments ,[object Object],[object Object]
Unknown Word Detection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Threshold (Accuracy) Precision Recall F-measure (our system) F-measure (AS system) 0.7 0.9324 0.4305 0.589035 0.71250 0.8 0.9008 0.5289 0.66648 0.752447 0.9 0.8343 0.7148 0.769941 0.76955 0.95 0.764 0.8288 0.795082 0.76553 0.98 0.686 0.8786 0.770446 0.744036 0.76158 0.9092 0.6552 29 0.77033 0.780085 0.787466 0.795082 F-measure 0.8995 0.8932 0.8819 0.8288 Recall 0.6736 0.6924 0.7113 0.764 Precision 19 11 7 3 Fre>=
Unknown Word Extraction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unknown Word Extraction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Extraction result ,[object Object],[object Object],[object Object],[object Object],[object Object],0.627 68.2% 58.1% Total 0.614 67.1% 56.7% 2-gram 0.707 80% 63.3% 3-gram 0.426 70.3% 30.6% 4-gram F1-score Recall Precision n-gram
Ensemble Method Improvement 0.426 0.703 0.306 0.707 0.8 0.633 0.614 0.671 0.567 Censemble 0.336 0.59 0.238 0.66 0.765 0.583 0.594 0.653 0.544 Caverage 0.412 0.662 0.299 0.669 0.776 0.587 0.587 0.645 0.538 C12 0.335 0.554 0.24 0.667 0.74 0.607 0.593 0.668 0.533 C11 0.344 0.662 0.232 0.655 0.723 0.599 0.596 0.661 0.543 C10 0.321 0.635 0.215 0.645 0.715 0.587 0.598 0.657 0.548 C9 0.309 0.486 0.226 0.676 0.813 0.579 0.6 0.673 0.541 C8 0.325 0.703 0.211 0.648 0.691 0.611 0.604 0.66 0.557 C7 0.333 0.608 0.23 0.641 0.735 0.568 0.582 0.636 0.536 C6 0.299 0.554 0.205 0.644 0.779 0.549 0.603 0.66 0.555 C5 0.42 0.676 0.305 0.667 0.796 0.574 0.598 0.645 0.557 C4 0.28 0.378 0.222 0.664 0.81 0.563 0.58 0.633 0.535 C3 0.338 0.743 0.219 0.7 0.791 0.627 0.61 0.657 0.569 C2 0.315 0.419 0.252 0.649 0.808 0.542 0.572 0.64 0.518 C1 F1-Score Recall Precision F1-Score Recall Precision F1-Score Recall Precision 4-gram 3-gram 2-gram 分類 模型
Experiment- One phase ,[object Object],[object Object],0.627 68.2% 58.1% Two Phases 0.52 71.4% 40.8% One Phase F-score Recall Precision Classification  Performance
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Tendances

Lecture: Summarization
Lecture: SummarizationLecture: Summarization
Lecture: SummarizationMarina Santini
 
Crash Course in Natural Language Processing (2016)
Crash Course in Natural Language Processing (2016)Crash Course in Natural Language Processing (2016)
Crash Course in Natural Language Processing (2016)Vsevolod Dyomkin
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)Marina Santini
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalNik Spirin
 
Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Aldo Gangemi
 
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...Rommel Carvalho
 
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language ProcessingVsevolod Dyomkin
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Predicate calculus
Predicate calculusPredicate calculus
Predicate calculusRajendran
 
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...KozoChikai
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categoriesWarNik Chow
 
A look inside the distributionally similar terms
A look inside the distributionally similar termsA look inside the distributionally similar terms
A look inside the distributionally similar termsKow Kuroda
 
Dependent Types in Natural Language Semantics
Dependent Types in Natural Language SemanticsDependent Types in Natural Language Semantics
Dependent Types in Natural Language SemanticsDaisuke BEKKI
 
[系列活動] 文字探勘者的入門心法
[系列活動] 文字探勘者的入門心法[系列活動] 文字探勘者的入門心法
[系列活動] 文字探勘者的入門心法台灣資料科學年會
 

Tendances (20)

Lecture: Summarization
Lecture: SummarizationLecture: Summarization
Lecture: Summarization
 
Crash Course in Natural Language Processing (2016)
Crash Course in Natural Language Processing (2016)Crash Course in Natural Language Processing (2016)
Crash Course in Natural Language Processing (2016)
 
PROLOG: Introduction To Prolog
PROLOG: Introduction To PrologPROLOG: Introduction To Prolog
PROLOG: Introduction To Prolog
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016
 
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
 
Plc part 4
Plc  part 4Plc  part 4
Plc part 4
 
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language Processing
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
L3 v2
L3 v2L3 v2
L3 v2
 
Predicate calculus
Predicate calculusPredicate calculus
Predicate calculus
 
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
 
1909 paclic
1909 paclic1909 paclic
1909 paclic
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
A look inside the distributionally similar terms
A look inside the distributionally similar termsA look inside the distributionally similar terms
A look inside the distributionally similar terms
 
First order logic
First order logicFirst order logic
First order logic
 
Dependent Types in Natural Language Semantics
Dependent Types in Natural Language SemanticsDependent Types in Natural Language Semantics
Dependent Types in Natural Language Semantics
 
[系列活動] 文字探勘者的入門心法
[系列活動] 文字探勘者的入門心法[系列活動] 文字探勘者的入門心法
[系列活動] 文字探勘者的入門心法
 

En vedette

LiDAR processing for road network asset inventory
LiDAR processing for road network asset inventory LiDAR processing for road network asset inventory
LiDAR processing for road network asset inventory Conor Mc Elhinney
 
A machine learning approach to building domain specific search
A machine learning approach to building domain specific searchA machine learning approach to building domain specific search
A machine learning approach to building domain specific searchNiharjyoti Sarangi
 
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)Daniel Roggen
 
Text independent speaker recognition system
Text independent speaker recognition systemText independent speaker recognition system
Text independent speaker recognition systemDeepesh Lekhak
 
Automatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAutomatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAbdullah al Mamun
 
Module15: Sliding Windows Protocol and Error Control
Module15: Sliding Windows Protocol and Error Control Module15: Sliding Windows Protocol and Error Control
Module15: Sliding Windows Protocol and Error Control gondwe Ben
 
Track 1 session 1 - st dev con 2016 - contextual awareness
Track 1   session 1 - st dev con 2016 - contextual awarenessTrack 1   session 1 - st dev con 2016 - contextual awareness
Track 1 session 1 - st dev con 2016 - contextual awarenessST_World
 
Track 2 session 1 - st dev con 2016 - avnet - making things real
Track 2   session 1 - st dev con 2016 - avnet - making things realTrack 2   session 1 - st dev con 2016 - avnet - making things real
Track 2 session 1 - st dev con 2016 - avnet - making things realST_World
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image ProcessingSahil Biswas
 

En vedette (10)

LiDAR processing for road network asset inventory
LiDAR processing for road network asset inventory LiDAR processing for road network asset inventory
LiDAR processing for road network asset inventory
 
Object segmentation in images using EEG signals
Object segmentation in images using EEG signalsObject segmentation in images using EEG signals
Object segmentation in images using EEG signals
 
A machine learning approach to building domain specific search
A machine learning approach to building domain specific searchA machine learning approach to building domain specific search
A machine learning approach to building domain specific search
 
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)Wearable Computing - Part III: The Activity Recognition Chain (ARC)
Wearable Computing - Part III: The Activity Recognition Chain (ARC)
 
Text independent speaker recognition system
Text independent speaker recognition systemText independent speaker recognition system
Text independent speaker recognition system
 
Automatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAutomatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approach
 
Module15: Sliding Windows Protocol and Error Control
Module15: Sliding Windows Protocol and Error Control Module15: Sliding Windows Protocol and Error Control
Module15: Sliding Windows Protocol and Error Control
 
Track 1 session 1 - st dev con 2016 - contextual awareness
Track 1   session 1 - st dev con 2016 - contextual awarenessTrack 1   session 1 - st dev con 2016 - contextual awareness
Track 1 session 1 - st dev con 2016 - contextual awareness
 
Track 2 session 1 - st dev con 2016 - avnet - making things real
Track 2   session 1 - st dev con 2016 - avnet - making things realTrack 2   session 1 - st dev con 2016 - avnet - making things real
Track 2 session 1 - st dev con 2016 - avnet - making things real
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 

Similaire à Pattern Mining To Unknown Word Extraction (10

Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.pptbutest
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Rajnish Raj
 
Named Entity Recognition for Telugu Using Conditional Random Field
Named Entity Recognition for Telugu Using Conditional Random FieldNamed Entity Recognition for Telugu Using Conditional Random Field
Named Entity Recognition for Telugu Using Conditional Random FieldWaqas Tariq
 
Open nlp presentationss
Open nlp presentationssOpen nlp presentationss
Open nlp presentationssChandan Deb
 
Dsm as theory building
Dsm as theory buildingDsm as theory building
Dsm as theory buildingClarkTony
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Chunyang Chen
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2Viral Gupta
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learningtelss09
 
NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflowseungwoo kim
 
GDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision MakingGDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision MakingGDSCSSN
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptbutest
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialAlyona Medelyan
 
Nonparametric Bayesian Word Discovery for Symbol Emergence in Robotics
Nonparametric Bayesian Word Discovery for Symbol Emergence in RoboticsNonparametric Bayesian Word Discovery for Symbol Emergence in Robotics
Nonparametric Bayesian Word Discovery for Symbol Emergence in RoboticsTadahiro Taniguchi
 
Query Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data SourcesQuery Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data SourcesJie Bao
 
Dcn 20170823 yjy
Dcn 20170823 yjyDcn 20170823 yjy
Dcn 20170823 yjy재연 윤
 

Similaire à Pattern Mining To Unknown Word Extraction (10 (20)

Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
 
Spoken Content Retrieval
Spoken Content RetrievalSpoken Content Retrieval
Spoken Content Retrieval
 
NLP
NLPNLP
NLP
 
Named Entity Recognition for Telugu Using Conditional Random Field
Named Entity Recognition for Telugu Using Conditional Random FieldNamed Entity Recognition for Telugu Using Conditional Random Field
Named Entity Recognition for Telugu Using Conditional Random Field
 
Open nlp presentationss
Open nlp presentationssOpen nlp presentationss
Open nlp presentationss
 
Dsm as theory building
Dsm as theory buildingDsm as theory building
Dsm as theory building
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learning
 
NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
 
GDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision MakingGDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision Making
 
Inteligencia artificial
Inteligencia artificialInteligencia artificial
Inteligencia artificial
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
Nltk
NltkNltk
Nltk
 
Nonparametric Bayesian Word Discovery for Symbol Emergence in Robotics
Nonparametric Bayesian Word Discovery for Symbol Emergence in RoboticsNonparametric Bayesian Word Discovery for Symbol Emergence in Robotics
Nonparametric Bayesian Word Discovery for Symbol Emergence in Robotics
 
Query Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data SourcesQuery Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data Sources
 
UWB semeval2016-task5
UWB semeval2016-task5UWB semeval2016-task5
UWB semeval2016-task5
 
Dcn 20170823 yjy
Dcn 20170823 yjyDcn 20170823 yjy
Dcn 20170823 yjy
 

Dernier

ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 

Dernier (20)

ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 

Pattern Mining To Unknown Word Extraction (10

  • 1. Pattern Mining to Chinese Unknown word Extraction 資工碩三 955202037 楊傑程 2008/10/14
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14. Unknown Word Detection & Extraction Unknown Word Detection (Detection Rule Mining) Judge Judge Unknown Word Extraction (Machine Learning- Classification) 8/10 corpus + detection tags (Initial Segmentation) 8/10 corpus 1/10 corpus (Validation) 1/10 corpus (Initial Segmentation) Classification Decision 1/10 corpus + detection tags training testing Phase 1 Phase 2 Rules 1/10 corpus (Validation) Mining tool (Prowl) Model POS tagging POS tagging
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. EX: 3-gram Model discard negative negative negative positive 運動會 ()  ‧ ()  四年 ()  甲班 ()  王 (?)  姿 (?)  分 (?)  ‧ ()  本校 ()  為 ()  響 ()  應 () 運動會 ‧ 四年 甲班 王 (?) ‧ 四年 甲班 王 (?) 姿 (?) 四年 甲班 王 (?) 姿 (?) 分 (?) 甲班 王 (?) 姿 (?) 分 (?) ‧ 王 (?) 姿 (?) 分 (?) ‧ 本校
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. Ensemble Method Improvement 0.426 0.703 0.306 0.707 0.8 0.633 0.614 0.671 0.567 Censemble 0.336 0.59 0.238 0.66 0.765 0.583 0.594 0.653 0.544 Caverage 0.412 0.662 0.299 0.669 0.776 0.587 0.587 0.645 0.538 C12 0.335 0.554 0.24 0.667 0.74 0.607 0.593 0.668 0.533 C11 0.344 0.662 0.232 0.655 0.723 0.599 0.596 0.661 0.543 C10 0.321 0.635 0.215 0.645 0.715 0.587 0.598 0.657 0.548 C9 0.309 0.486 0.226 0.676 0.813 0.579 0.6 0.673 0.541 C8 0.325 0.703 0.211 0.648 0.691 0.611 0.604 0.66 0.557 C7 0.333 0.608 0.23 0.641 0.735 0.568 0.582 0.636 0.536 C6 0.299 0.554 0.205 0.644 0.779 0.549 0.603 0.66 0.555 C5 0.42 0.676 0.305 0.667 0.796 0.574 0.598 0.645 0.557 C4 0.28 0.378 0.222 0.664 0.81 0.563 0.58 0.633 0.535 C3 0.338 0.743 0.219 0.7 0.791 0.627 0.61 0.657 0.569 C2 0.315 0.419 0.252 0.649 0.808 0.542 0.572 0.64 0.518 C1 F1-Score Recall Precision F1-Score Recall Precision F1-Score Recall Precision 4-gram 3-gram 2-gram 分類 模型
  • 33.
  • 34.