Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand
Prochain SlideShare
Chargement dans…5
×

0

Partager

Télécharger pour lire hors ligne

Machine Learning-powered Taxonomization: AI Lends Taxonomists a Hand

Télécharger pour lire hors ligne

* A talk by Alena Vasilevich, Computational Linguist at Coreon.

In realm of data-driven businesses, formalized knowledge is a valuable resource for AI projects, created at great expense.

IATE, with almost one million concepts storing multilingual terms and metadata, holds a large part of the textual knowledge of the EU. However, it can only be accessed lexically, and the database concepts stand alone.

Taxonomization is linking a flat set of concepts into a hierarchical knowledge graph. So if IATE were converted in a full-fledged ontology, its data could not only be consumed by linguists, but would also become accessible for machines through e.g. a SPARQL endpoint.

In this talk, we will present our approach to a semi-automatic generation of taxonomised concept maps, elevating a sub-domain of IATE terminology into a multilingual knowledge graph. We taxonomized a flat list of concepts within the COVID sub-domain, benchmarking two approaches to tackle this task: automatic concept map creation using an enhanced ML-powered language model and manual creation of the graph by a linguist expert.

We will dwell on performance and resource-saving advantages of our collaborative method, made easy by Coreon user-friendly UI, and show how the achieved productivity rate can make the taxonomization of even larger terminology databases economically viable.

To demonstrate empirically the effectiveness of the semi-automatic approach in a typical industry use case scenario, the resulting IATE/Covid graph was used to initialize a CNN for a multilingual document classification task. Leveraging the created taxonomy, we got a classification granularity that is not reachable by state-of-the-art models, such as non-initialised CNNs and zero-shot classifiers.

  • Soyez le premier à aimer ceci

* A talk by Alena Vasilevich, Computational Linguist at Coreon. In realm of data-driven businesses, formalized knowledge is a valuable resource for AI projects, created at great expense. IATE, with almost one million concepts storing multilingual terms and metadata, holds a large part of the textual knowledge of the EU. However, it can only be accessed lexically, and the database concepts stand alone. Taxonomization is linking a flat set of concepts into a hierarchical knowledge graph. So if IATE were converted in a full-fledged ontology, its data could not only be consumed by linguists, but would also become accessible for machines through e.g. a SPARQL endpoint. In this talk, we will present our approach to a semi-automatic generation of taxonomised concept maps, elevating a sub-domain of IATE terminology into a multilingual knowledge graph. We taxonomized a flat list of concepts within the COVID sub-domain, benchmarking two approaches to tackle this task: automatic concept map creation using an enhanced ML-powered language model and manual creation of the graph by a linguist expert. We will dwell on performance and resource-saving advantages of our collaborative method, made easy by Coreon user-friendly UI, and show how the achieved productivity rate can make the taxonomization of even larger terminology databases economically viable. To demonstrate empirically the effectiveness of the semi-automatic approach in a typical industry use case scenario, the resulting IATE/Covid graph was used to initialize a CNN for a multilingual document classification task. Leveraging the created taxonomy, we got a classification granularity that is not reachable by state-of-the-art models, such as non-initialised CNNs and zero-shot classifiers.

Vues

Nombre de vues

47

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

2

Actions

Téléchargements

0

Partages

0

Commentaires

0

Mentions J'aime

0

×