Slides of our presentation called What's going on out there right now? A beehive based machine to give snapshots of the ongoing stories on the Web, presented at NaBIC 2012 conference, Mexico City, Mexico.
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
NaBIC 2012 presentation
1. What's going on out there right now?
A beehive based machine to give snapshot of
the ongoing stories on the Web
Štefan Sabo and Pavol Návrat
sabo@fiit.stuba.sk, navrat@fiit.stuba.sk
2. General overview
• Method to extract keywords related to stories from news
articles is proposed.
• Multiple agents inspired by honey bees foraging for food
are used.
• Connections between articles are explored one keyword at
a time.
• Most promising keywords that provide links between
articles are propagated, uninteresting keywords are
discarded.
4. Motivation
• News stories are often represented by terms that identify
the story by providing an easily recognizable label for it.
• These keywords are interesting for navigation in the space
of news stories.
• It is difficult to predict in advance which articles will develop
into stories over time and which keywords will represent
them.
• Dynamic system is needed to follow new articles and
account for the changes in the old ones.
• Corpus of all the articles in unavailable.
5. Method overview
• Most representative keywords are chosen by comparing
relevance of multiple articles to a given keyword.
• If two articles are both relevant to a keyword a link is
established between them.
• Keywords that provide links between most articles are
selected as most interesting.
• Comparison between every two articles regarding every
keyword would be impractical.
• To facilitate the process of comparison, the process is
performed by a swarm of agents inspired by honey bees.
6. Method overview - agents
• Every agent carries a single keyword at a time and can
independently perform one of 3 actions:
o foraging – comparing articles
o dancing – propagating its current keyword
o observing – selecting a new keyword
• Based on the keyword quality, an agent may decide to
propagate an interesting keyword through dancing or
select a new keyword through observation.
• This mechanism focuses the swarm on the most
interesting keywords for currently visited articles.
7. Results
• News articles from Reuters web page have been checked
daily for a period of 9 days.
• 298 unique keywords had been identified.
• On average, 287 articles have been assigned a keywords
every day.
• Increased prevalence of proper nouns amongst the top
keywords can be noted.
8. Results – best keywords
keyword n (k) n (k) / N keyword n (k) n (k) / N
Syria 177.30 6.87 % court 49.90 1.93 %
Egypt 98.10 3.80 % ECB 49.85 1.93 %
Apple 92.65 3.59 % attack 49.41 1.91%
Afghan 78.23 3.03 % Colorado 41.79 1.62 %
Euro 75.50 2.92 % trial 28.90 1.12 %
shooting 56.32 2.18 % Libor 27.75 1.07 %
Samsung 55.71 2.16 % murder 26.38 1.02 %
China 55.30 2.14 % Aleppo 25.31 0.98 %
9. Results – development over time
120
100
Colorado
80 China
shooting
60 Afghan
Egypt
40 Apple
Euro
20 Syria
0
4.8. 5.8. 6.8. 7.8. 8.8. 9.8. 10.8. 11.8. 12.8.
10. Summary
• Proposed approach utilizes agents inspired by honey bees
foraging for food to extract story related keywords from a
set of news articles.
• Articles are compared and their proximity is evaluated
multiple times with regard to various keywords.
• To reduce the number of performed comparisons, agents
use the mechanisms of propagation and observation to
select the best keywords and discard those less desirable.
• Dynamic nature of the process enables agents to react to
new articles as well as to changes in the old ones without
need for article corpus or machine learning.
11. Future work
• Multi-level hierarchical grouping of keywords based on
their generality.
• Visualization of stories.