SlideShare une entreprise Scribd logo
1  sur  5
Télécharger pour lire hors ligne
Using Content-Based Filtering in a System of
Recommendation in the Context of Digital Mobile
Interactive TV
Elaine Cecília Gatto, Sergio Donizetti Zorzo
IEEE Conference Publishing
Computer Science Department. Federal University of São Carlos – UFSCar
Highway Washington Luís, Km 235, PO Box 676, 13565-9. São Carlos, São Paulo, Brazil.
{elaine_gatto, zorzo}@dc.ufscar.br.
Abstract-Recommendation systems provide suggestions based
on information about the preferences of users. The filtering
information is used by recommender systems for the processing
of information and suggestions to users and content-based
filtering is an approach to filtering information widely used in
recommender systems. Content-Based Filtering on analyzing the
correlation of the content of items with the profile, suggesting
relevant items and discarding the irrelevant. Widely used on the
Internet, recommendation systems are being studied for use in
the context of Digital TV, there are already several studies in this
direction. Just as occurs on the Internet, recommendation
systems can be used in Digital TV for recommendation of TV
programs, advertising and publicity and also electronic
commerce. Thus, the items in the context of digital TV, may be
programs, publicity / advertising and the products to be sold.
Applying Content Filtering Based on the recommendation of
programs, for example, it should correlate the content of these
programs with user preferences, which in this scenario are the
types of programs he has preferred to watch. This paper presents
the studies performed with Content Filtering Based on Data
Applied to Digital TV. The studies seek to observe and evaluate
how some techniques of content-based filtering can be used in
recommendation systems in the context of Digital TV.

I.

INTRODUCTION

Digital TV implementation in Brazil provides new markets
which can be explored. Well-succeeded technologies as those
in Web environment, for example, can be applied in Digital
TV domain and achieve the same success.
The interaction either through the remote control or the cell
phone keyboard etc by the user today, will allow many
applications to be carried to this environment.
One of the areas which has been extensively studied and is
well-succeeded in the Web is that of personalization. There
are some surveys concerning recommendation systems for
Digital TV as for example [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] among
others.
Recommendation systems can contribute to a better use of
Digital TV in residences, in groups or individually, in a cell
phone, for example. These systems can help the user to choose
the program, avoiding waste of time and of course, suggesting
to the user programs which really interest him. Moreover,
recommendation systems can be applied to publicity and
advertisement on Digital TV, as well as in the T-Commerce.

This work is organized as follows: section 1 introduces the
paper, section 2 presents a comparison between digital TV and
portable devices for homes, section 3 presents related works,
section 4 talks about content-based filtering, section 5 presents
our recommendation system, Section 6 talks about the
characteristics of households, the EPG, the user history and
methodology used for the tests, Section 7 talks about the
results and Section 8 presents the conclusion.
II. COMPARING IDTV IN RESIDENCES AND IDTV FOR CELL
PHONES
The use of IDTV for cell phones will quickly boom due to
the increasingly quantity of these devices surpassing television
sets in Brazil, when cell phones with IDTV are available to
population. Thus, some differences between IDTV for
residences and for cell phones can be noticed.
IDTV standard adopted in Brazil calls full-seg the fixed
devices like set-top-box, and one-seg, devices like cell phones,
miniTVs, PDAs, etc. In residences, the IDTV is used by all
residents while in the cell phone it is normally used by only
one user, the owner of the device.
Another characteristic is the size of the display. In
residences, the IDTV television sets have screens bigger than
30”, where is possible to have a more flexible development,
presentation and displaying of the content. However, cell
phones screens are smaller than 10" requiring a higher effort
in development to display the content on the screen avoiding
image pollution and confusion to the user.
An exceptional characteristic in this environment is that
IDTV for cell phones can be seen anywhere and anytime. On
the other hand, IDTV viewing period in residences can be
longer than in cell phones which are used in situations of
waiting and displacement.
IDTV in cell phones can use already existent 2G/3G net
architecture, and 4G in the future, as a return channel, making
interactivity possible in this environment before occurring in
IDTV.
The middleware adopted in Brazil has national technology
and is called Ginga. Ginga-NCL and Ginga-J declarative and
imperative portions of the middleware are necessary for full-
seg devices. For one-seg devices, only Ginga-NCL declarative
portion is required. There is a reference implementation of the
middleware for full-seg devices. For one-seg devices, this
reference implementation is not available yet, but are working
in this middleware development, as PUC-RIO and UFES
(Symbian e Android). [11, 12, 13]
The users of these devices need special attention due to
current characteristics of this environment like processing
power, storage capacity and battery.
III. RELATED WORK
There are many works involving recommendation systems
for IDTV for set-top-box and more recently for portable
devices. This section presents two recent works about
recommendation systems for IDTV.
In [5] the recommendation system fits the systems with
content-based filtering category, using text mining. The
system uses a simple interface with the user and accepts a
natural language as text entry as well as four values which
reflect user preferences for comedy, action, horror and erotic.
First, the system extracts texts and then searches for emotions
in the text and the distances among themes are calculated.
Finally, an index is calculated for each entry and a list of
programs organized by this index is returned.
In [3] the main aim of the system is substituting the
common content by a personalized and adapted content in a
more attractive way for the user. Therefore, this system
accepts and allows TV reception either through broadcast, or
multimedia streaming. The system uses explicit collection –
when using for the first time it is necessary to inform the
preferences – and also implicit collection – user’s actions in
the device are monitored, stored and sent to the server. The
personalized content– chosen based on preferences – is sent to
the user’s portable device by the Server in order to be
previously stored before being exhibited.
The ZapTV [4] developed for DVB-H standard allows the
user to create his own content, offering aggregated value
services as multimodal access (Web and Cell phones), return
channel, video note, personalized sharing and distribution of
content. Besides the technology provided by DVB-H, ZapTV
comprehends
other
technologies
as
TV-Anytime,
Technologies emerging from Web 2.0 and involved in the
Semantic Web. The main functionalities of ZapTV include a
social net, personalized content broadcasting (implicit or
explicit recommendation), thematic channels diffusion
planning (age-group, genre or specific theme), client
application and transmission of the electronic programming
guide. ZapTV seeks to improve the recommendation using an
intelligent personalization mechanism which matches
information filtering with semantic logic processes and it was
based on the principles of participation and sharing between
Web 2.0 users, so that the creation, sharing, classification and
note of content make the search for content easier.
Our recommendation system is in the portable device, and
the inclusion of servers in Brazilian IDTV architecture for cell

phones is not necessary to provide recommendation and,
consequently the need for remote communication is also not
necessary, avoiding that the user pay for the data traffic in the
net in order to receive recommendation or send his data, and
thus, protecting the user’s data privacy.
IV. CONTENT-BASED FILTERING
Content-Based Filtering (CBF) uses the content attributes to
describe the content of the items and then calculate the
similarity. This approach does not depend on other users’
evaluation about the items. CBF is an information recovery
technique which bases its forecast on the fact that previous
preferences of the users are reliable indicators for future
behavior. In order to formulate recommendations, a variety of
algorithms has been proposed to evaluate the content of
documents and find regularities. Some of these algorithms
operate with classification knowledge and others operate with
the problem of regression. Some of the problems and
limitations found in systems using CBF are super
specialization, the problem of the new user and the analyses of
limited content. [7, 6, 14]
V. RECOMMENDER SYSTEM
Our recommendation system aims to facilitate the IDTV
user’s routine by interacting with a simple interface which
provides content of preference without spending so much time
to find it.
The process starts when the user turns on the TV in the cell
phone. The user history data collected is submitted to
information filtering based on content in order to find the
user’s profile. Data resulting from this process are formatted.
The user profile is stored in a database with the date and time
of the generation. With the user profile updated it is possible
to look in the EPG for compatible TV programs and which are
going to be transmitted around the current time, providing a
list of these programs. This list is also stored in a data base
with the date and time of generation.
The recommendations are presented to the user and those
required are stored with the user history. During the time the
IDTV for cell phone is turned on, all programs viewed by the
user are stored in the database which has the user history. This
process is repeated every time the user turns on the TV.
Ginga-NCL middleware has a layer for resident applications
responsible for exhibition, other layer for the common center
responsible for offering several services and a last layer
regarding the protocols stack.
The recommendation system is considered as an element in
Ginga-NCL architecture, in Ginga Common Core, due to the
need to continue using data locally and also use Tunner
libraries – in order to obtain information about the channels
tune – ESInformation – in order to obtain information about
EIT table generating the EPG – and Context Manager – to
obtain system information.
As GINGA-NCL middleware is mandatory in Brazil for
portable devices, the recommendation system was planned,
designed and modeled according to Brazilian rules which refer
to portable devices, thus meeting these devices needs. More
details on our system can be obtained in. [15]
VI. TESTS
For the tests we used the data corresponding to TV viewing
and program schedule. These data were provided by IBOPE
which is a Brazilian multinational private equity firms and a
leading market research in Latin America. 67 years ago to
IBOPE provides a wide range of information and media
studies, public opinion, voting intention, consumption, brand
and market behavior. In the following subsections the
characteristics of these data and the tests will be detailed. [16]
A. Characteristics of Residences
Data that contain information provided IBOPE EPG (TV
programming), history of the user's view (what the viewer saw)
and also the socioeconomic information. All these data were
separated and stored in MySQL database the data correspond
to 15 days of programming and monitoring of six Brazilian
households with TV programming Open. These households
were monitored every minute, and each individual was also
monitored separately.
TABLE I
NUMBER OF INDIVIDUALS BY RESIDENCE
Residence
Individuals
TVs

1
2
1

2
3
1

3
3
2

4
2
2

5
2
1

6
3
2

B. Characteristics of Date
The data used for these tests have undergone a process of
manual adjustment. For each of the algorithms used, was a
necessary pre-processing for correct use and analysis.
Subsections C, D, E and F detail the composition of these data.
C. EPG
The EPG (Electronic Program Guide) is composed of 15
TXT files called programming files, one for each day
(05/03/2008 to 19/03/2008) with a grid of 10 TV stations
Open, starting at 00:00:00 and ends at 05:59:00. After
understanding the files that make up the EPG, the data were
copied from the archives of programming a spreadsheet
BrOffice and then was done cleaning up unnecessary data.
We noticed some inconsistencies in schedules, which were
immediately corrected so that future analysis will not generate
erroneous results. This was repeated for each of the 15
programming files, generating a single spreadsheet containing
the entire 15 days of EPG.
Was added to these data the day of week and duration of the
program. The EPG, this step is not complete, missing the
genre and subgenre of each program. Searched for it on the
official websites of the gender of each station broadcast
programs and then identified according to the Brazilian

standard ABNT NBR 15603-3:2007, Annex C, "gender
descriptor in the descriptor content." [17]
A new table was created, identical to the EPG table, but
with added fields with the names of genre, to be used with the
technique of the cosine. These fields were populated with 0 or
1 depending on the program or not fit in that genre, becoming
a matrix.
D. User History
Historical data viewing of users are needed for the
discovery of these preferences. In the context of digital TV
that we are considering, these data are collected and stored
implicitly.
Spreadsheets sent by tuning IBOPE, which contains user
data, were modified so that filtering techniques based on
content could be applied.
The data were then separated by households and although
some homes have more than one TV in these households it
was noticed that there is no record of monitoring more than
one TV at the same time and therefore it is considered that the
household has only one TV.
Data were also formatted: date in yyyy-mm-dd, time in
hh:mm:ss format TV and 00X. The resulting sheets were
converted to CSV files that were then inserted in MySQL
were also added to user data, information for day of week,
time of day and duration of display.
E. Methodology
We simulate the Content Filtering Based on using two
different techniques, the Apriori and cosine, both using as a
target attribute to gender. In the case of Apriori, we apply the
settings shown in Figure 1.
Then start the simulation for Apriori. For each household,
the process was the same on the first day to generate
recommendations for the second day, the second day, based
on what was seen the day before and the present day, to
generate recommendations for the third day, and so on.
First we opened the CSV file corresponding to the X home
and day 1. After convert some attributes String to Numeric
Nominal and others for Nominal (applying filters). Then the
executable Apriori and ultimately save the output. With the
data saved, it was possible to assess whether the day after,
someone from the household attended any of the genera found
by Apriori.
To find the cosine first and save the profile, then calculate
the distance of the cosine, cosine and the standard itself and
finally found the right answers. The process is iterative and is
performed for each day and each household.
F. Apriori e Cosine
The algorithms of association techniques identify
associations between data records that are somehow related.
The basic premise is evidence on the presence of others in the
same transaction, to determine what things are related.
Moreover, we also calculated the average percentage of
correct answers for the number of recommendations generated
using the following formula:
(2)

Figure 2 and Figure 3 show the chart with the average
achieved for the percentages calculated for each hit home for
the techniques of the Cosine and Apriori. Households 2, 4 and
6 had the best results for the cosine and the Apriori
households 4:06. As can be seen in Figure 4, which presents
the comparison chart between the two techniques in general,
the cosine has outperformed over the Apriori.

Figure 1. Parameters used in Apriori.

The association rules interconnect objects in an attempt to
expose patterns and trends. The discovery of associations must
show both associations as trivial associations not trivial.
The Apriori algorithm is often used to mine association
rules. Apriori can work with a high number of attributes,
generating various combinations between them and
performing successive searches across the database,
maintaining optimum performance in terms of processing time.
The algorithm tries to find all the relevant association rules
between items, which has the form X (history) ==> Y
(consequent). If x% of transactions that contain X also contain
Y, then x% represents the factor of trust (under the rule of
confidence). The support factor is a measure representing x%
of cases in which X and Y occur simultaneously over the total
number of records (often). [18]
The cosine is a measure of similarity, a metric that can be
applied to find out if an item has a correlation or not with the
user profile. A binary vector is a set of two elements, x and y.
In an n-dimensional space, where n is the number of items in
the vector. You can therefore calculate the cosine between the
vectors, measured as similarity between the user profile and its
history. The similarity is high when the value of cosine is high,
the closer to 1, the greater the similarity. [19]

Figure 2. Parameters used in Cosine.

Figure 3. Parameters used in Apriori.

VII. RESULTS
For all households were made in excel spreadsheets to
account for the percentage of success of each technique to
each household. As the number of recommendations were
generated in May, then the basic formula for calculating the
percentage of correct answers was used:

(1)
Figure 4. Comparison between means of Apriori and Cosine.
VIII.

CONCLUSION

During the tests, we observed some peculiarities. Our
system recommends content based on the kinds of programs,
and our analysis were made according to that parameter. With
the Apriori algorithm, the format of the data is already
collected correctly for use. For cosine, the EPG needs to be
changed to an array before starting the discovery process
profiles and recommendations.
The Apriori is able to mine only the user viewing history,
discovering your profile from the rules. To select the programs
to be recommended, another technique should be used. The
cosine can do both. However, Apriori can discover more
features in the user history, for example, "the user stands in
front of the TV more often at night, he enjoys watching
movies and watching TV more frequently in the second."
The cosine cannot find these features, but can reach our
goal. To find similar patterns in association rules, it is
necessary to use more complex queries to the bank.
The output from the Apriori must be crafted to generate the
correct profile of the user, ie, the rules should be interpreted,
which in terms of implementation becomes somewhat
cumbersome. The cosine output is more readable, its result
goes straight to the intended goal, allowing the output to be
used without the need for a post-treatment.
In relation to entry to the Apriori there is no need of
treatment, since the data will be used the way they are
collected. But for the cosine, where the EPG is updated, the
table containing the matrix of the EPG should be modified
according to the new EPG, becoming somewhat laborious.
Then simulated with both techniques the process of delivery
and acceptance of recommendations by calculating the
percentage of correct and generating graphics. The profile of
genres found by both algorithms are similar. Although both
techniques to cover the needs of the system, the cosine is one
that can be better utilized.
On a desktop, as was the case of our tests, the return of
calculating the cosine is faster in relation to the return of the
Apriori association rules. However, further studies on the
consumption of processing of these algorithms in a cell with
TVDI is not yet possible in Brazil. The time that the whole
process of recommendation takes to complete varies according
to the technique of customization to be used. In our tests and
simulations, the cosine ends the process before the Apriori.
Studies show that although both algorithms meet our needs,
these two techniques the Cosine can be better worked in the
recommendation system for TVDPI.

ACKNOWLEDGMENT
We thank IBOPE for providing real data about the
electronic program guide and also the viewer’s behavior data
from March, 05, 2008 to March, 19, 2008.
REFERENCES
[1]

[2]

[3]
[4]

[5]
[6]

[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]

[16]
[17]
[18]
[19]

Avila, P. M. TV Recommender: Application Development Support of
Recommendation for the Brazilian System of Digital TV. Dissertation.
Graduate in Computer Science. Department of Computer Science.
Federal University of Sao Carlos. 90 pages, 2010.
Lucas, A. Customization for Digital TV using the strategy of
Recommender System for multiuser environments. Dissertation.
Graduate in Computer Science. Department of Computer Science.
Federal University of Sao Carlos. 103 pages, 2009.
Uribe, S. et al. Mobile TV Targeted Advertisement and Content
Personalization. In 16th International Workshop Conference on Systems,
Signals and Image Processing, Chalkida, Greece, 18-19/06/2009.
Solla, A. G. et al. ZapTV: Personalized User-Generated Content for
Handheld Devices in DVB-H Mobile Newtorks. In Proceedings 6th
European Interactive TV Conference, p.193-203, Salzburg, Áustria, 0304/07/2008.
Bär, A. et al. A Lightweight Mobile TV Recommender: Towards a OneClick-to-Watch Experience. In Proceedings 6th European Interactive TV
Conference, p.142-147, Salzburg, Áustria, 03-04/07/2008.
Einarsson, O. P. Content Personalization for Mobile TV Combining
Content-Based and Collavorative Filtering. Master Thesis. Center for
Information and Communication Technologies. Technical Univesity of
Denmark. August 22, 2007
Chorianopoulos, K. Personalized and mobile digital TV applications. In
Proceedings of the Multimedia Tools and Aplications, p. 1- 10, vol.36,
27 January 2007.
Choi, J. Y.; Koh, D.; Lee, J. Ex-ante simulation of mobile TV market
based on consumers’ preference data. In Proceedings of the
Technological Forecasting & Social Change, p. 1043-1053, 2007.
Yu, Z. et al. TV program recommendation for multiple viewers based on
user profile merging. In Proceedings of the User Model User-Adap Inter,
p. 63-82, 2006.
Das, D. and ter Horst, H. Recommder Systems for TV. In Proceedings
of 15 th AAAI Conference, Madison, Wisconsin, July 1998.
Ginga. Disponível em: <http://www.ginga.org.br/>, Acessado em 06 de
janeiro de 2010. http://www.ginga.org.br/
Ginga-NCL. Disponível em: <http://www.ginga.org.br/>, Acessado em
06 de janeiro de 2010. http://www.gingancl.org.br/
Ginga-J. Disponível em: <http://www.ginga.org.br/>, Acessado em 06
de janeiro de 2010. http://dev.openginga.org/
Pazzani, M. J. A framework for Collaborative, Content-Based and
Demographic Filtering. Artificial Intelligence Review, p. 393-408,
December 1999.
Gatto, Elaine C., Zorzi, Sergio D. Recommender System for Digital TV
Portable Interactive Brasileira. In 8th International Information and
Telecommunication Technologies Symposium - December 09-11, 2009.
Florianopolis, Santa Catarina, Brazil.
IBOPE. Disponível em <www.ibope.com.br>
ABNT NBR 15603-2. Digital terrestrial television – Multiplexing and
service information (SI) Part 2: Data structure and definition of basic
information of SI.
Witten, I. H.; Frank, E. Data Mining: Practical Machine Learning Tools
and Techniques, 2nd Edition, Morgan Kaufmann, 525 pages, June 2005.
Cristo, M. Sistemas de Recomendação, Métodos e Avaliação. 81 slides.
2009.

Contenu connexe

En vedette (7)

Ajr%2 e98%2e2%2e343
Ajr%2 e98%2e2%2e343Ajr%2 e98%2e2%2e343
Ajr%2 e98%2e2%2e343
 
BIPODITVR: brazilian interactive portable digital tv recommendation system
BIPODITVR: brazilian interactive portable digital tv recommendation systemBIPODITVR: brazilian interactive portable digital tv recommendation system
BIPODITVR: brazilian interactive portable digital tv recommendation system
 
Ashley Manterfield CV
Ashley Manterfield CVAshley Manterfield CV
Ashley Manterfield CV
 
Application of recommendation techniques for brazilian portable interactive d...
Application of recommendation techniques for brazilian portable interactive d...Application of recommendation techniques for brazilian portable interactive d...
Application of recommendation techniques for brazilian portable interactive d...
 
Josefina y josé las ntic´s en la educación
Josefina y josé las ntic´s en la educaciónJosefina y josé las ntic´s en la educación
Josefina y josé las ntic´s en la educación
 
משחקי תיבת נח
משחקי תיבת נחמשחקי תיבת נח
משחקי תיבת נח
 
Evangelism and discipleship
Evangelism and discipleshipEvangelism and discipleship
Evangelism and discipleship
 

Similaire à Using content-based filtering in a system of recommendation in the context of digital mobile interactive tv

WS98-08-008
WS98-08-008WS98-08-008
WS98-08-008
Duco Das
 
Buidling competitive advantage for cpg
Buidling competitive advantage for cpgBuidling competitive advantage for cpg
Buidling competitive advantage for cpg
Ajit Gokhale
 
my model genuines.
my model genuines.my model genuines.
my model genuines.
Teng Xiaolu
 

Similaire à Using content-based filtering in a system of recommendation in the context of digital mobile interactive tv (20)

Iwssip application of recommendation techniques for brazilian portable inte...
Iwssip   application of recommendation techniques for brazilian portable inte...Iwssip   application of recommendation techniques for brazilian portable inte...
Iwssip application of recommendation techniques for brazilian portable inte...
 
Sigap bi po-ditvr brazilian interactive portable digital tv recommendation ...
Sigap   bi po-ditvr brazilian interactive portable digital tv recommendation ...Sigap   bi po-ditvr brazilian interactive portable digital tv recommendation ...
Sigap bi po-ditvr brazilian interactive portable digital tv recommendation ...
 
WS98-08-008
WS98-08-008WS98-08-008
WS98-08-008
 
Bandwidth Efficient : On-Demand Multimedia Advertisements using Mobile Agents
Bandwidth Efficient : On-Demand Multimedia Advertisements using Mobile AgentsBandwidth Efficient : On-Demand Multimedia Advertisements using Mobile Agents
Bandwidth Efficient : On-Demand Multimedia Advertisements using Mobile Agents
 
DESIGN AND IMPLEMENTATION OF REMOTELY MANAGED EMBEDDED DIGITAL SIGNAGE SYSTEM
DESIGN AND IMPLEMENTATION OF REMOTELY MANAGED EMBEDDED DIGITAL SIGNAGE SYSTEMDESIGN AND IMPLEMENTATION OF REMOTELY MANAGED EMBEDDED DIGITAL SIGNAGE SYSTEM
DESIGN AND IMPLEMENTATION OF REMOTELY MANAGED EMBEDDED DIGITAL SIGNAGE SYSTEM
 
Design and implementation of remotely managed embedded digital signage system
Design and implementation of remotely managed embedded digital signage systemDesign and implementation of remotely managed embedded digital signage system
Design and implementation of remotely managed embedded digital signage system
 
Design and Implementation of Remotely Managed Embedded Digital Signage System
Design and Implementation of Remotely Managed Embedded Digital Signage SystemDesign and Implementation of Remotely Managed Embedded Digital Signage System
Design and Implementation of Remotely Managed Embedded Digital Signage System
 
Accelerating Multiscreen Video Delivery
Accelerating Multiscreen Video DeliveryAccelerating Multiscreen Video Delivery
Accelerating Multiscreen Video Delivery
 
Buidling competitive advantage for cpg
Buidling competitive advantage for cpgBuidling competitive advantage for cpg
Buidling competitive advantage for cpg
 
Buidling competitive advantage for cpg
Buidling competitive advantage for cpgBuidling competitive advantage for cpg
Buidling competitive advantage for cpg
 
IRJET - Labtrust-Android Application for Pathology Laboratory
IRJET -  	  Labtrust-Android Application for Pathology LaboratoryIRJET -  	  Labtrust-Android Application for Pathology Laboratory
IRJET - Labtrust-Android Application for Pathology Laboratory
 
Consumer Intelligence Series: Product and Services Innovation for TV and the ...
Consumer Intelligence Series: Product and Services Innovation for TV and the ...Consumer Intelligence Series: Product and Services Innovation for TV and the ...
Consumer Intelligence Series: Product and Services Innovation for TV and the ...
 
my model genuines.
my model genuines.my model genuines.
my model genuines.
 
Movico bom offering
Movico bom offeringMovico bom offering
Movico bom offering
 
Mobile based online tv guide
Mobile based online tv guideMobile based online tv guide
Mobile based online tv guide
 
Mobile based online tv guide
Mobile based online tv guideMobile based online tv guide
Mobile based online tv guide
 
Fitness Activity Recognition for Smartphone
Fitness Activity Recognition for SmartphoneFitness Activity Recognition for Smartphone
Fitness Activity Recognition for Smartphone
 
Vertex – The All in one Web Application
Vertex – The All in one Web ApplicationVertex – The All in one Web Application
Vertex – The All in one Web Application
 
IRJET - Digital Advertisement using Artificial Intelligence for Data Anal...
IRJET -  	  Digital Advertisement using Artificial Intelligence for Data Anal...IRJET -  	  Digital Advertisement using Artificial Intelligence for Data Anal...
IRJET - Digital Advertisement using Artificial Intelligence for Data Anal...
 
8 Step to Build Your lot-Based Mobile Parking System.pdf
8 Step to Build Your lot-Based Mobile Parking System.pdf8 Step to Build Your lot-Based Mobile Parking System.pdf
8 Step to Build Your lot-Based Mobile Parking System.pdf
 

Plus de Elaine Cecília Gatto

Plus de Elaine Cecília Gatto (20)

A influência da Tecnologia em cada faixa etaria
A influência da Tecnologia em cada faixa etariaA influência da Tecnologia em cada faixa etaria
A influência da Tecnologia em cada faixa etaria
 
Inteligência Artificial Aplicada à Medicina
Inteligência Artificial Aplicada à MedicinaInteligência Artificial Aplicada à Medicina
Inteligência Artificial Aplicada à Medicina
 
Além do Aprendizado Local e Global: Particionando o espaço de classes em prob...
Além do Aprendizado Local e Global: Particionando o espaço de classes em prob...Além do Aprendizado Local e Global: Particionando o espaço de classes em prob...
Além do Aprendizado Local e Global: Particionando o espaço de classes em prob...
 
Apresentação da minha tese de doutorado no EPPC
Apresentação da minha tese de doutorado no EPPCApresentação da minha tese de doutorado no EPPC
Apresentação da minha tese de doutorado no EPPC
 
entrevista r7.pdf
entrevista r7.pdfentrevista r7.pdf
entrevista r7.pdf
 
Como a pesquisa científica impacta o mundo real.pptx
Como a pesquisa científica impacta o mundo real.pptxComo a pesquisa científica impacta o mundo real.pptx
Como a pesquisa científica impacta o mundo real.pptx
 
Empoderamento Feminino
Empoderamento FemininoEmpoderamento Feminino
Empoderamento Feminino
 
Explorando correlações entre rótulos para o particionamento do espaço de rótu...
Explorando correlações entre rótulos para o particionamento do espaço de rótu...Explorando correlações entre rótulos para o particionamento do espaço de rótu...
Explorando correlações entre rótulos para o particionamento do espaço de rótu...
 
Community Detection for Multi-Label Classification - Seminários UFSCar
Community Detection for Multi-Label Classification - Seminários UFSCarCommunity Detection for Multi-Label Classification - Seminários UFSCar
Community Detection for Multi-Label Classification - Seminários UFSCar
 
Classificação Multirrótulo: Aprendizado de Correlações
Classificação Multirrótulo: Aprendizado de CorrelaçõesClassificação Multirrótulo: Aprendizado de Correlações
Classificação Multirrótulo: Aprendizado de Correlações
 
EXPLORANDO CORRELAÇÕES PARA O PARTICIONAMENTO DO ESPAÇO DE RÓTULOS EM PROBLEM...
EXPLORANDO CORRELAÇÕES PARA O PARTICIONAMENTO DO ESPAÇO DE RÓTULOS EM PROBLEM...EXPLORANDO CORRELAÇÕES PARA O PARTICIONAMENTO DO ESPAÇO DE RÓTULOS EM PROBLEM...
EXPLORANDO CORRELAÇÕES PARA O PARTICIONAMENTO DO ESPAÇO DE RÓTULOS EM PROBLEM...
 
Community Detection Method for Multi-Label Classification
Community Detection Method for Multi-Label ClassificationCommunity Detection Method for Multi-Label Classification
Community Detection Method for Multi-Label Classification
 
Mulheres na Campus Party assumir o feminismo ou não – Blogueiras Feministas.pdf
Mulheres na Campus Party assumir o feminismo ou não – Blogueiras Feministas.pdfMulheres na Campus Party assumir o feminismo ou não – Blogueiras Feministas.pdf
Mulheres na Campus Party assumir o feminismo ou não – Blogueiras Feministas.pdf
 
Curtinhas de sábado.pdf
Curtinhas de sábado.pdfCurtinhas de sábado.pdf
Curtinhas de sábado.pdf
 
Explorando Correlações entre Rótulos usando Métodos de Detecção de Comu...
Explorando Correlações entre Rótulos usando Métodos de Detecção de Comu...Explorando Correlações entre Rótulos usando Métodos de Detecção de Comu...
Explorando Correlações entre Rótulos usando Métodos de Detecção de Comu...
 
EXPLORANDO CORRELAÇÕES PARA O PARTICIONAMENTO DO ESPAÇO DE RÓTULOS EM PROBLEM...
EXPLORANDO CORRELAÇÕES PARA O PARTICIONAMENTO DO ESPAÇO DE RÓTULOS EM PROBLEM...EXPLORANDO CORRELAÇÕES PARA O PARTICIONAMENTO DO ESPAÇO DE RÓTULOS EM PROBLEM...
EXPLORANDO CORRELAÇÕES PARA O PARTICIONAMENTO DO ESPAÇO DE RÓTULOS EM PROBLEM...
 
Pipeline desdobramento escalonamento
Pipeline desdobramento escalonamentoPipeline desdobramento escalonamento
Pipeline desdobramento escalonamento
 
Cheat sheet Mips 32 bits
Cheat sheet Mips 32 bitsCheat sheet Mips 32 bits
Cheat sheet Mips 32 bits
 
Resumo das Instruções de Desvio Incondicionais MIPS 32 bits
Resumo das Instruções de Desvio Incondicionais MIPS 32 bitsResumo das Instruções de Desvio Incondicionais MIPS 32 bits
Resumo das Instruções de Desvio Incondicionais MIPS 32 bits
 
Como descobrir e classificar coisas usando machine learning sem compilcação
Como descobrir e classificar coisas usando machine learning sem compilcaçãoComo descobrir e classificar coisas usando machine learning sem compilcação
Como descobrir e classificar coisas usando machine learning sem compilcação
 

Dernier

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Dernier (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Using content-based filtering in a system of recommendation in the context of digital mobile interactive tv

  • 1. Using Content-Based Filtering in a System of Recommendation in the Context of Digital Mobile Interactive TV Elaine Cecília Gatto, Sergio Donizetti Zorzo IEEE Conference Publishing Computer Science Department. Federal University of São Carlos – UFSCar Highway Washington Luís, Km 235, PO Box 676, 13565-9. São Carlos, São Paulo, Brazil. {elaine_gatto, zorzo}@dc.ufscar.br. Abstract-Recommendation systems provide suggestions based on information about the preferences of users. The filtering information is used by recommender systems for the processing of information and suggestions to users and content-based filtering is an approach to filtering information widely used in recommender systems. Content-Based Filtering on analyzing the correlation of the content of items with the profile, suggesting relevant items and discarding the irrelevant. Widely used on the Internet, recommendation systems are being studied for use in the context of Digital TV, there are already several studies in this direction. Just as occurs on the Internet, recommendation systems can be used in Digital TV for recommendation of TV programs, advertising and publicity and also electronic commerce. Thus, the items in the context of digital TV, may be programs, publicity / advertising and the products to be sold. Applying Content Filtering Based on the recommendation of programs, for example, it should correlate the content of these programs with user preferences, which in this scenario are the types of programs he has preferred to watch. This paper presents the studies performed with Content Filtering Based on Data Applied to Digital TV. The studies seek to observe and evaluate how some techniques of content-based filtering can be used in recommendation systems in the context of Digital TV. I. INTRODUCTION Digital TV implementation in Brazil provides new markets which can be explored. Well-succeeded technologies as those in Web environment, for example, can be applied in Digital TV domain and achieve the same success. The interaction either through the remote control or the cell phone keyboard etc by the user today, will allow many applications to be carried to this environment. One of the areas which has been extensively studied and is well-succeeded in the Web is that of personalization. There are some surveys concerning recommendation systems for Digital TV as for example [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] among others. Recommendation systems can contribute to a better use of Digital TV in residences, in groups or individually, in a cell phone, for example. These systems can help the user to choose the program, avoiding waste of time and of course, suggesting to the user programs which really interest him. Moreover, recommendation systems can be applied to publicity and advertisement on Digital TV, as well as in the T-Commerce. This work is organized as follows: section 1 introduces the paper, section 2 presents a comparison between digital TV and portable devices for homes, section 3 presents related works, section 4 talks about content-based filtering, section 5 presents our recommendation system, Section 6 talks about the characteristics of households, the EPG, the user history and methodology used for the tests, Section 7 talks about the results and Section 8 presents the conclusion. II. COMPARING IDTV IN RESIDENCES AND IDTV FOR CELL PHONES The use of IDTV for cell phones will quickly boom due to the increasingly quantity of these devices surpassing television sets in Brazil, when cell phones with IDTV are available to population. Thus, some differences between IDTV for residences and for cell phones can be noticed. IDTV standard adopted in Brazil calls full-seg the fixed devices like set-top-box, and one-seg, devices like cell phones, miniTVs, PDAs, etc. In residences, the IDTV is used by all residents while in the cell phone it is normally used by only one user, the owner of the device. Another characteristic is the size of the display. In residences, the IDTV television sets have screens bigger than 30”, where is possible to have a more flexible development, presentation and displaying of the content. However, cell phones screens are smaller than 10" requiring a higher effort in development to display the content on the screen avoiding image pollution and confusion to the user. An exceptional characteristic in this environment is that IDTV for cell phones can be seen anywhere and anytime. On the other hand, IDTV viewing period in residences can be longer than in cell phones which are used in situations of waiting and displacement. IDTV in cell phones can use already existent 2G/3G net architecture, and 4G in the future, as a return channel, making interactivity possible in this environment before occurring in IDTV. The middleware adopted in Brazil has national technology and is called Ginga. Ginga-NCL and Ginga-J declarative and imperative portions of the middleware are necessary for full-
  • 2. seg devices. For one-seg devices, only Ginga-NCL declarative portion is required. There is a reference implementation of the middleware for full-seg devices. For one-seg devices, this reference implementation is not available yet, but are working in this middleware development, as PUC-RIO and UFES (Symbian e Android). [11, 12, 13] The users of these devices need special attention due to current characteristics of this environment like processing power, storage capacity and battery. III. RELATED WORK There are many works involving recommendation systems for IDTV for set-top-box and more recently for portable devices. This section presents two recent works about recommendation systems for IDTV. In [5] the recommendation system fits the systems with content-based filtering category, using text mining. The system uses a simple interface with the user and accepts a natural language as text entry as well as four values which reflect user preferences for comedy, action, horror and erotic. First, the system extracts texts and then searches for emotions in the text and the distances among themes are calculated. Finally, an index is calculated for each entry and a list of programs organized by this index is returned. In [3] the main aim of the system is substituting the common content by a personalized and adapted content in a more attractive way for the user. Therefore, this system accepts and allows TV reception either through broadcast, or multimedia streaming. The system uses explicit collection – when using for the first time it is necessary to inform the preferences – and also implicit collection – user’s actions in the device are monitored, stored and sent to the server. The personalized content– chosen based on preferences – is sent to the user’s portable device by the Server in order to be previously stored before being exhibited. The ZapTV [4] developed for DVB-H standard allows the user to create his own content, offering aggregated value services as multimodal access (Web and Cell phones), return channel, video note, personalized sharing and distribution of content. Besides the technology provided by DVB-H, ZapTV comprehends other technologies as TV-Anytime, Technologies emerging from Web 2.0 and involved in the Semantic Web. The main functionalities of ZapTV include a social net, personalized content broadcasting (implicit or explicit recommendation), thematic channels diffusion planning (age-group, genre or specific theme), client application and transmission of the electronic programming guide. ZapTV seeks to improve the recommendation using an intelligent personalization mechanism which matches information filtering with semantic logic processes and it was based on the principles of participation and sharing between Web 2.0 users, so that the creation, sharing, classification and note of content make the search for content easier. Our recommendation system is in the portable device, and the inclusion of servers in Brazilian IDTV architecture for cell phones is not necessary to provide recommendation and, consequently the need for remote communication is also not necessary, avoiding that the user pay for the data traffic in the net in order to receive recommendation or send his data, and thus, protecting the user’s data privacy. IV. CONTENT-BASED FILTERING Content-Based Filtering (CBF) uses the content attributes to describe the content of the items and then calculate the similarity. This approach does not depend on other users’ evaluation about the items. CBF is an information recovery technique which bases its forecast on the fact that previous preferences of the users are reliable indicators for future behavior. In order to formulate recommendations, a variety of algorithms has been proposed to evaluate the content of documents and find regularities. Some of these algorithms operate with classification knowledge and others operate with the problem of regression. Some of the problems and limitations found in systems using CBF are super specialization, the problem of the new user and the analyses of limited content. [7, 6, 14] V. RECOMMENDER SYSTEM Our recommendation system aims to facilitate the IDTV user’s routine by interacting with a simple interface which provides content of preference without spending so much time to find it. The process starts when the user turns on the TV in the cell phone. The user history data collected is submitted to information filtering based on content in order to find the user’s profile. Data resulting from this process are formatted. The user profile is stored in a database with the date and time of the generation. With the user profile updated it is possible to look in the EPG for compatible TV programs and which are going to be transmitted around the current time, providing a list of these programs. This list is also stored in a data base with the date and time of generation. The recommendations are presented to the user and those required are stored with the user history. During the time the IDTV for cell phone is turned on, all programs viewed by the user are stored in the database which has the user history. This process is repeated every time the user turns on the TV. Ginga-NCL middleware has a layer for resident applications responsible for exhibition, other layer for the common center responsible for offering several services and a last layer regarding the protocols stack. The recommendation system is considered as an element in Ginga-NCL architecture, in Ginga Common Core, due to the need to continue using data locally and also use Tunner libraries – in order to obtain information about the channels tune – ESInformation – in order to obtain information about EIT table generating the EPG – and Context Manager – to obtain system information. As GINGA-NCL middleware is mandatory in Brazil for portable devices, the recommendation system was planned,
  • 3. designed and modeled according to Brazilian rules which refer to portable devices, thus meeting these devices needs. More details on our system can be obtained in. [15] VI. TESTS For the tests we used the data corresponding to TV viewing and program schedule. These data were provided by IBOPE which is a Brazilian multinational private equity firms and a leading market research in Latin America. 67 years ago to IBOPE provides a wide range of information and media studies, public opinion, voting intention, consumption, brand and market behavior. In the following subsections the characteristics of these data and the tests will be detailed. [16] A. Characteristics of Residences Data that contain information provided IBOPE EPG (TV programming), history of the user's view (what the viewer saw) and also the socioeconomic information. All these data were separated and stored in MySQL database the data correspond to 15 days of programming and monitoring of six Brazilian households with TV programming Open. These households were monitored every minute, and each individual was also monitored separately. TABLE I NUMBER OF INDIVIDUALS BY RESIDENCE Residence Individuals TVs 1 2 1 2 3 1 3 3 2 4 2 2 5 2 1 6 3 2 B. Characteristics of Date The data used for these tests have undergone a process of manual adjustment. For each of the algorithms used, was a necessary pre-processing for correct use and analysis. Subsections C, D, E and F detail the composition of these data. C. EPG The EPG (Electronic Program Guide) is composed of 15 TXT files called programming files, one for each day (05/03/2008 to 19/03/2008) with a grid of 10 TV stations Open, starting at 00:00:00 and ends at 05:59:00. After understanding the files that make up the EPG, the data were copied from the archives of programming a spreadsheet BrOffice and then was done cleaning up unnecessary data. We noticed some inconsistencies in schedules, which were immediately corrected so that future analysis will not generate erroneous results. This was repeated for each of the 15 programming files, generating a single spreadsheet containing the entire 15 days of EPG. Was added to these data the day of week and duration of the program. The EPG, this step is not complete, missing the genre and subgenre of each program. Searched for it on the official websites of the gender of each station broadcast programs and then identified according to the Brazilian standard ABNT NBR 15603-3:2007, Annex C, "gender descriptor in the descriptor content." [17] A new table was created, identical to the EPG table, but with added fields with the names of genre, to be used with the technique of the cosine. These fields were populated with 0 or 1 depending on the program or not fit in that genre, becoming a matrix. D. User History Historical data viewing of users are needed for the discovery of these preferences. In the context of digital TV that we are considering, these data are collected and stored implicitly. Spreadsheets sent by tuning IBOPE, which contains user data, were modified so that filtering techniques based on content could be applied. The data were then separated by households and although some homes have more than one TV in these households it was noticed that there is no record of monitoring more than one TV at the same time and therefore it is considered that the household has only one TV. Data were also formatted: date in yyyy-mm-dd, time in hh:mm:ss format TV and 00X. The resulting sheets were converted to CSV files that were then inserted in MySQL were also added to user data, information for day of week, time of day and duration of display. E. Methodology We simulate the Content Filtering Based on using two different techniques, the Apriori and cosine, both using as a target attribute to gender. In the case of Apriori, we apply the settings shown in Figure 1. Then start the simulation for Apriori. For each household, the process was the same on the first day to generate recommendations for the second day, the second day, based on what was seen the day before and the present day, to generate recommendations for the third day, and so on. First we opened the CSV file corresponding to the X home and day 1. After convert some attributes String to Numeric Nominal and others for Nominal (applying filters). Then the executable Apriori and ultimately save the output. With the data saved, it was possible to assess whether the day after, someone from the household attended any of the genera found by Apriori. To find the cosine first and save the profile, then calculate the distance of the cosine, cosine and the standard itself and finally found the right answers. The process is iterative and is performed for each day and each household. F. Apriori e Cosine The algorithms of association techniques identify associations between data records that are somehow related. The basic premise is evidence on the presence of others in the same transaction, to determine what things are related.
  • 4. Moreover, we also calculated the average percentage of correct answers for the number of recommendations generated using the following formula: (2) Figure 2 and Figure 3 show the chart with the average achieved for the percentages calculated for each hit home for the techniques of the Cosine and Apriori. Households 2, 4 and 6 had the best results for the cosine and the Apriori households 4:06. As can be seen in Figure 4, which presents the comparison chart between the two techniques in general, the cosine has outperformed over the Apriori. Figure 1. Parameters used in Apriori. The association rules interconnect objects in an attempt to expose patterns and trends. The discovery of associations must show both associations as trivial associations not trivial. The Apriori algorithm is often used to mine association rules. Apriori can work with a high number of attributes, generating various combinations between them and performing successive searches across the database, maintaining optimum performance in terms of processing time. The algorithm tries to find all the relevant association rules between items, which has the form X (history) ==> Y (consequent). If x% of transactions that contain X also contain Y, then x% represents the factor of trust (under the rule of confidence). The support factor is a measure representing x% of cases in which X and Y occur simultaneously over the total number of records (often). [18] The cosine is a measure of similarity, a metric that can be applied to find out if an item has a correlation or not with the user profile. A binary vector is a set of two elements, x and y. In an n-dimensional space, where n is the number of items in the vector. You can therefore calculate the cosine between the vectors, measured as similarity between the user profile and its history. The similarity is high when the value of cosine is high, the closer to 1, the greater the similarity. [19] Figure 2. Parameters used in Cosine. Figure 3. Parameters used in Apriori. VII. RESULTS For all households were made in excel spreadsheets to account for the percentage of success of each technique to each household. As the number of recommendations were generated in May, then the basic formula for calculating the percentage of correct answers was used: (1) Figure 4. Comparison between means of Apriori and Cosine.
  • 5. VIII. CONCLUSION During the tests, we observed some peculiarities. Our system recommends content based on the kinds of programs, and our analysis were made according to that parameter. With the Apriori algorithm, the format of the data is already collected correctly for use. For cosine, the EPG needs to be changed to an array before starting the discovery process profiles and recommendations. The Apriori is able to mine only the user viewing history, discovering your profile from the rules. To select the programs to be recommended, another technique should be used. The cosine can do both. However, Apriori can discover more features in the user history, for example, "the user stands in front of the TV more often at night, he enjoys watching movies and watching TV more frequently in the second." The cosine cannot find these features, but can reach our goal. To find similar patterns in association rules, it is necessary to use more complex queries to the bank. The output from the Apriori must be crafted to generate the correct profile of the user, ie, the rules should be interpreted, which in terms of implementation becomes somewhat cumbersome. The cosine output is more readable, its result goes straight to the intended goal, allowing the output to be used without the need for a post-treatment. In relation to entry to the Apriori there is no need of treatment, since the data will be used the way they are collected. But for the cosine, where the EPG is updated, the table containing the matrix of the EPG should be modified according to the new EPG, becoming somewhat laborious. Then simulated with both techniques the process of delivery and acceptance of recommendations by calculating the percentage of correct and generating graphics. The profile of genres found by both algorithms are similar. Although both techniques to cover the needs of the system, the cosine is one that can be better utilized. On a desktop, as was the case of our tests, the return of calculating the cosine is faster in relation to the return of the Apriori association rules. However, further studies on the consumption of processing of these algorithms in a cell with TVDI is not yet possible in Brazil. The time that the whole process of recommendation takes to complete varies according to the technique of customization to be used. In our tests and simulations, the cosine ends the process before the Apriori. Studies show that although both algorithms meet our needs, these two techniques the Cosine can be better worked in the recommendation system for TVDPI. ACKNOWLEDGMENT We thank IBOPE for providing real data about the electronic program guide and also the viewer’s behavior data from March, 05, 2008 to March, 19, 2008. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] Avila, P. M. TV Recommender: Application Development Support of Recommendation for the Brazilian System of Digital TV. Dissertation. Graduate in Computer Science. Department of Computer Science. Federal University of Sao Carlos. 90 pages, 2010. Lucas, A. Customization for Digital TV using the strategy of Recommender System for multiuser environments. Dissertation. Graduate in Computer Science. Department of Computer Science. Federal University of Sao Carlos. 103 pages, 2009. Uribe, S. et al. Mobile TV Targeted Advertisement and Content Personalization. In 16th International Workshop Conference on Systems, Signals and Image Processing, Chalkida, Greece, 18-19/06/2009. Solla, A. G. et al. ZapTV: Personalized User-Generated Content for Handheld Devices in DVB-H Mobile Newtorks. In Proceedings 6th European Interactive TV Conference, p.193-203, Salzburg, Áustria, 0304/07/2008. Bär, A. et al. A Lightweight Mobile TV Recommender: Towards a OneClick-to-Watch Experience. In Proceedings 6th European Interactive TV Conference, p.142-147, Salzburg, Áustria, 03-04/07/2008. Einarsson, O. P. Content Personalization for Mobile TV Combining Content-Based and Collavorative Filtering. Master Thesis. Center for Information and Communication Technologies. Technical Univesity of Denmark. August 22, 2007 Chorianopoulos, K. Personalized and mobile digital TV applications. In Proceedings of the Multimedia Tools and Aplications, p. 1- 10, vol.36, 27 January 2007. Choi, J. Y.; Koh, D.; Lee, J. Ex-ante simulation of mobile TV market based on consumers’ preference data. In Proceedings of the Technological Forecasting & Social Change, p. 1043-1053, 2007. Yu, Z. et al. TV program recommendation for multiple viewers based on user profile merging. In Proceedings of the User Model User-Adap Inter, p. 63-82, 2006. Das, D. and ter Horst, H. Recommder Systems for TV. In Proceedings of 15 th AAAI Conference, Madison, Wisconsin, July 1998. Ginga. Disponível em: <http://www.ginga.org.br/>, Acessado em 06 de janeiro de 2010. http://www.ginga.org.br/ Ginga-NCL. Disponível em: <http://www.ginga.org.br/>, Acessado em 06 de janeiro de 2010. http://www.gingancl.org.br/ Ginga-J. Disponível em: <http://www.ginga.org.br/>, Acessado em 06 de janeiro de 2010. http://dev.openginga.org/ Pazzani, M. J. A framework for Collaborative, Content-Based and Demographic Filtering. Artificial Intelligence Review, p. 393-408, December 1999. Gatto, Elaine C., Zorzi, Sergio D. Recommender System for Digital TV Portable Interactive Brasileira. In 8th International Information and Telecommunication Technologies Symposium - December 09-11, 2009. Florianopolis, Santa Catarina, Brazil. IBOPE. Disponível em <www.ibope.com.br> ABNT NBR 15603-2. Digital terrestrial television – Multiplexing and service information (SI) Part 2: Data structure and definition of basic information of SI. Witten, I. H.; Frank, E. Data Mining: Practical Machine Learning Tools and Techniques, 2nd Edition, Morgan Kaufmann, 525 pages, June 2005. Cristo, M. Sistemas de Recomendação, Métodos e Avaliação. 81 slides. 2009.