Why one-size-fits all does not work in Explainable Artificial Intelligence!

17 Mar 2023
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
Why one-size-fits all does not work in Explainable Artificial Intelligence!
1 sur 47

Contenu connexe

Similaire à Why one-size-fits all does not work in Explainable Artificial Intelligence!

Big Data and Predictive AnalyticsBig Data and Predictive Analytics
Big Data and Predictive AnalyticsInnovations2Solutions
Mining User Interests from Social MediaMining User Interests from Social Media
Mining User Interests from Social MediaFattane Zarrinkalam
Talk straps: Interactivity between Human and Artificial IntelligenceTalk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial IntelligenceGenoveva Vargas-Solar
UX STRAT USA: Sean Rhodes, "UX Strategy For Increasingly Disruptive Futures S...UX STRAT USA: Sean Rhodes, "UX Strategy For Increasingly Disruptive Futures S...
UX STRAT USA: Sean Rhodes, "UX Strategy For Increasingly Disruptive Futures S...UX STRAT
Violence_Detection.pptxViolence_Detection.pptx
Violence_Detection.pptxArindam Paul
Vagueness in Semantic Information ManagementVagueness in Semantic Information Management
Vagueness in Semantic Information ManagementPanos Alexopoulos

Plus de voginip

The Dark Side of Science: Misconduct in Biomedical ResearchThe Dark Side of Science: Misconduct in Biomedical Research
The Dark Side of Science: Misconduct in Biomedical Researchvoginip
Oude boeken, nieuwe vaardigheden en WikipediaOude boeken, nieuwe vaardigheden en Wikipedia
Oude boeken, nieuwe vaardigheden en Wikipediavoginip
De kracht van samenwerking: hoe de Universiteitsbibliotheek Gent open kennisc...De kracht van samenwerking: hoe de Universiteitsbibliotheek Gent open kennisc...
De kracht van samenwerking: hoe de Universiteitsbibliotheek Gent open kennisc...voginip
Open yet everywhere in chains: Where next for open knowledge?Open yet everywhere in chains: Where next for open knowledge?
Open yet everywhere in chains: Where next for open knowledge?voginip
The three layers of a knowledge graph and what it means for authoring, storag...The three layers of a knowledge graph and what it means for authoring, storag...
The three layers of a knowledge graph and what it means for authoring, storag...voginip
Vijf vindbaarheidsproblemen waar een taxonomie de schuld van krijgt (maar nik...Vijf vindbaarheidsproblemen waar een taxonomie de schuld van krijgt (maar nik...
Vijf vindbaarheidsproblemen waar een taxonomie de schuld van krijgt (maar nik...voginip

Dernier

OCTRI PSS Simulations in R Seminar.pdfOCTRI PSS Simulations in R Seminar.pdf
OCTRI PSS Simulations in R Seminar.pdfssuser84c78e
Cost of Living UpdateCost of Living Update
Cost of Living UpdateARCResearch
Database DesignDatabase Design
Database DesignVICTOR MAESTRE RAMIREZ
apidays London 2023 - Why and how to apply DDD to APIs, Radhouane Jrad, QBE E...apidays London 2023 - Why and how to apply DDD to APIs, Radhouane Jrad, QBE E...
apidays London 2023 - Why and how to apply DDD to APIs, Radhouane Jrad, QBE E...apidays
CASL compliance.pptxCASL compliance.pptx
CASL compliance.pptxpapalipadhihari
apidays London 2023 - Advanced AI-powered API Security, Ricky Moorhouse (IBM)...apidays London 2023 - Advanced AI-powered API Security, Ricky Moorhouse (IBM)...
apidays London 2023 - Advanced AI-powered API Security, Ricky Moorhouse (IBM)...apidays

Dernier(20)

Why one-size-fits all does not work in Explainable Artificial Intelligence!

Notes de l'éditeur

  1. Title: Interactive and adaptive explanation interfaces for recommender systems Abstract:  Recommender systems such as Amazon, offer users recommendations, or suggestions of items to try or buy. Recommender systems use algorithms that can be categorized as ranking algorithms, which result in an increase in the prominence of some information. Other information (e.g., low rank or low confidence recommendations) is consequently filtered and not shown to people. However, it is often not clear to a user whether the recommender system's advice is suitable to be followed, e.g., whether it is correct, whether the right information was taken into account, whether crucial information is missing. This keynote will describe how explanations can help bridge that gap, what we constitute as helpful explanations, as well as the considerations and external factors we must consider when determining the level of helpfulness. I will introduce the notion of interactive explanation interfaces for recommender systems, which support exploration rather than explaining (potentially poor) recommendations post-hoc. These interfaces support users in exploring the effects of the user model and algorithmic parameters on resulting recommendations. I will also address the importance of user modeling in the design of explanation interfaces. Both the characteristics of the people (e.g., expertise, cognitive capacity) and the situation (e.g., which other people are affected, or which variables are influential) place different requirements on which explanations are useful, also w.r.t. to different presentational choices (e.g., with regard to modality, degree of interactivity, level of detail).  Simply put, ``One size does not fit all’’.  45 minutes including Q&A
  2. Emphasis keywords
  3. https://www.um.org/umap2021/12-posters/25-the-unreasonable-effectiveness-of-graphical-user-interfaces-for-recommender-systems
  4. Ways of presenting – list with score, top-N, single one, set of items Variety of domains – some more expected than others
  5. People had a hard time having an opinion for ”baseline” explanations
  6. Add references from Recsys
  7. Effects of personal characteristics on perceptions of diversity, and item acceptance. Yucheng Jin et al, UMAP’18 Perceptions of diversity differed between two visualizations for people with high musical sophistication. They also differed for people with high visual working memory.  Individual differences matter!
  8. Novel algorithms for diverse selection Enrichment to move beyond diversification on `relevance’ Topical (query) relevance Ranked by click-through or popularity Previous consumption (e.g., LFP) Which diversity metrics can we develop on the basis of debatability metadata? Which diversification strategies can we model? What is the accuracy, and coverage of different diversification models? Can we further improve models of diversification by considering individual users?
  9. Touch back on exploratory aims Diifferent people need different expllanations There is a benefit in interaction
  10. Lightweight interventions, but we can have repeat interactions In order to study enduring effects, we developed an interactive mockup that simulates a search engine. This allows for repeat interactions and recording behavior as it evolves. And we have just completed running a first study and I look forward to seeing how this and other interactive interfaces enable us to study how learning occurs, and how meta-cognition develops.
  11. Left to right: 1) user’s profile items 2) top-k similar users; 3) top-n recommendations. Adds, deletes, or re-rates produce updated recommendations The dark blue lines appear when a user clicks a node and show provenance data for recommendations and the nearest neighbors who contributed to them. Clicking a movie recommendation on the right side of the page opens the movie information page on Rottentomatoes.com.
  12. We used the Spotify API~\footnote{\url{https://developer.spotify.com/web-api}, accessed June 2018.} to design a music recommender system and to present the user controls for three distinct recommender components. We leveraged the algorithm introduced in Section \ref{sec:general-algorithm} to control the recommendation process as well as our user interface and user controls. \begin{figure*} \centering \includegraphics[width=1\columnwidth]{figures/exp1/vis1} \caption{a): the recommendation source shows available top artists, tracks and genre tags. b): the recommendation processor enables users to adjust the weight of the input data type and individual data items. c): play-list style recommendations. Some UI controls are disabled in specific settings of user control, e.g., the sliders in b) are grayed out in the setting 5: REC*PRO. }~\label{fig:vis1} \end{figure*} We created four scenarios for the user task of selecting music, with each scenario represented by setting a pair of audio feature values between 0.0 and 1.0. We set a value for each scenario based on the explanation of audio feature value in the Spotify API. The used scenarios include: \textit{``Rock night - my life needs passion" }assigning attribute ``energy" between 0.6 and 1.0; \textit{``Dance party - dance till the world ends"} setting ``danceability" between 0.6 and 1.0; \textit{``A joyful after all exams"} with ``danceability" between 0.6 and 1.0; \textit{``Cannot live without hip-hop"} with ``speechiness" from 0.33 to 0.66. subsubsection{User interface} The user interface of the recommender is featured with ``drag and drop" interactions. The interface consists of three parts, as presented in Figure~\ref{fig:vis1}. \begin{itemize} \item[(a)] The \emph{user profile} works as a warehouse of source data, such as top artists, top tracks, and top genres, generated from past listening history. \item[(b)] The \emph{algorithm parameters} shows areas in which source items can be dropped from part (a). The dropped data are bound to UI controls such as sliders or sortable lists for weight adjustment. It also contains an additional info view to inspect details of selected data items. \item[(c)] The \emph{recommendations}: the recommended results are shown in a play-list style. \end{itemize} As presented in Figure~\ref{fig:vis1}, we use three distinct colors to represent the recommendation source data as visual cues: brown for artists, green for tracks, and blue for genres. Additional source data for a particular type is loaded by clicking the \textbf{``+''} icon next to the title of the source data type. Likewise, we use the same color schema to code the seeds (a), selected source data and data type slider (b), and recommendations (c). As a result, the visual cues show the relation among the data in three steps of the recommendation process. When users click on a particular data item in the recommendation processor, the corresponding recommended items are highlighted, and an additional info view displays its details. \subsubsection{User controls} %The interactive recommendation framework proposed by ~\cite{he2016interactive} defines three main components of interactive recommenders. %KV omitted and referred to our own framework Based on our framework described in Section \ref{sec:UIframework}, we defined three user control components in our study: (1) user profile (PRO), (2) algorithm parameters (PAR), (3) recommendations (REC) (see Table~\ref{tab:levelofcontrol}). \begin{table} \centering \begin{tabular}{p{14em} p{6em} p{15em}} {\small \textit{Components}} & {\small \textit{Control levels}} & {\small \textit{User controls}}\\ \midrule \small{Algorithm parameters (\textbf{PAR})} & \small{High} & \small{Modify the weight of the selected or generated data in the recommender engine} \\ \small{User profile (\textbf{PRO})} & \small{Middle} & \small{Select which user profile will be used in the recommender engine and check additional info of the user profile} \\ \small{Recommendations (\textbf{REC})} & \small{Low} & \small{Remove and sort recommendations} \\ \bottomrule \end{tabular} \caption{The three types of user control employed in our study.}~\label{tab:levelofcontrol} \end{table} ~\paragraph{Control for algorithm parameters (\textbf{PAR}).} This type of control allows users to tweak the influence of different underlying algorithms. To support this level of control, multiple UI components are developed to adjust the weight associated with the type of data items, or the weight associated with an individual data item. Users are able to specify their preferences for each data type by manipulating a slider for each data type. By sorting the list of dropped data items, users can set the weight of each item in this list (Figure~\ref{fig:vis1}b). ~\paragraph{Control for user profile (\textbf{PRO}).} This type of control influences the seed items used for recommendation. A drag and drop interface allows users to intuitively add a new source data item to update the recommendations (Figure~\ref{fig:vis1}a). When a preferred source item is dropped to the recommendation processor, a progress animation will play until the end of the processing. Users are also able to simply remove a dropped data item from the processor by clicking the corresponding \textbf{``x''} icon. Moreover, by selecting an individual item, users can inspect its detail: artists are accompanied by their name, an image, popularity, genres, and number of followers, tracks are shown with their name, album cover, and audio clip, and genres are accompanied by a play-list whose name contains the selected genre tag. ~\paragraph{Control for recommendations (\textbf{REC}).} This type of control influences the recommended songs directly. Since the order of items in a list may affect the experience of recommendations~\citep{zhao2017toward}, manipulations on recommendations include reordering tracks in a play-list. It also allows users to remove an unwanted track from a play-list. When doing so, a new recommendation candidate replaces the removed item. The action of removing can be regarded as a kind of implicit feedback to recommendations. Although a rating function has been implemented for each item in a play-list, the rating data is not used to update the user's preference for music recommendations. Therefore, user rating is not considered as a user control for the purposes of this study.
  13. We employed a between-subjects study to investigate the effects of interactions among different user control on acceptance, perceived diversity, and cognitive load. We consider each of three user control components as a variable. By following the 2x2x2 factorial design we created eight experimental settings (Table~\ref{tab:table1}), which allows us to analyze three main effects, three two-way interactions, and one three-way interaction. We also investigate which specific \textit{personal characteristics} (musical sophistication, visual memory capacity) influence acceptance and perceived diversity. Each experimental setting is evaluated by a group of participants (N=30). Of note, to minimize the effects of UI layout, all settings have the same UI and disable the unsupported UI controls, e.g., graying out sliders. As shown in section ~\ref{evaluation questions}, we employed Knijnenburg et al.'s framework~\citep{knijnenburg2012explaining} to measure the six subjective factors, perceived quality, perceived diversity, perceived accuracy, effectiveness, satisfaction, and choice difficulty~\citep{knijnenburg2012explaining}. In addition, we measured cognitive load by using a classic cognitive load testing questionnaire, the NASA-TLX~\footnote{https://humansystems.arc.nasa.gov/groups/tlx}. It assesses the cognitive load on six aspects: mental demand, physical demand, temporal demand, performance, effort, and frustration. The procedure follows the design outlined in the general methodology (c.f., Section \ref{sec:general-procedure}). The \textit{experimental task} is to compose a play-list for the chosen scenario by interacting with the recommender system. Participants were presented with play-list style recommendations (Figure~\ref{fig:vis1}c). Conditions were altered on a between-subjects basis. Each participant was presented with only one setting of user control. For each setting, initial recommendations are generated based on the selected top three artists, top two tracks, and top one genre. According to the controls provided in a particular setting, participants were able to manipulate the recommendation process.
  14. Main effects: REC has lowest cgload and highest acceptance Two-way: All the settings that combine two control components do not lead to significantly higher cognitive load than using only one control component. combing multiple control components potentially increases acceptance without increasing cognitive load significantly. visual memory is not a significant factor that affects the cognitive load of controlling recommender systems. In other words, controlling the more advanced recommendation components in this study does not seem to demand a high visual memory. In addition, we did not find an effect of visual memory on acceptance (or perceived accuracy and quality). One possible explanation is that users with higher musical so- phistication are able to leverage different control components to explore songs, and this influences their perception of recommenda- tion quality, thereby accepting more songs. Our results show that the settings of user control significantly influence cognitive load and recommendation acceptance. We discuss the results by the main effects and interaction effects in a 2x2x2 factorial design. Moreover, we discuss how visual memory and musical sophistication affect cognitive load, perceived diversity, and recommendation acceptance. \subsubsection{Main effects} We discuss the main effects of three control components. Increased control level; from control of recommendations (REC) to user profile (PRO) to algorithm parameters (PAR); leads to higher cognitive load (see Figure \ref{fig:margin}c). The increased cognitive load, in turn, leads to lower interaction times. Compared to the control of algorithm parameters (PAR) or user profile (PRO), the control of recommendations (REC) introduces the least cognitive load and supports users in finding songs they like. We observe that most existing music recommender systems only allow users to manipulate the recommendation results, e.g., users provide feedback to a recommender through acceptance. However, the control of recommendations is a limited operation that does not allow users to understand or control the deep mechanism of recommendations. \subsubsection{Two-way interaction effects} Adding multiple controls allows us to improve on existing systems w.r.t. control, and do not necessarily result in higher cognitive load. Adding an additional control component to algorithm parameters increases the acceptance of recommended songs significantly. Interestingly, all the settings that combine two control components do \textit{not} lead to significantly higher cognitive load than using only one control component. We even find that users' cognitive load is significantly \textit{lower} for (PRO*PAR) than (PRO, PAR), which shows a benefit of combining user profile and algorithm parameters in user control. Moreover, combing multiple control components potentially increases acceptance without increasing cognitive load significantly. Arguably, it is beneficial to combine multiple control components in terms of acceptance and cognitive load. \subsubsection{Three-way interaction effects} The interaction of PRO*PAR*REC tends to increase acceptance (see Figure \ref{fig:margin}a), and it does not lead to higher cognitive load (see Figure \ref{fig:margin}c). Moreover, it also tends to increase interaction times and accuracy. Therefore, we may consider having three control components in a system. Consequently, we answer the research question. \textbf{RQ1}: \textit{The UI setting (user control, visualization, or both) has a significant effect on recommendation acceptance?} It seems that combining PAR with a second control component or combing three control components increases acceptance significantly. %KV: this paragraph refers to different RQs: either rephrase or omit? -SOLVED \subsubsection{Effects of personal characteristics} Having observed the trends across all users, we survey the difference in cognitive load and item acceptance due to personal characteristics. We study two kinds of characteristics: visual working memory and musical sophistication. \paragraph{Visual working memory} The SEM model suggests that visual memory is not a significant factor that affects the cognitive load of controlling recommender systems. The cognitive load for the type of controls used may not be strongly affected by individual differences in visual working memory. In other words, controlling the more advanced recommendation components in this study does not seem to demand a high visual memory. In addition, we did not find an effect of visual memory on acceptance (or perceived accuracy and quality). Finally, the question items for diversity did not converge in our model, so we are not able to make a conclusion about the influence of visual working memory on diversity. \paragraph{Musical sophistication} Our results imply that high musical sophistication allows users to perceive higher recommendation quality, and may thereby be more likely to accept recommended items. However, higher musical sophistication also increases choice difficulty, which may negatively influence acceptance. One possible explanation is that users with higher musical sophistication are able to leverage different control components to explore songs, and this influences their perception of recommendation quality, thereby accepting more songs. Finally, the question items for diversity did not converge in our model, so we are not able to make a conclusion about the influence of musical sophistication on diversity.
  15. Tag Box – keyword summary. Query Box – Keywords that originated the current ranking state. Document List – Augmented document titles Ranking View – relevance scores Document Viewer – title, year, and snippet of document with augmented keywords Ranking Controls – wrap buttons for ranking settings