This presentation covers (1) Rich content developed at Chegg (2) An excellent knowledge graph that organizes content in a hierarchical fashion (3) Interaction of students across multiple products to enhance user signal in individual products.
I am going to talk about personalizing the learning experience at Chegg using recommendation systems.
Here is an outline of the presentation.
Chegg is a centralized learning platform where a student comes to learn concepts required for academic performance, job interviews or other activities. The goal of any RS is to present content that is of high quality and relevant, i.e we show them what they want to study. An example of that is --- lets say the student has data analyst job interview --- we know this from past user interactions , so we show the student content related to learning “SQL”.
This is an example of student experience at Chegg. A student logs in and finds suggestions in Mechanical Engineering and Chemistry.
As you can seem this model suggests textbook solutions for users based on their past behavior, and it is accompanied by the message "based on your progress”.
Another example is a concept-based recommendation module in Study, which is placed below an expert answer that the student is viewing.
I wanted to use this slide to give you a look into our content. As you can see most of our content is academic materials.
Now I will Segway into how this content is organized. We have build a knowledge graph which represents a hierarchy of subjects, courses and concepts. The nodes in this graph is provided by subject matter experts. We constantly iterate on this graph as we get suggestions for more nodes and edges. The machine Learning component comes in when we create edges between concept nodes and content. How does this look?
Here is an example of how we connect content from different products to the nodes of the knowledge graph.
When users interact with the content we are able to connect users to a node of the knowledge graph. Since user interactions constantly change with time the degs between users and KG nodes are constantly updated.
Lets now do a deepdive into content classification since that is the backbone of all the recommendations here.
Convolution and pooling layers are good at picking up signature at n-gram level, i.e it is able to pick up when certain phrases are indicative of certain class memberships.
The two layers ensure that the correlations between n-grams are picked up at two different scales.
We define two different task for optimization. One of them is to match the front of the card with the back of the card. We use the CNN model defined in the previous slide and use the dot product as the similarity function and use a cross entropy loss. For the classification problem we feed the CNN model into a softmax layer to predict the courses. Both tasks are optimized simultaneously.