Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Research in to Practice: Building and implementing learning analytics at Tribal

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 29 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Research in to Practice: Building and implementing learning analytics at Tribal (20)

Publicité

Plus par LACE Project (20)

Plus récents (20)

Publicité

Research in to Practice: Building and implementing learning analytics at Tribal

  1. 1. Research into Practice Building and implementing learning analytics atTribal Chris Ballard, Data Scientist
  2. 2. Building and implementing learning analytics 1. Start at the very beginning 2. How research and practice differ 3. Building a learning analytics platform 4. Implementing learning analytics 5. Summary
  3. 3. A leading provider of technology-enabled management solutions for the international education, learning and training markets Higher Education Vocational Learning Schools / K-12 Research universities Employment-focused universities Government agencies Further education colleges Training providers and employers Government agencies Schools School groups State and district government agencies AboutTribal 3
  4. 4. Objectives • Predict student academic performance to optimise success • Predict students at risk of non- continuation • Build on research into link betweenVLE activity and academic success • Scale data processing • Understand risk factors and compare to cohorts 3 years of matched student and activity data used to build predictive models Staff can use student, engagement and academic data to understand how they affect student outcomes. Information accessible in one place on easy to understand dashboards. Integrated withTribal SITS:Vision and staff e:vision portal Consultation with academic staff on presentation and design Accuracy of module academic performance predictions* 79% *Using module academic history and demographic factors R&D project overview
  5. 5. Student Insight
  6. 6. Current projects Providing learning analytics for 160,000 students across a state-wide vocational and further education provider inAustralia Student Insight being implemented as part of the JISC UK Effective Learning Analytics programme
  7. 7. What do we mean by practice?
  8. 8. From research to practice Research • Domain knowledge • Interpretation • In depth understanding • Testing an approach Practice • Integrated into everyday life • Interpret easily • Take action • Implementing an approach
  9. 9. Domain and people • What is the problem? • Identify the users and stakeholders • Data owners • Are research results sufficient? • Design • Project cost
  10. 10. Technical • Research limitations • Architecture • Data munging • Automating manual processes • Data suitability • Robustness of technical platform
  11. 11. Building a learning analytics platform Key design decisions 1. Transparency – knowing why a student is a risk 2. Flexibility – viewing learning analytics which relates to an institution's curriculum and organisation 3. Efficiency – ease of use, implementation and interpretation
  12. 12. Information relevant for different users
  13. 13. Aggregating warning indicators
  14. 14. Aggregating warning indicators
  15. 15. Transparency – individual risk
  16. 16. Ensemble Decision Combination Enrolment Academic Performance Engagement Historic module results VLE Event Data Library Event Data Attendance VLE Library Attendance % % % % % % % Demographics Risk Prediction % Module History AssessmentsFormative Assessments Datasets Ensemble learning Student Weight Weight Weight Weight Weight Weight
  17. 17. Transparency – risk factors
  18. 18. Providing flexibility to an institution
  19. 19. Reflecting differences between courses Student activity data is not consistent across all courses/modules 1. Standardise data so it is comparative across all courses and modules 2. Build different models for each course or module Need to be careful that you have sufficient data for the model to generalise to new data.
  20. 20. Provide opportunity for intervention Identify student at risk Log intervention details Assess intervention effectiveness Learning analytics Allocates intervention to support team Assign SLA SLA based alerts Monitor intervention progress Student support teams
  21. 21. Embed into business process Need to consider how learning analytics becomes embedded into the day to day working life of academic and support staff.  Notifications – analytics becomes proactive; support different types of notification  Integration – accessible from existing tools and services through single sign on
  22. 22. Implementing learning analytics CRISP-DM – Cross Industry Standard Process for Data Mining https://the-modeling-agency.com/crisp-dm.pdf
  23. 23. CRISP-DM process
  24. 24. Data understanding Understanding which features are important Example: end month of unit for successful and failed units
  25. 25. Data preparation Creating comparative features Example:Total proportion of hours worked on failed units
  26. 26. Modelling and evaluation Understanding whether model is under- or over- fitting Example: Learning curve for Random Forest model
  27. 27. Evaluation  Define business focused success criteria  Define model focused success criteria  Define what baseline performance is acceptable  Consider a model cost benefit analysis that takes into account intervention cost Actual Withdrawn Enrolled Predicted Withdrawn benefit cost Enrolled cost benefit
  28. 28. Summary Design  Embed learning analytics into business process  Ensure that analytics can be interpreted easily by staff  Intervention processes that are clearly articulated  Measure intervention effectiveness Implementation  Use a standard project approach such as CRISP-DM  Evaluate data in the context of the business problem and process  Define what success means, including acceptable accuracy and how it needs to be measured
  29. 29. Thank you Chris Ballard @chrisaballard chris.ballard@tribalgroup.com www.tribalgroup.com

Notes de l'éditeur

  • Tribal provides management solutions to the international education, training and learning markets.

    Our tools allow education providers such as universities, colleges and local government to manage student admin and learning processes such as recruitment, admissions, finance, timetabling and course portfolios.

    Student management systems – for example 70% of UK universities use our Higher Education Student Management system called SITS:Vision.
    Involved in large scale education technology implementations – e.g. we have recently implemented one of our student management systems across all schools and campuses in New South Wales, Australia.

  • Student Insight has been developed through a close working partnership between the University of Wolverhampton and Tribal. The original objectives of this partnership were to identify how we could build on initial research carried out by the university and build a solution that could enable Wolverhampton to benefit from improved use of student data. The outcome of this partnership is the Student Insight product that has been developed as a configurable solution that can now be adopted by other institutions. Consultation with academic staff at the university has enabled us to design the system in such a way so that it is flexible and can be tailored to meet the unique needs of each institution.
  • Example of violin practice from my childhood…
  • What do we mean by practice?

    Applying something that has been proven by research to be effective
    Moving from a research technique to application of that technique
    Research is focused on testing a technique or proving a theory; practice is focused on applying the results of that research so that we can benefit from it.

    To put it into perspective, we can compare research to a spreadsheet and practice to mobile apps. The latter "just works" and we can interpret what it is saying easily, and integrate it within our lives. The former requires in depth understanding, evaluation, interpretation and domain knowledge. We can't just stop and use it at a bus stop. But to go from one to the other takes effort and time to translate it from one area to the other.

  • Knowing what problem we are using the results of the research to solve. This has to be a real problem for which we have real data (or could collect data). Something that people are willing to use (and in the case of a product, pay for!)
    Identify the users and what their problems are - we might have a model that can identify student's at risk, but how do staff want to see this presented? Do they want to see individuals, or are they more interested in overall patterns to help future planning?
    Who are the stakeholders? Different from the users. Data owners. Senior leadership team - input onto university's strategy. Important that their needs are represented.
    Asking ourselves whether the research results are sufficient in order to be applied in the real world, whether further refinement is necessary, or whether what we have done needs to be put on the shelf. Very often, we will identify further improvements, but these may not necessarily stop us from moving to application. These may come gradual refinements to something that is being used.
    Design decisions - how does our technique need to change to be used in the real world? User interface; interpreting results of some analysis; involves understanding of objective; Might want to pilot different approaches to see what works - A/B testing commonly employed by Facebook etc. - look this up).

  • Technical decisions - often something written for research is not suitable for using in the real world. E.g. scalability, limitations of original research in order to keep the research focus narrow, hard coding, handling flexibility
    Data munging - often data used in research will have been collected from multiple sources and have been manipulated to resolve data quality issues, sampled etc. We will need to identify how to convert the manual approaches which were sufficient during research to automate them. Integration may present a real practical problem - often data owners want to hold onto their data!
    Data monitoring - a model is only as good as the data you use, therefore we need to be sure that data is loaded into the system correctly and there are no data quality issues.
  • How did we move from research to a platform that could be used by different institutions? Key design decisions were:
     
    Transparency - ensuring you know why a student is at risk
    Flexibility - allowing an institution to see analytics in a way that relates to their curriculum and organisation
    Efficiency - allowing the product to be implemented quickly, reducing implementation cost

    Here are some examples of key decisions that we made during the development of the product.

  • Worked with the University of Wolverhampton to identify the main users. Identified three main classes - Course Director, Module Director, Personal Tutor. Wanted to be able to monitor the students they are responsible for quickly and easily, from an aggregated perspective and down to individuals. Realised that we needed to do this generically as other institutions may have different requirements - different user roles and different institutional structures. Ultimately this changed how we approached the technical design of the system but also what features the system provides. We therefore designed an institution structure that could represent these different structures allowing the platform to adapt to different situations. Security can then be applied to different student groups within that structure. Discuss prediction aggregation.
  • Seeing an aggregated view of student risk as well as individuals - staff said they wanted to see how much groups of students may be at risk, not just individuals. Predictions are aggregated across the institution structure automatically to provide this information.
  • Seeing an aggregated view of student risk as well as individuals - staff said they wanted to see how much groups of students may be at risk, not just individuals. Predictions are aggregated across the institution structure automatically to provide this information.
  • We didn't want our models to be a "black box" where no-one could understand what they do, or why they identify a student at risk. An important consideration when intervening with a student is understanding what the data is saying and why the system has flagged a student. The initial research suggested a way forward and we evaluated a number of ways to solve this problem as part of the early "second phase" of R&D of the product. There are two examples of this:

    Ensemble learning - multiple predictions, single overall decision. Influence chart allowing comparisons across different data sources. Helps at an individual student level.
  • Group influence chart - ability to see what is going on from a more strategic level. What factors influence student outcomes? Helps at a more strategic level when designing intervention measures.
  • Flexible data - institutions have different characteristics and students with different needs and backgrounds. Therefore it is important that the data you bring into the system reflects the needs of both the institution and your students. Your data requirements may change over time - new data may become available and you may want to test its efficacy in student early warning prediction. During research, data is necessarily hard coded and fixed, and, as a result, any models built from that data are more static and relate to the structure of the data that has been used. We built a flexible modelling approach that allows you to bring any data into the system, and map it and view what that data looks like. You can then test models to verify the usefulness of the data to student early warning prediction.

    Configuration by an institution - an institution can configure the application once it has been setup.

  • Institutions are complex because rather than one overall consistent business process, often different faculties, departments, courses or even modules have different approaches to the delivery of their curriculum. The best example of this is in the use of the VLE for the delivery of course materials - some modules may not make use of the VLE at all, or may have a different approach to how the content is delivered. This will change the VLE usage patterns in the data and this needs to be taken into account when using student activity data sources such as those from the VLE. This represents one of the key areas that needs to be considered when implementing learning analytics - designing an approach that takes into account the different needs of students taking different courses and how that is reflected in the data. A conventional approach is to normalise the raw data to reflect this - ensuring that the predictive features can be compared across different course or modules. An alternative approach is to build separate models for different courses. However an impact of this may be that your training data for an individual course/module becomes too limited to generalise well to new data. Both approaches need to be compared in your context to identify which one works best for you. We are planning on building a tool in Student Insight that allows you to perform this comparison and automatically segment models by course using the institution structure hierarchy.
  • Intervention - unless the research is focused on evaluating the impact of interventions taken, one of the key areas which separates practice from research is in the action which is taken as a result of an early warning of risk. Staff should be able to determine whether an intervention is necessary based on the analytics and information they can see about the student and their context. They need a process to determine what types of intervention exist, record the intervention and then track the progress the student is making following that measures that have been taken. A key starting point is having clear institutional guidance about the intervention measures that are available, and in what circumstances they should be applied. With Student Insight, we have built in integration with our student support product that allows an intervention to be manually applied and assigned to the correct student support team for action and communication with the student. Institutions differ in their approach to how this is handled and therefore it is important that the workflows can be configured to reflect that.
  • Notifications - you need to consider how learning analytics can become embedded into the day to day working life of staff. There may be barriers to widespread adoption if staff need to go somewhere to look up information. Although predictive analytics is by nature proactive, if staff need to go and look for early warning indicators, then it will start to be used retrospectively. Notifications are one method to overcome this where staff perhaps receive a communication via email highlighting students who are at risk. They can then take action and arrange meetings directly from this communication or choose to view more information about the student. In addition, which may need to raise the profile of students at risk who have not been dealt with, perhaps highlighting those who have not been reviewed to ensure that they do not slip through the net.
  • When implementing learning analytics, institutions need to follow a project approach that ensures that key decisions have been taken and the project has been fully evaluated at different steps. This will increase the likelihood of project success and the embedding of learning analytics into the day to day activities of the institution.
     
    When implementing Student Insight with an institution, we use an established data mining project methodology called "CRISP-DM" - Cross Industry Standard Process for Data Mining. CRISP-DM was first conceived in 1996 and a review in 2009 called CRISP-DM the "de facto standard for developing data mining and knowledge discovery projects."
     
    The process model provides an overview of the lifecycle of a data mining project. Here we use the terminology "data mining", but it equally applies well to any analytics project. It breaks the analytics process down into six discrete phases. It is important to note that the order that the phases are approached is not constrained, as the result of each phase will dictate which phase needs to be performed next.
     
    The diagram illustrates analytics as a cyclical process. Lessons learned during deployment and use of a model provide inputs to further iterations of the process and provide inputs to more in depth understanding of the business problem to be solved.
  • Business understanding
     
    Understand what you are trying to accomplish, what is the problem that we are trying to use analytics to solve? This is a key step because although there are commonalities between how institutions are currently looking to deploy analytics, they are sufficiently different to have different slants on the same problem. For example, reducing student attrition may be a key issue, but there may be differences between the causes, effects and students for whom it is a particular issue. For example, although our objective may be to reduce attrition, it may be better to focus on student success for some cohorts of students, rather than specifically identifying students at risk of dropping out.
     
    If we don't do this step we might identify the wrong objectives, or possibly spend a lot of time analysing data to get the right answers to the wrong problems in the first place.
     
    In addition to identifying the objectives for the project, you need to agree what will be considered success, and therefore establish some success criteria. Such criteria may cover different areas, and relate to tangible improvements against which we want to target the use of analytics. For example in the case of retention, we may want to increase retention by a specified amount. If we decide on criteria such as this, we will need to carefully plan whether we wish to try and attribute any improvement to the implementation of analytics, taking into account other factors.
     
    JISC Discovery Phase accomplishes most of the tasks required in the Business Understanding phase of a project.
  • Data understanding
     
    Having good quality data available is at the heart of successful implementation of learning analytics. Although there are a large variety of algorithms, having good data with well thought out predictive attributes is the most important step is learning analytics implementation. Indeed, it has been said that 80% of total time spent on an analytics project is not building models, it is working with the data. So this is an important phase. The data understanding phase starts with the identification of potential data sources, collection of that data, assessment of data quality and initial analysis to help with understanding further. It is important that this is carried out in cooperation with data owners across the institution, as well as those who understand the link between a business process and how that process is reflected in the data itself. So, during a Tribal project, we identify the relevant people across the institution who need to be involved in this process.
     
  • Data preparation
     
    Once the data understanding phase has been completed, the raw data collected in the data understanding phase will need to be processed and prepared ready for modelling. This may involve resolving data quality issues, creating aggregated summaries or merging multiple data sources together. Both business and data understanding feed into this stage - it may involve transforming the data to reflect the business process we are modelling. A common issue that you will encounter when working with student data is dealing with time dependencies, where data updates made at the time an event occurs affect data that acts as inputs to our model. For example, current modules a student is studying may be automatically recorded as a fail once the student has withdrawn. If we're not careful then failure can become a very good predictor of likelihood of dropout. It may be, but we need to control for these quirks in the way that the data is recorded otherwise the model that we build will not reflect the real world.
  • Modelling
     
    Once data has been prepared, we're ready to build a model against the data. This will involve selecting appropriate algorithms and comparing the performance of models built using each algorithm. Parameters for each algorithm will need to be chosen so as to optimise the performance of a particular model. One of the most important aspects of this process is identifying whether our model is underfitting or overfitting the data. There can be a number of reasons for this, but one of the most important factors is that more complex models will tend to overfit the data and the converse is true in the case of underfitting. Complexity can come in different forms, for example models with a large number of predictive features in comparison to the number of training examples will tend to be more complex, and thus overfit. In each case, it will mean that our model does not generalise well to instances it has not seen before, and may not give us optimal performance.
     
    When implementing a model in Student Insight at Tribal, one of the main techniques we use are learning and fitting curves which can be used to diagnose whether a model is under or over fitting.
     
    [Illustrate our modelling process]
     
    We have also built in functionality into Student Insight to allow a model to be optimised automatically whilst it is being trained.


  • Evaluation
     
    In the case of predictive analytics, another type of success criteria may relate to goals as to the accuracy of models which have been built. In this situation, it is not sufficient to arbitrarily choose a baseline accuracy figure, but we need to decide what is going to be a sufficient baseline accuracy for our model. Accuracy of a predictive model is measured in different ways, according to our goal and the nature of the data we have available. We may decide that having a model which makes as few false positive predictions as possible is our goal, at the expense of the overall number of positive predictions made. Conversely, we may decide that we wish to make a large number of positive predictions in order to capture as many students at risk as possible. Choosing the appropriate balance for these figures can be tricky and needs to be based around an assessment of business objectives and intervention cost. Often, a cost benefit analysis can be helpful.

×