The document discusses segmenting unsegmented Sanskrit text tokens into words using a personalized context-aware random walk (PCRW) approach. The PCRW treats segmentation as a query expansion problem, starting with tokens and iteratively selecting candidate words from a set of possibilities based on different language models and morphological rules, until a complete sentence is formed. It incorporates various types of linguistic information through different edge weights in the random walk.
The document discusses named entity recognition in Sanskrit. It notes that conventional methods used for English do not work well for Sanskrit due to a lack of annotated data and linguistic style differences. The document outlines approaches used for named entity recognition in Sanskrit, including supervised approaches applied to linguistic corpora with annotated ground truths, feature selection based on natural signals in the language like capitalization in English, and using techniques like hidden Markov models and conditional random fields with a dataset from the Bagavatam and generated ground truths accounting for features like parts of speech, position, lemmas, suffixes, and context.
This document describes an app called ShutApp that predicts a user's mood based on their phone activity and data collected from the phone without explicit user feedback. It collects data like app usage, call logs, keyboard usage patterns, and analyzes emails to label training data. It then builds a real-time mood prediction model using this collected data and sentiment analysis. The summary is:
1. ShutApp is a mood prediction app that collects phone usage data like app history, call logs, keyboard patterns without explicit user input to label training data.
2. It uses this collected data to build a real-time mood prediction model using sentiment analysis of emails and texts.
3. The app aims to predict a user
1) The document discusses automating the Taddhita section of Sanskrit grammar as described by Pāṇini.
2) It aims to simulate the affixation process, handle issues like affix polysemy and synonymy, and determine methods for rule selection and conflict resolution.
3) Analyzing the effects on derived forms and preserved linguistic features can provide supplementary information for lexical databases and serve as a pedagogical tool.
This document describes FeRoSA, a faceted recommendation system for scientific articles. FeRoSA uses a collection of 20,000 computational linguistics articles and constructs a citation network with four facets - alternative approaches, background, comparison, and method. It performs random walks on induced subgraphs for each facet to generate personalized recommendations. Experimental results found FeRoSA outperformed other systems in providing high specificity recommendations, particularly for papers with low citations.
- The document discusses program synthesis through solving optimization problems to find the shortest program that fits the given observations and constraints.
- It proposes using probabilistic context-free grammars to define the search space of possible programs and casting the problem as finding a satisfying assignment for a set of constraints over the program variables.
- An iterative algorithm is described that finds program solutions, adds a minimum length constraint, and repeats to find shorter programs that still satisfy the constraints.
This document describes an assignment to analyze response times for questions on Q&A websites like StackOverflow. Specifically, it involves analyzing questions from a gaming StackExchange dataset and calculating features related to tags for each question, including tag popularity, number of popular tags, number of active subscribers to tags, and percentage of subscribers who are active. Response times will then be plotted against these tag features to analyze correlations. The deliverables are to calculate the features for each question programmatically and generate box plots and CDF plots of response times for each feature.
Asterix and the Maagic Potion - Suffix tree problemAmrith Krishna
Asterix needs to solve a problem to find the secret ingredients for the Magic Potion. Getafix provided each villager with a string and said to find all substrings that appear at least twice, as these would reveal the ingredients. To solve this, Asterix must build a suffix tree to efficiently find all repeated substrings in O(N^3) time before the writing disappears. The document then explains the node structure of a suffix tree and provides examples and functions to output the intermediate and final suffix trees and repeated substrings.
The document discusses segmenting unsegmented Sanskrit text tokens into words using a personalized context-aware random walk (PCRW) approach. The PCRW treats segmentation as a query expansion problem, starting with tokens and iteratively selecting candidate words from a set of possibilities based on different language models and morphological rules, until a complete sentence is formed. It incorporates various types of linguistic information through different edge weights in the random walk.
The document discusses named entity recognition in Sanskrit. It notes that conventional methods used for English do not work well for Sanskrit due to a lack of annotated data and linguistic style differences. The document outlines approaches used for named entity recognition in Sanskrit, including supervised approaches applied to linguistic corpora with annotated ground truths, feature selection based on natural signals in the language like capitalization in English, and using techniques like hidden Markov models and conditional random fields with a dataset from the Bagavatam and generated ground truths accounting for features like parts of speech, position, lemmas, suffixes, and context.
This document describes an app called ShutApp that predicts a user's mood based on their phone activity and data collected from the phone without explicit user feedback. It collects data like app usage, call logs, keyboard usage patterns, and analyzes emails to label training data. It then builds a real-time mood prediction model using this collected data and sentiment analysis. The summary is:
1. ShutApp is a mood prediction app that collects phone usage data like app history, call logs, keyboard patterns without explicit user input to label training data.
2. It uses this collected data to build a real-time mood prediction model using sentiment analysis of emails and texts.
3. The app aims to predict a user
1) The document discusses automating the Taddhita section of Sanskrit grammar as described by Pāṇini.
2) It aims to simulate the affixation process, handle issues like affix polysemy and synonymy, and determine methods for rule selection and conflict resolution.
3) Analyzing the effects on derived forms and preserved linguistic features can provide supplementary information for lexical databases and serve as a pedagogical tool.
This document describes FeRoSA, a faceted recommendation system for scientific articles. FeRoSA uses a collection of 20,000 computational linguistics articles and constructs a citation network with four facets - alternative approaches, background, comparison, and method. It performs random walks on induced subgraphs for each facet to generate personalized recommendations. Experimental results found FeRoSA outperformed other systems in providing high specificity recommendations, particularly for papers with low citations.
- The document discusses program synthesis through solving optimization problems to find the shortest program that fits the given observations and constraints.
- It proposes using probabilistic context-free grammars to define the search space of possible programs and casting the problem as finding a satisfying assignment for a set of constraints over the program variables.
- An iterative algorithm is described that finds program solutions, adds a minimum length constraint, and repeats to find shorter programs that still satisfy the constraints.
This document describes an assignment to analyze response times for questions on Q&A websites like StackOverflow. Specifically, it involves analyzing questions from a gaming StackExchange dataset and calculating features related to tags for each question, including tag popularity, number of popular tags, number of active subscribers to tags, and percentage of subscribers who are active. Response times will then be plotted against these tag features to analyze correlations. The deliverables are to calculate the features for each question programmatically and generate box plots and CDF plots of response times for each feature.
Asterix and the Maagic Potion - Suffix tree problemAmrith Krishna
Asterix needs to solve a problem to find the secret ingredients for the Magic Potion. Getafix provided each villager with a string and said to find all substrings that appear at least twice, as these would reveal the ingredients. To solve this, Asterix must build a suffix tree to efficiently find all repeated substrings in O(N^3) time before the writing disappears. The document then explains the node structure of a suffix tree and provides examples and functions to output the intermediate and final suffix trees and repeated substrings.
This document describes an assignment to simulate a roller coaster ride using multiple Linux processes. The processes include Tourist processes that board and ride the roller coaster, and processes that control the loading and unloading of tourists. Shared memory and semaphores must be used to coordinate access to data structures like queues and tables among the processes. The simulation tracks tourist wait times and emotions before and after riding.
The document describes an assignment to create a file watcher program in C with 4 parts. The program will watch 3 directories for any changes, notify the changes through a log file, copy files to destinations defined in a file map, remove duplicate files while keeping newer copies, and combine the functionality into a shell with file watching capabilities. The program is designed to automatically organize files downloaded or otherwise placed into the watched directories.
R - Eigen vector centrality with product reviewsAmrith Krishna
This document describes using eigenvector centrality to rank authors and documents for a search engine on Amazon food reviews. It involves building a graph of authors based on shared products reviewed and calculating transition probabilities between authors. Eigenvector centrality is then calculated iteratively on this graph to rank authors. For a query, the top 25 documents by cosine similarity are ranked by multiplying their similarity score by the author's eigenvector centrality score to produce the final ranked lists. Correlations between the different ranking methods are calculated using Spearman's rank correlation coefficient.
This document describes an assignment to implement skip lists to solve queries on student data. Students are represented by unique IDs indicating their degree type, year, department, and enrollment number. Skip lists will be built on a master list of all students and subject-specific lists. Queries include finding B.Tech 4th year students in departments 2-3, students enrolled in Algorithms but not IR from departments 3-5, counts of students in both/either Algorithms or IR, and generating DOT files tracing the query solutions. The program must be submitted with documentation and output comparisons for each query trace in red.
This assignment involves implementing a dictionary using a skip list data structure that allows basic operations like insertion, deletion, searching, and displaying the entire list. The task has two parts - in part 1 the input is given in a file and the output should be a GraphViz dot file representing the skip list. In part 2, the input is given in two files, and the task involves searching the skip list for given keys with different probability values p and plotting the search times versus p using gnuplot. The operations on the skip list like insertion, deletion and searching are also described. The deliverables include submitting the C/C++ program with makefile and doxy file.
The document describes a problem involving coordinating the delivery of food items ("Maach-Dal-Bhaat") to hotel guests by multiple processes. There are three food items that must be delivered together for a guest's meal. A chef process produces two items at a time while a waiter process ensures both items for a guest's full meal are delivered using three service rooms. Processes use shared memory and semaphores to synchronize delivery and avoid deadlocks between the chef, waiter, and service room processes.
This document contains code snippets and mathematical expressions related to programming and algorithms. It includes:
1) Code for a recursive function to sort a list in ascending order by splitting it into subsets less than and greater than or equal to an element.
2) Formal grammar definitions for syntax rules.
3) Code to print a string of characters based on their ASCII values.
4) An expression to filter a range by checking for prime numbers.
5) Pseudocode for an in-place Cooley–Tukey FFT algorithm that computes FFTs of size N using recursive decomposition.
Alternative au Tramway de la ville de Quebec Rev 1 sml.pdfDaniel Bedard
CDPQ Infra dévoile un plan de mobilité de 15 G$ sur 15 ans pour la région de Québec. Une alternative plus économique et rapide, ne serait-elle pas posssible?
- Valoriser les infrastructures ferroviaires du CN, en créant un Réseau Express Métropolitain (REM) plutôt qu'un nouveau tramway ou une combinaison des 2.
- Optimiser l'utilisation des rails pour un transport combiné des marchandises et des personnes, en accordant une priorité aux déplacements des personnes aux heures de pointes.
- Intégrer un téléphérique transrives comme 3ème lien urbain dédiés aux piétons et cyclistes avec correspondance avec le REM.
- Le 3 ème lien routier est repensé en intégrant un tunnel routier qui se prolonge avec le nouveau pont de l'Île d'Orléans et quelques réaménagemet de ses chausées.
https://www.linkedin.com/in/bedarddaniel/
English:
CDPQ Infra unveils a $15 billion, 15-year mobility plan for the Quebec region. Wouldn't a more economical and faster alternative be possible?
Leverage CN's railway infrastructure by creating a Metropolitan Express Network (REM) instead of a new tramway or a combination of both.
Optimize the use of rails for combined freight and passenger transport, giving priority to passenger travel during peak hours.
Integrate a cross-river cable car as a third urban link dedicated to pedestrians and cyclists, with connections to the REM.
Rethink the third road link by integrating a road tunnel that extends with the new Île d'Orléans bridge and some reconfiguration of its lanes.
https://www.linkedin.com/in/bedarddaniel/
This document describes an assignment to simulate a roller coaster ride using multiple Linux processes. The processes include Tourist processes that board and ride the roller coaster, and processes that control the loading and unloading of tourists. Shared memory and semaphores must be used to coordinate access to data structures like queues and tables among the processes. The simulation tracks tourist wait times and emotions before and after riding.
The document describes an assignment to create a file watcher program in C with 4 parts. The program will watch 3 directories for any changes, notify the changes through a log file, copy files to destinations defined in a file map, remove duplicate files while keeping newer copies, and combine the functionality into a shell with file watching capabilities. The program is designed to automatically organize files downloaded or otherwise placed into the watched directories.
R - Eigen vector centrality with product reviewsAmrith Krishna
This document describes using eigenvector centrality to rank authors and documents for a search engine on Amazon food reviews. It involves building a graph of authors based on shared products reviewed and calculating transition probabilities between authors. Eigenvector centrality is then calculated iteratively on this graph to rank authors. For a query, the top 25 documents by cosine similarity are ranked by multiplying their similarity score by the author's eigenvector centrality score to produce the final ranked lists. Correlations between the different ranking methods are calculated using Spearman's rank correlation coefficient.
This document describes an assignment to implement skip lists to solve queries on student data. Students are represented by unique IDs indicating their degree type, year, department, and enrollment number. Skip lists will be built on a master list of all students and subject-specific lists. Queries include finding B.Tech 4th year students in departments 2-3, students enrolled in Algorithms but not IR from departments 3-5, counts of students in both/either Algorithms or IR, and generating DOT files tracing the query solutions. The program must be submitted with documentation and output comparisons for each query trace in red.
This assignment involves implementing a dictionary using a skip list data structure that allows basic operations like insertion, deletion, searching, and displaying the entire list. The task has two parts - in part 1 the input is given in a file and the output should be a GraphViz dot file representing the skip list. In part 2, the input is given in two files, and the task involves searching the skip list for given keys with different probability values p and plotting the search times versus p using gnuplot. The operations on the skip list like insertion, deletion and searching are also described. The deliverables include submitting the C/C++ program with makefile and doxy file.
The document describes a problem involving coordinating the delivery of food items ("Maach-Dal-Bhaat") to hotel guests by multiple processes. There are three food items that must be delivered together for a guest's meal. A chef process produces two items at a time while a waiter process ensures both items for a guest's full meal are delivered using three service rooms. Processes use shared memory and semaphores to synchronize delivery and avoid deadlocks between the chef, waiter, and service room processes.
This document contains code snippets and mathematical expressions related to programming and algorithms. It includes:
1) Code for a recursive function to sort a list in ascending order by splitting it into subsets less than and greater than or equal to an element.
2) Formal grammar definitions for syntax rules.
3) Code to print a string of characters based on their ASCII values.
4) An expression to filter a range by checking for prime numbers.
5) Pseudocode for an in-place Cooley–Tukey FFT algorithm that computes FFTs of size N using recursive decomposition.
Alternative au Tramway de la ville de Quebec Rev 1 sml.pdfDaniel Bedard
CDPQ Infra dévoile un plan de mobilité de 15 G$ sur 15 ans pour la région de Québec. Une alternative plus économique et rapide, ne serait-elle pas posssible?
- Valoriser les infrastructures ferroviaires du CN, en créant un Réseau Express Métropolitain (REM) plutôt qu'un nouveau tramway ou une combinaison des 2.
- Optimiser l'utilisation des rails pour un transport combiné des marchandises et des personnes, en accordant une priorité aux déplacements des personnes aux heures de pointes.
- Intégrer un téléphérique transrives comme 3ème lien urbain dédiés aux piétons et cyclistes avec correspondance avec le REM.
- Le 3 ème lien routier est repensé en intégrant un tunnel routier qui se prolonge avec le nouveau pont de l'Île d'Orléans et quelques réaménagemet de ses chausées.
https://www.linkedin.com/in/bedarddaniel/
English:
CDPQ Infra unveils a $15 billion, 15-year mobility plan for the Quebec region. Wouldn't a more economical and faster alternative be possible?
Leverage CN's railway infrastructure by creating a Metropolitan Express Network (REM) instead of a new tramway or a combination of both.
Optimize the use of rails for combined freight and passenger transport, giving priority to passenger travel during peak hours.
Integrate a cross-river cable car as a third urban link dedicated to pedestrians and cyclists, with connections to the REM.
Rethink the third road link by integrating a road tunnel that extends with the new Île d'Orléans bridge and some reconfiguration of its lanes.
https://www.linkedin.com/in/bedarddaniel/