Astra word Segmentation

•

0 j'aime•242 vues

Amrith Krishna

Ingénierie

Set (Size in sentences) Micro Accuracy Macro Accuracy
Training set (1700) 87.76 % 92.56 %
Testing Set (150) 87.82 93.56 %
• रामेण रामस्य
•
•
•
• ṣṭ

Recommandé

Segmentation in Sanskrit texts

Amrith Krishna

The document discusses segmenting unsegmented Sanskrit text tokens into words using a personalized context-aware random walk (PCRW) approach. The PCRW treats segmentation as a query expansion problem, starting with tokens and iteratively selecting candidate words from a set of possibilities based on different language models and morphological rules, until a complete sentence is formed. It incorporates various types of linguistic information through different edge weights in the random walk.

Named Entity recognition in Sanskrit

Amrith Krishna

The document discusses named entity recognition in Sanskrit. It notes that conventional methods used for English do not work well for Sanskrit due to a lack of annotated data and linguistic style differences. The document outlines approaches used for named entity recognition in Sanskrit, including supervised approaches applied to linguistic corpora with annotated ground truths, feature selection based on natural signals in the language like capitalization in English, and using techniques like hidden Markov models and conditional random fields with a dataset from the Bagavatam and generated ground truths accounting for features like parts of speech, position, lemmas, suffixes, and context.

ShutApp

Amrith Krishna

This document describes an app called ShutApp that predicts a user's mood based on their phone activity and data collected from the phone without explicit user feedback. It collects data like app usage, call logs, keyboard usage patterns, and analyzes emails to label training data. It then builds a real-time mood prediction model using this collected data and sentiment analysis. The summary is: 1. ShutApp is a mood prediction app that collects phone usage data like app history, call logs, keyboard patterns without explicit user input to label training data. 2. It uses this collected data to build a real-time mood prediction model using sentiment analysis of emails and texts. 3. The app aims to predict a user

Taddhita Generation

Amrith Krishna

1) The document discusses automating the Taddhita section of Sanskrit grammar as described by Pāṇini. 2) It aims to simulate the affixation process, handle issues like affix polysemy and synonymy, and determine methods for rule selection and conflict resolution. 3) Analyzing the effects on derived forms and preserved linguistic features can provide supplementary information for lexical databases and serve as a pedagogical tool.

Ferosa - Insights

Amrith Krishna

This document describes FeRoSA, a faceted recommendation system for scientific articles. FeRoSA uses a collection of 20,000 computational linguistics articles and constructs a citation network with four facets - alternative approaches, background, comparison, and method. It performs random walks on induced subgraphs for each facet to generate personalized recommendations. Experimental results found FeRoSA outperformed other systems in providing high specificity recommendations, particularly for papers with low citations.

Unsupervised program synthesis

Amrith Krishna

- The document discusses program synthesis through solving optimization problems to find the shortest program that fits the given observations and constraints. - It proposes using probabilistic context-free grammars to define the search space of possible programs and casting the problem as finding a satisfying assignment for a set of constraints over the program variables. - An iterative algorithm is described that finds program solutions, adds a minimum length constraint, and repeats to find shorter programs that still satisfy the constraints.

Analyzing Stack Overflow - Problem

Amrith Krishna

This document describes an assignment to analyze response times for questions on Q&A websites like StackOverflow. Specifically, it involves analyzing questions from a gaming StackExchange dataset and calculating features related to tags for each question, including tag popularity, number of popular tags, number of active subscribers to tags, and percentage of subscribers who are active. Response times will then be plotted against these tag features to analyze correlations. The deliverables are to calculate the features for each question programmatically and generate box plots and CDF plots of response times for each feature.

Asterix and the Maagic Potion - Suffix tree problem

Amrith Krishna

Asterix needs to solve a problem to find the secret ingredients for the Magic Potion. Getafix provided each villager with a string and said to find all substrings that appear at least twice, as these would reveal the ingredients. To solve this, Asterix must build a suffix tree to efficiently find all repeated substrings in O(N^3) time before the writing disappears. The document then explains the node structure of a suffix tree and provides examples and functions to output the intermediate and final suffix trees and repeated substrings.

Recommandé

Segmentation in Sanskrit texts

Amrith Krishna

Named Entity recognition in Sanskrit

ShutApp

Taddhita Generation

Ferosa - Insights

Unsupervised program synthesis

Amrith Krishna

Analyzing Stack Overflow - Problem

Amrith Krishna

Asterix and the Maagic Potion - Suffix tree problem

Amrith Krishna

Roller Coaster Problem - OS

Amrith Krishna

This document describes an assignment to simulate a roller coaster ride using multiple Linux processes. The processes include Tourist processes that board and ride the roller coaster, and processes that control the loading and unloading of tourists. Shared memory and semaphores must be used to coordinate access to data structures like queues and tables among the processes. The simulation tracks tourist wait times and emotions before and after riding.

File Watcher - Lab Assignment

Amrith Krishna

The document describes an assignment to create a file watcher program in C with 4 parts. The program will watch 3 directories for any changes, notify the changes through a log file, copy files to destinations defined in a file map, remove duplicate files while keeping newer copies, and combine the functionality into a shell with file watching capabilities. The program is designed to automatically organize files downloaded or otherwise placed into the watched directories.

R - Eigen vector centrality with product reviews

Amrith Krishna

This document describes using eigenvector centrality to rank authors and documents for a search engine on Amazon food reviews. It involves building a graph of authors based on shared products reviewed and calculating transition probabilities between authors. Eigenvector centrality is then calculated iteratively on this graph to rank authors. For a query, the top 25 documents by cosine similarity are ranked by multiplying their similarity score by the author's eigenvector centrality score to produce the final ranked lists. Correlations between the different ranking methods are calculated using Spearman's rank correlation coefficient.

Skipl List implementation - Part 2

Amrith Krishna

This document describes an assignment to implement skip lists to solve queries on student data. Students are represented by unique IDs indicating their degree type, year, department, and enrollment number. Skip lists will be built on a master list of all students and subject-specific lists. Queries include finding B.Tech 4th year students in departments 2-3, students enrolled in Algorithms but not IR from departments 3-5, counts of students in both/either Algorithms or IR, and generating DOT files tracing the query solutions. The program must be submitted with documentation and output comparisons for each query trace in red.

Skipl List implementation - Part 1

Amrith Krishna

This assignment involves implementing a dictionary using a skip list data structure that allows basic operations like insertion, deletion, searching, and displaying the entire list. The task has two parts - in part 1 the input is given in a file and the output should be a GraphViz dot file representing the skip list. In part 2, the input is given in two files, and the task involves searching the skip list for given keys with different probability values p and plotting the search times versus p using gnuplot. The operations on the skip list like insertion, deletion and searching are also described. The deliverables include submitting the C/C++ program with makefile and doxy file.

Maach-Dal-Bhaat Problem

Amrith Krishna

The document describes a problem involving coordinating the delivery of food items ("Maach-Dal-Bhaat") to hotel guests by multiple processes. There are three food items that must be delivered together for a guest's meal. A chef process produces two items at a time while a waiter process ensures both items for a guest's full meal are delivered using three service rooms. Processes use shared memory and semaphores to synchronize delivery and avoid deadlocks between the chef, waiter, and service room processes.

QGene Quiz 2016

Amrith Krishna

This document contains code snippets and mathematical expressions related to programming and algorithms. It includes: 1) Code for a recursive function to sort a list in ascending order by splitting it into subsets less than and greater than or equal to an element. 2) Formal grammar definitions for syntax rules. 3) Code to print a string of characters based on their ASCII values. 4) An expression to filter a range by checking for prime numbers. 5) Pseudocode for an in-place Cooley–Tukey FFT algorithm that computes FFTs of size N using recursive decomposition.

Windows Architecture

Amrith Krishna

Comment aborder le changement climatique dans son métier, volet adaptation

Institut de l'Elevage - Idele

Leviers d’adaptation au changement climatique, qualité du lait et des produit...

Institut de l'Elevage - Idele

Presentation d'esquisse route juin 2023.pptx

imed53

1er webinaire INOSYS Réseaux d’élevage Ovins Viande

Institut de l'Elevage - Idele

Accompagner les éleveurs dans l'analyse de leurs coûts de production

Institut de l'Elevage - Idele

Reconquête de l’engraissement du chevreau à la ferme

Institut de l'Elevage - Idele

Alternative au Tramway de la ville de Quebec Rev 1 sml.pdf

Daniel Bedard

CDPQ Infra dévoile un plan de mobilité de 15 G$ sur 15 ans pour la région de Québec. Une alternative plus économique et rapide, ne serait-elle pas posssible? - Valoriser les infrastructures ferroviaires du CN, en créant un Réseau Express Métropolitain (REM) plutôt qu'un nouveau tramway ou une combinaison des 2. - Optimiser l'utilisation des rails pour un transport combiné des marchandises et des personnes, en accordant une priorité aux déplacements des personnes aux heures de pointes. - Intégrer un téléphérique transrives comme 3ème lien urbain dédiés aux piétons et cyclistes avec correspondance avec le REM. - Le 3 ème lien routier est repensé en intégrant un tunnel routier qui se prolonge avec le nouveau pont de l'Île d'Orléans et quelques réaménagemet de ses chausées. https://www.linkedin.com/in/bedarddaniel/ English: CDPQ Infra unveils a $15 billion, 15-year mobility plan for the Quebec region. Wouldn't a more economical and faster alternative be possible? Leverage CN's railway infrastructure by creating a Metropolitan Express Network (REM) instead of a new tramway or a combination of both. Optimize the use of rails for combined freight and passenger transport, giving priority to passenger travel during peak hours. Integrate a cross-river cable car as a third urban link dedicated to pedestrians and cyclists, with connections to the REM. Rethink the third road link by integrating a road tunnel that extends with the new Île d'Orléans bridge and some reconfiguration of its lanes. https://www.linkedin.com/in/bedarddaniel/

COUPROD Une méthode nationale commune à l’ensemble des filières herbivores

Institut de l'Elevage - Idele

COURS ANALYSE FINANCIERE-NOGLO Méthodes d’analyses financières.pdf

sieousse95

Quelles rotations dans les systèmes caprins de Nouvelle-Aquitaine et Pays de ...

Institut de l'Elevage - Idele

Accompagner les porteurs de projets en transformation fermière

Institut de l'Elevage - Idele

Contenu connexe

Plus de Amrith Krishna

Roller Coaster Problem - OS

Amrith Krishna

File Watcher - Lab Assignment

Amrith Krishna

R - Eigen vector centrality with product reviews

Amrith Krishna

Skipl List implementation - Part 2

Amrith Krishna

Skipl List implementation - Part 1

Amrith Krishna

Maach-Dal-Bhaat Problem

Amrith Krishna

QGene Quiz 2016

Amrith Krishna

Windows Architecture

Amrith Krishna

Plus de Amrith Krishna (8)

Roller Coaster Problem - OS

File Watcher - Lab Assignment

R - Eigen vector centrality with product reviews

Skipl List implementation - Part 2

Skipl List implementation - Part 1

Maach-Dal-Bhaat Problem

QGene Quiz 2016

Windows Architecture

Dernier

Comment aborder le changement climatique dans son métier, volet adaptation

Institut de l'Elevage - Idele

Leviers d’adaptation au changement climatique, qualité du lait et des produit...

Institut de l'Elevage - Idele

Presentation d'esquisse route juin 2023.pptx

imed53

1er webinaire INOSYS Réseaux d’élevage Ovins Viande

Institut de l'Elevage - Idele

Accompagner les éleveurs dans l'analyse de leurs coûts de production

Institut de l'Elevage - Idele

Reconquête de l’engraissement du chevreau à la ferme

Institut de l'Elevage - Idele

Alternative au Tramway de la ville de Quebec Rev 1 sml.pdf

Daniel Bedard

COUPROD Une méthode nationale commune à l’ensemble des filières herbivores

Institut de l'Elevage - Idele

COURS ANALYSE FINANCIERE-NOGLO Méthodes d’analyses financières.pdf

sieousse95

Quelles rotations dans les systèmes caprins de Nouvelle-Aquitaine et Pays de ...

Institut de l'Elevage - Idele

Accompagner les porteurs de projets en transformation fermière

Institut de l'Elevage - Idele

Dernier (11)

Comment aborder le changement climatique dans son métier, volet adaptation

Leviers d’adaptation au changement climatique, qualité du lait et des produit...

Presentation d'esquisse route juin 2023.pptx

1er webinaire INOSYS Réseaux d’élevage Ovins Viande

Accompagner les éleveurs dans l'analyse de leurs coûts de production

Reconquête de l’engraissement du chevreau à la ferme

Alternative au Tramway de la ville de Quebec Rev 1 sml.pdf

COUPROD Une méthode nationale commune à l’ensemble des filières herbivores

COURS ANALYSE FINANCIERE-NOGLO Méthodes d’analyses financières.pdf

Quelles rotations dans les systèmes caprins de Nouvelle-Aquitaine et Pays de ...

Accompagner les porteurs de projets en transformation fermière

Astra word Segmentation

6. Set (Size in sentences) Micro Accuracy Macro Accuracy Training set (1700) 87.76 % 92.56 % Testing Set (150) 87.82 93.56 % • रामेण रामस्य • • • • ṣṭ