3. Chandassu
Chandas Sastra is a literature framework with set of rules to be followed to
write a poem or prose.
Chandassu was first used in Vedas.
Telugu Chandassu was derived from Sanskrit but it has its own Set of rules.
The literary work that followed chandassu framework in Telugu is referred as
padyam.
SKIP
4. Features of a Chandssu
The features that define a chandassu are
gana structure
yati
prAsa
prAsa yati
5. Guru and Laghu
Syllables are classified into guru and laghu
Symbols associate with guru and laghu are U I
Other Symbols that are in-usage are Dot, Dash and Inverted U etc.
The classification of the Syllable is based on the time takes to pronounce the given syllable
Laghu syllable takes 1unit ,
Guru Syllable takes 2 units.
Syllable classification is also based on the position of the Syllable in a given word.
Ex:
Independent అ Laghu I
But the word అమ్మ to have Sequence as UI
Chandas Sastra defines these rules.
6. gana
Sequences of Guru and laghu will form a gaNa.
Ganas are classified as
Named gana (Akshara)
The Symbol Sequences with a length of 1,2,3 are given with a name. Ex: la, ga, va, ha,
ya, ma, ta, ra , ha , bha , na , sa
Compound gana: The sequences of named ganas to form compound ganas. Ex: la-la
means II
Grouped gana
Set of Named Gana’s and Compound ganas are classified as Grouped ganas
Matra gana
Ganas are classified based on the total time takes to pronounces
Ex: la-la takes 2 units of time.
7. yati –prAsa- prAsa yati
yati
Is a position at which the word break place in Sanksrit Chandassu
Is a similar or yati friend syllable to the 1st Syllable of a pada
prAsa
Is usually 2nd Syllable or last syllable of a pada.
Same or prAsa friend syllable to be maintained at each line.
prAsa yati
It’s a relaxed feature when poet is not able to apply prAsa in some chandassu’s
yati and prAsa to form a group and a prAsa yati friendly group to be formed at
yati position.
8. Classification of Chandassu’s
gaNa Structure yati prAsa prAsa yati
Jati Grouped gaNa ✓ ✓
upaJati Grouped gaNa ✓ ✓ ✓
vRutta Named gaNa ✓ ✓
matrA matra gaNa ✓* ✓*
*Optional in many cases
9. How many?
Chandassu uses Binary symbols (U,I) to represent a sequence of Syllables.
There is no restriction on No. of Syllables per line or poem (padyam).
The no. of Sequences that can be formed With upto n Syllables are
21 +22 +23 +…. +2n=2n+1-2
Chandas Sastra named Chandassu’s up to 26 Syllables.
Ex: gayatri Chandassu means Any Sequence of 6 Syllable Symbols*
udduramala Chandassu is a name given to a chandassu with >26
Syllables.
Simply we can say the total possible Sequences are infinite.
10. How many?
If n=26 then total possible Sequences are 227-2=13,42,17,728.
Chandas Sastra defined very few of them around 2000 sequences only.
Ancient Telugu poets frequently used 30 Telugu Chandassu’s.
Ancient Sanskrit poets used more than 1200 Sequences but less than
1500 *
Many poets create their own Chandassu's in our time too.
11. Why?
The Quantity of Literature written under Chandassu framework got reduced by
large amount.
This era of Digitization
Tools for publishers: To ensure the quality
Tools for Learners: To learn in a easy and interactive way.
Tools for professional poets: To experiment in new sequences and reducing effort of
computation and validation against rules.
Tools for Language study and Analysis: To understand and distinguish the poets style
and Language, vocabulary etc.. at that time.
What If a tool can do all these related to
Padyam?
15. White Listing
Digitization might involve references to sources and sometime foreign
characters.
Telugu[Targeted Language] Unicode Sub-Range characters will be filtered
along with few identified punctuations.
17. Syllable Chunks
Building of Syllable Chunks are necessary to create Laghu-Guru Stream.
Any Syllable Extraction Mechanism can be used.
Ex:కొత్త కొ , త్త
19. Laghu-Guru Stream
Each Syllable group will be assigned to a Symbol (U,I)
All laghu syllables will be checked for the influence of next syllable on it or
not.
Ex: కొత్త
కొ [ I,1,0]
త్త [ I,1,1]
[Current Symbol, Check for next Syllable Influence, Can influence prev.
Syllable]
i.e. కొ, త్త U, I
22. gaNa Parser
Based on the target gaNa Characteristic Symbol Stream will be parsed.
Ex: U||U||U||U
Named gaNa:
Above gaNa sequence can be parsed as bha-bha-bha-ga or
gala-laga-lala-gala-laga or many other
While parsing the Symbols next expected gaNa’s threshold will be considered.
Say for the above sequence feature is defined as bha-bha-bha-ga then
threshold would be 3-3-3-1.
bha-bha-bha-ga
23. gaNa Parser
Grouped gaNa
Incase of Grouped gana’s expected threshold is not constant
Immediate Symbol Sequence is when expected group found or Symbol at
which Max Threshold reached is considered as the gana.
Ex: U|UU| U|- UU| Surya-Indra
Min Threshold Max Threshold
Surya (Brahma) 2 3
Indra (Vishnu) 3 4
Chandra (Rudra) 4 5
24. gaNa Parser
Matra gaNa
Immediate Symbol Sequence is found with expected Matra count reached or
Exceeded.
Ex: : U|UU| can be parsed as UIU-UUI when expected matra gaNa is 5-5
25. Pairs Parser
yati, prAsa, prAsa-yati are the pair of syllables.
These will be extracted based on the position of yati
Position of yati
Usually a absolute number incase of vRutta’s
Ex: 10th place means 10 Syllable in each line.
Relative position from a given gaNa.
Ex: 1st Syllable of 3rd gaNa.
Ex: Last Syllable of 3rd gaNa.
Pairs of 1st and nth syllable extracted will be created for each line along with their previous Syllable,
prAsa:
prAsa is usually the 2nd or last syllable of each line.
Hence array of prAsa will be created with previous syllable too.
Previous Syllable has important role since it can influence the validity of the Yati, prAsa , prAsa-yati pairs
27. Match Features
Extracted Features (gaNa’s and Pairs) will be matched against Expected
feature.
A Scoring System is defined to find the match percentage.
-1 → Key feature not found or mismatched.
0 → Feature not found or mismatched
+1 → Feature found and matched.
+2 → Key feature found and matched exactly.
Customised Scoring Systems are open for experiments.
Percentage of Match or Confidence
(Sum of all features gained Score)*100
____________________________________________
((2*No. of Key features) + No. of Normal Features);
28. Match Results
Match Results may be delivered based on the user needs
HTML, PDF, Excel, TEXT etc.
Mismatches will be presented as Errors
Match score will be presented as Confidence of Matching.
30. Chandassu Identification
Why?
To Determine the Chandassu of a unknown padyam.
To find the multiple matches if any.
Resolving the conflicts.
Mechanism
Match each and every chandassu against a given padyam
Identifying Chandassu for which the Max Score is obtained.
Can be applied only on Known Chandassu’s
To determine the Sama/Vishama pada Chandassu’s is also possible [Not in
Scope]
32. Identification Engine
Need of Optimization
Running Matching Engine on all known Chandassu’s could take a longer time.
Ex:
Consider the Known Chandassu’s size 400 (Incase of Telugu)
Total Avg. Time takes to match Features of a given Chandassu is 40-120 Milli
Seconds.
Total time to Identify is 40*400 to 100*400 i.e. 16 Sec. to 40 sec.
Size Min Time Max Time
Telugu/Kannada 400 16 Sec. 40 sec.
Sanskrit -1200 1200 48 Sec. 120 sec.
33. Identification Engine
Eliminating redundant steps and Caching the results
Results of the Text Analysis will be cached.
Determining the Eligible Candidates
Find Syllable Count for each line Ex: 7, 12, 8, 15
Find the Range of Syllable Count i.e Min and Max Values Ex: 7-15
Find all the Candidates which fall under this Range Ex: 7-15
#If the Digitization has Errors Syllable count may not match the actual.
Extended Range will be calculated i.e. Say t% Digitization Errors.
Floor(X1*((100-t)/100)) to Ceil(X2*((100+t)/100))
X1=Min Value, X2=Max Value.
Extended Range would be Ex: 6 -16.
47. ఇక్కడ మనం ఒక్ట్వ పాదం దోష్పూరితం
అని అరధం చేసుకోవచుి.
గణన ఫలతాలు
JUMP to Questions
48. గణన ఫలతాలు రండవ పాదం లో పాా స్ యతిని స్రిగాా
గురితంచలేడానిి కూడా గమనించలేవచుి.
ముందు ప్దయంలో ఛాయనొస్గు బదులు
ఛాయననొస్గు అని ఉండడానిి గమనించలేవచుి.
JUMP to Questions
55. Case Studies-1 Telugu Bhagavatam
Sri U. Samba Siva Rao Digitized Telugu Bhaghavatam
http://telugubhaghavatam.org
Total Padyams & 10061(7400) with 900K Words 16K unique Words
Total time taken run 7400 Padyams is 18min ~= 150ms. Per Padyam
After running this on Chandam results with 70% confidence.
Percentage Result Examples
40 Human errors Spelling mistakes or misplaced
punctuations
30 Human errors Misplace spaces and punctuations.
Treating of Compound words as
independent and Vice-versa
30 False Alarm Due to Limitations of the Tool.
56. Case Studies-2 Poets & Usage
Poets primary or intermediate skills at Writing Padyam's Credited tool on
various forums
Improved Quality
Focusing more on Creative and Literature part
Computation is Outsourced.
Poets who mastered in writing Padyam’s
Experiment and Practice (or learn) new patterns.
Around 30-40 regular Telugu Poets are using Chandam
Avg. Poet Computations per Day:3-5
57. Case Studies-3 Research
Mr. M. Narasimha Rao & I started analyzing Annamaya kIrtaNa’s
To find if there any influence of Chandassu in his writings
To compute the Statistical Analysis of Chandassu Pattern's.
Mr. Sri Ganesh T working on Determining Author Style in Metrical Poetry.
58. Limitations
Handling Special Rules
In Determining Symbols Ex: అదుర చు, క్దుర చు
yati matching when there is Sandhi
ఆట్గా ఛందాలనల్లలంచి, యలరించి where ఛందాలనల్లలంచి = ఛందాలను+ అల్లలంచి
#Yati matching based on achchu
59. What’s Next?
Resolving the Lines
Ancient poets used write the wrong padyam in single line.
Some cases No Indicator of Line Break and Chandassu Name.
Makes difficulty in determine the Chandassu.
With Little customization to Identification Engine can be resolved easily.
Discovering the Art Forms
Bandha or citra kavitva’s
Configured for Kannada Chandassu’s too. [Alpha Version]
60. Technologies.
Runs on WEB Client and Server , Windows Client Application
JS API is available for integration with external Sites.
http://chandam.apphb.com/?qpi
Technologies
JAVA Script via Script #
HTML5
MONGO DB
C SHARP
ASP.NET
Ports for Java/PHP can contact me for collaborative working,
61. Dileep Miriyala
Contribution to Indic Languages:
Indic PDF http://indicpdf.apphb.com
Telugu Bhaghavatam http://telugubhaghavatam.org
Chandam : http://chandam.apphb.com
7 Keyboard Layouts for Telugu [Web/Windows/Mac]
Importable Mac Keyboard Layout on Windows
ASCII to Unicode Fonts (Not TEXT Conversion)
Some Works in progress
Sandhi Merger and Identifier
Spell checker
Content Clustering