Computational Approaches to Melodic Analysis of Indian Art Music
1. Computational Approaches to Melodic
Analysis of Indian Art Music
Indian Institute of Sciences, Bengaluru, India 2016
Sankalp Gulati
Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain
4. Tonic Identification
time (s)
Frequency(Hz)
0 1 2 3 4 5 6 7 8
0
1000
2000
3000
4000
5000
100 150 200 250 300
0
0.2
0.4
0.6
0.8
1
Frequency (bins), 1bin=10 cents, Ref=55 Hz
Normalizedsalience
f2
f3
f4
f
5f6
Tonic
Signal processing Learning
q Tanpura / drone background sound
q Extent of gamakas on Sa and Pa svara
q Vadi, sam-vadi svara of the rāga
S. Gulati, A. Bellur, J. Salamon, H. Ranjani, V. Ishwar, H.A. Murthy, and X. Serra. Automatic tonic identification in Indian art music: approaches
and evaluation. Journal of New Music Research, 43(01):55–73, 2014.
Salamon, J., Gulati, S., & Serra, X. (2012). A multipitch approach to tonic identification in Indian classical music. In Proc. of Int. Conf. on Music
Information Retrieval (ISMIR) (pp. 499–504), Porto, Portugal.
Bellur, A., Ishwar, V., Serra, X., & Murthy, H. (2012). A knowledge based signal processing approach to tonic identification in Indian classical music. In 2nd
CompMusic Workshop (pp. 113–118) Istanbul, Turkey.
Ranjani, H. G., Arthi, S., & Sreenivas, T. V. (2011). Carnatic music analysis: Shadja, swara identification and raga verification in Alapana using stochastic
models. Applications of Signal Processing to Audio and Acoustics (WASPAA), IEEE Workshop , 29–32, New Paltz, NY.
Accuracy : ~90% !!!
8. q Pitch (Fundamental frequency-F0) of the lead
artist
q Pitch estimation
§ Melodic contour characteristics
§ Dual melodic lines in Indian art music
Signal processing
Learning
Salamon, Justin, and Emilia Gómez. "Melody extraction from polyphonic music signals using pitch contour characteristics." Audio, Speech, and Language
Processing, IEEE Transactions on 20.6 (2012): 1759-1770.
Rao, Vishweshwara, and Preeti Rao. "Vocal melody extraction in the presence of pitched accompaniment in polyphonic music." Audio, Speech, and
Language Processing, IEEE Transactions on 18.8 (2010): 2145-2154.
De Cheveigné, A., & Kawahara, H. (2002). YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America,
111, 1917.
16. Melody Histogram Computation
50 100 150 200 250 300
0
0.2
0.4
0.6
0.8
1
Frequency (bins), 1 bin = 10 Cents, Ref = 55 Hz
Normalizedsalience
Lower Sa
Tonic
middle Sa
Higher Sa
Frequency (cents), fref = tonic frequency
0 120-120
time (s)
10 30
17. Intonation Analysis
Mohana - G Begada - G
• Koduri, Gopala Krishna, et al. "Intonation Analysis of Rāgas in Carnatic Music." Journal of New Music Research
43.1 (2014): 72-93.
• Koduri, Gopala K., Serrà Joan, and Xavier Serra. "Characterization of Intonation in Carnatic Music by
Parametrizing Pitch Histograms." (2012): 199-204.
24. Melodic Pattern Discovery
-Predominant pitch
estimation
-Downsampling
-Hz to Cents
-Tonic normalization
-Brute-force
segmentation
-Segment filtering
-Uniform Time-scaling
Flat Non-flat
Data processing Intra-recording
discovery
Inter-recording
search
Rank-refinement
q S. Gulati, J. Serrà, V. Ishwar, and X. Serra, “Mining melodic patterns in large audio collections
of Indian art music,” in Int. Conf. on Signal Image Technology & Internet Based Systems -
MIRA, Marrakesh, Morocco, 2014, pp. 264–271.
25. Data Preprocessing
-Predominant pitch
estimation
-Downsampling
-Hz to Cents
-Tonic normalization
-Brute-force
segmentation
-Segment filtering
-Uniform Time-scaling
Flat Non-flat
q S. Gulati, J. Serrà, V. Ishwar, and X. Serra, “Mining melodic patterns in large audio collections
of Indian art music,” in Int. Conf. on Signal Image Technology & Internet Based Systems -
MIRA, Marrakesh, Morocco, 2014, pp. 264–271.
26. Melodic Similarity
q S. Gulati, J. Serrà and X. Serra, "An Evaluation of Methodologies for Melodic Similarity in
Audio Recordings of Indian Art Music", in Proceedings of IEEE Int. Conf. on Acoustics,
Speech, and Signal Processing (ICASSP), Brisbane, Australia 2015
28. Computational Complexity
q Lower bounding techniques (DTW)
Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., ... & Keogh, E. (2012,
August). Searching and mining trillions of time series subsequences under dynamic time warping. In
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data
mining (pp. 262-270). ACM.
Image taken from: http://www.cs.ucr.edu/~eamonn/LB_Keogh.htm
29. Melodic Similarity Improvements
q S. Gulati, J. Serrà and X. Serra, "Improving Melodic Similarity in Indian Art Music Using
Culture-specific Melodic Characteristics", in International Society for Music Information
Retrieval Conference (ISMIR) , pp. 680-686, Spain, 2015
31. Melodic Pattern Network
q M. EJ Newman, “The structure and function of complex networks,” Society for Industrial and
Applied Mathematics (SIAM) review, vol. 45, no. 2, pp. 167–256, 2003.
Undirectional
33. Similarity Threshold Estimation
q M. EJ Newman, “The structure and function of complex networks,” Society for Industrial and Applied
Mathematics (SIAM) review, vol. 45, no. 2, pp. 167–256, 2003.
q S. Maslov and K. Sneppen, “Specificity and stability in topology of protein networks,” Science, vol. 296, no.
5569, pp. 910– 913, 2002.
Ts
*
34. Melodic Pattern Characterization
V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,”
Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, no. 10, pp. P10008, 2008.
35. Melodic Pattern Characterization
V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,”
Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, no. 10, pp. P10008, 2008.
36. Melodic Pattern Characterization
V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,”
Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, no. 10, pp. P10008, 2008.
38. Melodic Pattern Characterization
q S. Gulati, J. Serrà, V. Ishwar, S. Şentürk and X. Serra, "Discovering Rāga Motifs by
characterizing Communities in Networks of Melodic Patterns", in IEEE Int. Conf. on
Acoustics, Speech, and Signal Processing (ICASSP), pp. 286-290, Shanghai, China, 2016.
43. Rāga Characterization: Svaras
50 100 150 200 250 300
0
0.2
0.4
0.6
0.8
1
Frequency (bins), 1 bin = 10 Cents, Ref = 55 Hz
Normalizedsalience
Lower Sa
Tonic
middle Sa
Higher Sa
Frequency (cents), fref = tonic frequency
0 120-120
q P. Chordia and S. Şentürk, “Joint recognition of raag and tonic in North Indian music,” Computer Music Journal, vol.
37, no. 3, pp. 82–98, 2013.
q G. K. Koduri, S. Gulati, P. Rao, and X. Serra, “Rāga recognition based on pitch distribution methods,” Journal of New
Music Research, vol. 41, no. 4, pp. 337–350, 2012.
time (s)
10 30
44. Rāga Characterization: Intonation
50 100 150 200 250 300
0
0.2
0.4
0.6
0.8
1
Frequency (bins), 1 bin = 10 Cents, Ref = 55 Hz
Normalizedsalience
Lower Sa
Tonic
middle Sa
Higher Sa
Frequency (cents), fref = tonic frequency
0 120-120
time (s)
10 30
45. Rāga Characterization: Intonation
50 100 150 200 250 300
0
0.2
0.4
0.6
0.8
1
Frequency (bins), 1 bin = 10 Cents, Ref = 55 Hz
Normalizedsalience
Lower Sa
Tonic
middle Sa
Higher Sa
Frequency (cents), fref = tonic frequency
0 120-120
q G.K.Koduri,V.Ishwar,J.Serrà,andX.Serra,“Intonation analysis of rāgas in Carnatic music,” Journal of New Music
Research, vol. 43, no. 1, pp. 72–93, 2014.
q H. G. Ranjani, S. Arthi, and T. V. Sreenivas, “Carnatic music analysis: Shadja, swara identification and raga
verification in alapana using stochastic models,” in IEEE WASPAA, 2011, pp. 29–32.
time (s)
10 30
47. time (s)
10 30
Rāga Characterization: Ārōh-Avrōh
q S. Shetty and K. K. Achary, “Raga mining of indian music by extracting arohana-avarohana pattern,” Int.
Journal of Recent Trends in Engineering, vol. 1, no. 1, pp. 362–366, 2009.
q V. Kumar, H Pandya, and C. V. Jawahar, “Identifying ragas in indian music,” in 22nd Int. Conf. on Pattern
Recognition (ICPR), 2014, pp. 767–772.
q P. V. Rajkumar, K. P. Saishankar, and M. John, “Identification of Carnatic raagas using hidden markov
models,” in IEEE 9th Int. Symposium on Applied Machine Intelligence and Informatics (SAMI), 2011, pp.
107–110.
Melodic Progression Templates
N-gram Distribution
Hidden Markov Model
49. time (s)
10 30
Rāga Characterization: Melodic motifs
q R. Sridhar and T. V. Geetha, “Raga identification of carnatic music for music information retrieval,”
International Journal of Recent Trends in Engineering, vol. 1, no. 1, pp. 571–574, 2009.
q S. Dutta, S. PV Krishnaraj, and H. A. Murthy, “Raga verification in carnatic music using longest common
segment set,” in Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 605-611,2015
Rāga A Rāga B Rāga C
50. Goal
q Automatic rāga recognition
Training corpus
Rāga recognition
System
Rāga label
Yaman
Shankarabharnam
Todī
Darbari
Kalyan
Bageśrī
Kambhojī
Hamsadhwani
Des
Harikambhoji
Kirvani
Atana
Behag
Kapi
Begada
51. Goal
q Automatic rāga recognition
Training corpus
Rāga recognition
System
Rāga label
time (s)
10
time (s)
10
time (s)
30
time (s)
10
time (s)
10
time (s)
30
time (s
10
time (s)
10
time (s)
30
time
10
time (s
10
time (s)
30
tim
10
time (
10
time (s)
0 30
tim
10
time
10
time (s)
10 30
t
10
tim
10
time (s)
10 30
10
ti
10
time (s)
10 30
1010
time (s)
10 30
1010
time (s)
10 30
1010
time (s)
10 30
1010
time (s)
10 30
1010
time (s)
10 30
1010
time (s)
10 30
1010
time (s)
10 30
Yaman
Shankarabharnam
Todī
Darbari
Kalyan
Bageśrī
Kambhojī
Hamsadhwani
Des
Harikambhoji
Kirvani
Atana
Behag
Kapi
Begada
time (s)
10 30
time (s)
10 30
30
58. Classification methodology
q Experimental setup
§ Stratified 12-fold cross validation (balanced)
§ Repeat experiment 20 times
§ Evaluation measure: mean classification accuracy
q Classifiers
§ Multinomial, Gaussian and Bernoulli naive Bayes
(NBM, NBG and NBB)
§ SVM with a linear and RBF-kernel, and with a
SGD learning (SVML, SVMR and SGD)
§ logistic regression (LR) and random forest (RF)
59. Results
phrase
e of a
of oc-
as, we
es that
ile for
re vec-
ument
(2)
(3)
ere the
rdings.
db Mtd Ftr NBM NBB LR SVML 1NN
DB10r¯aga
M
F1 90.6 74 84.1 81.2 -
F2 91.7 73.8 84.8 81.2 -
F3 90.5 74.5 84.3 80.7 -
S1
PCD120 - - - - 82.2
PCDfull - - - - 89.5
S2 PDparam 37.9 11.2 70.1 65.7 -
DB40r¯aga
M
F1 69.6 61.3 55.9 54.6 -
F2 69.6 61.7 55.7 54.3 -
F3 69.5 61.5 55.9 54.5 -
S1
PCD120 - - - - 66.4
PCDfull - - - - 74.1
S2 PDparam 20.8 2.6 51.4 44.2 -
Table 1. Accuracy (in percentage) of different methods (Mtd) for
two datasets (db) using different classifiers and features (Ftr).
the evaluation measure. In order to assess if the difference in the
S. GulaD, J. Serrà, V. Ishwar and X. Serra, "Phrase-based Rāga RecogniDon Using Vector
Space Modelling", in IEEE Int. Conf. on AcousDcs, Speech, and Signal Processing (ICASSP),
pp. 66-70, Shanghai, China, 2016.