Last updated: 01.12.2024
View on GitHubSorted by Year | Sortey by Year (Compact) | Sorted by Key | Related research | Unverified research |
[CalvoZaragoza2024] |
Jorge Calvo-Zaragoza, Eliseo Fuentes-Martínez, Noelia Luna-Barahona, and
Antonio Ríos-Vila.
Can multimodal large language models read music score images?
In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors,
Proceedings of the 6th International Workshop on Reading Music
Systems, pages 4-6, Online, 2024.
[ bib |
DOI |
http ]
This paper investigates whether multimodal large language models (MLLMs), which combine visual and textual understanding, can effectively read and interpret music score images. Given their ability to process and integrate information from multiple modalities, MLLMs present a promising approach for Optical Music Recognition (OMR). Through empirical evaluation, we demonstrate that while MLLMs exhibit potential in recognizing musical structures, challenges remain in addressing the complexity of music notation. This work highlights the need for further refinements in ML
|
[Coueasnon2024] | Bertrand Coüasnon, Mathieu Giraud, Christophe Guillotel Nothmann, Aurélie Lemaitre, and Philippe Rigaux. CollabScore project - From Optical Recognition to Multimodal Music Sources. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 33-37, Online, 2024. [ bib | DOI | http ] |
[Dvorak2024] | Vojtěch Dvořák, Jan jr. Hajič, and Jiří Mayer. Staff Layout Analysis Using the YOLO Platform. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 18-22, Online, 2024. [ bib | DOI | http ] |
[Hartelt2024] | Alexander Hartelt and Frank Puppe. OMMR4all revisited - a Semiautomatic Online Editor for Medieval Music Notations. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 46-49, Online, 2024. [ bib | DOI | http ] |
[Lambertye2024] | Grégoire de Lambertye and Alexander Pacha. Semantic Reconstruction of Sheet Music with Graph-Neural Networks. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 12-17, Online, 2024. [ bib | DOI | http ] |
[MenarguezBox2024] | Aitana Menárguez-Box, Alejandro H. Tosselli, and Enrique Vidal. Enhanced User-Machine Interaction for Historical Sheet Music Retrieval: a Musical Notation Approach. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 28-32, Online, 2024. [ bib | DOI | http ] |
[Repolusk2024] | Tristan Repolusk and Eduardo Veas. Semi-Automatic Annotation of Chinese Suzipu Notation Using a Component-Based Prediction and Similarity Approach. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 38-42, Online, 2024. [ bib | DOI | http ] |
[RiosVila2024] | Antonio Ríos-Vila, Eliseo Fuentes-Martinez, and Jorge Calvo-Zaragoza. Towards Sheet Music Information Retrieval: A Unified Approach Using Multitask Transformers. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 7-11, Online, 2024. [ bib | DOI | http ] |
[Tirupati2024] | Nivesara Tirupati, Elona Shatri, and György Fazekas. Crafting Handwritten Notations: Towards Sheet Music Generation. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 50-56, Online, 2024. [ bib | DOI | http ] |
[Torras2024] | Pau Torras, Sanket Biswas, and Alicia Fornés. On Designing a Representation for the Evaluation of Optical Music Recognition Systems. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 23-27, Online, 2024. [ bib | DOI | http ] |
[Umbreit2024] | Janosch Umbreit and Silvana Schumann. OMR on Early Music Sources at the Bavarian State Library with MuRET - Prototyping, Automating, Scaling. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 6th International Workshop on Reading Music Systems, pages 43-45, Online, 2024. [ bib | DOI | http ] |
[AlfaroContreras2023] | María Alfaro-Contreras. Few-Shot Music Symbol Classification via Self-Supervised Learning and Nearest Neighbor. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 39-43, Milan, Italy, 2023. [ bib | DOI | http ] |
[Castellanos2023] | Francisco J. Castellanos, Antonio Javier Gallego, and Ichiro Fujinaga. A Preliminary Study of Few-shot Learning for Layout Analysis of Music Scores. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 44-48, Milan, Italy, 2023. [ bib | DOI | http ] |
[Fujinaga2023] | Ichiro Fujinaga and Gabriel Vigliensoni. Optical Music Recognition Workflow for Medieval Music Manuscripts. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 4-6, Milan, Italy, 2023. [ bib | DOI | http ] |
[Hajic2023] | Jan jr. Hajič, Petr Žabička, Jan Rychtář, Jiří Mayer, Martina Dvořáková, Filip Jebavý, Markéta Vlková, and Pavel Pecina. The OmniOMR Project. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 12-14, Milan, Italy, 2023. [ bib | DOI | http ] |
[Hande2023] | Pranjali Hande, Elona Shatri, Benjamin Timms, and György Fazekas. Towards Artificially Generated Handwritten Sheet Music Datasets. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 25-30, Milan, Italy, 2023. [ bib | DOI | http ] |
[Havelka2023] | Jonáš Havelka, Jiří Mayer, and Pavel Pecina. Symbol Generation via Autoencoders for Handwritten Music Synthesis. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 20-24, Milan, Italy, 2023. [ bib | DOI | http ] |
[MartinezSevilla2023] | Juan Carlos Martinez-Sevilla and Francisco J. Castellanos. Towards Music Notation and Lyrics Alignment: Gregorian Chants as Case Study. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 15-19, Milan, Italy, 2023. [ bib | DOI | http ] |
[Repolusk2023] | Tristan Repolusk and Eduardo Veas. The Suzipu Musical Annotation Tool for the Creation of Machine-Readable Datasets of Ancient Chinese Music. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 7-11, Milan, Italy, 2023. [ bib | DOI | http ] |
[RiosVila2023] | Antonio Ríos-Vila. Rotations Are All You Need: A Generic Method For End-To-End Optical Music Recognition. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 34-38, Milan, Italy, 2023. [ bib | DOI | http ] |
[Zhang2023] | Zihui Zhang, Elona Shatri, and György Fazekas. Improving Sheet Music Recognition using Data Augmentation and Image Enhancement. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 5th International Workshop on Reading Music Systems, pages 31-33, Milan, Italy, 2023. [ bib | DOI | http ] |
[Egozy2022] | Eran Egozy and Ian Clester. Computer-Assisted Measure Detection in a Music Score-Following Application. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, pages 33-36, Online, 2022. [ bib | DOI | http ] |
[GarridoMunoz2022] | Carlos Garrido-Munoz, Antonio Ríos-Vila, and Jorge Calvo-Zaragoza. End-to-End Graph Prediction for Optical Music Recognition. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, pages 25-28, Online, 2022. [ bib | DOI | http ] |
[Jacquemard2022] | Florent Jacquemard, Lydia Rodriguez-de la Nava, and Martin Digard. Automated Transcription of Electronic Drumkits. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, pages 37-41, Online, 2022. [ bib | DOI | http ] |
[Mayer2022] | Jiří Mayer and Pavel Pecina. Obstacles with Synthesizing Training Data for OMR. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, pages 15-19, Online, 2022. [ bib | DOI | http ] |
[Moss2022] | Fabian C. Moss, Néstor Nápoles López, Maik Köster, and David Rizo. Challenging sources: a new dataset for OMR of diverse 19th-century music theory examples. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, pages 4-8, Online, 2022. [ bib | DOI | http ] |
[Penarrubia2022] | Carlos Penarrubia, Carlos Garrido-Muñoz, Jose J. Valero-Mas, and Jorge Calvo-Zaragoza. Efficient Approaches for Notation Assembly in Optical Music Recognition. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, pages 29-32, Online, 2022. [ bib | DOI | http ] |
[RiosVila2022] | Antonio Ríos-Vila, Jose M. Iñesta, and Jorge Calvo-Zaragoza. End-To-End Full-Page Optical Music Recognition of Monophonic Documents via Score Unfolding. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, pages 20-24, Online, 2022. [ bib | DOI | http ] |
[Torras2022] | Pau Torras, Arnau Baró, Lei Kang, and Alicia Fornés. Improving Handwritten Music Recognition through Language Model Integration. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, Online, 2022. [ bib | DOI | http ] |
[Walwadkar2022] | Dnyanesh Walwadkar, Elona Shatri, Benjamin Timms, and György Fazekas. CompIdNet: Sheet Music Composer Identification using Deep Neural Network. In Jorge Calvo-Zaragoza, Alexander Pacha, and Elona Shatri, editors, Proceedings of the 4th International Workshop on Reading Music Systems, pages 9-14, Online, 2022. [ bib | DOI | http ] |
[AlfaroContreras2021] | María Alfaro-Contreras, Jose J. Valero-Mas, and José Manuel Iñesta. Neural architectures for exploiting the components of Agnostic Notation in Optical Music Recognition. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 33-37, Alicante, Spain, 2021. [ bib | http ] |
[Baro2021] | Arnau Baró, Carles Badal, Pau Torras, and Alicia Fornés. Handwritten Historical Music Recognition through Sequence-to-Sequence with Attention Mechanism. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 55-59, Alicante, Spain, 2021. [ bib | http ] |
[Castellanos2021] | Francisco J. Castellanos and Antonio-Javier Gallego. Unsupervised Neural Document Analysis for Music Score Images. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 50-54, Alicante, Spain, 2021. [ bib | http ] |
[Fuente2021] | Carlos de la Fuente, Jose J. Valero-Mas, Francisco J. Castellanos, and Jorge Calvo-Zaragoza. Multimodal Audio and Image Music Transcription. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 18-22, Alicante, Spain, 2021. [ bib | http ] |
[Kletz2021] | Marc Kletz and Alexander Pacha. Detecting Staves and Measures in Music Scores with Deep Learning. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 8-12, Alicante, Spain, 2021. [ bib | http ] |
[MasCandela2021] | Enrique Mas-Candela and María Alfaro-Contreras. Sequential Next-Symbol Prediction for Optical Music Recognition. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 13-17, Alicante, Spain, 2021. [ bib | http ] |
[Pacha2021] | Alexander Pacha. The Challenge of Reconstructing Digits in Music Scores. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 4-7, Alicante, Spain, 2021. [ bib | http ] |
[RiosVila2021] | Antonio Ríos-Vila, David Rizo, Jorge Calvo-Zaragoza, and José Manuel Iñesta. Completing Optical Music Recognition with Agnostic Transcription and Machine Translation. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 28-32, Alicante, Spain, 2021. [ bib | http ] |
[Samiotis2021] | Ioannis Petros Samiotis, Christoph Lofi, and Alessandro Bozzon. Hybrid Annotation Systems for Music Transcription. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 23-27, Alicante, Spain, 2021. [ bib | http ] |
[Shatri2021] | Elona Shatri and György Fazekas. DoReMi: First glance at a universal OMR dataset. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 43-49, Alicante, Spain, 2021. [ bib | http ] |
[Wenzlitschke2021] | Nils Wenzlitschke. Implementation and evaluation of a neural network for the recognition of handwritten melodies. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, Proceedings of the 3rd International Workshop on Reading Music Systems, pages 38-42, Alicante, Spain, 2021. [ bib | http ] |
[AlfaroContreras2020] | María Alfaro-Contreras, Jorge Calvo-Zaragoza, and José M. Iñesta. Reconocimiento holístico de partituras musicales. Technical report, Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Spain, 2020. [ bib | .pdf ] |
[Calvo-Zaragoza2020] |
Jorge Calvo-Zaragoza, Jan Hajič Jr., and Alexander Pacha.
Understanding Optical Music Recognition.
ACM Comput. Surv., 53 (4), 2020.
ISSN 0360-0300.
[ bib |
DOI |
http ]
For over 50 years, researchers have been trying to teach computers to read music notation, referred to as Optical Music Recognition (OMR). However, this field is still difficult to access for new researchers, especially those without a significant musical background: Few introductory materials are available, and, furthermore, the field has struggled with defining itself and building a shared terminology. In this work, we address these shortcomings by (1) providing a robust definition of OMR and its relationship to related fields, (2) analyzing how OMR inverts the music encoding process to recover the musical notation and the musical semantics from documents, and (3) proposing a taxonomy of OMR, with most notably a novel taxonomy of applications. Additionally, we discuss how deep learning affects modern OMR research, as opposed to the traditional pipeline. Based on this work, the reader should be able to attain a basic understanding of OMR: its objectives, its inherent structure, its relationship to other fields, the state of the art, and the research opportunities it affords.
|
[Castellanos2020] |
Francisco J. Castellanos, Antonio-Javier Gallego, and Jorge Calvo-Zaragoza.
Automatic scale estimation for music score images.
Expert Systems with Applications, page 113590, 2020.
ISSN 0957-4174.
[ bib |
DOI |
http ]
Optical Music Recognition (OMR) is the research field focused on the automatic reading of music from scanned images. Its main goal is to encode the content into a digital and structured format with the advantages that this entails. This discipline is traditionally aligned to a workflow whose first step is the document analysis. This step is responsible of recognizing and detecting different sources of information—e.g. music notes, staff lines and text—to extract them and then processing automatically the content in the following steps of the workflow. One of the most difficult challenges it faces is to provide a generic solution to analyze documents with diverse resolutions. The endless number of existing music sources does not meet a standard that normalizes the data collections, giving complete freedom for a wide variety of image sizes and scales, thereby making this operation unsustainable. In the literature, this question is commonly overlooked and a uniform scale is assumed. In this paper, a machine learning-based approach to estimate the scale of music documents with respect to a reference scale is presented. Our goal is to propose a robust and generalizable method to adapt the input image to the requirements of an OMR system. For this, two goal-directed case studies are included to evaluate the proposed approach over common task within the OMR workflow, comparing the behavior with other state-of-the-art methods. Results suggest that it is necessary to perform this additional step in the first stage of the workflow to correct the scale of the input images. In addition, it is empirically demonstrated that our specialized approach is more promising than image augmentation strategies for the multi-scale challenge.
|
[Elezi2020] | Ismail Elezi. Exploiting Contextual Information with Deep Neural Networks. mathesis, Ca' Foscari, University of Venice, 2020. [ bib | .pdf ] |
[Henkel2020] | Florian Henkel, Rainer Kelz, and Gerhard Widmer. Learning to Read and Follow Music in Complete Score Sheet Images. In Proceedings of the 21st Int. Society for Music Information Retrieval Conf., 2020. [ bib | .html ] |
[Mico2020] |
Luisa Micó, Jose Oncina, and José M. Iñesta.
Adaptively Learning to Recognize Symbols in Handwritten Early Music.
In Peggy Cellier and Kurt Driessens, editors, Machine Learning
and Knowledge Discovery in Databases, pages 470-477, Cham, 2020. Springer
International Publishing.
ISBN 978-3-030-43887-6.
[ bib |
DOI ]
Human supervision is necessary for a correct edition and publication of handwritten early music collections. The output of an optical music recognition system for that kind of documents may contain a significant number of errors, making it tedious to correct for a human expert. An adequate strategy is needed to optimize the human feedback information during the correction stage to adapt the classifier to the specificities of each manuscript. In this paper, we compare the performance of a neural system, difficult and slow to be retrained, and a nearest neighbor strategy, based on the neural codes provided by a neural net, trained offline, used as a feature extractor.
|
[MuNG] | Alexander Pacha and Jan Hajič jr. The Music Notation Graph (MuNG) Repository. https://github.com/OMR-Research/mung, 2020. [ bib | http ] |
[Tardon2020] | Lorenzo J. Tardón, Isabel Barbancho, Ana M. Barbancho, and Ichiro Fujinaga. Automatic Staff Reconstruction within SIMSSA Project. Applied Sciences, 10 (7): 2468-2484, 2020. [ bib | DOI | http ] |
[Tsai2020] | Timothy J. Tsai, Daniel Yang, Mengyi Shan, Thitaree Tanprasert, and Teerapat Jenrungrot. Using Cell Phone Pictures of Sheet Music To Retrieve MIDI Passages. IEEE Transactions on Multimedia, pages 1-13, 2020. [ bib | DOI | http ] |
[Tuggener2020] |
Lukas Tuggener, Yvan Putra Satyawan, Alexander Pacha, Jürgen Schmidhuber,
and Thilo Stadelmann.
The DeepScoresV2 Dataset and Benchmark for Music Object Detection.
In Proceedings of the 25th International Conference on Pattern
Recognition, Milan, Italy, 2020.
[ bib |
DOI ]
In this paper, we present DeepScoresV2, an extended version of the DeepScores dataset for optical music recognition (OMR). We improve upon the original DeepScores dataset by providing much more detailed annotations, namely (a) annotations for 135 classes including fundamental symbols of non-fixed size and shape, increasing the number of annotated symbols by 23%; (b) oriented bounding boxes; (c) higher-level rhythm and pitch information (onset beat for all symbols and line position for noteheads); and (d) a compatibility mode for easy use in conjunction with the MUSCIMA++ dataset for OMR on handwritten documents. These additions open up the potential for future advancement in OMR research. Additionally, we release two state-of-the-art baselines for DeepScoresV2 based on Faster R-CNN and the Deep Watershed Detector. An analysis of the baselines shows that regular orthogonal bounding boxes are unsuitable for objects which are long, small, and potentially rotated, such as ties and beams, which demonstrates the need for detection algorithms that naturally incorporate object angles.
|
[Wick2020] | Christoph Wick and Frank Puppe. Automatic Neume Transcription of Medieval Music Manuscripts using CNN/LSTM-Networks and the segmentation-free CTC-Algorithm. Technical report, University of Würzburg, 2020. [ bib | DOI ] |
[Miro2019] |
Jordi Burgués Miró.
Recognition of musical symbols in scores using neural networks.
Master's thesis, Universitat Politècnica de Catalunya, Barcelona,
June 2019.
[ bib |
http ]
Object detection is present nowadays in many aspects of our life. From security to entertainment, its applications play a key role in computer vision and image processing worlds. This thesis addresses, through the usage of an object detector, the creation of an application that allows its user to play a music score. The main goal is to display a digital music score and be able to play it by touching on its notes. In order to achieve the proposed system, deep learning techniques based on neural networks are used to detect musical symbols from a digitized score and infer their position along the staff lines. Different models and approaches are considered to tackle the main objective.
|
[Wick2019] | Christoph Wick, Alexander Hartelt, and Frank Puppe. Staff, Symbol, and Melody Detection of Medieval Manuscripts Written in Square Notation Using Deep Fully Convolutional Networks. May 2019a. [ bib | DOI | http ] |
[Baro2019] |
Arnau Baró, Pau Riba, Jorge Calvo-Zaragoza, and Alicia Fornés.
From Optical Music Recognition to Handwritten Music Recognition: A
baseline.
Pattern Recognition Letters, 123: 1-8, 2019.
ISSN 0167-8655.
[ bib |
DOI |
http ]
Optical Music Recognition (OMR) is the branch of document image analysis that aims to convert images of musical scores into a computer-readable format. Despite decades of research, the recognition of handwritten music scores, concretely the Western notation, is still an open problem, and the few existing works only focus on a specific stage of OMR. In this work, we propose a full Handwritten Music Recognition (HMR) system based on Convolutional Recurrent Neural Networks, data augmentation and transfer learning, that can serve as a baseline for the research community.
|
[Calvo-Zaragoza2019] |
Jorge Calvo-Zaragoza, Alejandro H. Toselli, and Enrique Vidal.
Hybrid hidden Markov models and artificial neural networks for
handwritten music recognition in mensural notation.
Pattern Analysis and Applications, Mar 2019b.
ISSN 1433-755X.
[ bib |
DOI ]
In this paper, we present a hybrid approach using hidden Markov models (HMM) and artificial neural networks to deal with the task of handwritten Music Recognition in mensural notation. Previous works have shown that the task can be addressed with Gaussian density HMMs that can be trained and used in an end-to-end manner, that is, without prior segmentation of the symbols. However, the results achieved using that approach are not sufficiently accurate to be useful in practice. In this work, we hybridize HMMs with deep multilayer perceptrons (MLPs), which lead to remarkable improvements in optical symbol modeling. Moreover, this hybrid architecture maintains important advantages of HMMs such as the ability to properly model variable-length symbol sequences through segmentation-free training, and the simplicity and robustness of combining optical models with N-gram language models, which provide statistical a priori information about regularities in musical symbol concatenation observed in the training data. The results obtained with the proposed hybrid MLP-HMM approach outperform previous works by a wide margin, achieving symbol-level error rates around 26%, as compared with about 40% reported in previous works.
|
[Calvo-Zaragoza2019a] | Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha. Understanding Optical Music Recognition. Computing Research Repository, 2019a. [ bib | http ] |
[Calvo-Zaragoza2019b] |
Jorge Calvo-Zaragoza, Alejandro H. Toselli, and Enrique Vidal.
Handwritten Music Recognition for Mensural notation with
convolutional recurrent neural networks.
Pattern Recognition Letters, 128: 115-121, 2019c.
ISSN 0167-8655.
[ bib |
DOI |
http ]
Optical Music Recognition is the technology that allows computers to read music notation, which is also referred to as Handwritten Music Recognition when it is applied over handwritten notation. This technology aims at efficiently transcribing written music into a representation that can be further processed by a computer. This is of special interest to transcribe the large amount of music written in early notations, such as the Mensural notation, since they represent largely unexplored heritage for the musicological community. Traditional approaches to this problem are based on complex strategies with many explicit rules that only work for one particular type of manuscript. Machine learning approaches offer the promise of generalizable solutions, based on learning from just labelled examples. However, previous research has not achieved sufficiently acceptable results for handwritten Mensural notation. In this work we propose the use of deep neural networks, namely convolutional recurrent neural networks, which have proved effective in other similar domains such as handwritten text recognition. Our experimental results achieve, for the first time, recognition results that can be considered effective for transcribing handwritten Mensural notation, decreasing the symbol-level error rate of previous approaches from 25.7% to 7.0%.
|
[Colesnicov2019] |
Alexandru Colesnicov, Svetlana Cojocaru, Mihaela Luca, and Ludmila Malahov.
On Digitization of Documents with Script Presentable Content.
In Proceedings of the Fifth Conference of Mathematical Society
of Moldova, 2019.
[ bib |
.pdf ]
The paper is dedicated to details of the digitization of printed documents that include formalized script presentable content, in connection with the revitalization of the cultural heritage. We discuss the process and the necessary software by an example of music, as the recognition of scores is a solved task.
|
[Eipert2019] | Tim Eipert, Felix Herrman, Christoph Wick, Frank Puppe, and Andreas Haug. Editor Support for Digital Editions of Medieval Monophonic Music. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, 2nd International Workshop on Reading Music Systems, pages 4-7, Delft, The Netherlands, 2019. [ bib | http ] |
[Goularas2019] |
Dionysis Goularas and Kürsat Çinar.
Optical Music Recognition of the Hamparsum Notation.
In 2019 Ninth International Conference on Image Processing
Theory, Tools and Applications (IPTA), pages 1-7, Nov 2019.
[ bib |
DOI ]
This paper presents a method for the recognition of music notes from the Hamparsum music notation system. This notation was widely used during the last two centuries of the Ottoman Empire and it is still in use today. The Hamparsum notation presents significant differences compared to the European music notation, in terms of symbols and structure. Moreover, the notes can consist of more than one individual symbols. The proposed recognition method comprises several steps and algorithms, including a feature extraction based on Gabor Filters, recognition of symbols using a Support Vector Machine classifier, a method for assigning recognized symbols to a candidate Hamparsum note and a final recognition system based on template matching. This work will help to popularize this unique cultural heritage by providing Hamparsum scores in a machine-readable format.
|
[Gover2019] | Matan Gover and Ichiro Fujinaga. A Notation-Based Query Language for Searching in Symbolic Music. In 6th International Conference on Digital Libraries for Musicology, DLfM ’19, pages 79-83, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450372398. [ bib | DOI | http ] |
[Hajicjr.2019] |
Jan Hajič jr.
Optical Recognition of Handwritten Music Notation.
phdthesis, Charles University, Prague, 2019.
[ bib ]
Optical Music Recognition (OMR) is the field of computationally reading music notation. This thesis presents, in the form of dissertation by publication, contributions to the theory, resources, and methods of OMR especially for handwritten notation. The main contributions are (1) the Music Notation Graph (MuNG) formalism for describing arbitrarily complex music notation using an oriented graph that can be unambiguously interpreted in terms of musical semantics, (2) the MUSCIMA++ dataset of musical manuscripts with MuNG as ground truth that can be used to train and evaluate OMR systems and subsystems from the image all the way to extracting the musical semantics encoded therein, and (3) a pipeline for performing OMR on musical manuscripts that relies on machine learning both for notation symbol detection and the notation assembly stage, and on properties of the inferred MuNG representation to deterministically extract the musical semantics. While the the OMR pipeline does not perform flawlessly, this is the first OMR system to perform at basic useful tasks over musical semantics extracted from handwritten music notation of arbitrary complexity.
|
[Hakim2019] | Dzikry Maulana Hakim and Ednawati Rainarli. Convolutional Neural Network untuk Pengenalan Citra Notasi Musik. Techno.COM, 18 (3): 214-226, 2019. ISSN 2356-2579. [ bib | DOI | http ] |
[Henkel2019] | Florian Henkel, Rainer Kelz, and Gerhard Widmer. Audio-Conditioned U-Net for Position Estimation in Full Sheet Images. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, 2nd International Workshop on Reading Music Systems, pages 8-11, Delft, The Netherlands, 2019. [ bib | http ] |
[Huang2019] |
Zhiquing Huang, Xiang Jia, and Yifan Guo.
State-of-the-Art Model for Music Object Recognition with Deep
Learning.
Applied Sciences, 9 (13): 2645-2665, 2019.
ISSN 2076-3417.
[ bib |
DOI |
http ]
Optical music recognition (OMR) is an area in music information retrieval. Music object detection is a key part of the OMR pipeline. Notes are used to record pitch and duration and have semantic information. Therefore, note recognition is the core and key aspect of music score recognition. This paper proposes an end-to-end detection model based on a deep convolutional neural network and feature fusion. This model is able to directly process the entire image and then output the symbol categories and the pitch and duration of notes. We show a state-of-the-art recognition model for general music symbols which can get 0.92 duration accurary and 0.96 pitch accuracy .
|
[Inesta2019] | José M. Iñesta, David Rizo, and Jorge Calvo-Zaragoza. MuRET as a software for the transcription of historical archives. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, 2nd International Workshop on Reading Music Systems, pages 12-15, Delft, The Netherlands, 2019. [ bib | http ] |
[Ju2019] | Qinjie Ju, René Chalon, and Stéphane Derrode. Assisted Music Score Reading Using Fixed-Gaze Head Movement: Empirical Experiment and Design Implications. Proc. ACM Hum.-Comput. Interact., 3 (EICS): 3:1-3:29, 2019. ISSN 2573-0142. [ bib | DOI | http ] |
[Mateiu2019] |
Tudor Nicolae Mateiu.
Unsupervised Learning for Domain Adaptation in automatic
classification tasks through Neural Networks.
mathesis, Universidad de Alicante, 2019.
[ bib |
http ]
Machine Learning systems have improved dramatically in recent years for automatic recognition and artificial intelligence tasks. In general, these systems are based on the use of a large amount of labeled data - also called training sets - in order to learn a model that fits the problem in question. The training set consists of examples of possible inputs to the system and the output expected from them. Achieving this training set is the main limitation to use Machine Learning systems, since it requires human effort to find and map possible inputs with their corresponding outputs. The situation is often frustrating since systems learn to solve the task for a specific domain - that is, a type of input with relatively homogeneous conditions – and they are not able to generalize to correctly solve the same task in other domains. This project considers the use of Domain Adaptation algorithms, which are capable of learning to adapt a Machine Learning model to work in an unknown domain based on only unlabeled data (unsupervised learning). This facilitates the transfer of systems to new domains because obtaining unlabeled data is relatively cheap, since the cost is to label them. To date, Domain Adaptation algorithms have been used in very restricted contexts, so this project aims to make an empirical evaluation of these algorithms in a greater number of cases, as well as propose possible improvements.
|
[Mateiu2019a] |
Tudor N. Mateiu, Antonio-Javier Gallego, and Jorge Calvo-Zaragoza.
Domain Adaptation for Handwritten Symbol Recognition: A Case of Study
in Old Music Manuscripts.
In Aythami Morales, Julian Fierrez, José Salvador Sánchez,
and Bernardete Ribeiro, editors, Pattern Recognition and Image
Analysis, pages 135-146, Cham, 2019. Springer International Publishing.
ISBN 978-3-030-31321-0.
[ bib |
DOI ]
The existence of a large amount of untranscripted music manuscripts has caused initiatives that use Machine Learning (ML) for Optical Music Recognition, in order to efficiently transcribe the music sources into a machine-readable format. Although most music manuscript are similar in nature, they inevitably vary from one another. This fact can negatively influence the complexity of the classification task because most ML models fail to transfer their knowledge from one domain to another, thereby requiring learning from scratch on new domains after manually labeling new data. This work studies the ability of a Domain Adversarial Neural Network for domain adaptation in the context of classifying handwritten music symbols. The main idea is to exploit the knowledge of a specific manuscript to classify symbols from different (unlabeled) manuscripts. The reported results are promising, obtaining a substantial improvement over a conventional Convolutional Neural Network approach, which can be used as a basis for future research.
|
[Mengarelli2019] |
Luciano Mengarelli, Bruno Kostiuk, João G. Vitório, Maicon A. Tibola,
William Wolff, and Carlos N. Silla.
OMR metrics and evaluation: a systematic review.
Multimedia Tools and Applications, Dec 2019.
ISSN 1573-7721.
[ bib |
DOI ]
Music is rhythm, timbre, tones, intensity and performance. Conventional Western Music Notation (CWMN) is used to generate Music Scores in order to register music on paper. Optical Music Recognition (OMR) studies techniques and algorithms for converting music scores into a readable format for computers. work presents a systematic literature review (SLR) searching for metrics and methods of evaluation and comparing for OMR systems and algorithms. The most commonly used metrics on OMR works are described. A research protocol is elaborated and executed. From 802 publications found, 94 are evaluated. All results are organized and classified focusing on metrics, stages, comparisons, OMR datasets and related works. Although there is still no standard methodology for evaluating OMR systems, a good number of datasets and metrics are already available and apply to all the stages of OMR. Some of the analyzed works can give good directions for future works.
|
[Metaj2019] | Stiven Metaj and Federico Magnolfi. MNR: MUSCIMA Notes Recognition. Using Faster R-CNN on handwritten music dataset. resreport, Politecnico di Milano, 2019. [ bib | DOI ] |
[Noll2019] | Justus Noll. Intelligentes Notenlesen. c't, 18: 122-126, 2019. [ bib | http ] |
[NunezAlcover2019] |
Alicia Núñez Alcover.
Glyph and Position Classification of Music Symbols in Early
Manuscripts.
mathesis, Universidad de Alicante, 2019.
[ bib |
http ]
In this research, we study how to classify of handwritten music symbols in early music manuscripts written in white Mensural notation, a common notation system used since the fourteenth century and until the Renaissance. The field of Optical Music Recognition researches how to automate the reading of musical scores to transcribe its content to a structured digital format such as MIDI. When dealing with music manuscripts, the traditional workflow establishes two separate stages of detection and classification of musical symbols. In the classification stage, most of the research focuses on detecting musical symbols, without taking into account that a musical note is defined in two components: glyph and its position with respect to the staff. Our purpose will consist of the design and implementation of architectures in the field of Deep Learning, using Convolutional Neural Networks (CNNs) as well as its evaluation and comparison to determine which model provides the best performance in terms of efficiency and precision for its implementation in an interactive scenario.
|
[Nunez-Alcover2019] |
Alicia Nuñez-Alcover, Pedro J. Ponce de León, and Jorge
Calvo-Zaragoza.
Glyph and Position Classification of Music Symbols in Early Music
Manuscripts.
In Aythami Morales, Julian Fierrez, José Salvador Sánchez,
and Bernardete Ribeiro, editors, Pattern Recognition and Image
Analysis, pages 159-168, Cham, 2019. Springer International Publishing.
ISBN 978-3-030-31321-0.
[ bib |
DOI ]
Optical Music Recognition is a field of research that automates the reading of musical scores so as to transcribe their content into a structured digital format. When dealing with music manuscripts, the traditional workflow establishes separate stages of detection and classification of musical symbols. In the latter, most of the research has focused on detecting musical glyphs, ignoring that the meaning of a musical symbol is defined by two components: its glyph and its position within the staff. In this paper we study how to perform both glyph and position classification of handwritten musical symbols in early music manuscripts written in white Mensural notation, a common notation system used for the most part of the XVI and XVII centuries. We make use of Convolutional Neural Networks as the classification method, and we tested several alternatives such as using independent models for each component, combining label spaces, or using both multi-input and multi-output models. Our results on early music manuscripts provide insights about the effectiveness and efficiency of each approach.
|
[OmrBibliography] | Alexander Pacha. The definitive bibliography for research on Optical Music Recognition. https://omr-research.github.io, 2019a. [ bib | http ] |
[Pacha2019] |
Alexander Pacha.
Self-Learning Optical Music Recognition.
phdthesis, TU Wien, 2019b.
[ bib |
.pdf ]
Music is an essential part of our culture and heritage. Throughout the centuries, millions of songs were composed and written down in documents using music notation. Optical Music Recognition (OMR) is the research field that investigates how the computer can learn to read those documents. Despite decades of research, OMR is still considered far from being solved. One reason is that traditional approaches rely heavily on heuristics and often do not generalize well. In this thesis, I propose a different approach to let the computer learn to read music notation documents mostly by itself using machine learning, especially deep learning. In several experiments, I have demonstrated that the computer can learn to robustly solve many tasks involved in OMR by using supervised learning. These include the structural analysis of the document, the detection and classification of symbols in the scores as well as the construction of the music notation graph, which is an intermediate representation that can be exported into a format suitable for further processing. A trained deep convolutional neural network can reliably detect whether an image contains music or not, while another one is capable of finding and linking individual measures across multiple sources for easy navigation between them. Detecting symbols in typeset and handwritten scores can be learned, given a sufficient amount of annotated data, and classifying isolated symbols can be performed at even lower error rates than those of humans. For scores written in mensural notation the complete recognition can even be simplified into just three steps, two of which can be solved with machine learning. Apart from publishing a number of scientific articles, I have gathered and documented the most extensive collection of datasets for OMR as well as the probably most comprehensive bibliography currently available. Both are available online. Moreover I was involved in the organization of the International Workshop on Reading Music Systems, in a joint tutorial at the International Society For Music Information Retrieval Conference on OMR as well as in another workshop at the Music Encoding Conference. Many challenges of OMR can be solved efficiently with deep learning, such as the layout analysis or music object detection. As music notation is a configurational writing system where the relations and interplay between symbols determine the musical semantic, these relationships have to be recognized as well. A music notation graph is a suitable representation for storing this information. It allows to clearly distinguish between the challenges involved in recovering information from the music score image and the encoding of the recovered information into a specific output format while complying with the rules of music notation. While the construction of such a graph can be learned as well, there are still many open issues that need future research. But I am confident that training the computer on a sufficiently large dataset under human supervision is a sustainable approach that will help to solve many applications of OMR in the future.
|
[Pacha2019a] | Alexander Pacha, Jorge Calvo-Zaragoza, and Jan Hajič jr. Learning Notation Graph Construction for Full-Pipeline Optical Music Recognition. In 20th International Society for Music Information Retrieval Conference, pages 75-82, 2019. [ bib | .pdf ] |
[Pacha2019b] | Alexander Pacha. Incremental Supervised Staff Detection. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, 2nd International Workshop on Reading Music Systems, pages 16-20, Delft, The Netherlands, 2019c. [ bib | http ] |
[Panadero2019] |
Ivan Santos Panadero.
Alignment of handwritten music scores.
Technical report, Universitat Autónoma de Barcelona, 2019.
[ bib |
.pdf ]
There are musicologists that spend their time in analyzing musical pieces of more than a century ago in order to link them to another pre-existing pieces from the same author but written by different hands. It is a tedious task, since there are many representations done of a single piece through the time, and the writing variability among those representations can be extensive. The purpose would be in having a varied database of these old compositions for the study, reproduction and difusion. This work is divided into two phases. The first one, constitent in the detection of primitive present elements in each of the measures of a score using the existing transcription of the piece, thus obtaining the desired guided alignment. The second one will seek to analyze this alignment. Obtained results are encouraging.
|
[Parada-Cabaleiro2019] |
Emilia Parada-Cabaleiro, Anton Batliner, and Björn Schuller.
A Diplomatic Edition of Il Lauro Secco: Ground Truth for OMR of White
Mensural Notation.
In 20th International Society for Music Information Retrieval
Conference, pages 557-564, Delft, The Netherlands, 2019.
[ bib |
.pdf ]
Early musical sources in white mensural notation—the most common notation in European printed music during the Renaissance—are nowadays preserved by libraries worldwide trough digitalisation. Still, the application of music information retrieval to this repertoire is restricted by the use of digitalisation techniques which produce an uncodified output. Optical Music Recognition (OMR) automatically generates a symbolic representation of imagebased musical content, thus making this repertoire reachable from the computational point of view; yet, further improvements are often constricted by the limited ground truth available. We address this lacuna by presenting a symbolic representation in original notation of Il Lauro Secco, an anthology of Italian madrigals in white mensural notation. For musicological analytic purposes, we encoded the repertoire in **mens and MEI formats; for OMR ground truth, we automatically codified the repertoire in agnostic and semantic formats, via conversion from the **mens files.
|
[Regimbal2019] | Juliette Regimbal, McLennan Zoé, Gabriel Vigliensoni, Andrew Tran, and Ichiro Fujinaga. Neon2: A Verovio-based square-notation editor. In Music Encoding Conference 2019, Vienna, Austria, 2019. [ bib | .pdf ] |
[Reuse2019] | Timothy de Reuse and Ichiro Fujinaga. Robust Transcript Alignment on Medieval Chant Manuscripts. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, 2nd International Workshop on Reading Music Systems, pages 21-26, Delft, The Netherlands, 2019. [ bib | http ] |
[Rios-Vila2019] | Antonio Ríos-Vila, Jorge Calvo-Zaragoza, David Rizo, and José M. Iñesta. ReadSco: An Open-Source Web-Based Optical Music Recognition Tool. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, 2nd International Workshop on Reading Music Systems, pages 27-30, Delft, The Netherlands, 2019. [ bib | http ] |
[Thomae2019] | Martha E. Thomae, Julie E. Cumming, and Ichiro Fujinaga. The Mensural Scoring-up Tool. In 6th International Conference on Digital Libraries for Musicology, DLfM ’19, pages 9-19, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450372398. [ bib | DOI | http ] |
[Vigliensoni2019] | Gabriel Vigliensoni, Alex Daigle, Eric Liu, Jorge Calvo-Zaragoza, Juliette Regimbal, Minh Anh Nguyen, Noah Baxter, Zoé McLennan, and Ichiro Fujinaga. From image to encoding: Full optical music recognition of Medieval and Renaissance music. In Music Encoding Conference, 2019. [ bib | .pdf ] |
[Waloschek2019] | Simon Waloschek, Aristotelis Hadjakos, and Alexander Pacha. Identification and Cross-Document Alignment of Measures in Music Score Images. In 20th International Society for Music Information Retrieval Conference, pages 137-143, 2019. [ bib | .pdf ] |
[Wick2019a] |
Christoph Wick, Alexander Hartelt, and Frank Puppe.
Staff, Symbol and Melody Detection of Medieval Manuscripts Written in
Square Notation Using Deel Fully Convolutional Networks.
Applied Sciences, 9 (13): 2646-2673, 2019b.
ISSN 2076-3417.
[ bib |
DOI |
http ]
Even today, the automatic digitisation of scanned documents in general, but especially the automatic optical music recognition (OMR) of historical manuscripts, still remains an enormous challenge, since both handwritten musical symbols and text have to be identified. This paper focuses on the Medieval so-called square notation developed in the 11th–12th century, which is already composed of staff lines, staves, clefs, accidentals, and neumes that are roughly spoken connected single notes. The aim is to develop an algorithm that captures both the neumes, and in particular its melody, which can be used to reconstruct the original writing. Our pipeline is similar to the standard OMR approach and comprises a novel staff line and symbol detection algorithm based on deep Fully Convolutional Networks (FCN), which perform pixel-based predictions for either staff lines or symbols and their respective types. Then, the staff line detection combines the extracted lines to staves and yields an F1-score of over 99% for both detecting lines and complete staves. For the music symbol detection, we choose a novel approach that skips the step to identify neumes and instead directly predicts note components (NCs) and their respective affiliation to a neume. Furthermore, the algorithm detects clefs and accidentals. Our algorithm predicts the symbol sequence of a staff with a diplomatic symbol accuracy rate (dSAR) of about 87%, which includes symbol type and location. If only the NCs without their respective connection to a neume, all clefs and accidentals are of interest, the algorithm reaches an harmonic symbol accuracy rate (hSAR) of approximately 90%. In general, the algorithm recognises a symbol in the manuscript with an F1-score of over 96%.
|
[Wick2019b] | Christoph Wick and Frank Puppe. OMMR4all - a Semiautomatic Online Editor for Medieval Music Notations. In Jorge Calvo-Zaragoza and Alexander Pacha, editors, 2nd International Workshop on Reading Music Systems, pages 31-34, Delft, The Netherlands, 2019. [ bib | http ] |
[Xiao2019] |
Zhe Xiao, Xin Chen, and Li Zhou.
Real-Time Optical Music Recognition System for Dulcimer Musical
Robot.
Journal of Advanced Computational Intelligence and Intelligent
Informatics, 23 (4): 782-790, 2019.
[ bib |
DOI ]
Traditional optical music recognition (OMR) is an important technology that automatically recognizes scanned paper music sheets. In this study, traditional OMR is combined with robotics, and a real-time OMR system for a dulcimer musical robot is proposed. This system gives the musical robot a stronger ability to perceive and understand music. The proposed OMR system can read music scores, and the recognized information is converted into a standard electronic music file for the dulcimer musical robot, thus achieving real-time performance. During the recognition steps, we treat note groups and isolated notes separately. Specially structured note groups are identified by primitive decomposition and structural analysis. The note groups are decomposed into three fundamental elements: note stem, note head, and note beams. Isolated music symbols are recognized based on shape model descriptors. We conduct tests on real pictures taken live by a camera. The tests show that the proposed method has a higher recognition rate.
|
[Zalkow2019] | Frank Zalkow, Angel Villar Corrales, TJ Tsai, Vlora Arifi-Müller, and Meinard Müller. Tools For Semi-Automatic Bounding Box Annotation Of Musical Measures In Sheet Music. In Late Breaking/Demo at 20th International Society for Music Information Retrieval, Delft, The Netherlands, 2019. [ bib ] |
[Achankunju2018] | Sanu Pulimootil Achankunju. Music Search Engine from Noisy OMR Data. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 23-24, Paris, France, 2018. [ bib | http ] |
[Balke2018] |
Stefan Balke, Christian Dittmar, Jakob Abeßer, Klaus Frieler, Martin
Pfleiderer, and Meinard Müller.
Bridging the Gap: Enriching YouTube Videos with Jazz Music
Annotations.
Frontiers in Digital Humanities, 5: 1-11, 2018.
ISSN 2297-2668.
[ bib |
DOI ]
Web services allow permanent access to music from all over the world. Especially in the case of web services with user-supplied content, e.g., YouTube(TM), the available metadata is often incomplete or erroneous. On the other hand, a vast amount of high-quality and musically relevant metadata has been annotated in research areas such as Music Information Retrieval (MIR). Although they have great potential, these musical annotations are ofter inaccessible to users outside the academic world. With our contribution, we want to bridge this gap by enriching publicly available multimedia content with musical annotations available in research corpora, while maintaining easy access to the underlying data. Our web-based tools offer researchers and music lovers novel possibilities to interact with and navigate through the content. In this paper, we consider a research corpus called the Weimar Jazz Database (WJD) as an illustrating example scenario. The WJD contains various annotations related to famous jazz solos. First, we establish a link between the WJD annotations and corresponding YouTube videos employing existing retrieval techniques. With these techniques, we were able to identify 988 corresponding YouTube videos for 329 solos out of 456 solos contained in the WJD. We then embed the retrieved videos in a recently developed web-based platform and enrich the videos with solo transcriptions that are part of the WJD. Furthermore, we integrate publicly available data resources from the Semantic Web in order to extend the presented information, for example, with a detailed discography or artists-related information. Our contribution illustrates the potential of modern web-based technologies for the digital humanities, and novel ways for improving access and interaction with digitized multimedia content.
|
[Baro2018] | Arnau Baró, Pau Riba, and Alicia Fornés. A Starting Point for Handwritten Music Recognition. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 5-6, Paris, France, 2018. [ bib | http ] |
[Bonnici2018] | Alexandra Bonnici, Julian Abela, Nicholas Zammit, and George Azzopardi. Automatic Ornament Localisation, Recognition and Expression from Music Sheets. In ACM Symposium on Document Engineering, pages 25:1-25:11, Halifax, NS, Canada, 2018. ACM. ISBN 978-1-4503-5769-2. [ bib | DOI | http ] |
[Calvo-Zaragoza2018] |
Jorge Calvo-Zaragoza and David Rizo.
End-to-End Neural Optical Music Recognition of Monophonic Scores.
Applied Sciences, 8 (4), 2018a.
ISSN 2076-3417.
[ bib |
DOI |
http ]
Optical Music Recognition is a field of research that investigates how to computationally decode music notation from images. Despite the efforts made so far, there are hardly any complete solutions to the problem. In this work, we study the use of neural networks that work in an end-to-end manner. This is achieved by using a neural model that combines the capabilities of convolutional neural networks, which work on the input image, and recurrent neural networks, which deal with the sequential nature of the problem. Thanks to the use of the the so-called Connectionist Temporal Classification loss function, these models can be directly trained from input images accompanied by their corresponding transcripts into music symbol sequences. We also present the Printed Music Scores dataset, containing more than 80,000 monodic single-staff real scores in common western notation, that is used to train and evaluate the neural approach. In our experiments, it is demonstrated that this formulation can be carried out successfully. Additionally, we study several considerations about the codification of the output musical sequences, the convergence and scalability of the neural models, as well as the ability of this approach to locate symbols in the input score.
|
[Calvo-Zaragoza2018a] |
Jorge Calvo-Zaragoza, Francisco J. Castellanos, Gabriel Vigliensoni, and Ichiro
Fujinaga.
Deep Neural Networks for Document Processing of Music Score Images.
Applied Sciences, 8 (5), 2018a.
ISSN 2076-3417.
[ bib |
DOI |
http ]
There is an increasing interest in the automatic digitization of medieval music documents. Despite efforts in this field, the detection of the different layers of information on these documents still poses difficulties. The use of Deep Neural Networks techniques has reported outstanding results in many areas related to computer vision. Consequently, in this paper, we study the so-called Convolutional Neural Networks (CNN) for performing the automatic document processing of music score images. This process is focused on layering the image into its constituent parts (namely, background, staff lines, music notes, and text) by training a classifier with examples of these parts. A comprehensive experimentation in terms of the configuration of the networks was carried out, which illustrates interesting results as regards to both the efficiency and effectiveness of these models. In addition, a cross-manuscript adaptation experiment was presented in which the networks are evaluated on a different manuscript from the one they were trained. The results suggest that the CNN is capable of adapting its knowledge, and so starting from a pre-trained CNN reduces (or eliminates) the need for new labeled data.
|
[Calvo-Zaragoza2018b] | Jorge Calvo-Zaragoza and David Rizo. Camera-PrIMuS: Neural End-to-End Optical Music Recognition on Realistic Monophonic Scores. In 19th International Society for Music Information Retrieval Conference, pages 248-255, Paris, France, 2018b. ISBN 978-2-9540351-2-3. [ bib | .pdf ] |
[Calvo-Zaragoza2018c] | Jorge Calvo-Zaragoza. Why WoRMS? In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 7-8, Paris, France, 2018. [ bib | http ] |
[Calvo-Zaragoza2018d] |
Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha.
Discussion Group Summary: Optical Music Recognition.
In Alicia Fornés and Lamiroy Bart, editors, Graphics
Recognition, Current Trends and Evolutions, Lecture Notes in Computer
Science, pages 152-157. Springer International Publishing, 2018b.
ISBN 978-3-030-02283-9.
[ bib |
DOI ]
This document summarizes the discussion of the interest group on Optical Music Recognition (OMR) that took place in the 12th IAPR International Workshop on Graphics Recognition, and presents the main conclusions drawn during the session: OMR should revisit how it describes itself, and the OMR community should intensify its collaboration both internally and with other stakeholders.
|
[Calvo-Zaragoza2018e] |
Jorge Calvo-Zaragoza, Alejandro H. Toselli, and Enrique Vidal.
Probabilistic Music-Symbol Spotting in Handwritten Scores.
In 16th International Conference on Frontiers in Handwriting
Recognition, pages 558-563, Niagara Falls, USA, 2018d.
[ bib |
DOI ]
Content-based search on musical manuscripts is usually performed assuming that there are accurate transcripts of the sources in a symbolic, structured format. Given that current systems for Handwritten Music Recognition are far from offering guarantees about their accuracy, this traditional approach does not represent a scalable scenario. In this work we propose a probabilistic framework for Music-Symbol Spotting (MSS), that allows for content-based music search directly over the images of the manuscripts. By means of statistical recognition systems, a probabilistic index is built upon which the search can be carried out efficiently. Our experiments over a dataset of an Early handwritten music manuscript in Mensural notation demonstrates that this MSS framework can be presented as a promising alternative to the traditional approach for content-based music search.
|
[Castellanos2018] | Fancisco J. Castellanos, Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga. Document Analysis of Music Score Images with Selectional Auto-Encoders. In 19th International Society for Music Information Retrieval Conference, pages 256-263, Paris, France, 2018. ISBN 978-2-9540351-2-3. [ bib | .pdf ] |
[Chen2018] | Liang Chen and Christopher Raphael. Optical Music Recognition and Human-in-the-loop Computation. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 11-12, Paris, France, 2018. [ bib | http ] |
[Choi2018] | Kwon-Young Choi, Bertrand Coüasnon, Yann Ricquebourg, and Richard Zanibbi. Music Symbol Detection with Faster R-CNN Using Synthetic Annotations. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 9-10, Paris, France, 2018. [ bib | http ] |
[Crawford2018] | Tim Crawford, Golnaz Badkobeh, and David Lewis. Searching Page-Images of Early Music Scanned with OMR: A Scalable Solution Using Minimal Absent Words. In 19th International Society for Music Information Retrieval Conference, pages 233-239, Paris, France, 2018. ISBN 978-2-9540351-2-3. [ bib | .pdf ] |
[Diet2018] | Jürgen Diet. Optical Music Recognition in der Bayerischen Staatsbibliothek. BIBLIOTHEK - Forschung und Praxis, 2018a. [ bib | DOI ] |
[Diet2018a] |
Jürgen Diet.
Innovative MIR Applications at the Bayerische Staatsbibliothek.
In 5th International Conference on Digital Libraries for
Musicology, Paris, France, 2018b.
[ bib |
.pdf ]
This short position paper gives an insight into the digitization of music prints in the Bayerische Staatsbibliothek and describes two music information retrieval applications in the Bayerische Staatsbibliothek. One of them is a melody search application based on OMR data that has been generated with 40.000 pages of digitized music prints containing all compositions of L. van Beethoven, G. F. Händel, F. Liszt, and F. Schubert. The other one is the incipit search in the International Inventory of Musical Sources (Répertoire International des Sources Musicales, RISM).
|
[Dorfer2018] | Matthias Dorfer, Jan Hajič jr., Andreas Arzt, Harald Frostel, and Gerhard Widmer. Learning Audio-Sheet Music Correspondences for Cross-Modal Retrieval and Piece Identification. Transactions of the International Society for Music Information Retrieval, 1 (1): 22-33, 2018a. [ bib | DOI ] |
[Dorfer2018a] | Matthias Dorfer, Florian Henkel, and Gerhard Widmer. Learning To Listen, Read And Follow: Score Following As A Reinforcement Learning Game. In 19th International Society for Music Information Retrieval Conference, pages 784-791, Paris, France, 2018b. ISBN 978-2-9540351-2-3. [ bib | .pdf ] |
[Elezi2018] | Ismail Elezi, Lukas Tuggener, Marcello Pelillo, and Thilo Stadelmann. DeepScores and Deep Watershed Detection: current state and open issues. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 13-14, Paris, France, 2018. [ bib | http ] |
[Fornes2018] | Alicia Fornés and Lamiroy Bart, editors. Graphics Recognition, Current Trends and Evolutions, volume 11009 of Lecture Notes in Computer Science, 2018. Springer International Publishing. ISBN 978-3-030-02283-9. [ bib | DOI ] |
[Fujinaga2018] |
Ichiro Fujinaga, Andrew Hankinson, and Laurent Pugin.
Automatic Score Extraction with Optical Music Recognition (OMR).
In Springer Handbook of Systematic Musicology, pages 299-311.
Springer Berlin Heidelberg, Berlin, Heidelberg, 2018.
ISBN 978-3-662-55004-5.
[ bib |
DOI ]
Optical music recognition (OMR optical music recognition (OMR) ) describes the process of automatically transcribing music notation from a digital image. Although similar to optical character recognition (OCR optical character recognition (OCR) ), the process and procedures of OMR diverge due to the fundamental differences between text and music notation, such as the two-dimensional nature of the notation system and the overlay of music symbols on top of staff lines. The OMR process can be described as a sequence of steps, with techniques adapted from disciplines including image processing, machine learning, grammars, and notation encoding. The sequence and specific techniques used can differ depending on the condition of the image, the type of notation, and the desired output.
|
[Gotham2018] | Mark Gotham, Peter Jonas, Bruno Bower, William Bosworth, Daniel Rootham, and Leigh VanHandel. Scores of Scores: An Openscore Project to Encode and Share Sheet Music. In 5th International Conference on Digital Libraries for Musicology, pages 87-95, Paris, France, 2018. ACM. ISBN 978-1-4503-6522-2. [ bib | DOI | http ] |
[Hajicjr.2018] |
Jan Hajič jr., Marta Kolárová, Alexander Pacha, and Jorge
Calvo-Zaragoza.
How Current Optical Music Recognition Systems Are Becoming Useful for
Digital Libraries.
In 5th International Conference on Digital Libraries for
Musicology, pages 57-61, Paris, France, 2018b. ACM.
ISBN 978-1-4503-6522-2.
[ bib |
DOI |
http ]
Optical Music Recognition (OMR) promises to make large collections of sheet music searchable by their musical content. It would open up novel ways of accessing the vast amount of written music that has never been recorded before. For a long time, OMR was not living up to that promise, as its performance was simply not good enough, especially on handwritten music or under non-ideal image conditions. However, OMR has recently seen a number of improvements, mainly due to the advances in machine learning. In this work, we take an OMR system based on the traditional pipeline and an end-to-end system, which represent the current state of the art, and illustrate in proof-of-concept experiments their applicability in retrieval settings. We also provide an example of a musicological study that can be replicated with OMR outputs at much lower costs. Taken together, this indicates that in some settings, current OMR can be used as a general tool for enriching digital libraries.
|
[Hajicjr.2018a] | Jan Hajič jr., Matthias Dorfer, Gerhard Widmer, and Pavel Pecina. Towards Full-Pipeline Handwritten OMR with Musical Symbol Detection by U-Nets. In 19th International Society for Music Information Retrieval Conference, pages 225-232, Paris, France, 2018a. ISBN 978-2-9540351-2-3. [ bib | .pdf ] |
[Hajicjr.2018b] | Jan Hajič jr. A Case for Intrinsic Evaluation of Optical Music Recognition. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 15-16, Paris, France, 2018. [ bib | http ] |
[Hemmatifar2018] | Ali Hemmatifar and Ashish Krishna. DeepPiano: A Deep Learning Approach to Translate Music Notation to English Alphabet. Technical report, Stanford University, 2018. [ bib | .pdf ] |
[Inesta2018] | José Manuel Iñesta, Pedro J. Ponce de León, David Rizo, José Oncina, Luisa Micó, Juan Ramón Rico-Juan, Carlos Pérez-Sancho, and Antonio Pertusa. HISPAMUS: Handwritten Spanish Music Heritage Preservation by Automatic Transcription. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 17-18, Paris, France, 2018. [ bib | http ] |
[Konwer2018] |
Aishik Konwer, Ayan Kumar Bhunia, Abir Bhowmick, Ankan Kumar Bhunia, Prithaj
Banerjee, Partha Pratim Roy, and Umapada Pal.
Staff line Removal using Generative Adversarial Networks.
In 2018 24th International Conference on Pattern Recognition
(ICPR), pages 1103-1108, Aug 2018.
[ bib |
DOI ]
Staff line removal is a crucial pre-processing step in Optical Music Recognition. In this paper we propose a novel approach for staff line removal, based on Generative Adversarial Networks. We convert staff line images into patches and feed them into a U-Net, used as Generator. The Generator intends to produce staff-less images at the output. Then the Discriminator does binary classification and differentiates between the generated fake staff-less image and real ground truth staff less image. For training, we use a Loss function which is a weighted combination of L2 loss and Adversarial loss. L2 loss minimizes the difference between real and fake staff-less image. Adversarial loss helps to retrieve more high quality textures in generated images. Thus our architecture supports solutions which are closer to ground truth and it reflects in our results. For evaluation we consider the ICDAR/GREC 2013 staff removal database. Our method achieves superior performance in comparison to other conventional approaches on the same dataset.
|
[Li2018] |
Chuanzhen Li, Jiaqi Zhao, Juanjuan Cai, Hui Wang, and Huaichang Du.
Optical Music Notes Recognition for Printed Music Score.
In 11th International Symposium on Computational Intelligence
and Design (ISCID), volume 01, pages 285-288, Dec 2018.
[ bib |
DOI ]
To convert printed music score into a machine-readable format, a system that can automatically decode the symbolic image and play the music is proposed. The system takes a music score image as input, segments music symbols after preprocessing the image, then recognizes their pitch and duration. Finally, MIDI files are generated. The experiments on Rebelo Database shows that the proposed method obtains superior recognition accuracy against other methods.
|
[McLeod2018] | Andrew McLeod and Mark Steedman. Evaluating Automatic Polyphonic Music Transcription. In 19th International Society for Music Information Retrieval Conference, pages 42-49, Paris, France, 2018. ISBN 978-2-9540351-2-3. [ bib | .pdf ] |
[Mico2018] | Luisa Micó, José Manuel Iñesta, and David Rizo. Incremental Learning for Recognition of Handwritten Mensural Notation. In 11th International Workshop on Machine Learning and Music, 2018. [ bib | http ] |
[Moonlight] | Dan Ringwalt. Moonlight. https://github.com/ringw/moonlight, 2018. [ bib | http ] |
[Napoles2018] | Néstor Nápoles, Gabriel Vigliensoni, and Ichiro Fujinaga. Encoding Matters. In 5th International Conference on Digital Libraries for Musicology, pages 69-73, Paris, France, 2018. ACM. ISBN 978-1-4503-6522-2. [ bib | DOI | http ] |
[Niitsuma2018] |
Masahiro Niitsuma, Yo Tomita, Wei Qi Yan, and David Bell.
Towards Musicologist-Driven Mining of Handwritten Scores.
IEEE Intelligent Systems, 33 (4): 24-34, 2018.
ISSN 1541-1672.
[ bib |
DOI ]
Historical musicologists have been seeking for objective and powerful techniques to collect, analyse and verify their findings for many decades. The aim of this study is to propose a musicologist-driven mining method for extracting quantitative information from early music manuscripts. Our focus is on finding evidence for the chronological ordering of J.S. Bachs manuscripts. Bachs C-clefs were extracted from a wide range of manuscripts under the direction of domain experts, and with these the classification of C-clefs was conducted. The proposed methods were evaluated on a dataset containing over 1000 clefs extracted from J.S. Bachs manuscripts. The results show more than 70% accuracy for dating J.S. Bachs manuscripts, providing a rough barometer to be combined with other evidence to evaluate musicologists hypotheses, and the practicability of this domain-driven approach is demonstrated.
|
[OmrDatasetTools] | Alexander Pacha. Documentation of the OMR Dataset Tools Python package. https://omr-datasets.readthedocs.io/en/latest, 2018a. [ bib | http ] |
[OmrTutorialOnYoutube] | Jorge Calvo-Zaragoza, Jan Hajič jr., Alexander Pacha, and Ichiro Fujinaga. The recording of the ISMIR Tutorial "OMR for Dummies" on YouTube. https://www.youtube.com/playlist?list=PL1jvwDVNwQke-04UxzlzY4FM33bo1CGS0, 2018c. [ bib | http ] |
[Pacha2018] |
Alexander Pacha, Kwon-Young Choi, Bertrand Coüasnon, Yann Ricquebourg,
Richard Zanibbi, and Horst Eidenberger.
Handwritten Music Object Detection: Open Issues and Baseline Results.
In 13th International Workshop on Document Analysis Systems,
pages 163-168, 2018a.
[ bib |
DOI ]
Optical Music Recognition (OMR) is the challenge of understanding the content of musical scores. Accurate detection of individual music objects is a critical step in processing musical documents because a failure at this stage corrupts any further processing. So far, all proposed methods were either limited to typeset music scores or were built to detect only a subset of the available classes of music symbols. In this work, we propose an end-to-end trainable object detector for music symbols that is capable of detecting almost the full vocabulary of modern music notation in handwritten music scores. By training deep convolutional neural networks on the recently released MUSCIMA++ dataset which has symbol-level annotations, we show that a machine learning approach can be used to accurately detect music objects with a mean average precision of over 80%.
|
[Pacha2018a] | Alexander Pacha. Self-learning Optical Music Recognition. In Philipp Hans, Gerald Artner, Johanna Grames, Heinz Krebs, Hamid Reza Mansouri Khosravi, and Taraneh Rouhi, editors, Vienna Young Scientists Symposium, pages 34-35. TU Wien, Book-of-Abstracts.com, Heinz A. Krebs, 2018b. ISBN 978-3-9504017-8-3. ISBN: 978-3-9504017-8-3. [ bib | http ] |
[Pacha2018b] |
Alexander Pacha and Jorge Calvo-Zaragoza.
Optical Music Recognition in Mensural Notation with Region-Based
Convolutional Neural Networks.
In 19th International Society for Music Information Retrieval
Conference, pages 240-247, Paris, France, 2018.
ISBN 978-2-9540351-2-3.
[ bib |
.pdf ]
In this work, we present an approach for the task of optical music recognition (OMR) using deep neural networks. Our intention is to simultaneously detect and categorize musical symbols in handwritten scores, written in mensural notation. We propose the use of region-based convolutional neural networks, which are trained in an end-toend fashion for that purpose. Additionally, we make use of a convolutional neural network that predicts the relative position of a detected symbol within the staff, so that we cover the entire image-processing part of the OMR pipeline. This strategy is evaluated over a set of 60 ancient scores in mensural notation, with more than 15000 annotated symbols belonging to 32 different classes. The results reflect the feasibility and capability of this approach, with a weighted mean average precision of around 76% for symbol detection, and over 98% accuracy for predicting the position.
|
[Pacha2018c] |
Alexander Pacha, Jan Hajič jr., and Jorge Calvo-Zaragoza.
A Baseline for General Music Object Detection with Deep Learning.
Applied Sciences, 8 (9): 1488-1508, 2018b.
ISSN 2076-3417.
[ bib |
DOI |
http ]
Deep learning is bringing breakthroughs to many computer vision subfields including Optical Music Recognition (OMR), which has seen a series of improvements to musical symbol detection achieved by using generic deep learning models. However, so far, each such proposal has been based on a specific dataset and different evaluation criteria, which made it difficult to quantify the new deep learning-based state-of-the-art and assess the relative merits of these detection models on music scores. In this paper, a baseline for general detection of musical symbols with deep learning is presented. We consider three datasets of heterogeneous typology but with the same annotation format, three neural models of different nature, and establish their performance in terms of a common evaluation standard. The experimental results confirm that the direct music object detection with deep learning is indeed promising, but at the same time illustrates some of the domain-specific shortcomings of the general detectors. A qualitative comparison then suggests avenues for OMR improvement, based both on properties of the detection model and how the datasets are defined. To the best of our knowledge, this is the first time that competing music object detection systems from the machine learning paradigm are directly compared to each other. We hope that this work will serve as a reference to measure the progress of future developments of OMR in music object detection.
|
[Pacha2018d] | Alexander Pacha. Advancing OMR as a Community: Best Practices for Reproducible Research. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 19-20, Paris, France, 2018c. [ bib | http ] |
[Paeaekkoenen2018] | Tuula Pääkkönen, Jukka Kervinen, and Kimmo Kettunen. Digitisation and Digital Library Presentation System - Sheet Music to the Mix. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 21-22, Paris, France, 2018. [ bib | http ] |
[PhotoScore] | Neuratron. PhotoScore 2018. http://www.neuratron.com/photoscore.htm, 2018. [ bib | http ] |
[Rizo2018] | David Rizo, Jorge Calvo-Zaragoza, and José M. Iñesta. MuRET: A Music Recognition, Encoding, and Transcription Tool. In 5th International Conference on Digital Libraries for Musicology, pages 52-56, Paris, France, 2018. ACM. ISBN 978-1-4503-6522-2. [ bib | DOI | http ] |
[Roggenkemper2018] | Heinz Roggenkemper and Ryan Roggenkemper. How can Machine Learning make Optical Music Recognition more relevant for practicing musicians? In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 25-26, Paris, France, 2018. [ bib | http ] |
[Sotoodeh2017] |
Mahmood Sotoodeh, Farshad Tajeripour, Sadegh Teimori, and Kirk Jorgensen.
A music symbols recognition method using pattern matching along with
integrated projection and morphological operation techniques.
Multimedia Tools and Applications, 77 (13): 16833-16866,
2018.
ISSN 1573-7721.
[ bib |
DOI ]
Optical Music Recognition (OMR) can be divided into three main phases: (i) staff line detection and removal. The goal of this phase is to detect and to remove staff lines from sheet music images. (ii) music symbol detection and segmentation. The propose of this phase is to detect the remaining musical symbols such as single symbols and group symbols, then segment the group symbols to single or primitive symbols after removing staff lines. (iii) musical symbols recognition. In this phase, recognition of musical symbols is the main objective. The method presented in this paper, covers all three phases. One advantage of the first phase of the proposed method is that it is robust to staff lines rotation and staff lines which have curvature in sheet music images. Moreover, the staff lines are removed accurately and quickly and also fewer details of the musical symbols are omitted. The proposed method in the first phase focuses on the hand-written documents databases which have been introduced in the CVC-MUSCIMA and ICDAR 2013. It has the lowest error rate among well-known methods and outperforms the state of the art in CVC-MUSCIMA database. In ICDAR 2013, the specificity measure of this method is 99.71% which is the highest specificity among available methods. Also, in terms of accuracy, recall rate and f-measure is only slightly less than the best method. Therefor our method is comparable favorably to the existing methods. In the second phase, the symbols are divided into two categories, single and group. In the recognition phase, we use a pattern matching method to identify single symbols. For recognizing group symbols, a hierarchical method is proposed. The proposed method in the third phase has several advantages over the previous methods. It is quite robust to skewness of musical group symbols. Furthermore, it provides high accuracy in recognition of the symbols.
|
[Tuggener2018] |
Lukas Tuggener, Ismail Elezi, Jürgen Schmidhuber, Marcello Pelillo, and
Thilo Stadelmann.
DeepScores - A Dataset for Segmentation, Detection and Classification
of Tiny Objects.
In 24th International Conference on Pattern Recognition,
Beijing, China, 2018a.
[ bib |
DOI |
http ]
We present the DeepScores dataset with the goal of advancing the state-of-the-art in small objects recognition, and by placing the question of object recognition in the context of scene understanding. DeepScores contains high quality images of musical scores, partitioned into 300,000 sheets of written music that contain symbols of different shapes and sizes. With close to a hundred millions of small objects, this makes our dataset not only unique, but also the largest public dataset. DeepScores comes with ground truth for object classification, detection and semantic segmentation. DeepScores thus poses a relevant challenge for computer vision in general, beyond the scope of optical music recognition (OMR) research. We present a detailed statistical analysis of the dataset, comparing it with other computer vision datasets like Caltech101/256, PASCAL VOC, SUN, SVHN, ImageNet, MS-COCO, smaller computer vision datasets, as well as with other OMR datasets. Finally, we provide baseline performances for object classification and give pointers to future research based on this dataset.
|
[Tuggener2018a] | Lukas Tuggener, Ismail Elezi, Jürgen Schmidhuber, and Thilo Stadelmann. Deep Watershed Detector for Music Object Recognition. In 19th International Society for Music Information Retrieval Conference, pages 271-278, Paris, France, 2018b. ISBN 978-2-9540351-2-3. [ bib | .pdf ] |
[Vigliensoni2018] | Gabriel Vigliensoni, Jorge Calvo-Zaragoza, and Ichiro Fujinaga. Developing an environment for teaching computers to read music. In Jorge Calvo-Zaragoza, Jan Hajič jr., and Alexander Pacha, editors, 1st International Workshop on Reading Music Systems, pages 27-28, Paris, France, 2018. [ bib | http ] |
[Vo2017] |
Quang Nhat Vo, Guee Sang Lee, Soo Hyung Kim, and Hyung Jeong Yang.
Recognition of Music Scores with Non-Linear Distortions in Mobile
Devices.
Multimedia Tools and Applications, 77 (12): 15951-15969,
2018.
ISSN 1573-7721.
[ bib |
DOI ]
Optical music recognition (OMR), when the input music score is captured by a handheld or a mobile phone camera, suffers from severe degradation in the image quality and distortions caused by non-planar document curvature and perspective projection. Hence the binarization of the input often fails to preserve the details of the original music score, leading to a poor performance in recognition of music symbols. This paper addresses the issue of staff line detection, which is the most important step in OMR, in the presence of nonlinear distortions and describes how to cope with severe degradations in recognition of music symbols. First, a RANSAC-based detection of curved staff lines is presented and staves are segmented into sub-areas for the rectification with bi-quadratic transformation. Then, run length coding is used to recognize music symbols such as stem, note head, flag, and beam. The proposed system is implemented on smart phones, and it shows promising results with music score images captured in the mobile environment.
|
[Yin2018] | Yu Yin, Zhenya Huang, Enhong Chen, Qi Liu, Fuzheng Zhang, Xing Xie, and Guoping Hu. Transcribing Content from Structural Images with Spotlight Mechanism. In 24th International Conference on Knowledge Discovery & Data Mining, pages 2643-2652, London, United Kingdom, 2018. ACM. ISBN 978-1-4503-5552-0. [ bib | DOI | http ] |
[Baro2017] | Arnau Baró, Pau Riba, Jorge Calvo-Zaragoza, and Alicia Fornés. Optical Music Recognition by Recurrent Neural Networks. In 14th International Conference on Document Analysis and Recognition, pages 25-26, Kyoto, Japan, 2017. IEEE. [ bib | DOI ] |
[Baro-Mas2017] | Arnau Baró-Mas. Optical Music Recognition by Long Short-Term Memory Recurrent Neural Networks. Master's thesis, Universitat Autònoma de Barcelona, 2017. [ bib | .pdf ] |
[Bountouridis2017] |
Dimitrios Bountouridis, Frans Wiering, Dan Brown, and Remco C. Veltkamp.
Towards Polyphony Reconstruction Using Multidimensional Multiple
Sequence Alignment.
In João Correia, Vic Ciesielski, and Antonios Liapis, editors,
Computational Intelligence in Music, Sound, Art and Design, pages
33-48, Cham, 2017. Springer International Publishing.
ISBN 978-3-319-55750-2.
[ bib |
DOI ]
The digitization of printed music scores through the process of optical music recognition is imperfect. In polyphonic scores, with two or more simultaneous voices, errors of duration or position can lead to badly aligned and inharmonious digital transcriptions. We adapt biological sequence analysis tools as a post-processing step to correct the alignment of voices. Our multiple sequence alignment approach works on multiple musical dimensions and we investigate the contribution of each dimension to the correct alignment. Structural information, such musical phrase boundaries, is of major importance; therefore, we propose the use of the popular bioinformatics aligner Mafft which can incorporate such information while being robust to temporal noise. Our experiments show that a harmony-aware Mafft outperforms sophisticated, multidimensional alignment approaches and can achieve near-perfect polyphony reconstruction.
|
[Calvo-Zaragoza2017] |
Jorge Calvo-Zaragoza, Antonio Pertusa, and Jose Oncina.
Staff-line detection and removal using a convolutional neural
network.
Machine Vision and Applications, pages 1-10, 2017b.
ISSN 1432-1769.
[ bib |
DOI ]
Staff-line removal is an important preprocessing stage for most optical music recognition systems. Common procedures to solve this task involve image processing techniques. In contrast to these traditional methods based on hand-engineered transformations, the problem can also be approached as a classification task in which each pixel is labeled as either staff or symbol, so that only those that belong to symbols are kept in the image. In order to perform this classification, we propose the use of convolutional neural networks, which have demonstrated an outstanding performance in image retrieval tasks. The initial features of each pixel consist of a square patch from the input image centered at that pixel. The proposed network is trained by using a dataset which contains pairs of scores with and without the staff lines. Our results in both binary and grayscale images show that the proposed technique is very accurate, outperforming both other classifiers and the state-of-the-art strategies considered. In addition, several advantages of the presented methodology with respect to traditional procedures proposed so far are discussed.
|
[Calvo-Zaragoza2017a] |
Jorge Calvo-Zaragoza, Alejandro H. Toselli, and Enrique Vidal.
Early handwritten music recognition with Hidden Markov Models.
In 15th International Conference on Frontiers in Handwriting
Recognition, pages 319-324. Institute of Electrical and Electronics
Engineers Inc., 2017d.
ISBN 9781509009817.
[ bib |
DOI ]
This work presents a statistical method to tackle the Handwritten Music Recognition task for Early notation, which comprises more than 200 different symbols. Unlike previous approaches to deal with music notation, our strategy is to perform a holistic recognition without any previous segmentation or staff removal process. The input consists of a page of a music book, which is processed to extract and normalize the staves contained. Then, a feature extraction process is applied to define such sections as a sequence of numerical vectors. The recognition is based on the use of Hidden Markov Models for the optical processing and smoothed N-grams as language model. Experimentation results over a historical archive of Hispanic music reported an error around 40 account the difficulty of the task.
|
[Calvo-Zaragoza2017b] |
Jorge Calvo-Zaragoza, Alejandro Toselli, and Enrique Vidal.
Handwritten Music Recognition for Mensural Notation: Formulation,
Data and Baseline Results.
In 14th International Conference on Document Analysis and
Recognition, pages 1081-1086, Kyoto, Japan, 2017c.
[ bib |
DOI ]
Music is a key element for cultural transmission, and so large collections of music manuscripts have been preserved over the centuries. In order to develop computational tools for analysis, indexing and retrieval from these sources, it is necessary to transcribe the content to some machine-readable format. In this paper we discuss the Handwritten Music Recognition problem, which refers to the development of automatic transcription systems for musical manuscripts. We focus on mensural notation, one of the most widespread varieties of Western classical music. For that, we present a labeled corpus containing 576 staves, along with a baseline recognition system based on a combination of hidden Markov models and N-gram language models. The baseline error obtained at symbol level is about 40 % which, given the difficulty of the task, can be considered a good starting point for future developments. Our aim is that these data and preliminary results help to promote this research field, serving as a reference in future developments.
|
[Calvo-Zaragoza2017c] | Jorge Calvo-Zaragoza, Jose J. Valero-Mas, and Antonio Pertusa. End-to-end Optical Music Recognition using Neural Networks. In 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017e. ISBN 978-981-11-5179-8. [ bib | .pdf ] |
[Calvo-Zaragoza2017d] | Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga. One-step detection of background, staff lines, and symbols in medieval music manuscripts with convolutional neural networks. In 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017f. ISBN 978-981-11-5179-8. [ bib | .pdf ] |
[Calvo-Zaragoza2017e] | Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga. A machine learning framework for the categorization of elements in images of musical documents. In 3rd International Conference on Technologies for Music Notation and Representation, A Coruña, Spain, 2017g. University of A Coruña. [ bib | .pdf ] |
[Calvo-Zaragoza2017f] |
Jorge Calvo-Zaragoza, Antonio-Javier Gallego, and Antonio Pertusa.
Recognition of Handwritten Music Symbols with Convolutional Neural
Codes.
In 14th International Conference on Document Analysis and
Recognition, pages 691-696, Kyoto, Japan, 2017a.
[ bib |
DOI ]
There are large collections of music manuscripts preserved over the centuries. In order to analyze these documents it is necessary to transcribe them into a machine-readable format. This process can be done automatically using Optical Music Recognition (OMR) systems, which typically consider segmentation plus classification workflows. This work is focused on the latter stage, presenting a comprehensive study for classification of handwritten musical symbols using Convolutional Neural Networks (CNN). The power of these models lies in their ability to transform the input into a meaningful representation for the task at hand, and that is why we study the use of these models to extract features (Neural Codes) for other classifiers. For the evaluation we consider four datasets containing different configurations and notation styles, along with a number of network models, different image preprocessing techniques and several supervised learning classifiers. Our results show that a remarkable accuracy can be achieved using the proposed framework, which significantly outperforms the state of the art in all datasets considered.
|
[Calvo-Zaragoza2017g] |
Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga.
Pixelwise classification for music document analysis.
In 7th International Conference on Image Processing Theory,
Tools and Applications, pages 1-6, 2017h.
[ bib |
DOI ]
Content within musical documents not only contains music symbol but also include different elements such as staff lines, text, or frontispieces. Before attempting to automatically recognize components in these layers, it is necessary to perform an analysis of the musical documents in order to detect and classify each of these constituent parts. The obstacle for this analysis is the high heterogeneity amongst music collections, especially with ancient documents, which makes it difficult to devise methods that can be generalizable to a broader range of sources. In this paper we propose a data-driven document analysis framework based on machine learning that focuses on classifying regions of interest at pixel level. For that, we make use of Convolutional Neural Networks trained to infer the category of each pixel. The main advantage of this approach is that it can be applied regardless of the type of document provided, as long as training data is available. Since this work represents first efforts in that direction, our experimentation focuses on reporting a baseline classification using our framework. The experiments show promising performance, achieving an accuracy around 90% in two corpora of old music documents.
|
[Calvo-Zaragoza2017h] | Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga. Pixel-wise binarization of musical documents with convolutional neural networks. In 15th International Conference on Machine Vision Applications, pages 362-365, 2017i. [ bib | DOI ] |
[Calvo-Zaragoza2017i] |
Jorge Calvo-Zaragoza and Jose Oncina.
Recognition of pen-based music notation with finite-state machines.
Expert Systems with Applications, 72: 395-406, 2017.
ISSN 0957-4174.
[ bib |
DOI ]
This work presents a statistical model to recognize pen-based music compositions using stroke recognition algorithms and finite-state machines. The series of strokes received as input is mapped onto a stochastic representation, which is combined with a formal language that describes musical symbols in terms of stroke primitives. Then, a Probabilistic Finite-State Automaton is obtained, which defines probabilities over the set of musical sequences. This model is eventually crossed with a semantic language to avoid sequences that does not make musical sense. Finally, a decoding strategy is applied in order to output a hypothesis about the musical sequence actually written. Comprehensive experimentation with several decoding algorithms, stroke similarity measures and probability density estimators are tested and evaluated following different metrics of interest. Results found have shown the goodness of the proposed model, obtaining competitive performances in all metrics and scenarios considered.
|
[Calvo-Zaragoza2017j] |
Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga.
Staff-Line Detection on Grayscale Images with Pixel Classification.
In Luís A. Alexandre, José Salvador Sánchez, and João
M. F. Rodrigues, editors, Pattern Recognition and Image Analysis,
pages 279-286, Cham, 2017j. Springer International Publishing.
ISBN 978-3-319-58838-4.
[ bib |
http ]
Staff-line detection and removal are important processing steps in most Optical Music Recognition systems. Traditional methods make use of heuristic strategies based on image processing techniques with binary images. However, binarization is a complex process for which it is difficult to achieve perfect results. In this paper we describe a novel staff-line detection and removal method that deals with grayscale images directly. Our approach uses supervised learning to classify each pixel of the image as symbol, staff, or background. This classification is achieved by means of Convolutional Neural Networks. The features of each pixel consist of a square window from the input image centered at the pixel to be classified. As a case of study, we performed experiments with the CVC-Muscima dataset. Our approach showed promising performance, outperforming state-of-the-art algorithms for staff-line removal.
|
[Calvo-Zaragoza2017k] |
Jorge Calvo-Zaragoza, Ké Zhang, Zeyad Saleh, Gabriel Vigliensoni, and
Ichiro Fujinaga.
Music Document Layout Analysis through Machine Learning and Human
Feedback.
In 2017 14th IAPR International Conference on Document Analysis
and Recognition (ICDAR), volume 02, pages 23-24, Nov 2017k.
[ bib |
DOI ]
Music documents often include musical symbols as well as other relevant elements such as staff lines, text, and decorations. To detect and separate these constituent elements, we propose a layout analysis framework based on machine learning that focuses on pixel-level classification of the image. For that, we make use of supervised learning classifiers trained to infer the category of each pixel. In addition, our scenario considers a human-aided computing approach in which the user is part of the recognition loop, providing feedback where relevant errors are made.
|
[Chen2017] | Liang Chen, Rong Jin, and Christopher Raphael. Human-Guided Recognition of Music Score Images. In 4th International Workshop on Digital Libraries for Musicology. ACM Press, 2017. [ bib | DOI ] |
[Chen2017b] | Liang Chen and Christopher Raphael. Renotation of Optical Music Recognition Data. In 14th Sound and Music Computing Conference, Espoo, Finland, 2017. [ bib | .pdf ] |
[Choi2017] |
Kwon-Young Choi, Bertrand Coüasnon, Yann Ricquebourg, and Richard Zanibbi.
Bootstrapping Samples of Accidentals in Dense Piano Scores for
CNN-Based Detection.
In 14th International Conference on Document Analysis and
Recognition, Kyoto, Japan, 2017. IAPR TC10 (Technical Committee on Graphics
Recognition), IEEE Computer Society.
ISBN 978-1-5386-3586-5.
[ bib |
DOI ]
State-of-the-art Optical Music Recognition system often fails to process dense and damaged music scores, where many symbols can present complex segmentation problems. We propose to resolve these segmentation problems by using a CNNbased detector trained with few manually annotated data. A data augmentation bootstrapping method is used to accurately train a deep learning model to do the localization and classification of an accidental symbol associated with a note head, or the note head if there is no accidental. Using 5-fold cross-validation, we obtain an average of 98.5 and a classification accuracy of 99.2%.
|
[Gallego2017] |
Antonio-Javier Gallego and Jorge Calvo-Zaragoza.
Staff-line removal with selectional auto-encoders.
Expert Systems with Applications, 89: 138-148, 2017.
ISSN 0957-4174.
[ bib |
DOI |
http ]
Abstract Staff-line removal is an important preprocessing stage as regards most Optical Music Recognition systems. The common procedures employed to carry out this task involve image processing techniques. In contrast to these traditional methods, which are based on hand-engineered transformations, the problem can also be approached from a machine learning point of view if representative examples of the task are provided. We propose doing this through the use of a new approach involving auto-encoders, which select the appropriate features of an input feature set (Selectional Auto-Encoders). Within the context of the problem at hand, the model is trained to select those pixels of a given image that belong to a musical symbol, thus removing the lines of the staves. Our results show that the proposed technique is quite competitive and significantly outperforms the other state-of-art strategies considered, particularly when dealing with grayscale input images.
|
[Gomez2017] | Ashley Antony Gomez and C. N. Sujatha. Optical Music Recognition: Staffline Detection and Removal. International Journal of Application or Innovation in Engineering & Management, 2017. [ bib ] |
[Hajicjr.2017] | Jan Hajič jr. and Pavel Pecina. In Search of a Dataset for Handwritten Optical Music Recognition: Introducing MUSCIMA++. Computing Research Repository, abs/1703.04824: 1-16, 2017a. [ bib | http ] |
[Hajicjr.2017a] | Jan Hajič jr. and Pavel Pecina. Detecting Noteheads in Handwritten Scores with ConvNets and Bounding Box Regression. Computing Research Repository, abs/1708.01806, 2017b. [ bib | http ] |
[Hajicjr.2017b] | Jan Hajič jr. and Matthias Dorfer. Prototyping Full-Pipeline Optical Music Recognition with MUSCIMarker. In Extended abstracts for the Late-Breaking Demo Session of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017. [ bib | .pdf ] |
[Hajicjr.2017c] |
Jan Hajič jr. and Pavel Pecina.
Groundtruthing (Not Only) Music Notation with MUSICMarker: A
Practical Overview.
In 14th International Conference on Document Analysis and
Recognition, pages 47-48, Kyoto, Japan, 2017c.
[ bib |
DOI ]
Dataset creation for graphics recognition, especially for hand-drawn inputs, is often an expensive and time-consuming undertaking. The MUSCIMarker tool used for creating the MUSCIMA++ dataset for Optical Music Recognition (OMR) led to efficient use of annotation resources, and it provides enough flexibility to be applicable to creating datasets for other graphics recognition tasks where the ground truth can be represented similarly. First, we describe the MUSCIMA++ ground truth to define the range of tasks for which using MUSCIMarker to annotate ground truth is applicable. We then describe the MUSCIMarker tool itself, discuss its strong and weak points, and share practical experience with the tool from creating the MUSCIMA++ dataset.
|
[Hajicjr.2017d] |
Jan Hajič jr. and Pavel Pecina.
The MUSCIMA++ Dataset for Handwritten Optical Music Recognition.
In 14th International Conference on Document Analysis and
Recognition, pages 39-46, Kyoto, Japan, 2017d.
[ bib |
DOI ]
Optical Music Recognition (OMR) promises to make accessible the content of large amounts of musical documents, an important component of cultural heritage. However, the field does not have an adequate dataset and ground truth for benchmarking OMR systems, which has been a major obstacle to measurable progress. Furthermore, machine learning methods for OMR require training data. We design and collect MUSCIMA++, a new dataset for OMR. Ground truth in MUSCIMA++ is a notation graph, which our analysis shows to be a necessary and sufficient representation of music notation. Building on the CVC-MUSCIMA dataset for staffline removal, the MUSCIMA++ dataset v1.0 consists of 140 pages of handwritten music, with 91254 manually annotated notation symbols and 82247 explicitly marked relationships between symbol pairs. The dataset allows training and directly evaluating models for symbol classification, symbol localization, and notation graph assembly, and indirectly musical content extraction, both in isolation and jointly. Open-source tools are provided for manipulating the dataset, visualizing the data and annotating further, and the data is made available under an open license.
|
[iSeeNotes] | Gear Up AB. iSeeNotes. http://www.iseenotes.com, 2017. [ bib | http ] |
[Jin2017] | Rong Jin. Graph-Based Rhythm Interpretation in Optical Music Recognition. PhD thesis, Indiana University, 2017. [ bib | http ] |
[KompApp] | Gene Ragan. KompApp. http://kompapp.com, 2017. [ bib | http ] |
[Mexin2017] | Yevgen Mexin, Aristotelis Hadjakos, Axel Berndt, Simon Waloschek, Anastasia Wawilow, and Gerd Szwillus. Tools for Annotating Musical Measures in Digital Music Editions. In 14th Sound and Music Computing Conference, pages 279-286, Espoo, Finland, 2017. [ bib | .pdf ] |
[Montagner2017] |
Igor dos Santos Montagner, Nina S.T. Hirata, and Roberto Jr. Hirata.
Staff removal using image operator learning.
Pattern Recognition, 63: 310-320, 2017.
ISSN 0031-3203.
[ bib |
DOI ]
Staff removal is an image processing task that aims to facilitate further analysis of music score images. Even when restricted to images in specific domains such as music score recognition, solving image processing problems usually requires the design of customized algorithms. To cope with image variabilities and the growing amount of data, machine learning based techniques emerge as a natural approach to be employed in image processing problems. In this sense, image operator learning methods are concerned with estimating, from sample pairs of input-output images of a transformation, a local function that characterizes the image transformation. These methods require the definition of some parameters, including the local information to be considered in the processing which is defined by a window. In this work we show how to apply the image operator learning technique to the staff line removal problem. We present an algorithm for window determination and show that it captures visual information relevant for staff removal. We also present a reference window set to be used in cases where the training set is not sufficiently large. Experimental results obtained with respect to synthetic and handwritten music scores under varying image conditions show that the learned image operators are comparable with especially designed state-of-the-art heuristic algorithms. © 2016 Elsevier Ltd
|
[MusicScoreClassifier] | Alexander Pacha. Github Repository of the Music Score Classifier. https://github.com/apacha/MusicScoreClassifier, 2017a. [ bib | http ] |
[Oh2017] | Jiyong Oh, Sung Joon Son, Sangkuk Lee, Ji-Won Kwon, and Nojun Kwak. Online recognition of handwritten music symbols. International Journal on Document Analysis and Recognition, 20 (2): 79-89, 2017. [ bib | DOI ] |
[OmrDatasetsProject] | Alexander Pacha. The OMR Datasets Project. https://apacha.github.io/OMR-Datasets, 2017b. [ bib | http ] |
[Pacha2017] |
Alexander Pacha and Horst Eidenberger.
Towards a Universal Music Symbol Classifier.
In 14th International Conference on Document Analysis and
Recognition, pages 35-36, Kyoto, Japan, 2017a. IAPR TC10 (Technical
Committee on Graphics Recognition), IEEE Computer Society.
ISBN 978-1-5386-3586-5.
[ bib |
DOI ]
Optical Music Recognition (OMR) aims to recognize and understand written music scores. With the help of Deep Learning, researchers were able to significantly improve the state-of-the-art in this research area. However, Deep Learning requires a substantial amount of annotated data for supervised training. Various datasets have been collected in the past, but without a common standard that defines data formats and terminology, combining them is a challenging task. In this paper we present our approach towards unifying multiple datasets into the largest currently available body of over 90000 musical symbols that belong to 79 classes, containing both handwritten and printed music symbols. A universal music symbol classifier, trained on such a dataset using Deep Learning, can achieve an accuracy that exceeds 98%.
|
[Pacha2017a] |
Alexander Pacha and Horst Eidenberger.
Towards Self-Learning Optical Music Recognition.
In 16th International Conference on Machine Learning and
Applications, pages 795-800, 2017b.
[ bib |
DOI ]
Optical Music Recognition (OMR) is a branch of artificial intelligence that aims at automatically recognizing and understanding the content of music scores in images. Several approaches and systems have been proposed that try to solve this problem by using expert knowledge and specialized algorithms that tend to fail at generalization to a broader set of scores, imperfect image scans or data of different formatting. In this paper we propose a new approach to solve OMR by investigating how humans read music scores and by imitating that behavior with machine learning. To demonstrate the power of this approach, we conduct two experiments that teach a machine to distinguish entire music sheets from arbitrary content through frame-by-frame classification and distinguishing between 32 classes of handwritten music symbols which can be a basis for object detection. Both tasks can be performed at high rates of confidence (>98comparable to the performance of humans on the same task.
|
[Parada-Cabaleiro2017] | Emilia Parada-Cabaleiro, Anton Batliner, Alice Baird, and Björn Schuller. The SEILS Dataset: Symbolically Encoded Scores in Modern-Early Notation for Computational Musicology. In 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017. ISBN 978-981-11-5179-8. [ bib | .pdf ] |
[Riba2017] |
Pau Riba, Alicia Fornés, and Josep Lladós.
Towards the Alignment of Handwritten Music Scores.
In Lins R.D. Lamiroy B., editor, Graphic Recognition. Current
Trends and Challenges, Lecture Notes in Computer Science, pages 103-116.
Springer Verlag, 2017.
ISBN 9783319521589.
[ bib |
DOI ]
It is very common to find different versions of the same music work in archives of Opera Theaters. These differences correspond to modifications and annotations from the musicians. From the musicologist point of view, these variations are very interesting and deserve study. This paper explores the alignment of music scores as a tool for automatically detecting the passages that contain such differences. Given the difficulties in the recognition of handwritten music scores, our goal is to align the music scores and at the same time, avoid the recognition of music elements as much as possible. After removing the staff lines, braces and ties, the bar lines are detected. Then, the bar units are described as a whole using the Blurred Shape Model. The bar units alignment is performed by using Dynamic Time Warping. The analysis of the alignment path is used to detect the variations in the music scores. The method has been evaluated on a subset of the CVC-MUSCIMA dataset, showing encouraging results. © Springer International Publishing AG 2017.
|
[RicoBlanes2017] | Adrià Rico Blanes and Alicia Fornés Bisquerra. Camera-Based Optical Music Recognition Using a Convolutional Neural Network. In 14th International Conference on Document Analysis and Recognition, pages 27-28, Kyoto, Japan, 2017. IEEE. [ bib | DOI ] |
[Roy2017] |
Partha Pratim Roy, Ayan Kumar Bhunia, and Umapada Pal.
HMM-based writer identification in music score documents without
staff-line removal.
Expert Systems with Applications, 89: 222-240, 2017.
ISSN 0957-4174.
[ bib |
DOI |
http ]
Writer identification from musical score documents is a challenging task due to its inherent problem of overlapping of musical symbols with staff-lines. Most of the existing works in the literature of writer identification in musical score documents were performed after a pre-processing stage of staff-lines removal. In this paper we propose a novel writer identification framework in musical score documents without removing staff-lines from the documents. In our approach, Hidden Markov Model (HMM) has been used to model the writing style of the writers without removing staff-lines. The sliding window features are extracted from musical score-lines and they are used to build writer specific HMM models. Given a query musical sheet, writer specific confidence for each musical line is returned by each writer specific model using a log-likelihood score. Next, a log-likelihood score in page level is computed by weighted combination of these scores from the corresponding line images of the page. A novel Factor Analysis-based feature selection technique is applied in sliding window features to reduce the noise appearing from staff-lines which proves efficiency in writer identification performance. In our framework we have also proposed a novel score-line detection approach in musical sheet using HMM. The experiment has been performed in CVC-MUSCIMA data set and the results obtained show that the proposed approach is efficient for score-line detection and writer identification without removing staff-lines. To get the idea of computation time of our method, detail analysis of execution time is also provided.
|
[Saleh2017] | Zeyad Saleh, Ke Zhang, Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga. Pixel.js: Web-Based Pixel Classification Correction Platform for Ground Truth Creation. In 14th International Conference on Document Analysis and Recognition, pages 39-40, Kyoto, Japan, 2017. [ bib | DOI ] |
[Shi2017] |
Baoguang Shi, Xiang Bai, and Cong Yao.
An End-to-End Trainable Neural Network for Image-Based Sequence
Recognition and Its Application to Scene Text Recognition.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 39 (11): 2298-2304, 2017.
ISSN 0162-8828.
[ bib |
DOI ]
Image-based sequence recognition has been a long-standing research topic in computer vision. In this paper, we investigate the problem of scene text recognition, which is among the most important and challenging tasks in image-based sequence recognition. A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed. Compared with previous systems for scene text recognition, the proposed architecture possesses four distinctive properties: (1) It is end-to-end trainable, in contrast to most of the existing algorithms whose components are separately trained and tuned. (2) It naturally handles sequences in arbitrary lengths, involving no character segmentation or horizontal scale normalization. (3) It is not confined to any predefined lexicon and achieves remarkable performances in both lexicon-free and lexicon-based scene text recognition tasks. (4) It generates an effective yet much smaller model, which is more practical for realworld application scenarios. The experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets, demonstrate the superiority of the proposed algorithm over the prior arts. Moreover, the proposed algorithm performs well in the task of image-based music score recognition, which evidently verifies the generality of it.
|
[SmartScore] | Musitek. SmartScore X2. http://www.musitek.com/smartscore-pro.html, 2017. [ bib | .html ] |
[Sober-Mira2017] | Javier Sober-Mira, Jorge Calvo-Zaragoza, David Rizo, and José Manuel Iñesta. Pen-Based Music Document Transcription. In 14th International Conference on Document Analysis and Recognition, pages 21-22, Kyoto, Japan, 2017a. IEEE. [ bib | DOI ] |
[Sober-Mira2017a] | Javier Sober-Mira, Jorge Calvo-Zaragoza, David Rizo, and José Manuel Iñesta. Multimodal Recognition for Music Document Transcription. In 10th International Workshop on Machine Learning and Music, Barcelona, Spain, 2017b. [ bib | .pdf ] |
[StaffPad] | StaffPad Ltd. StaffPad. http://www.staffpad.net, 2017. [ bib | http ] |
[Wel2017] | Eelco van der Wel and Karen Ullrich. Optical Music Recognition with Convolutional Sequence-to-Sequence Models. In 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017. ISBN 978-981-11-5179-8. [ bib | .pdf ] |
[Wu2017] |
Fu-Hai Frank Wu.
Applying Machine Learning in Optical Music Recognition of Numbered
Music Notation.
In International Journal of Multimedia Data Engineering and
Management, page 21. IGI Global, 2017.
[ bib |
DOI ]
Although research of optical music recognition (OMR) has existed for few decades, most of efforts were put in step of image processing to approach upmost accuracy and evaluations were not in common ground. And major music notations explored were the conventional western music notations with staff. On contrary, the authors explore the challenges of numbered music notation, which is popular in Asia and used in daily life for sight reading. The authors use different way to improve recognition accuracy by applying elementary image processing with rough tuning and supplementing with methods of machine learning. The major contributions of this work are the architecture of machine learning specified for this task, the dataset, and the evaluation metrics, which indicate the performance of OMR system, provide objective function for machine learning and highlight the challenges of the scores of music with the specified notation.
|
[Zhang2017a] |
Emily H. Zhang.
An Efficient Score Alignment Algorithm and its Applications.
Master's thesis, Massachusetts Institute of Technology, 2017.
[ bib |
http ]
String alignment and comparison in Computer Science is a well-explored space with classic problems such as Longest Common Subsequence that have practical application in bioinformatic genomic sequencing and data comparison in revision control systems. In the field of musicology, score alignment and comparison is a problem with many similarities to string comparison and alignment but also vast differences. In particular we can use ideas in string alignment and comparison to compare a music score in the MIDI format with a music score generated from Optical Musical Recognition (OMR), both of which have incomplete or wrong information, and correct errors that were introduced in the OMR process to create an improved third score. This thesis creates a set of algorithms that align and compare MIDI and OMR music scores to produce a corrected version of the OMR score that borrows ideas from classic computer science string comparison and alignment algorithm but also incorporates optimizations and heuristics from music theory.
|
[Baro2016] |
Arnau Baró, Pau Riba, and Alicia Fornés.
Towards the recognition of compound music notes in handwritten music
scores.
In 15th International Conference on Frontiers in Handwriting
Recognition, pages 465-470. Institute of Electrical and Electronics
Engineers Inc., 2016.
ISBN 9781509009817.
[ bib |
DOI ]
The recognition of handwritten music scores still remains an open problem. The existing approaches can only deal with very simple handwritten scores mainly because of the variability in the handwriting style and the variability in the composition of groups of music notes (i.e. compound music notes). In this work we focus on this second problem and propose a method based on perceptual grouping for the recognition of compound music notes. Our method has been tested using several handwritten music scores of the CVC-MUSCIMA database and compared with a commercial Optical Music Recognition (OMR) software. Given that our method is learning-free, the obtained results are promising.
|
[Byrd2016] | Donald Byrd and Eric Isaacson. A Music Representation Requirement Specification for Academia. Technical report, Indiana University, Bloomington, 2016. [ bib | http ] |
[Calvo-Zaragoza2016c] | Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga. Document Analysis for Music Scores via Machine Learning. In 3rd International workshop on Digital Libraries for Musicology, pages 37-40, New York, USA, 2016c. ACM, ACM. ISBN 978-1-4503-4751-8. [ bib | DOI ] |
[Calvo-Zaragoza2016d] | Jorge Calvo-Zaragoza, David Rizo, and José Manuel Iñesta. Two (note) heads are better than one: pen-based multimodal interaction with music scores. In J. et al. Devaney, editor, 17th International Society for Music Information Retrieval Conference, pages 509-514, New York City, 2016b. ISBN 978-0-692-75506-8. [ bib | .pdf ] |
[Calvo-Zaragoza2016e] |
Jorge Calvo-Zaragoza, Luisa Micó, and Jose Oncina.
Music staff removal with supervised pixel classification.
International Journal on Document Analysis and Recognition, 19
(3): 211-219, 2016a.
[ bib |
DOI ]
This work presents a novel approach to tackle the music staff removal. This task is devoted to removing the staff lines from an image of a music score while maintaining the symbol information. It represents a key step in the performance of most optical music recognition systems. In the literature, staff removal is usually solved by means of image processing procedures based on the intrinsics of music scores. However, we propose to model the problem as a supervised learning classification task. Surprisingly, although there is a strong background and a vast amount of research concerning machine learning, the classification approach has remained unexplored for this purpose. In this context, each foreground pixel is labelled as either staff or symbol. We use pairs of scores with and without staff lines to train classification algorithms. We test our proposal with several well-known classification techniques. Moreover, in our experiments no attempt of tuning the classification algorithms has been made, but the parameters were set to the default setting provided by the classification software libraries. The aim of this choice is to show that, even with this straightforward procedure, results are competitive with state-of-the-art algorithms. In addition, we also discuss several advantages of this approach for which conventional methods are not applicable such as its high adaptability to any type of music score.
|
[Campos2016] |
Vicente Bosch Campos, Jorge Calvo-Zaragoza, Alejandro H. Toselli, and Enrique
Vidal Ruiz.
Sheet Music Statistical Layout Analysis.
In 15th International Conference on Frontiers in Handwriting
Recognition, pages 313-318, 2016.
[ bib |
DOI ]
In order to provide access to the contents of ancient music scores to researchers, the transcripts of both the lyrics and the musical notation is required. Before attempting any type of automatic or semi-automatic transcription of sheet music, an adequate layout analysis (LA) is needed. This LA must provide not only the locations of the different image regions, but also adequate region labels to distinguish between different region types such as staff, lyric, etc. To this end, we adapt a stochastic framework for LA based on Hidden Markov Models that we had previously introduced for detection and classification of text lines in typical handwritten text images. The proposed approach takes a scanned music score image as input and, after basic preprocessing, simultaneously performs region detection and region classification in an integrated way. To assess this statistical LA approach several experiments were carried out on a representative sample of a historical music archive, under different difficulty settings. The results show that our approach is able to tackle these structured documents providing good results not only for region detection but also for classification of the different regions.
|
[Chen2016] |
Liang Chen and Kun Duan.
MIDI-assisted egocentric optical music recognition.
In Winter Conference on Applications of Computer Vision.
Institute of Electrical and Electronics Engineers Inc., 2016.
ISBN 9781509006410.
[ bib |
DOI ]
Egocentric vision has received increasing attention in recent years due to the vast development of wearable devices and their applications. Although there are numerous existing work on egocentric vision, none of them solve Optical Music Recognition (OMR) problem. In this paper, we propose a novel optical music recognition approach for egocentric device (e.g. Google Glass) with the assistance of MIDI data. We formulate the problem as a structured sequence alignment problem as opposed to the blind recognition in traditional OMR systems. We propose a linear-chain Conditional Random Field (CRF) to model the note event sequence, which translates the relative temporal relations contained by MIDI to spatial constraints over the egocentric observation. We performed evaluations to compare the proposed approach with several different baselines and proved that our approach achieved the highest recognition accuracy. We view our work as the first step towards egocentric optical music recognition, and believe it will bring insights for next-generation music pedagogy and music entertainment.
|
[Chen2016a] | Liang Chen and Christopher Raphael. Human-Directed Optical Music Recognition. Electronic Imaging, 2016 (17): 1-9, 2016. [ bib | DOI ] |
[Chen2016b] | Liang Chen, Erik Stolterman, and Christopher Raphael. Human-Interactive Optical Music Recognition. In Michael I. Mandel, Johanna Devaney, Douglas Turnbull, and George Tzanetakis, editors, 17th International Society for Music Information Retrieval Conference, pages 647-653, 2016b. ISBN 978-0-692-75506-8. [ bib | .pdf ] |
[Chen2016e] | Liang Chen, Rong Jin, Simo Zhang, Stefan Lee, Zhenhua Chen, and David Crandall. A Hybrid HMM-RNN Model for Optical Music Recognition. In Extended abstracts for the Late-Breaking Demo Session of the 17th International Society for Music Information Retrieval Conference, 2016a. [ bib | .pdf ] |
[Dinh2016] |
Cong Minh Dinh, Hyung-Jeong Yang, Guee-Sang Lee, and Soo-Hyung Kim.
Fast lyric area extraction from images of printed Korean music
scores.
IEICE Transactions on Information and Systems, E99D (6):
1576-1584, 2016.
ISSN 0916-8532.
[ bib |
DOI ]
In recent years, optical music recognition (OMR) has been extensively developed, particularly for use with mobile devices that require fast processing to recognize and play live the notes in images captured from sheet music. However, most techniques that have been developed thus far have focused on playing back instrumental music and have ignored the importance of lyric extraction, which is time consuming and affects the accuracy of the OMR tools. The text of the lyrics adds complexity to the page layout, particularly when lyrics touch or overlap musical symbols, in which case it is very difficult to separate them from each other. In addition, the distortion that appears in captured musical images makes the lyric lines curved or skewed, making the lyric extraction problem more complicated. This paper proposes a new approach in which lyrics are detected and extracted quickly and effectively. First, in order to resolve the distortion problem, the image is undistorted by a method using information of stave lines and bar lines. Then, through the use of a frequency count method and heuristic rules based on projection, the lyric areas are extracted, the cases where symbols touch the lyrics are resolved, and most of the information from the musical notation is kept even when the lyrics and music notes are overlapping. Our algorithm demonstrated a short processing time and remarkable accuracy on two test datasets of images of printed Korean musical scores: The first set included three hundred scanned musical images; the second set had two hundred musical images that were captured by a digital camera. © 2016 The Institute of Electronics, Information and Communication Engineers.
|
[Dorfer2016] | Matthias Dorfer, Andreas Arzt, and Gerhard Widmer. Towards End-to-End Audio-Sheet-Music Retrieval. Computing Research Repository, abs/1612.05070, 2016a. [ bib | http ] |
[Dorfer2016a] | Matthias Dorfer, Andreas Arzt, and Gerhard Widmer. Towards Score Following In Sheet Music Images. In Michael I. Mandel, Johanna Devaney, Douglas Turnbull, and George Tzanetakis, editors, 17th International Society for Music Information Retrieval Conference, pages 789-795, 2016b. ISBN 978-0-692-75506-8. [ bib | .pdf ] |
[Hajicjr.2016] | Jan Hajič jr., Jiří Novotný, Pavel Pecina, and Jaroslav Pokorný. Further Steps towards a Standard Testbed for Optical Music Recognition. In Michael Mandel, Johanna Devaney, Douglas Turnbull, and George Tzanetakis, editors, 17th International Society for Music Information Retrieval Conference, pages 157-163, New York, USA, 2016. New York University, New York University. ISBN 978-0-692-75506-8. [ bib | http ] |
[Jastrzebska2016] |
Agnieszka Jastrzebska and Wojciech Lesinski.
Optical Music Recognition as the Case of Imbalanced Pattern
Recognition: A Study of Single Classifiers.
In Andrzej M.J. Skulimowski and Janusz Kacprzyk, editors,
Knowledge, Information and Creativity Support Systems: Recent Trends,
Advances and Solutions, pages 493-505, Cham, 2016. Springer International
Publishing.
ISBN 978-3-319-19090-7.
[ bib |
DOI ]
The article is focused on a particular aspect of classification, namely the imbalance of recognized classes. The paper contains a comparative study of results of musical symbols classification using known algorithms: k-nearest neighbors, k-means, Mahalanobis minimal distance, and decision trees. Authors aim at addressing the problem of imbalanced pattern recognition. First, we theoretically analyze difficulties entailed in the classification of music notation symbols. Second, in the enclosed case study we investigate the fitness of named single classifiers on real data. Conducted experiments are based on own implementations of named algorithms with all necessary image processing tasks. Results are highly satisfying.
|
[Laplante2016] | Audrey Laplante and Ichiro Fujinaga. Digitizing musical scores: Challenges and opportunities for libraries. In 3rd International Workshop on Digital Libraries for Musicology, pages 45-48. ACM, 2016. [ bib | DOI ] |
[Lee2016a] |
Sangkuk Lee, Sung Joon Son, Jiyong Oh, and Nojun Kwak.
Handwritten Music Symbol Classification Using Deep Convolutional
Neural Networks.
In International Conference on Information Science and
Security, pages 1-5, 2016.
[ bib |
DOI ]
In this paper, we utilize deep Convolutional Neural Networks (CNNs) to classify handwritten music symbols in HOMUS data set. HOMUS data set is made up of various types of strokes which contain time information and it is expected that online techniques are more appropriate for classification. However, experimental results show that CNN which does not use time information achieved classification accuracy around 94.6 the prior state-of-the-art online technique. Finally, we achieved the best accuracy around 95.6% with the ensemble of CNNs.
|
[Lehman-Borer2016] | Ryerson Lehman-Borer. Optical Music Recognition. Technical report, Swarthmore College, 2016. [ bib | http ] |
[Pedersoli2016] |
Fabrizio Pedersoli and George Tzanetakis.
Document segmentation and classification into musical scores and
text.
International Journal on Document Analysis and Recognition, 19
(4): 289-304, 2016.
ISSN 1433-2825.
[ bib |
DOI ]
A new algorithm for segmenting documents into regions containing musical scores and text is proposed. Such segmentation is a required step prior to applying optical character recognition and optical music recognition on scanned pages that contain both music notation and text. Our segmentation technique is based on the bag-of-visual-words representation followed by random block voting (RBV) in order to detect the bounding boxes containing the musical score and text within a document image. The RBV procedure consists of extracting a fixed number of blocks whose position and size are sampled from a discrete uniform distribution that “over”-covers the input image. Each block is automatically classified as either coming from musical score or text and votes with a particular posterior probability of classification in its spatial domain. An initial coarse segmentation is obtained by summarizing all the votes in a single image. Subsequently, the final segmentation is obtained by subdividing the image in microblocks and classifying them using a N-nearest neighbor classifier which is trained using the coarse segmentation. We demonstrate the potential of the proposed method by experiments on two different datasets. One is on a challenging dataset of images collected and artificially combined and manipulated for this project. The other is a music dataset obtained by the scanning of two music books. The results are reported using precision/recall metrics of the overlapping area with respect to the ground truth. The proposed system achieves an overall averaged F-measure of 85 %. The complete source code package and associated data are available at https://github.com/fpeder/mscr under the FreeBSD license to support reproducibility.
|
[PinheiroPereira2016] | Roberto M. Pinheiro Pereira, Caio E.F. Matos, Geraldo Jr. Braz, João D.S. de Almeida, and Anselmo C. de Paiva. A Deep Approach for Handwritten Musical Symbols Recognition. In 22nd Brazilian Symposium on Multimedia and the Web, pages 191-194, Teresina, Piau; Brazil, 2016. ACM. ISBN 978-1-4503-4512-5. [ bib | DOI ] |
[PlayScore] | Organum. PlayScore. http://www.playscore.co, 2016. [ bib | http ] |
[Rhodes2016] |
Christophe Rhodes, Tim Crawford, and Mark d'Inverno.
Duplicate Detection in Facsimile Scans of Early Printed Music.
In Analysis of Large and Complex Data, pages 449-459.
Springer International Publishing, Cham, 2016.
ISBN 978-3-319-25226-1.
[ bib |
DOI ]
There is a growing number of collections of readily available scanned musical documents, whether generated and managed by libraries, research projects, or volunteer efforts. They are typically digital images; for computational musicology we also need the musical data in machine-readable form. Optical Music Recognition (OMR) can be used on printed music, but is prone to error, depending on document condition and the quality of intermediate stages in the digitization process such as archival photographs. This work addresses the detection of one such error-duplication of images-and the discovery of other relationships between images in the process.
|
[Vo2016] |
Quang Nhat Vo, Soo Hyung Kim, Hyung Jeong Yang, and Gueesang Lee.
An MRF model for binarization of music scores with complex
background.
Pattern Recognition Letters, 69: 88-95, 2016.
ISSN 0167-8655.
[ bib |
DOI ]
We present a Gaussian Mixture Markov Random Field (GMMRF) model that is effective for the binarization of music score images with complex backgrounds. The binarization of music score documents containing noises with arbitrary shapes and/or non-uniform colors in the background area is a very challenging problem. In order to extract the content knowledge of music score documents, the staff lines are extracted by first applying a stroke width transform. With the color and spatial information of the detected staff lines, we can accurately model the foreground and background color distribution, in which a GMMRF framework is used to make the binarization robust to variations in colors. Then, the staff line information is employed for guiding the GMMRF labeling process. In the experiment, the music score images captured by camera show promising results compared to existing methods.
|
[Wen2016] |
Cuihong Wen, Jing Zhang, Ana Rebelo, and Fanyong Cheng.
A Directed Acyclic Graph-Large Margin Distribution Machine Model for
Music Symbol Classification.
PLoS ONE, 11 (3): 1-11, 2016.
[ bib |
DOI ]
Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs).
|
[Wu2016] |
Fu-Hai Frank Wu.
An Evaluation Framework of Optical Music Recognition in Numbered
Music Notation.
In International Symposium on Multimedia, pages 626-631,
2016.
[ bib |
DOI ]
In this study, we refine the ecosystem for optical music recognition (OMR) of numbered music notation with better accuracy. The ecosystem includes users, OMR system, dataset of music scores, groundtruth building, symbolic representation of sheet music, checking by musicological rules and performance evaluation. Especially, the evaluation metric includes exact and approximate approach to count accuracy automatically. The hands-on dataset comprises of 110 music score manuscripts in a songbook for singing reference. The experimental results justify the value of evaluation framework and show the necessity of checks complying with musicological properties.
|
[Adamska2015] |
Julia Adamska, Mateusz Piecuch, Mateusz Podgórski, Piotr Walkiewicz, and
Ewa Lukasik.
Mobile System for Optical Music Recognition and Music Sound
Generation.
In Khalid Saeed and Wladyslaw Homenda, editors, Computer
Information Systems and Industrial Management, pages 571-582, Cham, 2015.
Springer International Publishing.
ISBN 978-3-319-24369-6.
[ bib |
DOI ]
The paper presents a mobile system for generating a melody based on a photo of a musical score. The client-server architecture was applied. The client role is designated to a mobile application responsible for taking a photo of a score, sending it to the server for further processing and playing mp3 file received from the server. The server role is to recognize notes from the image, generate mp3 file and send it to the client application. The key element of the system is the program realizing the algorithm of notes recognition. It is based on the decision trees and characteristics of the individual symbols extracted from the image. The system is implemented in the Windows Phone 8 framework and uses a cloud operating system Microsoft Azure. It enables easy archivization of photos, recognized notes in the Music XML format and generated mp3 files. An easy transition to other mobile operating systems is possible as well as processing multiple music collections scans.
|
[Balke2015] |
Stefan Balke, Sanu Pulimootil Achankunju, and Meinard Müller.
Matching Musical Themes Based on Noisy OCR and OMR Input.
In International Conference on Acoustics, Speech and Signal
Processing, pages 703-707. Institute of Electrical and Electronics
Engineers Inc., 2015.
ISBN 9781467369978.
[ bib |
DOI ]
In the year 1948, Barlow and Morgenstern published the book 'A Dictionary of Musical Themes', which contains 9803 important musical themes from the Western classical music literature. In this paper, we deal with the problem of automatically matching these themes to other digitally available sources. To this end, we introduce a processing pipeline that automatically extracts from the scanned pages of the printed book textual metadata using Optical Character Recognition (OCR) as well as symbolic note information using Optical Music Recognition (OMR). Due to the poor printing quality of the book, the OCR and OMR results are quite noisy containing numerous extraction errors. As one main contribution, we adjust alignment techniques for matching musical themes based on the OCR and OMR input. In particular, we show how the matching quality can be substantially improved by fusing the OCR- and OMR-based matching results. Finally, we report on our experiments within the challenging Barlow and Morgenstern scenario, which also indicates the potential of our techniques when considering other sources of musical themes such as digital music archives and the world wide web.
|
[Burgoyne2015] |
John Ashley Burgoyne, Ichiro Fujinaga, and J. Stephen Downie.
Music Information Retrieval.
In Susan Schreibman, Ray Siemens, and John Unsworth, editors, A
New Companion to Digital Humanities, pages 213-228. Wiley Blackwell, 2015.
ISBN 9781118680605.
[ bib |
DOI ]
Music information retrieval (MIR) is "a multidisciplinary research endeavor that strives to develop innovative content-based searching schemes, novel interfaces, and evolving networked delivery mechanisms in an effort to make the world's vast store of music accessible to all." MIR was born from computational musicology in the 1960s and has since grown to have links with music cognition and audio engineering, a dedicated annual conference (ISMIR) and an annual evaluation campaign (MIREX). MIR combines machine learning with expert human knowledge to use digital music data - images of music scores, "symbolic" data such as MIDI files, audio, and metadata about musical items - for information retrieval, classification and estimation, or sequence labeling. This chapter gives a brief history of MIR, introduces classical MIR tasks from optical music recognition to music recommendation systems, and outlines some of the key questions and directions for future developments in MIR. © 2016 John Wiley & Sons, Ltd.
|
[Byrd2015] |
Donald Byrd and Jakob Grue Simonsen.
Towards a Standard Testbed for Optical Music Recognition:
Definitions, Metrics, and Page Images.
Journal of New Music Research, 44 (3): 169-195, 2015.
ISSN 0929-8215.
[ bib |
DOI ]
We posit that progress in Optical Music Recognition (OMR) has been held up for years by the absence of anything resembling the standard testbeds in use in other fields that face difficult evaluation problems. One example of such a field is text information retrieval (IR), where the Text Retrieval Conference (TREC) has annually-renewed IR tasks with accompanying data sets. In music informatics, the Music Information Retrieval Exchange (MIREX), with its annual tests and meetings held during the ISMIR conference, is a close analog to TREC; but MIREX has never had an OMR track or a collection of music such a track could employ. We describe why the absence of an OMR testbed is a problem and how this problem may be mitigated. To aid in the establishment of a standard testbed, we provide (1) a set of definitions for the complexity of music notation; (2) a set of performance metrics for OMR tools that gauge score complexity and graphical quality; and (3) a small corpus of music for use as a baseline for a proper OMR testbed.
|
[Calvo-Zaragoza2015] |
Jorge Calvo-Zaragoza, Isabel Barbancho, Lorenzo J. Tardón, and Ana M.
Barbancho.
Avoiding staff removal stage in optical music
recognition: application to scores written in white mensural notation.
Pattern Analysis and Applications, 18 (4): 933-943, 2015.
ISSN 1433-755X.
[ bib |
DOI ]
Staff detection and removal is one of the most important issues in optical music recognition (OMR) tasks since common approaches for symbol detection and classification are based on this process. Due to its complexity, staff detection and removal is often inaccurate, leading to a great number of errors in posterior stages. For this reason, a new approach that avoids this stage is proposed in this paper, which is expected to overcome these drawbacks. Our approach is put into practice in a case of study focused on scores written in white mensural notation. Symbol detection is performed by using the vertical projection of the staves. The cross-correlation operator for template matching is used at the classification stage. The goodness of our proposal is shown in an experiment in which our proposal attains an extraction rate of 96 % and a classification rate of 92 %, on average. The results found have reinforced the idea of pursuing a new research line in OMR systems without the need of the removal of staff lines.
|
[Calvo-Zaragoza2015a] |
Jorge Calvo-Zaragoza and Jose Oncina.
Clustering of strokes from pen-based music notation: An experimental
study.
Lecture Notes in Computer Science, 9117: 633-640, 2015.
ISSN 0302-9743.
[ bib |
DOI ]
A comfortable way of digitizing a new music composition is by using a pen-based recognition system, in which the digital score is created with the sole effort of the composition itself. In this kind of systems, the input consist of a set of pen strokes. However, it is hitherto unclear the different types of strokes that must be considered for this task. This paper presents an experimental study on automatic labeling of these strokes using the well-known k-medoids algorithm. Since recognition of pen-based music scores is highly related to stroke recognition, it may be profitable to repeat the process when new data is received through user interaction. Therefore, our intention is not to propose some stroke labeling but to show which stroke dissimilarities perform better within the clustering process. Results show that there can be found good methods in the trade-off between cluster complexity and classification accuracy, whereas others offer a very poor performance. © Springer International Publishing Switzerland 2015.
|
[Chen2015] | Liang Chen, Rong Jin, and Christopher Raphael. Renotation from Optical Music Recognition. In Mathematics and Computation in Music, pages 16-26, Cham, 2015. Springer International Publishing. [ bib | DOI ] |
[Chen2015a] | Liang Chen and Christopher Raphael. Ceres: An Interactive Optical Music Recognition System. In Extended abstracts for the Late-Breaking Demo Session of the 16th International Society for Music Information Retrieval Conference, Málaga, Spain, 2015. [ bib | .pdf ] |
[Fang2015] |
Yang Fang and Teng Gui-fa.
Visual music score detection with unsupervised feature learning
method based on K-means.
International Journal of Machine Learning and Cybernetics, 6
(2): 277-287, 2015.
ISSN 1868-8071.
[ bib |
DOI ]
Automatic music score detection plays important role in the optical music recognition (OMR). In a visual image, the characteristic of the music scores is frequently degraded by illumination, distortion and other background elements. In this paper, to reduce the influences to OMR caused by those degradations especially the interference of Chinese character, an unsupervised feature learning detection method is proposed for improving the correctness of music score detection. Firstly, a detection framework was constructed. Then sub-image block features were extracted by simple unsupervised feature learning (UFL) method based on K-means and classified by SVM. Finally, music score detection processing was completed by connecting component searching algorithm based on the sub-image block label. Taking Chinese text as the main interferences, the detection rate was compared between UFL method and texture feature method based on 2D Gabor filter in the same framework. The experiment results show that unsupervised feature learning method gets less error detection rate than Gabor texture feature method with limited training set. © 2014, Springer-Verlag Berlin Heidelberg.
|
[Huang2015] | Yu-Hui Huang, Xuanli Chen, Serafina Beck, David Burn, and Luc Van Gool. Automatic Handwritten Mensural Notation Interpreter: From Manuscript to MIDI Performance. In Meinard Müller and Frans Wiering, editors, 16th International Society for Music Information Retrieval Conference, pages 79-85, Málaga, Spain, 2015. ISBN 978-84-606-8853-2. [ bib | .pdf ] |
[Lesinski2015] | Wojciech Lesinski and Agnieszka Jastrzebska. Optical Music Recognition: Standard and Cost-Sensitive Learning with Imbalanced Data. In IFIP International Conference on Computer Information Systems and Industrial Management, pages 601-612. Springer, 2015. [ bib | DOI ] |
[Liu2015] |
Xiaoxiang Liu, Mi Zhou, and Peng Xu.
A Robust Method for Musical Note Recognition.
In 14th International Conference on Computer-Aided Design and
Computer Graphics, pages 212-213. Institute of Electrical and Electronics
Engineers Inc., 2015.
ISBN 9781467380201.
[ bib |
DOI ]
Musical note recognition plays a fundamental role in the process of the optical music recognition system. In this paper, we propose a robust method for recognizing notes. The method includes three parts: (1) the description of relationships between primitives by introducing the concept of interaction field, (2) the definition of six hierarchical structure features for analyzing notes structures, (3) the workflow of primitive assembly under the guidance of giving priority to key structure features. To evaluate the performance of our method, we present experimental results on real-life scores and comparisons with two commercial products. Experiment show that our method lead to quite good results, especially for complicated scores.
|
[Mehta2015] |
Apurva A. Mehta and Malay S. Bhatt.
Optical Music Notes Recognition for Printed Piano Music Score Sheet.
In International Conference on Computer Communication and
Informatics, Coimbatore, India, 2015.
ISBN 9781479968053.
[ bib |
DOI ]
Entertainment, Therapy and Education are the fields where music is always found in couple with homo-sapiens. Music is presented in various formats to us like aural, visual and one more - written form of music that is known very less to us. In a way music dominates our life. System discussed in this paper inputs music score written for piano music using modern staff notations as image. Segmentation is carried out using hierarchical decomposition using thresholding along with stave lines of score sheet. Segmented symbols are recognized through an established artificial neural network based on boosting approach. Recognized symbols are represented in an admissible way. System is capable enough of addressing very complex cases and validation is done over 53 songs available at various global music scores resources. Segmentation algorithms achieve accuracy of 99.12% and segmented symbols are recognized with prompt accuracy of 92.38% through the help of PCA and AdaBoost.
|
[Nguyen2015] | Tam Nguyen and Gueesang Lee. A Lightweight and Effective Music Score Recognition on Mobile Phones. Journal of Information Processing Systems, 11 (3): 438-449, 2015. [ bib | DOI ] |
[NotateMe] | Neuratron. NotateMe. http://www.neuratron.com/notateme.html, 2015. [ bib | .html ] |
[Novotny2015] |
Jiri Novotny and Jaroslav Pokorny.
Introduction to Optical Music Recognition: Overview and Practical
Challenges.
In Pokorny J. Necasky M., Moravec P., editor, Annual
International Workshop on DAtabases, TExts, Specifications and Objects,
pages 65-76. CEUR-WS, 2015.
[ bib |
.pdf ]
Music has been always an integral part of human culture. In our computer age, it is not surprising that there is a growing interest to store music in a digitized form. Optical music recognition (OMR) refers to a discipline that investigates music score recognition systems. This is similar to well-known optical character recognition systems, except OMR systems try to automatically transform scanned sheet music into a computer-readable format. In such a digital format, semantic information is also stored (instrumentation, notes, pitches and duration, contextual information, etc.). This article introduces the OMR field and presents an overview of the relevant literature and basic techniques. Practical challenges and questions arising from the automatic recognition of music notation and its semantic interpretation are discussed as well as the most important open issues.
|
[Pham2015] | Viet-Khoi Pham, Hai-Dang Nguyen, and Minh-Triet Tran. Virtual Music Teacher for New Music Learners with Optical Music Recognition. In International Conference on Learning and Collaboration Technologies, pages 415-426. Springer, 2015b. [ bib | DOI ] |
[Pham2015a] |
Van Khien Pham and Guee-Sang Lee.
Music Score Recognition Based on a Collaborative Model.
International Journal of Multimedia and Ubiquitous
Engineering, 10 (8): 379-390, 2015.
ISSN 1975-0080.
[ bib |
DOI ]
Recognition musical symbols are very important in music score system and they depend on these methods of researchers. Most of existing approaches for OMR (optical music recognition) removes staff lines before symbols are detected, therefore the symbols can get damaged easily. Another method recognizes symbols without staff line removal but all of them have a low accuracy rate and high processing time for recognizing symbols. In this paper, none staff removal and staff removal are suggested and these new methods are proposed to improve appreciation result of symbols. A lot of symbols are detected before deleted staff line as vertical lines, note head, pitch, beam, tail and then these staff lines are removed to identify other symbols using connected component. The proposed method is applied to the Samsung smart phone which embeds a high resolution camera. Experimental results show that the recognition rate is higher than existing methods and the computation time is reduced significantly.
|
[Pham2015b] |
Viet-Khoi Pham, Hai-Dang Nguyen, Tung-Anh Nguyen-Khac, and Minh-Triet Tran.
Apply lightweight recognition algorithms in optical music
recognition.
In 7th International Conference on Machine Vision. SPIE,
2015a.
ISBN 9781628415605.
[ bib |
DOI ]
The problems of digitalization and transformation of musical scores into machine-readable format are necessary to be solved since they help people to enjoy music, to learn music, to conserve music sheets, and even to assist music composers. However, the results of existing methods still require improvements for higher accuracy. Therefore, the authors propose lightweight algorithms for Optical Music Recognition to help people to recognize and automatically play musical scores. In our proposal, after removing staff lines and extracting symbols, each music symbol is represented as a grid of identical M â- N cells, and the features are extracted and classified with multiple lightweight SVM classifiers. Through experiments, the authors find that the size of 10 â- 12 cells yields the highest precision value. Experimental results on the dataset consisting of 4929 music symbols taken from 18 modern music sheets in the Synthetic Score Database show that our proposed method is able to classify printed musical scores with accuracy up to 99.56%.
|
[Ringwalt2015] | Dan Ringwalt, Roger Dannenberg, and Andrew Russell. Optical Music Recognition for Interactive Score Display. In Edgar Berdahl and Jesse T. Allison, editors, International Conference on New Interfaces for Musical Expression, pages 95-98, Baton Rouge, Louisiana, USA, 2015. The School of Music and the Center for Computation and Technology (CCT), Louisiana State University. ISBN 978-0-692-49547-6. [ bib | http ] |
[Ringwalt2015a] | Dan Ringwalt and Roger B. Dannenberg. Image Quality Estimation for Multi-Score OMR. In 16th International Society for Music Information Retrieval Conference, pages 17-23, 2015. ISBN 978-84-606-8853-2. [ bib | .pdf ] |
[Taele2015] |
Paul Taele, Laura Barreto, and Tracy Hammond.
Maestoso: An Intelligent Educational Sketching Tool for Learning
Music Theory.
In 27th Conference on Innovative Applications of Artificial
Intelligence, pages 3999-4005, Austin, Texas, 2015. AAAI Press.
ISBN 0-262-51129-0.
[ bib |
http ]
Learning music theory not only has practical benefits for musicians to write, perform, understand, and express music better, but also for both non-musicians to improve critical thinking, math analytical skills, and music appreciation. However, current external tools applicable for learning music theory through writing when human instruction is unavailable are either limited in feedback, lacking a written modality, or assuming already strong familiarity of music theory concepts. In this paper, we describe Maestoso, an educational tool for novice learners to learn music theory through sketching practice of quizzed music structures. Maestoso first automatically recognizes students' sketched input of quizzed concepts, then relies on existing sketch and gesture recognition techniques to automatically recognize the input, and finally generates instructor-emulated feedback. From our evaluations, we demonstrate that Maestoso performs reasonably well on recognizing music structure elements and that novice students can comfortably grasp introductory music theory in a single session.
|
[Wen2015] |
Cuihong Wen, Ana Rebelo, Jing Zhang, and Jamie dos Santos Cardoso.
A new optical music recognition system based on combined neural
network.
Pattern Recognition Letters, 58: 1-7, 2015.
ISSN 0167-8655.
[ bib |
DOI |
http ]
Abstract Optical music recognition (OMR) is an important tool to recognize a scanned page of music sheet automatically, which has been applied to preserving music scores. In this paper, we propose a new OMR system to recognize the music symbols without segmentation. We present a new classifier named combined neural network (CNN) that offers superior classification capability. We conduct tests on fifteen pages of music sheets, which are real and scanned images. The tests show that the proposed method constitutes an interesting contribution to OMR.
|
[Alirezazadeh2014] | Fatemeh Alirezazadeh and Mohammad Reza Ahmadzadeh. Effective staff line detection, restoration and removal approach for different quality of scanned handwritten music sheets. Journal of Advanced Computer Science & Technology, 3 (2): 136-142, 2014. [ bib | DOI ] |
[Bainbridge2014] |
David Bainbridge, Xiao Hu, and J. Stephen Downie.
A Musical Progression with Greenstone: How Music Content Analysis and
Linked Data is Helping Redefine the Boundaries to a Music Digital Library.
In 1st International Workshop on Digital Libraries for
Musicology. Association for Computing Machinery, 2014.
ISBN 9781450330022.
[ bib |
DOI ]
Despite the recasting of the web's technical capabilities through Web 2.0, conventional digital library software architectures-from which many of our leading Music Digital Libraries (MDLs) are formed-result in digital resources that are, surprisingly, disconnected from other online sources of information, and embody a "read-only" mindset. Leveraging from Music Information Retrieval (MIR) techniques and Linked Open Data (LOD), in this paper we demonstrate a new form of music digital library that encompasses management, discovery, delivery, and analysis of the musical content it contains. Utilizing open source tools such as Greenstone, audioDB, Meandre, and Apache Jena we present a series of transformations to a musical digital library sourced from audio files that steadily increases the level of support provided to the user for musicological study. While the seed for this work was motivated by better supporting musicologists in a digital library, the developed software architecture alters the boundaries to what is conventionally thought of as a digital library- and in doing so challenges core assumptions made in mainstream digital library software design. Copyright 2014 ACM.
|
[Bui2014] |
Hoang-Nam Bui, Iin-Seop Na, and Soo-Hyung Kim.
Staff Line Removal Using Line Adjacency Graph and Staff Line Skeleton
for Camera-Based Printed Music Scores.
In 22nd International Conference on Pattern Recognition, pages
2787-2789, 2014.
[ bib |
DOI ]
On camera-based music scores, curved and uneven staff-lines tend to incur more frequently, and with the loss in performance of binarization methods, line thickness variation and space variation between lines are inevitable. We propose a novel and effective staff-line removal method based on following 3 main ideas. First, the state-of-the-art staff-line detection method, Stable Path, is used to extract staff-line skeletons of the music score. Second, a line adjacency graph (LAG) model is exploited in a different manner of over segmentation to cluster pixel runs generated from the run-length encoding (RLE) of the image. Third, a two-pass staff-line removal pipeline called filament filtering is applied to remove clusters lying on the staff-line. Our method shows impressive results on music score images captured from cameras, and gives high performance when applied to the ICDAR/GREC 2013 database.
|
[Calvo-Zaragoza2014] |
Jorge Calvo-Zaragoza and Jose Oncina.
Recognition of Pen-Based Music Notation: The HOMUS Dataset.
In 22nd International Conference on Pattern Recognition, pages
3038-3043. Institute of Electrical & Electronics Engineers (IEEE), 2014.
[ bib |
DOI ]
A profitable way of digitizing a new musical composition is by using a pen-based (online) system, in which the score is created with the sole effort of the composition itself. However, the development of such systems is still largely unexplored. Some studies have been carried out but the use of particular little datasets has led to avoid objective comparisons between different approaches. To solve this situation, this work presents the Handwritten Online Musical Symbols (HOMUS) dataset, which consists of 15200 samples of 32 types of musical symbols from 100 different musicians. Several alternatives of recognition for the two modalities -online, using the strokes drawn by the pen, and offline, using the image generated after drawing the symbol- are also presented. Some experiments are included aimed to draw main conclusions about the recognition of these data. It is expected that this work can establish a binding point in the field of recognition of online handwritten music notation and serve as a baseline for future developments.
|
[Chanda2014] | Sukalpa Chanda, Debleena Das, Umapada Pal, and Fumitaka Kimura. Offline Hand-Written Musical Symbol Recognition. 14th International Conference on Frontiers in Handwriting Recognition, pages 405-410, 2014. [ bib | DOI | http ] |
[Chen2014] |
Gen-Fang Chen and Jia-Shing Sheu.
An optical music recognition system for traditional Chinese Kunqu
Opera scores written in Gong-Che Notation.
EURASIP Journal on Audio, Speech, and Music Processing, 2014
(1): 7, 2014.
ISSN 1687-4722.
[ bib |
DOI ]
This paper presents an optical music recognition (OMR) system to process the handwritten musical scores of Kunqu Opera written in Gong-Che Notation (GCN). First, it introduces the background of Kunqu Opera and GCN. Kunqu Opera is one of the oldest forms of musical activity, spanning the sixteenth to eighteenth centuries, and GCN has been the most popular notation for recording musical works in China since the seventh century. Many Kunqu Operas that use GCN are available as original manuscripts or photocopies, and transforming these versions into a machine-readable format is a pressing need. The OMR system comprises six stages: image pre-processing, segmentation, feature extraction, symbol recognition, musical semantics, and musical instrument digital interface (MIDI) representation. This paper focuses on the symbol recognition stage and obtains the musical information with Bayesian, genetic algorithm, and K-nearest neighbor classifiers. The experimental results indicate that symbol recognition for Kunqu Opera's handwritten musical scores is effective. This work will help to preserve and popularize Chinese cultural heritage and to store Kunqu Opera scores in a machine-readable format, thereby ensuring the possibility of spreading and performing original Kunqu Opera musical scores.
|
[Chen2014a] | Liang Chen, Rong Jin, and Christopher Raphael. Optical Music Recognition with Human Labeled Constraints. In CHI'14 Workshop on Human-Centred Machine Learning, Toronto, Canada, 2014. [ bib | .pdf ] |
[Church2014] | Maura Church and Michael Scott Cuthbert. Improving Rhythmic Transcriptions via Probability Models Applied Post-OMR. In Hsin-Min Wang, Yi-Hsuan Yang, and Jin Ha Lee, editors, 15th International Society for Music Information Retrieval Conference, pages 643-648, 2014. [ bib | .pdf ] |
[Ding2014] |
Ing-Jr Ding, Chih-Ta Yen, Che-Wei Chang, and He-Zhong Lin.
Optical music recognition of the singer using formant frequency
estimation of vocal fold vibration and lip motion with interpolated GMM
classifiers.
Journal of Vibroengineering, 16 (5): 2572-2581, 2014.
ISSN 1392-8716.
[ bib |
http ]
The main work of this paper is to identify the musical genres of the singer by performing the optical detection of lip motion. Recently, optical music recognition has attracted much attention. Optical music recognition in this study is a type of automatic techniques in information engineering, which can be used to determine the musical style of the singer. This paper proposes a method for optical music recognition where acoustic formant analysis of both vocal fold vibration and lip motion are employed with interpolated Gaussian mixture model (GMM) estimation to perform musical genre classification of the singer. The developed approach for such classification application is called GMM-Formant. Since humming and voiced speech sounds cause periodic vibrations of the vocal folds and then the corresponding motion of the lip, the proposed GMM-Formant firstly operates to acquire the required formant information. Formant information is important acoustic feature data for recognition classification. The proposed GMM-Formant method then uses linear interpolation for combining GMM likelihood estimates and formant evaluation results appropriately. GMM-Formant will effectively adjust the estimated formant feature evaluation outcomes by referring to certain degree of the likelihood score derived from GMM calculations. The superiority and effectiveness of presented GMM-Formant are demonstrated by a series of experiments on musical genre classification of the singer.
|
[Fornes2014] |
Alicia Fornés, Van Cuong Kieu, Muriel Visani, Nicholas Journet, and Anjan
Dutta.
The ICDAR/GREC 2013 Music Scores Competition: Staff Removal.
In Bart Lamiroy and Jean-Marc Ogier, editors, Graphics
Recognition. Current Trends and Challenges, pages 207-220, Berlin,
Heidelberg, 2014. Springer Berlin Heidelberg.
ISBN 978-3-662-44854-0.
[ bib |
http ]
The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant's methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations.
|
[Fujinaga2014] | Ichiro Fujinaga, Andrew Hankinson, and Julie E. Cumming. Introduction to SIMSSA (Single Interface for Music Score Searching and Analysis). In 1st International Workshop on Digital Libraries for Musicology, pages 1-3. ACM, 2014. [ bib | DOI ] |
[Fujinaga2014a] | Ichiro Fujinaga and Andrew Hankinson. SIMSSA: Single Interface for Music Score Searching and Analysis. Journal of the Japanese Society for Sonic Arts, 6 (3): 25-30, 2014. [ bib | .pdf ] |
[Galea2014] | Dan Gâlea, Florin Rotaru, Silviu-Ioan Bejinariu, Mihai Bulea, Dan Murgu, Simona Pescaru, Vasile Apopei, Mihaela Murgu, and Irina Rusu. A review on printed music recognition system developed in institute of computer science iasi. Technical Report Lxiv, Universitatea Tehnica Gheorghe Asachi din Iasi, 2014. [ bib | .pdf ] |
[Geraud2014] |
Thierry Géraud.
A morphological method for music score staff removal.
In International Conference on Image Processing, pages
2599-2603. Institute of Electrical and Electronics Engineers Inc., 2014.
ISBN 9781479957514.
[ bib |
DOI ]
Removing the staff in music score images is a key to improve the recognition of music symbols and, with ancient and degraded handwritten music scores, it is not a straightforward task. In this paper we present the method that has won in 2013 the staff removal competition, organized at the International Conference on Document Analysis and Recognition (ICDAR). The main characteristics of this method is that it essentially relies on mathematical morphology filtering. So it is simple, fast, and its full source code is provided to favor reproducible research. © 2014 IEEE.
|
[Han2014] | Sejin Han and Gueesang Lee. Optical Music Score Recognition System for Smart Mobile Devices. International Journal of Contents, 10 (4): 63-68, 2014. [ bib | DOI ] |
[Hankinson2014] | Andrew Hankinson. Optical music recognition infrastructure for large-scale music document analysis. PhD thesis, McGill University, 2014. [ bib | http ] |
[Helsen2014] | Kate Helsen, Jennifer Bain, Ichiro Fujinaga, Andrew Hankinson, and Debra Lacoste. Optical music recognition and manuscript chant sources. Early Music, 42 (4): 555-558, 2014. [ bib | DOI ] |
[Homenda2014] |
Wladyslaw Homenda and Wojciech Lesinski.
Decision trees and their families in imbalanced pattern recognition:
Recognition with and without Rejection.
Lecture Notes in Computer Science, 8838: 219-230, 2014.
ISSN 0302-9743.
[ bib |
DOI ]
Decision trees are considered to be among the best classifiers. In this work we use decision trees and its families to the problem of imbalanced data recognition. Considered are aspects of recognition without rejection and with rejection: it is assumed that all recognized elements belong to desired classes in the first case and that some of them are outside of such classes and are not known at classifiers training stage. The facets of imbalanced data and recognition with rejection affect different real world problems. In this paper we discuss results of experiment of imbalanced data recognition on the case study of music notation symbols. Decision trees and three methods of joining decision trees (simple voting, bagging and random forest) are studied. These methods are used for recognition without and with rejection. © IFIP International Federation for Information Processing 2014.
|
[Jastrzebska2014] |
Agnieszka Jastrzebska and Wojciech Lesinski.
Optical Music Recognition as the Case of Imbalanced Pattern
Recognition: A Study of Complex Classifiers.
In International Conference on Systems Science 2013, pages
325-335. Springer International Publishing, Cham, 2014.
ISBN 978-3-319-01857-7.
[ bib |
DOI ]
The article is focused on a particular aspect of classification, namely the imbalance of recognized classes. Imbalanced data adversely affects the recognition ability and requires proper classifier's construction. The aim of presented study is to explore the capabilities of classifier combining methods with such raised problem. In this paper authors discuss results of experiment of imbalanced data recognition on the case study of music notation symbols. Applied classification methods include: simple voting method, bagging and random forest.
|
[Jastrzebski2014] | Krzysztof Jastrzebski. OMR for sheet music digitization. Master's thesis, Politechnika Wroclawska, 2014. [ bib | .pdf ] |
[Kiriella2014] |
Dawpadee B. Kiriella, Shyama C. Kumari, Kavindu C. Ranasinghe, and Lakshman
Jayaratne.
Music Training Interface for Visually Impaired through a Novel
Approach to Optical Music Recognition.
GSTF Journal on Computing, 3 (4): 45, 2014.
ISSN 2010-2283.
[ bib |
DOI ]
Some inherited barriers which limits the human abilities can be surprisingly win through technology. This research focuses on defining a more reliable and a controllable interface for visually impaired people to read and study eastern music notations which are widely available in printed format. One of another concept behind was that differently-abled people should be assisted in a way which they can proceed interested tasks in an independent way. The research provide means to continue on researching the validity of using a controllable auditory interface instead using Braille music scripts converted with the help of 3rd parties. The research further summarizes the requirements aroused by the relevant users, design considerations, evaluation results on user feedbacks of proposed interface.
|
[Kodirov2014] |
Elyor Kodirov, Sejin Han, Guee-Sang Lee, and YoungChul Kim.
Music with Harmony: Chord Separation and Recognition in Printed Music
Score Images.
In 8th International Conference on Ubiquitous Information
Management and Communication, pages 1-8, Siem Reap, Cambodia, 2014. ACM.
ISBN 978-1-4503-2644-5.
[ bib |
DOI ]
Optical music recognition systems are in the general interest recently. These systems achieve accurate symbol recognition at some level. However, chords are not considered in these systems yet they play a role in music. Therefore, we aimed to develop an algorithm that can deal with separation and recognition of chords in music score images. Separation is necessary because the chords can be touched, overlapped or/and broken due to noise and other reasons. By considering these problems, we propose top-down based separation using domain information and characteristics of the chords. To handle recognition, we propose a modified zoning method with k-nearest neighbor classifier. Also, we analyzed several classifiers with different features to see which method is reliable for the chord recognition. Since this topic is not considered with special focus before, there is not a standard benchmark to evaluate performance of the algorithm. Thus, we introduce a new dataset, namely OMR-ChSR6306, which includes a wide range of chords such as single chords, touched chords, and overlapped chords. Experiments on the proposed dataset demonstrate that our algorithm can separate and recognize the chords, with 100 and 98.98% recognition accuracy respectively.
|
[Kusakunniran2014] | Worapan Kusakunniran, Attapol Prempanichnukul, Arthid Maneesutham, Kullachut Chocksawud, Suparus Tongsamui, and Kittikhun Thongkanchorn. Optical music recognition for traditional Thai sheet music. In International Computer Science and Engineering Conference, pages 157-162. IEEE, 2014. [ bib | DOI ] |
[Mehta2014] | Apurva Ashokbhai Mehta and Malay S. Bhatt. Practical Issues in the Field of Optical Music Recognition. International Journal of Advance Research in Computer Science and Management Studies, 2 (1): 513-518, 2014. ISSN 2321-7782. Dubious Journal. [ bib | .pdf ] |
[Montagner2014] |
Igor dos Santos Montagner, Roberto Jr. Hirata, and Nina S. T. Hirata.
Learning to remove staff lines from music score images.
In International Conference on Image Processing, pages
2614-2618, 2014a.
[ bib |
DOI ]
The methods for removal of staff lines rely on characteristics specific to musical documents and they are usually not robust to some types of imperfections in the images. To overcome this limitation, we propose the use of binary morphological operator learning, a technique that estimates a local operator from a set of example images. Experimental results in both synthetic and real images show that our approach can adapt to different types of deformations and achieves similar or better performance than existing methods in most of the test scenarios.
|
[Montagner2014a] |
Igor dos Santos Montagner, Roberto Jr. Hirata, and Nina S. T. Hirata.
A Machine Learning based method for Staff Removal.
In 22nd International Conference on Pattern Recognition, pages
3162-3167. Institute of Electrical and Electronics Engineers Inc.,
2014b.
ISBN 9781479952083.
[ bib |
DOI ]
Staff line removal is an important pre-processing step to convert content of music score images to machine readable formats. Many heuristic algorithms have been proposed for staff removal and recently a competition was organized in the 2013 ICDAR/GREC conference. Music score images are often subject to different deformations and variations, and existing algorithms do not work well for all cases. We investigate the application of a machine learning based method for the staff removal problem. The method consists in learning multiple image operators from training input-output pairs of images and then combining the results of these operators. Each operator is based on local information provided by a neighborhood window, which is usually manually chosen based on the content of the images. We propose a feature selection based approach for automatically defining the windows and also for combining the operators. The performance of the proposed method is superior to several existing methods and is comparable to the best method in the competition. © 2014 IEEE.
|
[Ng2014] | Kia Ng, Alex McLean, and Alan Marsden. Big Data Optical Music Recognition with Multi Images and Multi Recognisers. In EVA London 2014 on Electronic Visualisation and the Arts, pages 215-218. BCS, 2014. [ bib | DOI | .pdf ] |
[Nguyen2014] |
Hong Quy Nguyen, Hyung-Jeong Yang, Soo-Hyung Kim, and Guee-Sang Lee.
Automatic Touching Detection and Recognition of Music Chord Using
Auto-encoding and Softmax.
In 8th International Conference on Ubiquitous Information
Management and Communication, Siem Reap, 2014. Association for Computing
Machinery.
[ bib |
DOI ]
Humankind envisioned an age of automatic where many machines perform all cumbersome and tedious tasks and we just enjoy. Playing music is not a tedious work but a program that plays music from music sheet image automatically can increase productivity of musician or bring convenience to amateurs. Following its requirement, we studied a specific task in Optical Music Recognition problem that is touching chord. Specially, touching chord becomes a critical problem on mobile device captured image because of some objective conditions. In this paper we showed our proposed method which used Autoencoder and Softmax classifier. The experiment results showed that our method is very promising. We get 94.117 96.261% in separate phase.
|
[Nhat2014] |
Vo Quang Nhat and GueeSang Lee.
Adaptive Line Fitting for Staff Detection in Handwritten Music Score
Images.
In 8th International Conference on Ubiquitous Information
Management and Communication, pages 991-996, Siem Reap, Cambodia, 2014.
ACM.
ISBN 978-1-4503-2644-5.
[ bib |
DOI ]
The target of staff line detection is to extract staff lines accurately in order to remove them while preserves the shape of musical symbols. There are several researches in staff line detection and removal which provide good results with printed scores. However, in case of handwritten music scores, detecting staff lines still has problems due to the diversity of musical symbol shape, line curvature and disconnection. In this paper, we present a novel line fitting method for detecting the staff line in handwritten music score images. Our method first starts with the estimation of staff line height and staff space height. Then the staff segments are selected. Based on these staff candidates, we construct a line with the orientation of the staff segment and gradually fit it to the real lines. The staff line is then removed and the process is continuing until no line is detected. To show the effectiveness of our proposed method with different types of handwritten music score, images from the ICDAR/GREC 2013 dataset are tested. The experiment results show the advantages of our algorithm comparing with the previous approaches. Copyright 2014 ACM.
|
[Padilla2014] |
Victor Padilla, Alan Marsden, Alex McLean, and Kia Ng.
Improving OMR for Digital Music Libraries with Multiple Recognisers
and Multiple Sources.
In 1st International Workshop on Digital Libraries for
Musicology, pages 1-8, London, United Kingdom, 2014. ACM.
ISBN 978-1-4503-3002-2.
[ bib |
DOI ]
Large quantities of scanned music are now available in public digital music libraries. However, the information in such sources is represented as pixel data in images rather than symbolic information about the notes of a piece of music, and therefore it is opaque to musically meaningful computational processes (e.g., to search for a particular melodic pattern). Optical Music Recognition (Optical Character Recognition for music) holds out the prospect of a solution to this issue and allowing access to very large quantities of musical information in digital libraries. Despite the efforts made by the different commercial OMR developers to improve the accuracy of their systems, mistakes in the output are currently too frequent to make OMR a practical tool for bulk processing. One possibility for improving the accuracy of OMR is to use multiple recognisers and combine the results to achieve an output better than each of them individually. The general process presented here can be divided into three subtasks, S1, S2, and S3. S1 is focused in the correction of rhythmical errors at bar level, counting the errors of the different OMR outputs, establish a ranking of the results, and make a pairwise alignment to select the best measures. S2 is based on the alignment and voting of individual symbols. For this task we have implemented a conversion of the most important symbols to a simple grammar. Finally, S3 improves the output of S2 by comparing and adding symbols from S1 and detecting gaps through the alignment of wrong measures. The process described in this paper is part of our "Big Data Approach" where a large amount of data is available in music score libraries, such as the International Music Score Library Project (IMSLP), for the purpose of Music Information Retrieval (MIR).
|
[Ramirez2014] |
Carolina Ramirez and Jun Ohya.
Automatic Recognition of Square Notation Symbols in Western
Plainchant Manuscripts.
Journal of New Music Research, 43 (4): 390-399, 2014.
ISSN 0929-8215.
[ bib |
DOI ]
Abstract: While the Optical Music Recognition (OMR) of printed and handwritten music scores in modern standard notation has been broadly studied, this is not the case for early music manuscripts. This is mainly due to the high variability in the sources introduced by their severe physical degradation, the lack of notation standards and, in the case of the scanned versions, by non-homogenous image-acquisition protocols. The volume of early musical manuscripts available is considerable, and therefore we believe that computational methods can be extremely useful in helping to preserve, share and analyse this information. This paper presents an approach to recognizing handwritten square musical notation in degraded western plainchant manuscripts from the XIVth to XVIth centuries. We propose the use of image processing techniques that behave robustly under high data variability and which do not require strong hypotheses regarding the condition of the sources. The main differences from traditional OMR approaches are our avoidance of the staff line removal stage and the use of grey-level images to perform primitive segmentation and feature extraction. We used 136 images from the Digital Scriptorium repository (DS, 2007), from which we were able to extract over 90 of all symbols present. For symbol classification, we used gradient-based features and SVM classifiers, obtaining over 90 over eight basic symbol classes.
|
[Saitis2014] | Charalampos Saitis, Andrew Hankinson, and Ichiro Fujinaga. Correcting Large-Scale OMR Data with Crowdsourcing. In 1st International Workshop on Digital Libraries for Musicology, pages 1-3. ACM, 2014. [ bib | DOI ] |
[Stramer2014] | Tal Stramer. Digitizing sheet music. Technical report, Stanford University, 2014. [ bib | .pdf ] |
[Vo2014] |
Quang Nhat Vo, Tam Nguyen, Soo-Hyung Kim, Hyung-Jeong Yang, and Guee-Sang Lee.
Distorted music score recognition without Staffline removal.
In 22nd International Conference on Pattern Recognition, pages
2956-2960. Institute of Electrical and Electronics Engineers Inc., 2014.
ISBN 9781479952083.
[ bib |
DOI |
http ]
This paper proposes a new approach for recognizing the primitive musical symbols in distorted music scores without the staff line removal. We try to overcome two main issues. The first problem is the difficult and unreliable removal of staff lines required as a pre-processing step for most of recognition systems. The second problem is the non-linear distortion of the music score images captured by digital cameras. At the beginning, we detect the locations of bar-lines on each staff and segment it into sub-areas which can be rectified into undistorted shapes by biquadratic transformation. Then, musical rules, template matching, run length coding and projection methods are employed to extract the musical note information without the application of staff removal. The proposed method is implemented on smart phones and shows promising results. © 2014 IEEE.
|
[Wallner2014] | Matthias Wallner. A System for Optical Music Recognition and Audio Synthesis. Master's thesis, TU Wien, 2014. [ bib | .pdf ] |
[Wen2014] |
Cuihong Wen, Ana Rebelo, Jing Zhang, and Jamie dos Santos Cardoso.
Classification of optical music symbols based on combined neural
network.
In International Conference on Mechatronics and Control, pages
419-423, 2014.
[ bib |
DOI ]
In this paper, a new method for music symbol classification named Combined Neural Network (CNN) is proposed. Tests are conducted on more than 9000 music symbols from both real and scanned music sheets, which show that the proposed technique offers superior classification capability. At the same time, the performance of the new network is compared with the single Neural Network (NN) classifier using the same music scores. The average classification accuracy increased more than ten percent, reaching 98.82%.
|
[Chen2013] | Yung-Sheng Chen, Feng-Sheng Chen, and Chin-Hung Teng. An Optical Music Recognition System for Skew or Inverted Musical Scores. International Journal of Pattern Recognition and Artificial Intelligence, 27 (07), 2013. [ bib | DOI ] |
[Fornes2013] |
Alicia Fornés, Anjan Dutta, Albert Gordo, and Josep Lladós.
The 2012 Music Scores Competitions: Staff Removal and Writer
Identification.
In Young-Bin Kwon and Jean-Marc Ogier, editors, Graphics
Recognition. New Trends and Challenges, pages 173-186, Berlin, Heidelberg,
2013. Springer Berlin Heidelberg.
ISBN 978-3-642-36824-0.
[ bib |
DOI ]
Since there has been a growing interest in the analysis of handwritten music scores, we have tried to foster this interest by proposing in ICDAR and GREC two different competitions: Staff removal and Writer identification. Both competitions have been tested on the CVC-MUSCIMA database of handwritten music score images. In the corresponding ICDAR publication, we have described the ground-truth, the evaluation metrics, the participants' methods and results. As a result of the discussions with attendees in ICDAR and GREC concerning our music competition, we decided to propose a new experiment for an extended competition. Thus, this paper is focused on this extended competition, describing the new set of images and analyzing the new results.
|
[Gordo2013] |
Albert Gordo, Alicia Fornés, and Ernest Valveny.
Writer identification in handwritten musical scores with bags of
notes.
Pattern Recognition, 46 (5): 1337-1345, 2013.
ISSN 0031-3203.
[ bib |
DOI |
http ]
Writer Identification is an important task for the automatic processing of documents. However, the identification of the writer in graphical documents is still challenging. In this work, we adapt the Bag of Visual Words framework to the task of writer identification in handwritten musical scores. A vanilla implementation of this method already performs comparably to the state-of-the-art. Furthermore, we analyze the effect of two improvements of the representation: a Bhattacharyya embedding, which improves the results at virtually no extra cost, and a Fisher Vector representation that very significantly improves the results at the cost of a more complex and costly representation. Experimental evaluation shows results more than 20 points above the state-of-the-art in a new, challenging dataset.
|
[Hankinson2013] | Andrew Hankinson and Ichiro Fujinaga. Using optical music recognition to navigate and retrieve music documents. In Conference of the International Association of Music Libraries, Vienna, Austria, 2013. [ bib | .pdf ] |
[Malik2013] |
Rakesh Malik, Partha Pratim Roy, Umapada Pal, and Fumitaka Kimura.
Handwritten Musical Document Retrieval Using Music-Score Spotting.
In 12th International Conference on Document Analysis and
Recognition, pages 832-836, 2013.
[ bib |
DOI ]
In this paper, we present a novel approach for retrieval of handwritten musical documents using a query sequence/word of musical scores. In our algorithm, the musical score-words are described as sequences of symbols generated from a universal codebook vocabulary of musical scores. Staff lines are removed first from musical documents using structural analysis of staff lines and symbol codebook vocabulary is created in offline. Next, using this symbol codebook the music symbol information in each document image is encoded. Given a query sequence of musical symbols in a musical score-line, the symbols in the query are searched in each of these encoded documents. Finally, a sub-string matching algorithm is applied to find query words. For codebook, two different feature extraction methods namely: Zernike Moments and 400 dimensional gradient features are tested and two unsupervised classifiers using SOM and K-Mean are evaluated. The results are compared with a baseline approach of DTW. The performance is measured on a collection of handwritten musical documents and results are promising.
|
[Pugin2013] | Laurent Pugin and Tim Crawford. Evaluating OMR on the Early Music Online Collection. In Alceu de Souza Britto Jr., Fabien Gouyon, and Simon Dixon, editors, 14th International Society for Music Information Retrieval Conference, pages 439-444, Curitiba, Brazil, 2013. [ bib | .pdf ] |
[Raphael2013] | Christopher Raphael and Rong Jin. Optical music recognition on the international music score library project. In IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 2013. [ bib | DOI ] |
[Rebelo2013] | Ana Rebelo, André Marçal, and Jamie dos Santos Cardoso. Global constraints for syntactic consistency in OMR: an ongoing approach. In International Conference on Image Analysis and Recognition, 2013. [ bib | .pdf ] |
[Rebelo2013a] |
Ana Rebelo and Jamie dos Santos Cardoso.
Staff Line Detection and Removal in the Grayscale Domain.
In 12th International Conference on Document Analysis and
Recognition, pages 57-61, 2013.
[ bib |
DOI ]
The detection of staff lines is the first step of most Optical Music Recognition (OMR) systems. Its great significance derives from the ease with which we can then proceed with the extraction of musical symbols. All OMR tasks are usually achieved using binary images by setting thresholds that can be local or global. These techniques however, may remove relevant information of the music sheet and introduce artifacts which will degrade results in the later stages of the process. It arises therefore a need to create a method that reduces the loss of information due to the binarization. The baseline for the methodology proposed in this paper follows the shortest path algorithm proposed in [CardosoTPAMI08]. The concept of strong staff pixels (SSP's), which is a set of pixels with a high probability of belonging to a staff line, is proposed to guide the cost function. The SSP allows to overcome the results of the binary based detection and to generalize the binary framework to grayscale music scores. The proposed methodology achieves good results.
|
[Sapp2013] | Craig Sapp. OMR Comparison of SmartScore and SharpEye. https://ccrma.stanford.edu/~craig/mro-compare-beethoven, 2013. [ bib | http ] |
[Silva2013] | Rui Miguel Filipe da Silva. Mobile framework for recognition of musical characters. Master's thesis, Universidade do Porto, 2013. [ bib | .pdf ] |
[Tambouratzis2013] |
Tatiana Tambouratzis.
The Digital Music Stand as a Minimal Processing Custom-Made Optical
Music Recognition System, Part 1: Key Music Symbol Recognition.
International Journal of Intelligent Systems, 28 (5):
474-504, 2013.
ISSN 0884-8173.
[ bib |
DOI ]
The digital music stand is proposed as a minimal-processing optical music recognition implementation, where music score (MS) presentation is realized without prior alignment, noise, or staff line removal. After each MS page is segmented into systems, staves, measures, and candidate music symbols, music symbol recognition is accomplished via probabilistic neural networks: Only the key music symbols (namely clefs, global accidentals, time signatures) of the MS are identified, while the remaining music symbols are generally classified. Subsequently, satisfactory quality of on-screen MS viewing is accomplished via the concatenation and/or substitution of appropriately selected parts and isolated music symbols of the original MS. In this piece of research, the processing stages leading to on-screen MS presentation are detailed. © 2013 Wiley Periodicals, Inc.
|
[Timofte2013] |
Radu Timofte and Luc Van Gool.
Automatic Stave Discovery for Musical Facsimiles.
In Kyoung Mu Lee, Yasuyuki Matsushita, James M. Rehg, and Zhanyi Hu,
editors, Computer Vision - ACCV 2012, pages 510-523, Berlin,
Heidelberg, 2013. Springer Berlin Heidelberg.
ISBN 978-3-642-37447-0.
[ bib |
DOI ]
Lately, there is an increased interest in the analysis of music score facsimiles, aiming at automatic digitization and recognition. Noise, corruption, variations in handwriting, non-standard page layouts and notations are common problems affecting especially the centuries-old manuscripts.
|
[Vigliensoni2013] | Gabriel Vigliensoni, Gregory Burlet, and Ichiro Fujinaga. Optical measure recognition in common music notation. In 14th International Society for Music Information Retrieval Conference, Curitiba, Brazil, 2013. [ bib | .pdf ] |
[Visaniy2013] |
Muriel Visaniy, V.C. Kieu, Alicia Fornés, and Nicholas Journet.
The ICDAR 2013 Music Scores Competition: Staff Removal.
In 12th International Conference on Document Analysis and
Recognition, pages 1407-1411, 2013.
[ bib |
DOI ]
The first competition on music scores that was organized at ICDAR in 2011 awoke the interest of researchers, who participated both at staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario: old music scores. For this purpose, we have generated a new set of images using two kinds of degradations: local noise and 3D distortions. This paper describes the dataset, distortion methods, evaluation metrics, the participant's methods and the obtained results.
|
[Witt2013] |
Carl Witt.
Optical Music Recognition Symbol Detection using Contour Traces,
2013.
[ bib ]
A novel approach to symbol detection in optical music recognition is presented. The binarized image of a scanned score is transformed into an intermediate representation by computing its contours and assigning additional visual features to them. The resulting contour points are accessed via a high dimensional spatial index that aids a heuristic search to detect a given symbol as described by a template image. An automatic and a manual method for generating ground truth data are presented, amongst other web-based tools to evaluate and supervise the recognition process.
|
[Baba2012] | Tetsuaki Baba, Yuya Kikukawa, Toshiki Yoshiike, Tatsuhiko Suzuki, Rika Shoji, Kumiko Kushiyama, and Makoto Aoki. Gocen: A Handwritten Notational Interface for Musical Performance and Learning Music. In ACM SIGGRAPH 2012 Emerging Technologies, pages 9-9, New York, USA, 2012. ACM. ISBN 978-1-4503-1680-4. [ bib | DOI ] |
[Burlet2012] | Gregory Burlet, Alastair Porter, Andrew Hankinson, and Ichiro Fujinaga. Neon.js: Neume Editor Online. In 13th International Society for Music Information Retrieval Conference, pages 121-126, Porto, Portugal, 2012. [ bib | .pdf ] |
[Fornes2012] |
Alicia Fornés, Anjan Dutta, Albert Gordo, and Josep Lladós.
CVC-MUSCIMA: A Ground-truth of Handwritten Music Score Images for
Writer Identification and Staff Removal.
International Journal on Document Analysis and Recognition, 15
(3): 243-251, 2012.
ISSN 1433-2825.
[ bib |
DOI ]
The analysis of music scores has been an active research field in the last decades. However, there are no publicly available databases of handwritten music scores for the research community. In this paper, we present the CVC-MUSCIMA database and ground truth of handwritten music score images. The dataset consists of 1,000 music sheets written by 50 different musicians. It has been especially designed for writer identification and staff removal tasks. In addition to the description of the dataset, ground truth, partitioning, and evaluation metrics, we also provide some baseline results for easing the comparison between different approaches.
|
[Hankinson2012] | Andrew Hankinson, John Ashley Burgoyne, Gabriel Vigliensoni, Alastair Porter, Jessica Thompson, Wendy Liu, Remi Chiu, and Ichiro Fujinaga. Digital Document Image Retrieval Using Optical Music Recognition. In Fabien Gouyon, Perfecto Herrera, Luis Gustavo Martins, and Meinard Müller, editors, 13th International Society for Music Information Retrieval Conference, pages 577-582, 2012b. [ bib | .pdf ] |
[Hankinson2012a] | Andrew Hankinson, John Ashley Burgoyne, Gabriel Vigliensoni, and Ichiro Fujinaga. Creating a Large-scale Searchable Digital Collection from Printed Music Materials. In 21st International Conference on World Wide Web, pages 903-908, Lyon, France, 2012a. ACM. ISBN 978-1-4503-1230-1. [ bib | DOI ] |
[Hankinson2012b] | Andrew Hankinson and Ichiro Fujinaga. SIMSSA: Single Interface for Music Score Searching and Analysis. In Conference of the International Association of Music Libraries, Montréal, QC, 2012. [ bib | .pdf ] |
[Hankinson2012c] | Andrew Hankinson. Optical Music Recognition Bibliography. http://ddmal.music.mcgill.ca/research/omr/omr_bibliography, 2012. [ bib | http ] |
[Jin2012] | Rong Jin and Christopher Raphael. Interpreting Rhythm in Optical Music Recognition. In Fabien Gouyon, Perfecto Herrera, Luis Gustavo Martins, and Meinard Müller, editors, 13th International Society for Music Information Retrieval Conference, pages 151-156, Porto, Portugal, 2012. [ bib | .pdf ] |
[Liu2012] |
Xiaoxiang Liu.
Note Symbol Recognition for Music Scores.
In Jeng-Shyang Pan, Shyi-Ming Chen, and Ngoc Thanh Nguyen, editors,
Intelligent Information and Database Systems, pages 263-273, Berlin,
Heidelberg, 2012. Springer Berlin Heidelberg.
ISBN 978-3-642-28490-8.
[ bib |
http ]
Note symbol recognition plays a fundamental role in the process of an OMR system. In this paper, we propose new approaches for recognizing notes by extracting primitives and assembling them into constructed symbols. Firstly, we propose robust algorithms for extracting primitives (stems, noteheads and beams) based on Run-Length Encoding. Secondly, introduce the concept of interaction field to describe the relationship between primitives, and define six hierarchical categories for the structure of notes. Thirdly, propose an effective sequence to assemble the primitives into notes, guided by the mechanism of giving priority to the key structures. To evaluate the performance of those approaches,wepresent experimental results on real-life scores and comparisons with commercial systems. The results show our approaches can recognize notes with high-accuracy and powerful adaptability, especially for the complicated scores with high density of symbols.
|
[Low2012] | Grady Low and Yung-Ho Chang. Optical Music Recognition Application, 2012. [ bib | .pdf ] |
[Luangnapa2012] | Nawapon Luangnapa, Thongchai Silpavarangkura, Chakarida Nukoolkit, and Pornchai Mongkolnam. Optical Music Recognition on Android Platform. In International Conference on Advances in Information Technology, pages 106-115. Springer, 2012. [ bib | DOI ] |
[Rebelo2012] | Ana Rebelo, Ichiro Fujinaga, Filipe Paszkiewicz, Andre R.S. Marcal, Carlos Guedes, and Jamie dos Santos Cardoso. Optical music recognition: state-of-the-art and open issues. International Journal of Multimedia Information Retrieval, 1 (3): 173-190, 2012. [ bib | DOI ] |
[Rebelo2012a] | Ana Rebelo. Robust Optical Recognition of Handwritten Musical Scores based on Domain Knowledge. PhD thesis, University of Porto, 2012. [ bib | .pdf ] |
[Sebastien2012] | Véronique Sébastien, Henri Ralambondrainy, Olivier Sébastien, and Noël Conruyt. Score Analyzer: Automatically Determining Scores Difficulty Level for Instrumental e-Learning. In Fabien Gouyon, Perfecto Herrera, Luis Gustavo Martins, and Meinard Müller, editors, 13th International Society for Music Information Retrieval Conference, pages 571-576, Porto, Portugal, 2012. [ bib | .pdf ] |
[Su2012] |
Bolan Su, Shijian Lu, Umapada Pal, and Chew Lim Tan.
An effective staff detection and removal technique for musical
documents.
In 10th International Workshop on Document Analysis Systems,
pages 160-164. IEEE, 2012.
ISBN 9780769546612.
[ bib |
DOI ]
Abstract Musical staff line detection and removal techniques detect the staff positions in musical documents and segment musical score from musical documents by removing those staff lines. It is an important preprocessing step for ensuing the Optical Music Recognition ...
|
[Tsandilas2012] | Theophanis Tsandilas. Interpreting Strokes on Paper with a Mobile Assistant. In 25th Annual ACM Symposium on User Interface Software and Technology, pages 299-308, Cambridge, Massachusetts, USA, 2012. ACM. ISBN 978-1-4503-1580-7. [ bib | DOI ] |
[Vidal2012] | Vitor Hugo Couto Vidal. Optical Music Recognition in the grey-scale domain. Technical report, Universidade do Porto, 2012. [ bib | .pdf ] |
[Yin-xian2012] |
Yang Yin-xian and Yang Ding-li.
Staff Line Removal Algorithm Based on Trajectory Tracking and
Topological Structure of Score.
In 4th International Conference on Computer Modeling and
Simulation, 2012.
[ bib ]
Staff line removal plays a vital role in OMR technology, and is the preconditions of succeeding segmentation & recognition of music sheets. For the phenomena of over-deletion or mistaken deletion and under-deletion which often appear in removal process of staff lines, a novel staff line removal algorithm based on tra1jectory tracking and topological structure of music symbols is put forward to solve the deletion faults of partial notions, Experimental results show the presented algorithms can remove staff lines fast and effectively.
|
[Bugge2011] |
Esben Paul Bugge, Kim Lundsteen Juncher, Brian Soborg Mathiasen, and Jakob Grue
Simonsen.
Using Sequence Alignment and Voting To Improve Optical Music
Recognition From Multiple Recognizers.
In 12th International Society for Music Information Retrieval
Conference, pages 405-410, 2011.
ISBN 9780615548654.
[ bib |
.pdf ]
Digitalizing sheet music using Optical Music Recognition (OMR) is error-prone, especially when using noisy images created from scanned prints. Inspired by DNA-sequence alignment, we devise a method to use multiple sequence alignment to automatically compare output from multiple third partyOMRtools and perform automatic error-correction of pitch and duration of notes. We perform tests on a corpus of 49 one-page scores of varying quality. Our method on average reduces the amount of errors from an ensemble of 4 commercial OMR tools. The method achieves, on average, fewer errors than each recognizer by itself, but statistical tests show that it is sig- nificantly better than only 2 of the 4 commercial recogniz- ers. The results suggest that recognizers may be improved somewhat by sequence alignment and voting, but that more elaborate methods may be needed to obtain substantial im- provements. All software, scanned music data used for testing, and experiment protocols are open source and available at: http://code.google.com/p/omr-errorcorrection/
|
[Fornes2011] |
Alicia Fornés, Anjan Dutta, Albert Gordo, and Josep Llados.
The ICDAR 2011 Music Scores Competition: Staff Removal and Writer
Identification.
In International Conference on Document Analysis and
Recognition, pages 1511-1515, 2011.
[ bib |
DOI ]
In the last years, there has been a growing interest in the analysis of handwritten music scores. In this sense, our goal has been to foster the interest in the analysis of handwritten music scores by the proposal of two different competitions: Staff removal and Writer Identification. Both competitions have been tested on the CVC-MUSCIMA database: a ground-truth of handwritten music score images. This paper describes the competition details, including the dataset and ground-truth, the evaluation metrics, and a short description of the participants, their methods, and the obtained results.
|
[Min2011] |
Du Min.
Research on numbered musical notation recognition and performance in
a intelligent system.
In International Conference on Business Management and
Electronic Information, pages 340-343, 2011.
[ bib |
DOI ]
A intelligent system with numbered musical notation recognition and performance (NMRPIS) is presented which is based on notation recognition and can play digital music automatically. The system combines with OMR to analyze musical notation, interpret completely, form the output quickly and efficiently by the embedded program. The experimental result indicates this system has high classification rate and higher recognition performance.
|
[Pinto2011] |
Telmo Pinto, Ana Rebelo, Gilson Giraldi, and Jamie dos Santos Cardoso.
Music Score Binarization Based on Domain Knowledge.
In Jordi Vitrià, João Miguel Sanches, and Mario
Hernández, editors, Pattern Recognition and Image Analysis, pages
700-708. Springer Berlin Heidelberg, 2011.
ISBN 978-3-642-21257-4.
[ bib |
DOI ]
Image binarization is a common operation in the pre- processing stage in most Optical Music Recognition (OMR) systems. The choice of an appropriate binarization method for handwritten music scores is a difficult problem. Several works have already evaluated the performance of existing binarization processes in diverse applications. However, no goal-directed studies for music sheets documents were carried out. This paper presents a novel binarization method based in the content knowledge of the image. The method only needs the estimation of the staffline thickness and the vertical distance between two stafflines. This information is extracted directly from the gray level music score. The proposed binarization procedure is experimentally compared with several state of the art methods.
|
[Raphael2011] | Christopher Raphael. Optical Music Recognition on the IMSLP. Technical report, Indiana University, Bloomington, 2011. [ bib ] |
[Raphael2011a] | Christopher Raphael and Jingya Wang. New Approaches to Optical Music Recognition. In Anssi Klapuri and Colby Leider, editors, 12th International Society for Music Information Retrieval Conference, pages 305-310, Miami, Florida, 2011. University of Miami. [ bib | .pdf ] |
[Rebelo2011] |
Ana Rebelo, Jakub Tkaczuk, Sousa Sousa, and Jamie dos Santos Cardoso.
Metric Learning for Music Symbol Recognition.
In 10th International Conference on Machine Learning and
Applications and Workshops, pages 106-111, 2011b.
[ bib |
DOI ]
Although Optical Music Recognition (OMR) has been the focus of much research for decades, the processing of handwritten musical scores is not yet satisfactory. The efforts made to find robust symbol representations and learning methodologies have not found a similar quality in the learning of the dissimilarity concept. Simple Euclidean distances are often used to measure dissimilarity between different examples. However, such distances do not necessarily yield the best performance. In this paper, we propose to learn the best distance for the k-nearest neighbor (k-NN) classifier. The distance concept will be tuned both for the application domain and the adopted representation for the music symbols. The performance of the method is compared with the support vector machine (SVM) classifier using both real and synthetic music scores. The synthetic database includes four types of deformations inducing variability in the printed musical symbols which exist in handwritten music sheets. The work presented here can open new research paths towards a novel automatic musical symbols recognition module for handwritten scores.
|
[Rebelo2011a] | Ana Rebelo, Filipe Paszkiewicz, Carlos Guedes, Andre R. S. Marcal, and Jamie dos Santos Cardoso. A Method for Music Symbols Extraction based on Musical Rules. In Bridges 2011: Mathematics, Music, Art, Architecture, Culture, pages 81-88, 2011a. ISBN 098460426X. [ bib | .pdf ] |
[Tambouratzis2011] |
Tatiana Tambouratzis.
Identification of key music symbols for optical music recognition and
on-screen presentation.
In International Joint Conference on Neural Networks, pages
1935-1942, 2011.
[ bib |
DOI ]
A novel optical music recognition (OMR) system is put forward, where the custom-made on-screen presentation of the music score (MS) is promoted via the recognition of key music symbols only. The proposed system does not require perfect manuscript alignment or noise removal. Following the segmentation of each MS page into systems and, subsequently, into staves, staff lines, measures and candidate music symbols (CMS's), music symbol recognition is limited to the identification of the clefs, accidentals and time signatures. Such an implementation entails significantly less computational effort than that required by classic OMR systems, without an observable compromise in the quality of the on-screen presentation of the MS. The identification of the music symbols of interest is performed via probabilistic neural networks (PNN's), which are trained on a small set of exemplars from the MS itself. The initial results are promising in terms of efficiency, identification accuracy and quality of viewing.
|
[Thompson2011] | Jessica Thompson, Andrew Hankinson, and Ichiro Fujinaga. Searching the Liber Usualis: Using CouchDB and ElasticSearch to Query Graphical Music Documents. In 12th International Society for Music Information Retrieval Conference, 2011. [ bib | .pdf ] |
[Vigliensoni2011] | Gabriel Vigliensoni, John Ashley Burgoyne, Andrew Hankinson, and Ichiro Fujinaga. Automatic Pitch Detection in Printed Square Notation. In Anssi Klapuri and Colby Leider, editors, 12th International Society for Music Information Retrieval Conference, pages 423-428, Miami, Florida, 2011. University of Miami. [ bib | .pdf ] |
[Viro2011] |
Vladimir Viro.
Peachnote: Music Score Search and Analysis Platform.
In 12th International Society for Music Information Retrieval
Conference, pages 359-362, Miami, FL, 2011.
[ bib |
.pdf ]
Our system takes the scores in PDF format, runs optical music recognition (OMR) softwareover them, indexes the data and makes them accessible for querying and data min- ing. Thesearch engine is built upon Hadoop and HBase and runs on a cluster.
|
[Byrd2010] | Donald Byrd, William Guerin, Megan Schindele, and Ian Knopke. OMR Evaluation and Prospects for Improved OMR via Multiple Recognizers. Technical report, Indiana University, Bloomington, IN, USA, 2010. [ bib | http ] |
[Dutta2010] |
Anjan Dutta, Umapada Pal, Alicia Fornés, and Josep Llados.
An Efficient Staff Removal Approach from Printed Musical Documents.
In 20th International Conference on Pattern Recognition, pages
1965-1968, 2010.
[ bib |
DOI ]
Staff removal is an important preprocessing step of the Optical Music Recognition (OMR). The process aims to remove the stafflines from a musical document and retain only the musical symbols, later these symbols are used effectively to identify the music information. This paper proposes a simple but robust method to remove stafflines from printed musical scores. In the proposed methodology we have considered a staffline segment as a horizontal linkage of vertical black runs with uniform height. We have used the neighbouring properties of a staffline segment to validate it as a true segment. We have considered the dataset along with the deformations described in for evaluation purpose. From experimentation we have got encouraging results.
|
[Gozzi2010] | Gianmarco Gozzi. OMRJX: A framework for piano scores optical music recognition. Master's thesis, Politecnico di Milano, 2010. [ bib | .pdf ] |
[Hankinson2010] | Andrew Hankinson, Laurent Pugin, and Ichiro Fujinaga. An Interchange Format for Optical Music Recognition Applications. In 11th International Society for Music Information Retrieval Conference, pages 51-56, Utrecht, The Netherlands, 2010. [ bib | .pdf ] |
[Pinto2010] | Telmo Pinto, Ana Rebelo, Gilson Giraldi, and Jamie dos Santos Cardoso. Content Aware Music Score Binarization. Technical report, Universidade do Porto, Portugal, 2010. [ bib | .pdf ] |
[Rebelo2010] |
Ana Rebelo, G. Capela, and Jamie dos Santos Cardoso.
Optical recognition of music symbols.
International Journal on Document Analysis and Recognition, 13
(1): 19-31, 2010.
ISSN 1433-2825.
[ bib |
DOI ]
Many musical works produced in the past are still currently available only as original manuscripts or as photocopies. The preservation of these works requires their digitalization and transformation into a machine-readable format. However, and despite the many research activities on optical music recognition (OMR), the results for handwritten musical scores are far from ideal. Each of the proposed methods lays the emphasis on different properties and therefore makes it difficult to evaluate the efficiency of a proposed method. We present in this article a comparative study of several recognition algorithms of music symbols. After a review of the most common procedures used in this context, their respective performances are compared using both real and synthetic scores. The database of scores was augmented with replicas of the existing patterns, transformed according to an elastic deformation technique. Such transformations aim to introduce invariances in the prediction with respect to the known variability in the symbols, particularly relevant on handwritten works. The following study and the adopted databases can constitute a reference scheme for any researcher who wants to confront a new OMR algorithm face to well-known ones.
|
[Rizo2010] | David Rizo. Symbolic music comparison with tree data structures. PhD thesis, Universidad de Alicante, 2010. [ bib | .pdf ] |
[Burgoyne2009] | John Ashley Burgoyne, Yue Ouyang, Tristan Himmelman, Johanna Devaney, Laurent Pugin, and Ichiro Fujinaga. Lyric Extraction and Recognition on Digital Images of Early Music Sources. In 10th International Society for Music Information Retrieval Conference, pages 723-727, Kobe, Japan, 2009. [ bib | .pdf ] |
[Byrd2009] | Donald Byrd. Studying Music is Difficult and Important: Challenges of Music Knowledge Representation. In Eleanor Selfridge-Field, Frans Wiering, and Geraint A. Wiggins, editors, Knowledge representation for intelligent music processing, number 09051 in Dagstuhl Seminar Proceedings, Wadern, Germany, 2009. Leibniz-Center for Informatics, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany. [ bib | http ] |
[Cardoso2009] |
Jamie dos Santos Cardoso, Artur Capela, Ana Rebelo, Carlos Guedes, and Joaquim
Pinto da Costa.
Staff Detection with Stable Paths.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 31 (6): 1134-1139, 2009.
ISSN 0162-8828.
[ bib |
DOI ]
The preservation of musical works produced in the past requires their digitalization and transformation into a machine-readable format. The processing of handwritten musical scores by computers remains far from ideal. One of the fundamental stages to carry out this task is the staff line detection. We investigate a general-purpose, knowledge-free method for the automatic detection of music staff lines based on a stable path approach. Lines affected by curvature, discontinuities, and inclination are robustly detected. Experimental results show that the proposed technique consistently outperforms well-established algorithms.
|
[Dalitz2009] |
Christoph Dalitz and Christine Pranzas.
German Lute Tablature Recognition.
In 10th International Conference on Document Analysis and
Recognition, pages 371-375, 2009.
[ bib |
DOI ]
This paper describes a document recognition system for 16th century German staffless lute tablature notation. We present methods for page layout analysis, symbol recognition and symbol layout analysis and report error rates for these methods on a variety of historic prints. Page layout analysis is based on horizontal separator lines, which may interfere with other symbols. The proposed algorithm for their detection and removal is also applicable to other single staff line detection problems (like percussion notation), for which common staff line removal algorithms fail.
|
[Fornes2009] | Alicia Fornés, Josep Lladós, Gemma Sánchez, and Horst Bunke. On the Use of Textural Features for Writer Identification in Old Handwritten Music Scores. 10th International Conference on Document Analysis and Recognition, pages 996-1000, 2009. [ bib | DOI | http ] |
[Fornes2009a] | Alicia Fornés. Writer Identification by a Combination of Graphical Features in the Framework of Old Handwritten Music Scores. PhD thesis, Universitat Autònoma de Barcelona, 2009. [ bib | .pdf ] |
[Fremerey2009] | Christian Fremerey, David Damm, Frank Kurth, and Michael Clausen. Handling Scanned Sheet Music and Audio Recordings in Digital Music Libraries. In International Conference on Acoustics NAG/DAGA, pages 1-2, 2009. [ bib | .pdf ] |
[Genfang2009] |
Chen Genfang, Zhang Wenjun, and Wang Qiuqiu.
Pick-up the Musical Information from Digital Musical Score Based on
Mathematical Morphology and Music Notation.
In 1st International Workshop on Education Technology and
Computer Science, pages 1141-1144, 2009.
[ bib |
DOI ]
The basic rule of musical notation for image processing is analyzed, in this paper. Using the structuring elements of musical notation and the basic algorithms of mathematical morphology, a new recognizing for the musical information of digital musical score is presented, and then the musical information is transformed to MIDI file for the communication and restoration of musical score. The results of experiment show that the statistic average value of recognition rate for musical information from digital musical score is 94.4%, and can be satisfied the practical applied demand, and it is a new way for applications of digital library, musical education, musical theory analysis and so on.
|
[Johansen2009] | Linn Saxrud Johansen. Optical Music Recognition. Master's thesis, University of Oslo, 2009. [ bib | http ] |
[Sharif2009] | Muhammad Sharif, Quratul-Ain Arshad, Mudassar Raza, and Wazir Zada Khan. [COMSCAN]: An Optical Music Recognition System. In 7th International Conference on Frontiers of Information Technology, page 34. ACM, 2009. [ bib | DOI ] |
[Tardon2009] |
Lorenzo J. Tardón, Simone Sammartino, Isabel Barbancho, Verónica
Gómez, and Antonio Oliver.
Optical Music Recognition for Scores Written in White Mensural
Notation.
EURASIP Journal on Image and Video Processing, 2009 (1):
843401, 2009.
ISSN 1687-5281.
[ bib |
DOI ]
An Optical Music Recognition (OMR) system especially adapted for handwritten musical scores of the XVII-th and the early XVIII-th centuries written in white mensural notation is presented. The system performs a complete sequence of analysis stages: the input is the RGB image of the score to be analyzed and, after a preprocessing that returns a black and white image with corrected rotation, the staves are processed to return a score without staff lines; then, a music symbol processing stage isolates the music symbols contained in the score and, finally, the classification process starts to obtain the transcription in a suitable electronic format so that it can be stored or played. This work will help to preserve our cultural heritage keeping the musical information of the scores in a digital format that also gives the possibility to perform and distribute the original music contained in those scores.
|
[Vrist2009] | Søren Bjerregaard Vrist. Optical Music Recognition for structural information from high-quality scanned music, 2009. [ bib ] |
[Bellini2008] |
Pierfrancesco Bellini, Ivan Bruno, and Paolo Nesi.
Optical Music Recognition: Architecture and Algorithms.
In Kia Ng and Paolo Nesi, editors, Interactive Multimedia Music
Technologies, pages 80-110. IGI Global, Hershey, PA, USA, 2008.
[ bib |
http ]
Optical music recognition is a key problem for coding western music sheets in the digital world. This problem has been addressed in several manners obtaining suitable results only when simple music constructs are processed. To this end, several different strategies have been followed, to pass from the simple music sheet image to a complete and consistent representation of music notation symbols (symbolic music notation or representation). Typically, image processing, pattern recognition and symbolic reconstruction are the technologies that have to be considered and applied in several manners the architecture of the so called OMR (Optical Music Recognition) systems. In this chapter, the O3MR (Object Oriented Optical Music Recognition) system is presented. It allows producing from the image of a music sheet the symbolic representation and save it in XML format (WEDELMUSIC XML and MUSICXML). The algorithms used in this process are those of the image processing, image segmentation, neural network pattern recognition, and symbolic reconstruction and reasoning. Most of the solutions can be applied in other field of image understanding. The development of the O3MR solution with all its algorithms has been partially supported by the European Commission, in the IMUTUS Research and Development project, while the related music notation editor has been partially funded by the research and development WEDELMUSIC project of the European Commission. The paper also includes a methodology for the assessment of other OMR systems. The set of metrics proposed has been used to assess the quality of results produce by the O3MR with respect the best OMR on market.
|
[Bullen2008] | Andrew H. Bullen. Bringing Sheet Music to Life: My Experiences with OMR. code4lib Journal, 3 (84), 2008. ISSN 1940-5758. [ bib | http ] |
[Burgoyne2008] | John Ashley Burgoyne, Johanna Devaney, Laurent Pugin, and Ichiro Fujinaga. Enhanced Bleedthrough Correction for Early Music Documents with Recto-Verso Registration. In 9th International Conference on Music Information Retrieval, pages 407-412, Philadelphia, PA, 2008. [ bib | .pdf ] |
[Capela2008] | Artur Capela, Jamie dos Santos Cardoso, Ana Rebelo, and Carlos Guedes. Integrated recognition system for music scores. In International Computer Music Conference, pages 3-6, 2008a. [ bib | http ] |
[Capela2008a] | Artur Capela, Ana Rebelo, Jamie dos Santos Cardoso, and Carlos Guedes. Staff Line Detection and Removal with Stable Paths. In International Conference on Signal Processing and Multimedia Applications, 2008b. [ bib | .pdf ] |
[Cardoso2008] |
Jamie dos Santos Cardoso, Artur Capela, Ana Rebelo, and Carlos Guedes.
A connected path approach for staff detection on a music score.
In 15th International Conference on Image Processing, pages
1005-1008, 2008.
[ bib |
DOI ]
The preservation of many music works produced in the past entails their digitalization and consequent accessibility in an easy-to-manage digital format. Carrying this task manually is very time consuming and error prone. While optical music recognition systems usually perform well on printed scores, the processing of handwritten musical scores by computers remain far from ideal. One of the fundamental stages to carry out this task is the staff line detection. In this paper a new method for the automatic detection of music staff lines based on a connected path approach is presented. Lines affected by curvature, discontinuities, and inclination are robustly detected. Experimental results show that the proposed technique consistently outperforms well-established algorithms.
|
[Craig-McFeely2008] | Julia Craig-McFeely. Digital Image Archive of Medieval Music: The evolution of a digital resource. Digital Medievalist, 3, 2008. [ bib | DOI ] |
[Dalitz2008] |
Christoph Dalitz, Michael Droettboom, Bastian Pranzas, and Ichiro Fujinaga.
A Comparative Study of Staff Removal Algorithms.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 30 (5): 753-766, 2008a.
ISSN 0162-8828.
[ bib |
DOI ]
This paper presents a quantitative comparison of different algorithms for the removal of stafflines from music images. It contains a survey of previously proposed algorithms and suggests a new skeletonization-based approach. We define three different error metrics, compare the algorithms with respect to these metrics, and measure their robustness with respect to certain image defects. Our test images are computer-generated scores on which we apply various image deformations typically found in real-world data. In addition to modern western music notation, our test set also includes historic music notation such as mensural notation and lute tablature. Our general approach and evaluation methodology is not specific to staff removal but applicable to other segmentation problems as well.
|
[Dalitz2008a] |
Christoph Dalitz, Georgios K. Michalakis, and Christine Pranzas.
Optical recognition of psaltic Byzantine chant notation.
International Journal of Document Analysis and Recognition, 11
(3): 143-158, 2008b.
ISSN 1433-2825.
[ bib |
DOI |
http ]
This paper describes a document recognition system for the modern neume based notation of Byzantine music. We propose algorithms for page segmentation, lyrics removal, syntactical symbol grouping and the determination of characteristic page dimensions. All algorithms are experimentally evaluated on a variety of printed books for which we also give an optimal feature set for a nearest neighbour classifier. The system is based on the Gamera framework for document image analysis. Given that we cover all aspects of the recognition process, the paper can also serve as an illustration how a recognition system for a non standard document type can be designed from scratch.
|
[Damm2008] | David Damm, Christian Fremerey, Frank Kurth, Meinard Müller, and Michael Clausen. Multimodal Presentation and Browsing of Music. In 10th International Conference on Multimodal Interfaces, pages 205-208, Chania, Greece, 2008. ACM. ISBN 978-1-60558-198-9. [ bib | DOI | http ] |
[Fornes2008] |
Alicia Fornés, Josep Lladós, Gemma Sánchez, and Horst Bunke.
Writer Identification in Old Handwritten Music Scores.
In 8th International Workshop on Document Analysis Systems,
pages 347-353, Nara, Japan, 2008.
[ bib |
DOI ]
The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper we present a system for writer identification in old handwritten music scores. Even though an important amount of compositions contains handwritten text in the music scores, the aim of our work is to use only music notation to determine the author. The steps of the system proposed are the following. First of all, the music sheet is preprocessed and normalized for obtaining a single binarized music line, without the staff lines. Afterwards, 100 features are extracted for every music line, which are subsequently used in a k-NN classifier that compares every feature vector with prototypes stored in a database. By applying feature selection and extraction methods on the original feature set, the performance is increased. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving a recognition rate of about 95%.
|
[Fornes2008a] |
Alicia Fornés, Josep Lladós, and Gemma Sánchez.
Old Handwritten Musical Symbol Classification by a Dynamic Time
Warping Based Method.
In Wenyin Liu, Josep Lladós, and Jean-Marc Ogier, editors,
Graphics Recognition. Recent Advances and New Opportunities, pages
51-60, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
ISBN 978-3-540-88188-9.
[ bib |
DOI ]
A growing interest in the document analysis field is the recognition of old handwritten documents, towards the conversion into a readable format. The difficulties when we work with old documents are increased, and other techniques are required for recognizing handwritten graphical symbols that are drawn in such these documents. In this paper we present a Dynamic Time Warping based method that outperforms the classical descriptors, being also invariant to scale, rotation, and elastic deformations typical found in handwriting musical notation.
|
[Fremerey2008] |
Christian Fremerey, Meinard Müller, Frank Kurth, and Michael Clausen.
Automatic Mapping of Scanned Sheet Music to Audio Recordings.
In 9th International Conference on Music Information
Retrieval, pages 413-418, 2008.
ISBN 978-0-615-24849-3.
[ bib |
.pdf ]
Significant digitization efforts have resulted in large multimodal music collections comprising visual (scanned sheet music) as well as acoustic material (audio recordings). In this paper, we present a novel procedure for mapping scanned pages of sheet music to a given collection of audio recordings by identifying musically corresponding audio clips. To this end, both the scanned images as well as the audio recordings are first transformed into a common feature representation using optical music recognition (OMR) and methods from digital signal processing, respectively. Based on this common representation, a direct comparison of the two different types of data is facilitated. This allows for a search of scan-based queries in the audio collection. We report on systematic experiments conducted on the corpus of Beethoven’s piano sonatas showing that our mapping procedure works with high precision across the two types of music data in the case that there are no severe OMR errors. The proposed mapping procedure is relevant in a real-world application scenario at the Bavarian State Library for automatically identifying and annotating scanned sheet music by means of already available annotated audio material.
|
[Jones2008] |
Graham Jones, Bee Ong, Ivan Bruno, and Kia Ng.
Optical Music Imaging: Music Document Digitisation, Recognition,
Evaluation, and Restoration.
In Interactive multimedia music technologies, pages 50-79.
IGI Global, 2008.
[ bib |
DOI ]
This paper presents the applications and practices in the domain of music imaging for musical scores (music sheets and music manuscripts), which include music sheet digitisation, optical music recognition (OMR) and optical music restoration. With a general background of Optical Music Recognition (OMR), the paper discusses typical obstacles in this domain and reports currently available commercial OMR software. It reports hardware and software related to music imaging, discussed the SharpEye optical music recognition system and provides an evaluation of a number of OMR systems. Besides the main focus on the transformation from images of music scores to symbolic format, this paper also discusses optical music image restoration and the application of music imaging techniques for graphical preservation and potential applications for cross-media integration.
|
[Kolakowska2008] |
Agata Kolakowska.
Applying decision trees to the recognition of musical symbols.
In 1st International Conference on Information Technology,
pages 1-4, 2008.
[ bib |
DOI ]
The paper presents an experimental study on the recognition of printed musical scores. The first part of the study focuses on data preparation. Bitmaps containing musical symbols are converted to feature vectors using various methods. The vectors created in such a way are used to train classifiers which are the essential part of the study. Several decision tree classifiers are applied to this recognition task. These classifiers are created using different decision tree induction methods. The algorithms incorporate different criteria to select attributes in the nodes of the trees. Moreover, some of them apply stopping criteria, whereas the others perform tree pruning. The classification accuracy of the decision trees is estimated on data taken from musical scores. Eventually the usefulness of decision trees in the recognition of printed musical symbols is evaluated.
|
[Kurth2008] |
Frank Kurth, David Damm, Christian Fremerey, Meinard Müller, and Michael
Clausen.
A Framework for Managing Multimodal Digitized Music Collections.
In Birte Christensen-Dalsgaard, Donatella Castelli, Bolette
Ammitzbøll Jurik, and Joan Lippincott, editors, Research and
Advanced Technology for Digital Libraries, pages 334-345, Berlin,
Heidelberg, 2008. Springer Berlin Heidelberg.
ISBN 978-3-540-87599-4.
[ bib |
DOI ]
In this paper, we present a framework for managing heterogeneous, multimodal digitized music collections containing visual music representations (scanned sheet music) as well as acoustic music material (audio recordings). As a first contribution, we propose a preprocessing workflow comprising feature extraction, audio indexing, and music synchronization (linking the visual with the acoustic data). Then, as a second contribution, we introduce novel user interfaces for multimodal music presentation, navigation, and content-based retrieval. In particular, our system offers high quality audio playback with time-synchronous display of the digitized sheet music. Furthermore, our system allows a user to select regions within the scanned pages of a musical score in order to search for musically similar sections within the audio documents. Our novel user interfaces and search functionalities will be integrated into the library service system of the Bavarian State Library as part of the Probado project.
|
[Pugin2008] | Laurent Pugin, Jason Hockman, John Ashley Burgoyne, and Ichiro Fujinaga. Gamera versus Aruspix - Two Optical Music Recognition Approaches. In 9th International Conference on Music Information Retrieval, 2008. [ bib | .pdf ] |
[Rebelo2008] | Ana Rebelo. New Methodologies Towards an Automatic Optical Recognition of Handwritten Musical Scores. Master's thesis, Universidade do Porto, 2008. [ bib | .pdf ] |
[Smiatacz2008] |
Maciej Smiatacz and Witold Malina.
Matrix-based classifiers applied to recognition of musical notation
symbols.
In 1st International Conference on Information Technology,
pages 1-4, 2008.
[ bib |
DOI ]
The paper presents the application of matrix-based classifiers to the problem of automatic recognition of musical notation symbols. The idea of classification algorithms operating on matrices instead of feature vectors is briefly introduced together with a short description of methods that we have recently proposed. The experiments that we report show that the matrix-based approach can be used to improve the effectiveness and usefulness of the OMR system developed in our department as a part of the digital library of musical documents.
|
[Szwoch2008] |
Mariusz Szwoch.
Using MusicXML to Evaluate Accuracy of OMR Systems.
In International Conference on Theory and Application of
Diagrams, pages 419-422, Herrsching, Germany, 2008. Springer,
Springer-Verlag.
ISBN 978-3-540-87729-5.
[ bib |
DOI ]
In this paper a methodology for automatic accuracy evaluation in optical music recognition (OMR) applications is proposed. Presented approach assumes using ground truth images together with digital music scores describing their content. The automatic evaluation algorithm measures differences between the tested score and the reference one, both stored in MusicXML format. Some preliminary test results of this approach are presented based on the algorithm’s implementation in OMR Guido application.
|
[Wei2008] |
Lee Ling Wei, Qussay A. Salih, and Ho Sooi Hock.
Optical Tablature Recognition (OTR) system: Using Fourier
Descriptors as a recognition tool.
In International Conference on Audio, Language and Image
Processing, pages 1532-1539, 2008.
[ bib |
DOI ]
This paper presents an optical recognition system for the guitar tablature. Images of guitar tablature are fed as input to the system whereby each image undergoes four main stages of processing to produce a music output in MIDI format. Algorithms both existing and self-devised were used. Each input image was first cropped to the desired region, followed by a process for removal of the string lines and detection of the numbers. Recognition of the numbers was carried out using Fourier descriptors based on 8 selected feature points. Once completed, the numbers were matched to their corresponding chords and then rearranged and played. The algorithms and methods used within the system are presented here with a justification on the selection of Fourier descriptors as the recognition tool.
|
[Yoo2008] |
JaeMyeong Yoo, Nguyen Dinh Toan, DeokJai Choi, HyukRo Park, and Gueesang Lee.
Advanced Binarization Method for Music Score Recognition Using Local
Thresholds.
In 8th International Conference on Computer and Information
Technology Workshops, pages 417-420, 2008.
[ bib |
DOI ]
Application technology of mobile phone has been developing for the delivery of various contents over a simple voice channel. Music score recognition is one of such application services provided by mobile phone manufacturers which transform a music score taken by the phone camera into a midi file. For the successful recognition of the music score, the input image should be properly binarized to be fed into the recognition process. In this paper, Adaptive binary algorithm is proposed which exploits local thresholds with several levels to deal with illumination changes over the entire image. Experimental results shown advanced performance of music score recognition.
|
[Bellini2007] | Pierfrancesco Bellini, Ivan Bruno, and Paolo Nesi. Assessing Optical Music Recognition Tools. Computer Music Journal, 31 (1): 68-93, 2007. [ bib | DOI ] |
[Burgoyne2007] | John Ashley Burgoyne, Laurent Pugin, Greg Eustace, and Ichiro Fujinaga. A Comparative Survey of Image Binarisation Algorithms for Optical Recognition on Degraded Musical Sources. In 8th International Conference on Music Information Retrieval, 2007. [ bib | .pdf ] |
[Castro2007] |
Pedro Castro and J. R. Caldas Pinto.
Methods for Written Ancient Music Restoration.
In Mohamed Kamel and Aurélio Campilho, editors, Image
Analysis and Recognition, pages 1194-1205, Berlin, Heidelberg, 2007.
Springer Berlin Heidelberg.
ISBN 978-3-540-74260-9.
[ bib |
DOI ]
Degradation in old documents has been a matter of concern for a long time. With the easy access to information provided by technologies such as the Internet, new ways have arisen for consulting those documents without exposing them to yet more dangers of degradation. While restoration methods are present in the literature in relation to text documents and artworks, little attention has been given to the restoration of ancient music. This paper describes and compares different methods to restore images of ancient music documents degraded over time. Six different methods were tested, including global and adaptive thresholding, color clustering and edge detection. In this paper we conclude that those based on the Sauvola's thresholding algorithm are the better suited for our proposed goal of ancient music restoration.
|
[Castro2007a] |
Pedro Castro, R. J. Almeida, and J. R. Caldas Pinto.
Restoration of Double-Sided Ancient Music Documents with
Bleed-Through.
In Luis Rueda, Domingo Mery, and Josef Kittler, editors,
Progress in Pattern Recognition, Image Analysis and Applications,
pages 940-949, Berlin, Heidelberg, 2007. Springer Berlin Heidelberg.
ISBN 978-3-540-76725-1.
[ bib |
DOI ]
Access to collections of cultural heritage is increasingly becoming a topic of interest for institutions like libraries. With the easy access to information provided by technologies such as the Internet, new ways exist for consulting ancient documents without exposing them to more dangers of degradation. One of those types of documents is written ancient music. These documents suffer from multiple kinds of degradation, where bleed-through outstands as the most damaging. This paper proposes a new method based on the Takagi Sugeno fuzzy classification algorithm to classify the pixels as bleed-through, after performing a general background restoration. This method is applied to a set of double-sided ancient music documents, and the obtained results compared with methods present in the literature.
|
[Diet2007] | Jürgen Diet and Frank Kurth. The Probado Music Repository at the Bavarian State Library. In 8th International Conference on Music Information Retrieval, pages 501-504, Vienna, Austria, 2007. [ bib | .pdf ] |
[Knopke2007] |
Ian Knopke and Donald Byrd.
Towards Musicdiff : A Foundation for Improved Optical Music
Recognition Using Multiple Recognizers.
In 8th International Conference on Music Information
Retrieval, pages 123-126, Vienna, Austria, 2007.
ISBN 978-3-85403-218.
[ bib |
.pdf ]
This paper presents work towards a “musicdiff” program for comparing files representing different versions of the same piece, primarily in the context of comparing versions produced by different optical music recognition (OMR) programs. Previous work by the current authors and oth- ers strongly suggests that using multiple recognizers will make it possible to improve OMR accuracy substantially. The basicmethodology requires several stages: documents must be scanned and submitted to severalOMR programs, programswhose strengths andweaknesses have previously been evaluated in detail. We discuss techniques we have implemented for normalization, alignment and rudimen- tary error correction. We also describe a visualization tool for comparingmultiple versions on ameasure-by-measure basis.
|
[McKay2007] | Cory McKay and Ichiro Fujinaga. Style-independent computer-assisted exploratory analysis of large music collections. Journal of Interdisciplinary Music Studies, 1 (1): 63-85, 2007. [ bib | .pdf ] |
[Pugin2007] | Laurent Pugin, John Ashley Burgoyne, and Ichiro Fujinaga. Goal-directed Evaluation for the Improvement of Optical Music Recognition on Early Music Prints. In 7th ACM/IEEE-CS Joint Conference on Digital Libraries, pages 303-304, Vancouver, Canada, 2007b. ACM. ISBN 978-1-59593-644-8. [ bib | DOI ] |
[Pugin2007a] | Laurent Pugin, John Ashley Burgoyne, and Ichiro Fujinaga. MAP Adaptation to Improve Optical Music Recognition of Early Music Documents Using Hidden Markov Models. In 8th International Conference on Music Information Retrieval, pages 513-516, 2007c. [ bib | .pdf ] |
[Pugin2007b] |
Laurent Pugin, John Ashley Burgoyne, and Ichiro Fujinaga.
Reducing Costs for Digitising Early Music with Dynamic Adaptation.
In László Kovács, Norbert Fuhr, and Carlo Meghini,
editors, Research and Advanced Technology for Digital Libraries, pages
471-474, Berlin, Heidelberg, 2007d. Springer Berlin Heidelberg.
ISBN 978-3-540-74851-9.
[ bib ]
Optical music recognition (OMR) enables librarians to digitise early music sources on a large scale. The cost of expert human labour to correct automatic recognition errors dominates the cost of such projects. To reduce the number of recognition errors in the OMR process, we present an innovative approach to adapt the system dynamically, taking advantage of the human editing work that is part of any digitisation project. The corrected data are used to perform MAP adaptation, a machine-learning technique used previously in speech recognition and optical character recognition (OCR). Our experiments show that this technique can reduce editing costs by more than half.
|
[Pugin2007c] | Laurent Pugin, John Ashley Burgoyne, Douglas Eck, and Ichiro Fujinaga. Book-Adaptive and Book-Dependent Models to Accelerate Digitization of Early Music. Technical report, McGill University, Whistler, BC, 2007a. [ bib | http ] |
[Rebelo2007] |
Ana Rebelo, Artur Capela, Joaquim F. Pinto da Costa, Carlos Guedes, Eurico
Carrapatoso, and Jamie dos Santos Cardoso.
A Shortest Path Approach for Staff Line Detection.
In 3rd International Conference on Automated Production of
Cross Media Content for Multi-Channel Distribution, pages 79-85, 2007.
[ bib |
DOI ]
Many music works produced in the past still exist only as original manuscripts or as photocopies. Preserving them entails their digitalization and consequent accessibility in a digital format easy-to-manage. The manual process to carry out this task is very time consuming and error prone. Optical music recognition (OMR) is a form of structured document image analysis where music symbols are isolated and identified so that the music can be conveniently processed. While OMR systems perform well on printed scores, current methods for reading handwritten musical scores by computers remain far from ideal. One of the fundamental stages of this process is the staff line detection. In this paper a new method for the automatic detection of music stave lines based on a shortest path approach is presented. Lines with some curvature, discontinuities, and inclination are robustly detected. The proposed algorithm behaves favourably when compared experimentally with well-established algorithms.
|
[Szwoch2007] |
Mariusz Szwoch.
Guido: A Musical Score Recognition System.
In 9th International Conference on Document Analysis and
Recognition, pages 809-813, 2007.
[ bib |
DOI ]
This paper presents an optical music recognition system Guido that can automatically recognize the main musical symbols of music scores that were scanned or taken by a digital camera. The application is based on object model of musical notation and uses linguistic approach for symbol interpretation and error correction. The system offers musical editor with a partially automatic error correction.
|
[Bainbridge2006] |
David Bainbridge and Tim Bell.
Identifying music documents in a collection of images.
In 7th International Conference on Music Information
Retrieval, pages 47-52, Victoria, Canada, 2006.
[ bib |
http ]
Digital libraries and search engines are now well-equipped to find images of documents based on queries. Many images of music scores are now available, often mixed up with textual documents and images. For example, using the Google “images” search feature, a search for “Beethoven” will return a number of scores and manuscripts as well as pictures of the composer. In this paper we report on an investigation into methods to mechanically determine if a particular document is indeed a score, so that the user can specify that only musical scores should be returned. The goal is to find a minimal set of features that can be used as a quick test that will be applied to large numbers of documents. A variety of filters were considered, and two promising ones (run-length ratios and Hough transform) were evaluated. We found that a method based around run-lengths in vertical scans (RL) that out-performs a comparable algorithm using the Hough transform (HT). On a test set of 1030 images, RL achieved recall and precision of 97.8% and 88.4% respectively while HT achieved 97.8% and 73.5%. In terms of processor time, RL was more than five times as fast as HT.
|
[Byrd2006] | Donald Byrd and Megan Schindele. Prospects for Improving OMR with Multiple Recognizers. In 7th International Conference on Music Information Retrieval, pages 41-46, 2006. ISBN 1-55058-349-2. [ bib | .pdf ] |
[Desaedeleer2006] |
Arnaud F. Desaedeleer.
Reading Sheet Music.
Master's thesis, University of London, 2006.
[ bib |
http ]
Optical Music Recognition is the process of recognising a printed music score and converting it to a format that is understood by computers. This process involves detecting all musical elements present in the music score in such a way that the score can be represented digitally. For example, the score could be recognised and played back through the computer speakers. Much research has been carried out in this area and several approaches to performing OMR have been suggested. A more recent approach involves segmenting the image using a neural network to recognise the segmented symbols from which the score can be reconstructed. This project will survey the different techniques that have been used to perform OMR on printed music scores and an application by the name of OpenOMR will be developed. One of the aims is to create an open source project in which developers in the open source community will be able to contribute their ideas in order to enhance this application and progress the research in the OMR field.
|
[Fornes2006] |
Alicia Fornés, Josep Lladós, and Gemma Sánchez.
Primitive Segmentation in Old Handwritten Music Scores.
In Wenyin Liu and Josep Lladós, editors, Graphics
Recognition. Ten Years Review and Future Perspectives, pages 279-290,
Berlin, Heidelberg, 2006. Springer Berlin Heidelberg.
ISBN 978-3-540-34712-5.
[ bib |
DOI ]
Optical Music Recognition consists in the identification of music information from images of scores. In this paper, we propose a method for the early stages of the recognition: segmentation of staff lines and graphical primitives in handwritten scores. After introducing our work with modern musical scores (where projections and Hough Transform are effectively used), an approach to deal with ancient handwritten scores is exposed. The recognition of such these old scores is more difficult due to paper degradation and the lack of a standard in musical notation. Our method has been tested with several scores of 19th century with high performance rates.
|
[Homenda2006] |
Wladyslaw Homenda and Marcin Luckner.
Automatic Knowledge Acquisition: Recognizing Music Notation with
Methods of Centroids and Classifications Trees.
In International Joint Conference on Neural Network, pages
3382-3388, Vancouver, Canada, 2006.
[ bib |
DOI ]
This paper presents a pattern recognition study aimed al music symbols recognition. The study is focused on classification methods of music symbols based on decision trees and clustering method applied to classes of music symbols that face classification problems. Classification is made on the basis of extracted features. A comparison of selected classifiers was made on some classes of nutation symbols distorted by a variety of factors as image noise, printing defects, different fonts, skew and curvature of scanning, overlapped symbols.
|
[Homenda2006a] |
Wladyslaw Homenda.
Automatic understanding of images: integrated syntactic and semantic
analysis of music notation.
In International Joint Conference on Neural Network, pages
3026-3033, Vancouver, Canada, 2006.
[ bib |
DOI ]
The paper introduces an approach to image processing and recognition based on the perception of images as subjects being exchanged in the man-computer communication. The approach reveals the parallel syntactic and semantic attempts to automatic image understanding. Both attempts are reflected in the paradigms of information granulation and granular computing. The parallel syntactic and semantic processing of images allows for solving problems raised by difficulties and complexity of the detailed syntactic description of images as well as difficulties of detailed semantic analysis. The study presented in this paper is cast on the practical task of the music notation recognition.
|
[Luckner2006] |
Marcin Luckner.
Recognition of Noised Patterns Using Non-Disruption Learning Set.
In 6th International Conference on Intelligent Systems Design
and Applications, pages 557-562, 2006.
[ bib |
DOI ]
In this paper the recognition of strongly noised symbols on the basis of non-disruption patterns is discussed taking music symbols as an example. Although Optical Music Recognition technology is not developed as successfully as OCR technology, several systems do recognize typical musical symbols to quite a good level. However, the recognition of non-typical fonts is still an unsolved issue. In this paper a model of a recognition system for unusual scores is presented. In the model described non-disruption symbols are used to generate a learning set that makes possible improved recognition as is presented on a real example of rests and accidentals recognition. Some techniques are presented with various recognition rates and computing times including supervised and unsupervised ones
|
[McPherson2006] | John R. McPherson. Coordinating Knowledge To Improve Optical Music Recognition. PhD thesis, The University of Waikato, 2006. [ bib | .pdf ] |
[Pugin2006] | Laurent Pugin. Optical Music Recognitoin of Early Typographic Prints using Hidden Markov Models. In 7th International Conference on Music Information Retrieval, pages 53-56, Victoria, Canada, 2006a. [ bib | .pdf ] |
[Pugin2006a] | Laurent Pugin. Aruspix: an Automatic Source-Comparison System. Computing in Musicology, 14: 49-59, 2006b. ISSN 1057-9478. [ bib | http ] |
[Pugin2006b] | Laurent Pugin. Lecture et traitement informatique de typographies musicales anciennes: un logiciel de reconnaissance de partitions par modèles de Markov cachés. PhD thesis, Geneva University, Geneva, Switzerland, 2006c. [ bib | DOI ] |
[Rossant2006] |
Florence Rossant and Isabelle Bloch.
Robust and Adaptive OMR System Including Fuzzy Modeling, Fusion of
Musical Rules, and Possible Error Detection.
EURASIP Journal on Advances in Signal Processing, 2007 (1):
081541, 2006.
ISSN 1687-6180.
[ bib |
DOI ]
This paper describes a system for optical music recognition (OMR) in case of monophonic typeset scores. After clarifying the difficulties specific to this domain, we propose appropriate solutions at both image analysis level and high-level interpretation. Thus, a recognition and segmentation method is designed, that allows dealing with common printing defects and numerous symbol interconnections. Then, musical rules are modeled and integrated, in order to make a consistent decision. This high-level interpretation step relies on the fuzzy sets and possibility framework, since it allows dealing with symbol variability, flexibility, and imprecision of music rules, and merging all these heterogeneous pieces of information. Other innovative features are the indication of potential errors and the possibility of applying learning procedures, in order to gain in robustness. Experiments conducted on a large data base show that the proposed method constitutes an interesting contribution to OMR.
|
[Toyama2006] |
Fubito Toyama, Kenji Shoji, and Juichi Miyamichi.
Symbol Recognition of Printed Piano Scores with Touching Symbols.
In 18th International Conference on Pattern Recognition, pages
480-483, 2006.
[ bib |
DOI ]
To build a music database efficiently, an automatic score recognition system is a critical component. Many previous methods are applicable only to some simple music scores. In case of complex music scores it becomes difficult to detect symbols correctly because of noise and connection between symbols included in the scores. In this paper, we propose a score recognition method which is applicable to the complex music scores. Symbol candidates are detected by template matching. From these candidates correct symbols are selected by considering their relative positions and mutual connections. Under the presence of noise and connected symbols, the proposed method outperformed "Score Maker" which is an optical music score recognition software
|
[Barton2005] | Louis W. G. Barton, John A. Caldwell, and Peter G. Jeavons. E-library of Medieval Chant Manuscript Transcriptions. In 5th ACM/IEEE-CS Joint Conference on Digital Libraries, pages 320-329, Denver, CO, USA, 2005. ACM. ISBN 1-58113-876-8. [ bib | DOI ] |
[Dalitz2005] | Christoph Dalitz and Thomas Karsten. Using the Gamera framework for building a lute tablature recognition system. In 6th International Conference on Music Information Retrieval, pages 478-481, London, UK, 2005. [ bib | .pdf ] |
[Fornes2005] | Alicia Fornés. Analysis of Old Handwritten Musical Scores. Master's thesis, Universitat Autònoma de Barcelona, 2005. [ bib | .pdf ] |
[Gan2005] | Ting Gan. Música Colonial: 18th Century Music Score Meets 21st Century Digitalization Technology. In 5th ACM/IEEE-CS Joint Conference on Digital Libraries, pages 379-379, Denver, USA, 2005. ACM. ISBN 1-58113-876-8. [ bib | DOI ] |
[Homenda2005] |
Wladyslaw Homenda.
Optical Music Recognition: the Case Study of Pattern Recognition.
In Marek Kurzyński, Edward Puchala, Michal WoŹniak,
and Andrzej żolnierek, editors, Computer Recognition Systems,
pages 835-842, Berlin, Heidelberg, 2005. Springer Berlin Heidelberg.
ISBN 978-3-540-32390-7.
[ bib |
DOI ]
The paper presents a pattern recognition study aimed on music notation recognition. The study is focused on practical aspect of optical music recognition; it presents a variety of methods applied in optical music recognition technology. The following logically separated stages of music notation recognition are distinguished: acquiring music notation structure, recognizing symbols of music notation, analyzing contextual information. The directions for OMR package development are drawn.
|
[Rossant2005] |
Florence Rossant and Isabelle Bloch.
Optical music recognition based on a fuzzy modeling of symbol classes
and music writing rules.
In IEEE International Conference on Image Processing 2005,
pages II-538, 2005.
[ bib |
DOI ]
We propose an OMR method based on fuzzy modeling of the information extracted from the scanned score and of musical rules. The aim is to disambiguate the recognition hypotheses output by the individual symbol analysis process. Fuzzy modeling allows to account for imprecision in symbol detection, for typewriting variations, and for flexibility of rules. Tests conducted on a hundred of music sheets result in a global recognition rate of 98.55%, and show good performances compared to SmartScore.
|
[Szwoch2005] |
Mariusz Szwoch.
A Robust Detector for Distorted Music Staves.
In André Gagalowicz and Wilfried Philips, editors, Computer
Analysis of Images and Patterns, pages 701-708, Berlin, Heidelberg, 2005.
Springer Berlin Heidelberg.
ISBN 978-3-540-32011-1.
[ bib |
DOI ]
In this paper an algorithm for music staves detection is presented. The algorithm bases on horizontal projections in local windows of a score image and farther processing of resulting histograms and their connections. Experiments carried out, proved high efficiency of presented algorithm and its robustness in case of non-ideal staff lines: skew and with barrel and pincushion distortions. The algorithm allows for usage of acquisition devices alternative to scanner such as digital cameras.
|
[Taubman2005] | Gabriel Taubman. MusicHand : A Handwritten Music Recognition System. Technical report, Brown University, 2005. [ bib | .pdf ] |
[Audiveris] | Hervé Bitteur. Audiveris. https://github.com/audiveris, 2004. [ bib | http ] |
[Bellini2004] | Pierfrancesco Bellini, Ivan Bruno, and Paolo Nesi. An Off-Line Optical Music Sheet Recognition. In Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 40-77. IGI Global, 2004. [ bib | DOI ] |
[Clausen2004] |
Michael Clausen and Frank Kurth.
A unified approach to content-based and fault-tolerant music
recognition.
IEEE Transactions on Multimedia, 6 (5): 717-731, 2004.
ISSN 1520-9210.
[ bib |
DOI ]
In this paper, we propose a unified approach to fast index-based music recognition. As an important area within the field of music information retrieval (MIR), the goal of music recognition is, given a database of musical pieces and a query document, to locate all occurrences of that document within the database, up to certain possible errors. In particular, the identification of the query with regard to the database becomes possible. The approach presented in this paper is based on a general algorithmic framework for searching complex patterns of objects in large databases. We describe how this approach may be applied to two important music recognition tasks: The polyphonic (musical score-based) search in polyphonic score data and the identification of pulse-code modulation audio material from a given acoustic waveform. We give an overview on the various aspects of our technology including fault-tolerant search methods. Several areas of application are suggested. We describe several prototypic systems we have developed for those applications including the notify! and the audentify! systems for score- and waveform-based music recognition, respectively.
|
[Dovey2004] |
Matthew J. Dovey.
Overview of the OMRAS Project: Online Music Retrieval and
Searching.
Journal of the American Society for Information Science and
Technology, 55 (12): 1100-1107, 2004.
[ bib |
DOI |
http ]
Until recently, most research on music information retrieval concentrated on monophonic music. Online Music Retrieval and Searching (OMRAS) is a three-year project funded under the auspices of the JISC (Joint Information Systems Committee)/NSF (National Science Foundation) International Digital Library Initiative which began in 1999 and whose remit was to investigate the issues surrounding polyphonic music information retrieval. Here we outline the work OMRAS has achieved in pattern matching, document retrieval, and audio transcription, as well as some prototype work in how to implement these techniques into library systems.
|
[Droettboom2004] | Michael Droettboom and Ichiro Fujinaga. Symbol-level groundtruthing environment for OMR. In 5th International Conference on Music Information Retrieval, pages 497-500, 2004. [ bib | .pdf ] |
[Fujinaga2004] | Ichiro Fujinaga. Staff detection and removal. In Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 1-39. IGI Global, 2004. [ bib | DOI ] |
[George2004] | Susan E. George. Visual Perception of Music Notation On-Line and Off-Line Recognition. IRM Press, 2004a. ISBN 1931777942. [ bib | http ] |
[George2004a] | Susan E. George. Evaluation in the Visual Perception of Music Notation. In S. George, editor, Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 304-349. IRM Press, Hershey, PA, 2004b. [ bib | DOI ] |
[George2004b] | Susan E. George. Lyric Recognition and Christian Music. In S. George, editor, Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 198-226. IRM Press, Hershey, PA, 2004c. [ bib | DOI ] |
[George2004c] | Susan E. George. Wavelets for Dealing with Super-Imposed Objects in Recognition of Music Notation. In S. George, editor, Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 78-107. IRM Press, Hershey, PA, 2004d. [ bib | DOI ] |
[George2004d] | Susan E. George. Pen-Based Input for On-Line Handwritten Music Notation. In S. George, editor, Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 128-160. IRM Press, Hershey, PA, 2004e. [ bib | DOI ] |
[Homenda2004] | Wladyslaw Homenda and Marcin Luckner. Automatic Recognition of Music Notation Using Neural Networks. In International Conference on AI and Systems, Divnormorkoye, Russia, 2004. [ bib | http ] |
[Homenda2004a] | Wladyslaw Homenda and K. Mossakowski. Music Symbol Recognition: Neural Networks vs. Statistical Methods. In B. De Baets, R. De Caluwe, G. De Tre, Janos Fodor, J. Kaprzyk, and S. Zadrozny, editors, EUROFUSE Workshop On Data And Knowledge Engineering, Warszawa, Poland, 2004. [ bib | http ] |
[Mitobe2004] |
Youichi Mitobe, Hidetoshi Miyao, and Minoru Maruyama.
A fast HMM algorithm based on stroke lengths for on-line recognition
of handwritten music scores.
In 9th International Workshop on Frontiers in Handwriting
Recognition, pages 521-526, 2004.
[ bib |
DOI ]
The hidden Markov model (HMM) has been successfully applied to various kinds of on-line recognition problems including, speech recognition, handwritten character recognition, etc. In this paper, we propose an on-line method to recognize handwritten music scores. To speed up the recognition process and improve usability of the system, the following methods are explained: (1) The target HMMs are restricted based on the length of a handwritten stroke, and (2) Probability calculations of HMMs are successively made as a stroke is being written. As a result, recognition rates of 85.78% and average recognition times of 5.19 ms/stroke were obtained for 6,999 test strokes of handwritten music symbols, respectively. The proposed HMM recognition rate is 2.4% higher than that achieved with the traditional method, and the processing time was 73% of that required by the traditional method.
|
[Miyao2004] | Hidetoshi Miyao and Minoru Maruyama. An online handwritten music score recognition system. In 17th International Conference on Pattern Recognition. Institute of Electrical & Electronics Engineers (IEEE), 2004. [ bib | DOI ] |
[Ng2004] | Kia Ng. Optical Music Analysis for Printed Music Score and Handwritten Music Manuscript. In Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 108-127. IGI Global, 2004. [ bib | DOI ] |
[Rossant2004] |
Florence Rossant and Isabelle Bloch.
A fuzzy model for optical recognition of musical scores.
Fuzzy Sets and Systems, 141 (2): 165-201, 2004.
ISSN 0165-0114.
[ bib |
DOI |
http ]
Optical music recognition aims at reading automatically scanned scores in order to convert them in an electronic format, such as a midi file. We only consider here classical monophonic music: we exclude any music written on several staves, but also any music that contains chords. In order to overcome recognition failures due to the lack of methods dealing with structural information, non-local rules and corrections, we propose a recognition approach integrating structural information in the form of relationships between symbols and of musical rules. Another contribution of this paper is to solve ambiguities by accounting for sources of imprecision and uncertainty, within the fuzzy set and possibility theory framework. We add to a single symbol analysis several rules for checking the consistency of hypotheses: graphical consistency (compatibility between accidental and note, between grace note and note, between note and augmentation dot, etc.), and syntactic consistency (accidentals, tonality, metric). All these rules are combined in order to lead to better decisions. Experimental results on 65 music sheets show that our approach leads to very good results, and is able to correct errors made by other approaches, such as the one of SmartScore.
|
[Sheridan2004] |
Scott Sheridan and Susan E. George.
Defacing Music Scores for Improved Recognition.
In 2nd Australian Undergraduate Students' Computing
Conference, pages 142-148, 2004.
[ bib |
.pdf ]
The area of Optical Music Recognition (OMR) has long been plagued by an inability to provide a definitive method for locating and identifying musical objects superimposed on musical stave lines. The first step in the process of recognising musical symbols in OMR has previously been to either remove the stave lines, or ignore them. Removing stave lines leads to many problems of fragmented and deformed musical symbols, or in the case of ignoring them, a lowered chance of recognition. Most OMR systems attempt to correct these deficiencies later on in the process through many varied approaches including bounding box analysis, k-nearest-neighbour (k-NN) and neural network (ANN) classification schemes. All of these have a level of success, but none have provided nearly the desired level of accuracy. This paper aims to show that this removal of the stave lines before symbol recognition is not the only first step and may not be the best. Instead of removing stave lines, more should be added! This process is called ‘defacing’ since it adds stave lines to the score at a 1/2 stave line width, and actually overwrites the score - apparently complicating the recognition procedure. However, the addition of signal to the image means that subsequent symbol recognition is ‘normalised’ and a musical symbol will look the same whether it was above, below or on a stave line. As a result of this, a classification system trained with double stave lines should provide a higher level of accuracy than the traditional approaches of removing/ignoring the stave lines.
|
[Bainbridge2003] |
David Bainbridge and Tim Bell.
A music notation construction engine for optical music recognition.
Software: Practice and Experience, 33 (2): 173-200, 2003.
ISSN 1097-024X.
[ bib |
DOI ]
Optical music recognition (OMR) systems are used to convert music scanned from paper into a format suitable for playing or editing on a computer. These systems generally have two phases: recognizing the graphical symbols (such as note-heads and lines) and determining the musical meaning and relationships of the symbols (such as the pitch and rhythm of the notes). In this paper we explore the second phase and give a two-step approach that admits an economical representation of the parsing rules for the system. The approach is flexible and allows the system to be extended to new notations with little effort—the current system can parse common music notation, Sacred Harp notation and plainsong. It is based on a string grammar and a customizable graph that specifies relationships between musical objects. We observe that this graph can be related to printing as well as recognizing music notation, bringing the opportunity for cross-fertilization between the two areas of research. Copyright © 2003 John Wiley & Sons, Ltd.
|
[Bruder2003] |
Ilvio Bruder, Andreas Finger, Andreas Heuer, and Temenushka Ignatova.
Towards a Digital Document Archive for Historical Handwritten Music
Scores.
In Tengku Mohd Tengku Sembok, Halimah Badioze Zaman, Hsinchun Chen,
Shalini R. Urs, and Sung-Hyon Myaeng, editors, Digital Libraries:
Technology and Management of Indigenous Knowledge for Global Access, pages
411-414, Berlin, Heidelberg, 2003. Springer Berlin Heidelberg.
ISBN 978-3-540-24594-0.
[ bib |
DOI ]
Contemporary digital libraries and archives of music scores focus mainly on providing efficient storage and access methods for their data. However, digital archives of historical music scores can enable musicologists not only to easily store and access research material, but also to derive new knowledge from existing data. In this paper we present the first steps in building a digital archive of historical music scores from the 17th and 18th century. Along with the architectural and accessibility aspects of the system, we describe an integrated approach for classification and identification of the scribes of music scores.
|
[Byrd2003] | Donald Byrd and Eric Isaacson. A Music Representation Requirement Specification for Academia. Computer Music Journal, 27 (4): 43-57, 2003. ISSN 01489267, 15315169. [ bib | http ] |
[George2003] | Susan E. George. Online Pen-Based Recognition of Music Notation with Artificial Neural Networks. Computer Music Journal, 27 (2): 70-79, 2003. [ bib | DOI ] |
[Goecke2003] |
Roland Göcke.
Building a system for writer identification on handwritten music
scores.
In IASTED International Conference on Signal Processing,
Pattern Recognition, and Applications, pages 250-255. Acta Press, 2003.
ISBN 0 88986 363 6.
[ bib |
.pdf ]
A significant example of the integration of musicology and computer science. The problem of writer identification process by historical musicologists is identified and possible solutions by computer technology are assessed. The system outline is unique and seems convincing including the interesting ideas such as the feature trees and consistency check. However, it lacks any concrete methods to implement the proposed system and any evaluation.
|
[Nehab2003] | Diego Nehab. Staff Line Detection by Skewed Projection. Technical report, 2003. [ bib | .pdf ] |
[Pinto2003] |
João Caldas Pinto, Pedro Vieira, and João M. Sousa.
A new graph-like classification method applied to ancient handwritten
musical symbols.
Document Analysis and Recognition, 6 (1): 10-22, 2003.
ISSN 1433-2825.
[ bib |
DOI |
http ]
Several algorithms have been proposed in the past to solve the problem of binary pattern recognition. The problem of finding features that clearly distinguish two or more different patterns is a key issue in the design of such algorithms. In this paper, a graph-like recognition process is proposed that combines a number of different classifiers to simplify the type of features and classifiers used in each classification step. The graph-like classification method is applied to ancient music optical recogniti on, and a high degree of accuracy has been achieved.
|
[Riley2003] | Jenn Riley and Ichiro Fujinaga. Recommended best practices for digital image capture of musical scores. OCLC Systems & Services, 19 (2): 62-69, 2003. ISSN 1065-075X. [ bib | DOI ] |
[Barton2002] |
Louis W. G. Barton.
The NEUMES Project: digital transcription of medieval chant
manuscripts.
In 2nd International Conference on Web Delivering of Music,
pages 211-218, 2002.
[ bib |
DOI ]
This paper introduces the NEUMES Project from a top-down perspective. The purpose of the project is to design a software infrastructure for digital transcription of medieval chant manuscripts, such that transcriptions can be interoperable across many types of applications programs. Existing software for modern music does not provide an effective solution. A distributed library of chant document resources for the Web is proposed, to encompass photographic images, transcriptions, and searchable databases of manuscript descriptions. The NEUMES encoding scheme for chant transcription is presented, with NeumesXML serving as a 'wrapper' for transmission, storage, and editorial markup of transcription data. A scenario of use is given and future directions for the project are briefly discussed.
|
[Clausen2002] |
Michael Clausen and Frank Kurth.
A unified approach to content-based and fault tolerant music
identification.
In 2nd International Conference on Web Delivering of Music,
pages 56-65, 2002.
[ bib |
DOI ]
In this paper we propose a unified approach to content-based search in different kinds of music data. Our approach is based on a general algorithmic framework for searching patterns of complex objects in large databases. In particular we describe how this approach may be used to allow for polyphonic search in polyphonic scores as well as for the identification of PCM audio material. We give an overview on the various aspects of our technology including fault tolerant search methods. Several areas of application are suggested. We give an overview on several prototypic systems we developed for those applications including the notify! and the audentify! systems.
|
[Droettboom2002] |
Michael Droettboom, Ichiro Fujinaga, and Karl MacMillan.
Optical Music Interpretation.
In Terry Caelli, Adnan Amin, Robert P. W. Duin, Dick de Ridder, and
Mohamed Kamel, editors, Structural, Syntactic, and Statistical Pattern
Recognition, pages 378-387, Berlin, Heidelberg, 2002a. Springer Berlin
Heidelberg.
ISBN 978-3-540-70659-5.
[ bib |
DOI ]
A system to convert digitized sheet music into a symbolic music representation is presented. A pragmatic approach is used that conceptualizes this primarily two-dimensional structural recognition problem as a one-dimensional one. The transparency of the implementation owes a great deal to its implementation in a dynamic, object-oriented language. This system is a part of a locally developed end-to-end solution for the conversion of digitized sheet music into symbolic form.
|
[Droettboom2002a] | Michael Droettboom, Ichiro Fujinaga, Karl MacMillan, G. Sayeed Chouhury, Tim DiLauro, Mark Patton, and Teal Anderson. Using the Gamera framework for the recognition of cultural heritage materials. In Joint Conference on Digital Libraries, pages 12-17, London, UK, 2002b. [ bib | .pdf ] |
[Gezerlis2002] |
Velissarios G. Gezerlis and Sergios Theodoridis.
Optical character recognition of the Orthodox Hellenic Byzantine
Music notation.
Pattern Recognition, 35 (4): 895-914, 2002.
ISSN 0031-3203.
[ bib |
DOI |
http ]
In this paper we present for the first time, the development of a new system for the off-line optical recognition of the characters used in the orthodox Hellenic Byzantine Music notation, that has been established since 1814. We describe the structure of the new system and propose algorithms for the recognition of the 71 distinct character classes, based on Wavelets, 4-projections and other structural and statistical features. Using a nearest neighbor classifier, combined with a post classification schema and a tree-structured classification philosophy, an accuracy of 99.4% was achieved, in a database of about 18,000 Byzantine character patterns that have been developed for the needs of the system.
|
[Lopresti2002] | Daniel Lopresti and George Nagy. Issues in Ground-Truthing Graphic Documents. In Graphics Recognition Algorithms and Applications, pages 46-67. Springer Berlin Heidelberg, Ontario, Canada, 2002. ISBN 978-3-540-45868-5. [ bib | DOI ] |
[Luth2002] | Nailja Luth. Automatic Identification of Music Notations. In 2nd International Conference on WEB Delivering of Music, 2002. ISBN 0769518621. [ bib | DOI ] |
[MacMillan2002] | Karl MacMillan, Michael Droettboom, and Ichiro Fujinaga. Gamera: Optical music recognition in a new shell. In International Computer Music Conference, pages 482-485, 2002. [ bib | .pdf ] |
[McPherson2002] | John R. McPherson. Introducing Feedback into an Optical Music Recognition System. In 3rd International Conference on Music Information Retrieval, Paris, France, 2002. [ bib | .pdf ] |
[McPherson2002a] | John R. McPherson and David Bainbridge. Coordinating Knowledge Within an Optical Music Recognition System. Technical report, University of Waikato, Hamilton, New Zealand, 2002. [ bib | http ] |
[Miyao2002] |
Hidetoshi Miyao.
Stave Extraction for Printed Music Scores.
In Hujun Yin, Nigel Allinson, Richard Freeman, John Keane, and Simon
Hubbard, editors, Intelligent Data Engineering and Automated Learning,
pages 562-568. Springer Berlin Heidelberg, 2002.
ISBN 978-3-540-45675-9.
[ bib |
http ]
In this paper, a satisfactory method is described for the extraction of staff lines in which there are some inclinations, discontinuities, and curvatures. The extraction calls for four processes: (1) Extraction of specific points on a stave on vertical scan lines, (2) Connection of the points using DP matching, (3) Composition of stave groups using labeling, and (4) Extraction and adjustment of the edges of lines. The experiment resulted in an extraction rate of 99.4% for 71 printed music scores that included lines with some inclinations, discontinuities, and curvatures.
|
[Ng2002] | Kia Ng. Music manuscript tracing. Lecture Notes in Computer Science, 2390: 322-334, 2002. ISSN 1611-3349. [ bib | DOI | .pdf ] |
[Roland2002] | Perry Roland. The music encoding initiative (MEI). In 1st International Conference on Musical Applications Using XML, pages 55-59, 2002. [ bib | .pdf ] |
[Rossant2002] |
Florence Rossant.
A global method for music symbol recognition in typeset music sheets.
Pattern Recognition Letters, 23 (10): 1129-1141, 2002.
ISSN 0167-8655.
[ bib |
DOI ]
This paper presents an optical music recognition (OMR) system that can automatically recognize the main musical symbols of a scanned paper-based music score. Two major stages are distinguished: the first one, using low-level pre-processing, detects the isolated objects and outputs some hypotheses about them; the second one has to take the final correct decision, through high-level processing including contextual information and music writing rules. This article exposes both stages of the method: after explaining in detail the first one, the symbol analysis process, it shows through first experiments that its outputs can efficiently be used as inputs for a high-level decision process.
|
[Soak2002] |
Sang Moon Soak, Seok Cheol Chang, Taehwan Shin, and Byung-Ha Ahn.
Music recognition system using ART-1 and GA.
In AeroSense 2002, 2002.
[ bib |
DOI ]
Previously, most optical music recognition (OMR) systems have used the neural network, and used mainly back- propagation training method. One of the disadvantages of BP is that much time is required to train data sets. For example, when new data sets are added, all data sets have to be trained. Another disadvantage is that weighting values cannot be guaranteed as global optima after training them. It means that weighting values can fall down to local optimum solution. In this paper, we propose the new OMR method which combines the adaptive resonance theory (ART-1) with the genetic algorithms (GA). For reducing the training time, we use ART-1 which classifies several music symbols. It has another advantage to reduce the number of datasets, because classified symbols through ART-1 are used as input vectors of BP. And for guaranteeing the global optima in training data set, we use GA which is known as one of the best method for finding optimal solutions at complex problems.
|
[Bainbridge2001] |
David Bainbridge and Tim Bell.
The Challenge of Optical Music Recognition.
Computers and the Humanities, 35 (2): 95-121, 2001.
ISSN 1572-8412.
[ bib |
DOI ]
This article describes the challenges posed by optical musicrecognition - a topic in computer science that aims to convert scannedpages of music into an on-line format. First, the problem is described;then a generalised framework for software is presented that emphasises keystages that must be solved: staff line identification, musical objectlocation, musical feature classification, and musical semantics. Next,significant research projects in the area are reviewed, showing how eachfits the generalised framework. The article concludes by discussingperhaps the most open question in the field: how to compare the accuracy and success of rival systems, highlighting certain steps thathelp ease the task.
|
[Bainbridge2001a] | David Bainbridge, Gerry Bernbom, Mary Wallace Davidson, Andrew P. Dillon, Matthey Dovey, Jon W. Dunn, Michael Fingerhut, Ichiro Fujinaga, and Eric J. Isaacson. Digital Music Libraries - Research and Development. In 1st ACM/IEEE-CS Joint Conference on Digital Libraries, pages 446-448, Roanoke, Virginia, USA, 2001. [ bib | DOI ] |
[Bellini2001] |
Pierfrancesco Bellini, Ivan Bruno, and Paolo Nesi.
Optical music sheet segmentation.
In 1st International Conference on WEB Delivering of Music,
pages 183-190. Institute of Electrical & Electronics Engineers (IEEE),
2001.
ISBN 0769512844.
[ bib |
DOI ]
The optical music recognition problem has been addressed in several ways, obtaining suitable results only when simple music constructs are processed. The most critical phase of the optical music recognition process is the first analysis of the image sheet. The first analysis consists of segmenting the acquired sheet into smaller parts which may be processed to recognize the basic symbols. The segmentation module of the O<sup>3</sup> MR system (Object Oriented Optical Music Recognition) system is presented. The proposed approach is based on the adoption of projections for the extraction of basic symbols that constitute a graphic element of the music notation. A set of examples is also included.
|
[Choudhury2001] | G. Sayeed Choudhury, Tim DiLauro, Michael Droettboom, Ichiro Fujinaga, and Karl MacMillan. Strike Up the Score: Deriving searchable and playable digital formats from sheet music. D-Lib Magazine, 7 (2), 2001. ISSN 1082-9873. [ bib | DOI | .html ] |
[Coueasnon2001] |
Bertrand Coüasnon.
DMOS: a generic document recognition method, application to an
automatic generator of musical scores, mathematical formulae and table
structures recognition systems.
In 6th International Conference on Document Analysis and
Recognition, pages 215-220, 2001.
[ bib |
DOI ]
Genericity in structured document recognition is a difficult challenge. We therefore propose a new generic document recognition method, called DMOS (Description and MOdification of Segmentation), that is made up of a new grammatical formalism, called EPF (Enhanced Position Formalism) and an associated parser which is able to introduce context in segmentation. We implement this method to obtain a generator of document recognition systems. This generator can automatically produce new recognition systems. It is only necessary to describe the document with an EPF grammar, which is then simply compiled. In this way, we have developed various recognition systems: one on musical scores, one on mathematical formulae and one on recursive table structures. We have also defined a specific application to damaged military forms of the 19th Century. We have been able to test the generated system on 5,000 of these military forms. This has permitted us to validate the DMOS method on a real-world application
|
[Droettboom2001] | Michael Droettboom and Ichiro Fujinaga. Interpreting the semantics of music notation using an extensible and object-oriented system. Technical report, John Hopkins University, 2001. [ bib | http ] |
[Homenda2001] |
Wladyslaw Homenda.
Optical Music Recognition: the Case of Granular Computing.
In Granular Computing: An Emerging Paradigm, pages 341-366.
Physica-Verlag HD, Heidelberg, 2001.
ISBN 978-3-7908-1823-9.
[ bib |
DOI |
http ]
The paper deals with optical music recognition (OMR) as a process of structured data processing applied to music notation. Granularity of OMR in both its aspects: data representation and data processing is especially emphasised in the paper. OMR is a challenge in intelligent computing technologies, especially in such fields as pattern recognition and knowledge representation and processing. Music notation is a language allowing for communication in music, one of most sophisticated field of human activity, and has a high level of complexity itself. On the one hand, music notation symbols vary in size and have complex shapes; they often touch and overlap each other. This feature makes the recognition of music symbols a very difficult and complicated task. On the other hand, music notation is a two dimensional language in which importance of geometrical and logical relations between its symbols may be compared to the importance of the symbols alone. Due to complexity of music nature and music notation, music representation, necessary to store and reuse recognised information, is also the key issue in music notation recognition and music processing. Both: the data representation and the data processing used in OMR is highly structured, granular rather than numeric. OMR technology fits paradigm of granular computing
|
[MacMillan2001] | Karl MacMillan, Michael Droettboom, and Ichiro Fujinaga. Gamera: A structured document recognition application development environment. In 2nd International Symposium on Music Information Retrieval, pages 15-16, Bloomington, IN, 2001. [ bib | http ] |
[McPherson2001] | John R. McPherson. Using feedback to improve Optical Music Recognition, 2001. [ bib ] |
[Pugin2001] | Laurent Pugin. Réalisation d'un système de superposition de partitions de musique anciennes. Technical report, Geneva University, Geneva, Switzerland, 2001. [ bib | .pdf ] |
[Rossant2001] | Florence Rossant and Isabelle Bloch. Reconnaissance de Partitions Musicales par Modélisation Floue et Intégration de Règles Musicales. In GRETSI, Toulouse, France, 2001. [ bib | http ] |
[Su2001] |
Mu-Chun Su, Chee-Yuen Tew, and Hsin-Hua Chen.
Musical symbol recognition using SOM-based fuzzy systems.
In Joint 9th IFSA World Congress and 20th NAFIPS International
Conference, pages 2150-2153 vol.4, 2001.
[ bib |
DOI ]
A large number of research activities have been undertaken to investigate optical music recognition (OMR). OMR involves identifying musical symbols on a scanned sheet of music and transforming them into a computer readable format. We propose an efficient method based on SOM-based fuzzy systems to recognize musical symbols. A database consisting of 9 kinds of musical symbols were used to test the performance of the SOM-based fuzzy systems.
|
[Vieira2001] |
Pedro Vieira and João Caldas Pinto.
Recognition of musical symbols in ancient manuscripts.
In International Conference on Image Processing, pages 38-41
vol.3, 2001.
[ bib |
DOI ]
This paper presents a system for the automatic retrieval of music from ancient music collections (XVI-XVIII century), creating digital documents of music from images of music sheets. This is an optical music recognition system that uses image processing and pattern recognition techniques. Finally, we obtain a document that contains the music semantics: description of the notes, in time and pitches, as well as other relevant information.
|
[Anquetil2000] |
Éric Anquetil, Bertrand Coüasnon, and Frédéric Dambreville.
A Symbol Classifier Able to Reject Wrong Shapes for Document
Recognition Systems.
In Atul K. Chhabra and Dov Dori, editors, Graphics Recognition
Recent Advances, pages 209-218, Berlin, Heidelberg, 2000. Springer Berlin
Heidelberg.
ISBN 978-3-540-40953-3.
[ bib |
http ]
We propose in this paper a new framework to develop a transparent classifier able to deal with reject notions. The generated classifier can be characterized by a strong reliability without loosing good properties in generalization. We show on a musical scores recognition system that this classifier is very well suited to develop a complete document recognition system. Indeed this classifier allows them firstly to extract known symbols in a document (text for example) and secondly to validate segmentation hypotheses. Tests had been successfully performed on musical and digit symbols databases.
|
[Choudhury2000] | G. Sayeed Choudhury, M. Droetboom, Tim DiLauro, Ichiro Fujinaga, and Brian Harrington. Optical Music Recognition System within a Large-Scale Digitization Project. In 1st International Symposium on Music Information Retrieval, 2000a. [ bib | http ] |
[Choudhury2000a] |
G. Sayeed Choudhury, Cynthia Requardt, Ichiro Fujinaga, Tim DiLauro,
Elisabeth W. Brown, James W. Warner, and Brian Harrington.
Digital workflow management: The Lester S. Levy digitized collection
of sheet music.
First Monday, 5 (6), 2000b.
[ bib |
DOI ]
The paper describes the development of a set of workflow management tools (WMS) that will reduce the manual input necessary to manage the workflow of large-scale digitization projects. The WMS will also support the path from physical object and/or digitized material into a digital library repository by providing effective tools for perusing multimedia elements. The Lester S. Levy Collection of Sheet Music Project at the Milton S. Eisenhower Library at The Johns Hopkins University provides an ideal testbed for the development and evaluation of the WMS. Building upon previous effort to digitize the entire collection of over 29000 pieces of sheet music, optical music recognition (OMR) software will create sound files and full-text lyrics. The combination of image, text and sound files provide a comprehensive multimedia environment. The functionality of the collection will be enhanced by the incorporation of metadata, the implementation of a disk based search engine for lyrics, and the development of toolkits for searching sound files (0 Refs.) music; search engines; workflow management software
|
[Fotinea2000] | Stavroula-Evita Fotinea, George Giakoupis, Aggelos Livens, Stylianos Bakamidis, and George Carayannis. An Optical Notation Recognition System for Printed Music Based on Template Matching and High Level Reasoning. In RIAO '00 Content-Based Multimedia Information Access, pages 1006-1014, Paris, France, 2000. Le centre de hautes etudes internationales d'informatique documentaire. [ bib | http ] |
[Fujinaga2000] | Ichiro Fujinaga. Optical Music Recognition Bibliography. http://www.music.mcgill.ca/~ich/research/omr/omrbib.html, 2000. [ bib | .html ] |
[Lallican2000] | P. M. Lallican, C. Viard-Gaudin, and S. Knerr. From Off-Line to On-Line Handwriting Recognition. In L. R. B. Schomaker and L. G. Vuurpijl, editors, 7th International Workshop on Frontiers in Handwriting Recognition, pages 303-312, Amsterdam, 2000. International Unipen Foundation. ISBN 90-76942-01-3. [ bib | .pdf ] |
[Lin2000] |
Karen Lin and Tim Bell.
Integrating Paper and Digital Music Information Systems.
In International Society for Music Information Retrieval,
pages 23-25, 2000.
[ bib |
.pdf ]
Active musicians generally rely on extensive personal paper-based music information retrieval systems containing scores, parts, compositions, and arrangements of published and hand-written music. Many have a bias against using computers to store, edit and retrieve music, and prefer to work in the paper domain rather than using digital documents, despite the flexibility and powerful retrieval opportunities available. In this paper we propose a model of operation that blurs the boundaries between the paper and digital domains, offering musicians the best of both worlds. A survey of musicians identifies the problems and potential of working with digital tools, and we propose a system using colour printing and scanning technology that simplifies the process of moving music documents between the two domains
|
[Miyao2000] | Hidetoshi Miyao and Robert Martin Haralick. Format of Ground Truth Data Used in the Evaluation of the Results of an Optical Music Recognition System. In 4th International Workshop on Document Analysis Systems, pages 497-506, Brasil, 2000. [ bib | .pdf ] |
[Pinto2000] |
João Caldas Pinto, Pedro Vieira, M. Ramalho, M. Mengucci, P. Pina, and
F. Muge.
Ancient Music Recovery for Digital Libraries.
In José Borbinha and Thomas Baker, editors, Research and
Advanced Technology for Digital Libraries, pages 24-34, Berlin, Heidelberg,
2000. Springer Berlin Heidelberg.
ISBN 978-3-540-45268-3.
[ bib |
DOI ]
The purpose of this paper is to present a description and current state of the “ROMA” (Reconhecimento Óptico de Música Antiga or Ancient Music Optical Recognition) Project that consists on building an application, for the recognition and restoration specialised in ancient music manuscripts (from XVI to XVIII century). This project, beyond the inventory of the Biblioteca Geral da Universidade de Coimbra musical funds aims to develop algorithms for scores restoration and musical symbols recognition in order to allow a suitable representation and restoration on digital format. Both objectives have an intrinsic research nature one in the area of musicology and other in digital libraries.
|
[Bainbridge1999] |
David Bainbridge and K. Wijaya.
Bulk processing of optically scanned music.
In 7th International Conference on Image Processing and its
Applications, pages 474-478. Institution of Engineering and Technology,
1999.
[ bib |
DOI |
http ]
For many years now optical music recognition (OMR) has been advocated as the leading methodology for transferring the vast repositories of music notation from paper to digital database. Other techniques exist for acquiring music on-line; however, these methods require operators with musical and computer skills. The notion, therefore, of an entirely automated process through OMR is highly attractive. It has been an active area of research since its inception in 1966 (Pruslin), and even though there has been the development of many systems with impressively high accuracy rates it is surprising to note that there is little evidence of large collections being processed with the technology-work by Carter (1994) and Bainbridge and Carter (1997) being the only known notable exception. This paper outlines some of the insights gained, and algorithms implemented, through the practical experience of converting collections in excess of 400 pages. In doing so, the work demonstrates that there are additional factors not currently considered by other research centres that are necessary for OMR to reach its full potential.
|
[Beran1999] |
Tomáš Beran and Tomáš Macek.
Recognition of Printed Music Score.
In Petra Perner and Maria Petrou, editors, Machine Learning and
Data Mining in Pattern Recognition, pages 174-179. Springer Berlin
Heidelberg, 1999.
ISBN 978-3-540-48097-6.
[ bib |
DOI ]
This article describes our implementation of the Optical Music Recognition System (OMR). The system implemented in our project is based on the binary neural network ADAM. ADAM has been used for recognition of music symbols. Preprocessing was implemented by conventional techniques. We decomposed the OMR process into several phases. The results of these phases are summarized.
|
[Blostein1999] |
Dorothea Blostein and Lippold Haken.
Using diagram generation software to improve diagram recognition: a
case study of music notation.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 21 (11): 1121-1136, 1999.
ISSN 0162-8828.
[ bib |
DOI ]
Diagrams are widely used in society to transmit information such as circuit designs, music, mathematical formulae, architectural plans, and molecular structure. Computers must process diagrams both as images (marks on paper) and as information. A diagram recognizer translates from image to information and a diagram generator translates from information to image. Current technology for diagram generation is ahead of the technology for diagram recognition. Diagram generators have extensive knowledge of notational conventions which relate to readability and aesthetics, whereas current diagram recognizers focus on the hard constraints of the notation. To create a recognizer capable of exploiting layout information, it is expedient to reuse the expertise in existing diagram generators. In particular, we discuss the use of Lime (our editor and generator for music notation) to proofread and correct the raw output of MIDIScan (a third-party commercial recognizer for music notation). Over the past several years, this combination of software has been distributed to thousands of users.
|
[Ferrand1999] | Miguel Ferrand, João Alexandre Leite, and Amilcar Cardoso. Hypothetical reasoning: An application to Optical Music Recognition. In Appia-Gulp-Prode'99 joint conference on declarative programming, pages 367-381, 1999a. [ bib | http ] |
[Ferrand1999a] |
Miguel Ferrand, João Alexandre Leite, and Amilcar Cardoso.
Improving Optical Music Recognition by Means of Abductive Constraint
Logic Programming.
In Pedro Barahona and José J. Alferes, editors, Progress in
Artificial Intelligence, pages 342-356, Berlin, Heidelberg, 1999b.
Springer Berlin Heidelberg.
ISBN 978-3-540-48159-1.
[ bib |
DOI ]
In this paper we propose a hybrid system that bridges the gap between traditional image processing methods, used for low-level object recognition, and abductive constraint logic programming used for high-level musical interpretation. Optical Music Recognition (OMR) is the automatic recognition of a scanned page of printed music. All such systems are evaluated by their rate of successful recognition; therefore a reliable OMR program should be able to detect and eventually correct its own recognition errors. Since we are interested in dealing with polyphonic music, some additional complexity is introduced as several concurrent voices and simultaneous musical events may occur. In RIEM, the OMR system we are developing, when events are inaccurately recognized they will generate inconsistencies in the process of voice separation. Furthermore if some events are missing a consistent voice separation may not even be possible.
|
[Hori1999] |
Toyokazu Hori, Shinichiro Wada, Howzan Tai, and S. Y. Kung.
Automatic music score recognition/play system based on decision based
neural network.
In 3rd Workshop on Multimedia Signal Processing, pages
183-184, 1999.
[ bib |
DOI ]
This paper proposes an automatic music score recognition system based on a hierarchically structured decision based neural network (DBNN), which can classify patterns with nonlinear decision boundaries. Currently, this system yields around a 97% recognition rate for printed music scores.
|
[Marinai1999] |
Simone Marinai and Paolo Nesi.
Projection Based Segmentation of Musical Sheets.
In 5th International Conference on Document Analysis and
Recognition, pages 3-6, 1999.
ISBN 0-7695-0318-7.
[ bib |
DOI ]
The automatic recognition of music scores is a key process for the electronic treatment of music information. In this paper we present the segmentation module of an OMR system. The proposed approach is based on the use of projection profiles for the location of elementary symbols that constitute the music notation. An extensive experimentation was made which the help of a tool developed to this purpose. Reported results shown a high efficiency in the correct location of elementary symbols
|
[McPherson1999] | John R. McPherson. Page Turning - Score Automation for Musicians. Technical report, University of Canterbury, New Zealand, 1999. [ bib | http ] |
[Ng1999] | Kia Ng, David Cooper, Ewan Stefani, Roger Boyle, and Nick Bailey. Embracing the Composer : Optical Recognition of Handwrtten Manuscripts. In International Computer Music Conference, pages 500-503, 1999. [ bib | http ] |
[VuilleumierStueckelberg1999] |
Marc Vuilleumier Stückelberg and David Doermann.
On musical score recognition using probabilistic reasoning.
In 5th International Conference on Document Analysis and
Recognition, pages 115-118, 1999.
ISBN 0-7695-0318-7.
[ bib |
DOI |
http ]
We present a probabilistic framework for document analysis and recognition and illustrate it on the problem of musical score recognition. Our system uses an explicit descriptive model of the document class to find the most likely interpretation of a scanned document image. In contrast to the traditional pipeline architecture, we carry out all stages of the analysis with a single inference engine, allowing for an end-to-end propagation of the uncertainty. The global modeling structure is similar to a stochastic attribute grammar, and local parameters are estimated using hidden Markov models (10 Refs.) image processing; image recognition; inference mechanisms; music; uncertainty handling
|
[Wijaya1999] |
K. Wijaya and David Bainbridge.
Staff line restoration.
In 7th International Conference on Image Processing and its
Applications, pages 760-764. Institution of Engineering and Technology,
1999.
[ bib |
DOI ]
Optical music recognition (OMR), the conversion of scanned pages of music into a musical database, has reached an exciting level of maturity. Like optical character recognition, it has now reached the point where the returns in accuracy from increasingly sophisticated pattern recognition algorithms appears saturated and more significant gains are being made from the application of structured a priori knowledge. This paper describes one such technique for improved staff line processing-the detection and subsequent correction of bowing in the staff lines, which is an important category given the significant source of music in book form. Two versions of the algorithm are tested: the first, based on mathematical morphology, has the added benefit of automatically fusing small breaks in staff lines, common for example in older works; the second, based on a flood-fill algorithm, requires a minor modification if fragmented staff lines are to be repaired. The correct detection and processing of staff lines is fundamental to OMR. Without adequate knowledge of staff line location, notation superimposed on the staves cannot be correctly separated, classified and processed.
|
[Bainbridge1998] |
David Bainbridge and Stuart Inglis.
Musical image compression.
In Data Compression Conference, pages 209-218, 1998.
[ bib |
DOI ]
Optical music recognition aims to convert the vast repositories of sheet music in the world into an on-line digital format. In the near future it will be possible to assimilate music into digital libraries and users will be able to perform searches based on a sung melody in addition to typical text-based searching. An important requirement for such a system is the ability to reproduce the original score as accurately as possible. Due to the huge amount of sheet music available, the efficient storage of musical images is an important topic of study. This paper investigates whether the "knowledge" extracted from the optical music recognition (OMR) process can be exploited to gain higher compression than the JBIG international standard for bi-level image compression. We present a hybrid approach where the primitive shapes of music extracted by the optical music recognition process-note heads, note stems, staff lines and so forth-are fed into a graphical symbol based compression scheme originally designed for images containing mainly printed text. Using this hybrid approach the average compression rate for a single page is improved by 3.5% over JBIG. When multiple pages with similar typography are processed in sequence, the file size is decreased by 4-8%. The relevant background to both optical music recognition and textual image compression is presented. Experiments performed on 66 test images are described, outlining the combinations of parameters that were examined to give the best results.
|
[Chhabra1998] |
Atul K. Chhabra.
Graphic symbol recognition: An overview.
In Karl Tombre and Atul K. Chhabra, editors, Graphics
Recognition Algorithms and Systems, pages 68-79, Berlin, Heidelberg, 1998.
Springer Berlin Heidelberg.
ISBN 978-3-540-69766-4.
[ bib |
DOI ]
Symbol recognition is one of the primary stages of any graphics recognition system. This paper reviews the current state of the art in graphic symbol recognition and raises some open issues that need further investigation. Work on symbol recognition tends to be highly application specific. Therefore, this review presents the symbol recognition methods in the context of specific applications.
|
[Fahmy1998] |
Hoda M. Fahmy and Dorothea Blostein.
A graph-rewriting paradigm for discrete relaxation: Application to
sheet-music recognition.
International Journal of Pattern Recognition and Artificial
Intelligence, 12 (6): 763-799, 1998.
[ bib |
DOI ]
In image analysis, recognition of the primitives plays an important role. Subsequent analysis is used to interpret the arrangement of primitives. This subsequent analysis must make allowance for errors or ambiguities in the recognition of primitives. In this paper, we assume that the primitive recognizer produces a set of possible interpretations for each primitive. To reduce this primitive-recognition ambiguity, we use contextual information in the image, and apply constraints from the image domain. This process is variously termed constraint satisfaction, labeling or discrete relaxation. Existing methods for discrete relaxation are limited in that they assume a priori knowledge of the neighborhood model: before relaxation begins, the system is told (or can determine) which sets of primitives are related by constraints. These methods do not apply to image domains in which complex analysis is necessary to determine which primitives are related by constraints. For example, in music notation, we must recognize which notes belong to one measure, before it is possible to apply the constraint that the number of beats in the measure should match the time signature. Such constraints can be handled by our graph-rewriting paradigm for discrete relaxation: here neighborhood-model construction is interleaved with constraint-application. In applying this approach to the recognition of simple music notation, we use approximately 180 graph-rewriting rules to express notational constraints and semantic-interpretation rules far music notation. The graph rewriting rules express both binary and higher-order notational constraints. As image-interpretation proceeds, increasingly abstract levels of interpretation are assigned to (groups of) primitives. This allows application of higher-level constraints, which can be formulated only after partial interpretation of the image.
|
[Ferrand1998] |
Miguel Ferrand and Amílcar Cardoso.
Scheduling to Reduce Uncertainty in Syntactical Music Structures.
In Flávio Moreira de Oliveira, editor, Advances in
Artificial Intelligence, pages 249-258, Berlin, Heidelberg, 1998. Springer
Berlin Heidelberg.
ISBN 978-3-540-49523-9.
[ bib |
DOI ]
In this paper, we focus on the syntactical aspects of music representation. We look at a music score as a structured layout of events with intrinsic temporal significance and we show that important basic relations between these events can be inferred from the topology of symbol objects in a music score. Within this framework, we propose a scheduling algorithm to find consistent assignments of events to voices, in the presence of uncertain information. Based on some experimental results, we show how we may use this approach to improve the accuracy of an Optical Music Recognition system.
|
[Fujinaga1998] | Ichiro Fujinaga, Stephan Moore, and David S. Sullivan. Implementation of exemplar-based learning model for music cognition. In International Conference on Music Perception and Cognition, pages 171-179, Seoul, South Korea, 1998. [ bib | .pdf ] |
[Bainbridge1997] |
David Bainbridge and Tim Bell.
Dealing with Superimposed Objects in Optical Music Recognition.
In 6th International Conference on Image Processing and its
Applications, number 443, pages 756-760, 1997.
ISBN 0 85296 692 X.
[ bib |
DOI ]
Optical music recognition (OMR) involves identifying musical symbols on a scanned sheet of music, and interpreting them so that the music can either be played by the computer, or put into a music editor. Applications include providing an automatic accompaniment, transposing or extracting parts for individual instruments, and performing an automated musicological analysis of the music. A key problem with music recognition, compared with character recognition, is that symbols very often overlap on the page. The most significant form of this problem is that the symbols are superimposed on a five-line staff. Although the staff provides valuable positional information, it creates ambiguity because it is difficult to determine whether a pixel would be black or white if the staff line was not there. The other main difference between music recognition and character recognition is the set of permissible symbols. In text, the alphabet size is fixed. Conversely, in music notation there is no standard "alphabet" of shapes, with composers inventing new notation where necessary, and music for particular instruments using specialised notation where appropriate. The focus of this paper is on techniques we have developed to deal with superimposed objects (6 Refs.) recognition
|
[Bainbridge1997a] | David Bainbridge. Extensible optical music recognition. PhD thesis, University of Canterbury, 1997. [ bib | http ] |
[Bainbridge1997b] |
David Bainbridge and Nicholas Paul Carter.
Automatic reading of music notation.
In H. Bunke and P. Wang, editors, Handbook of Character
Recognition and Document Image Analysis, pages 583-603. World Scientific,
Singapore, 1997.
[ bib |
DOI ]
The aim of Optical Music Recognition (OMR) is to convert optically scanned pages of music into a machine-readable format. In this tutorial level discussion of the topic, an historical background of work is presented, followed by a detailed explanation of the four key stages to an OMR system: stave line identification, musical object location, symbol identification, and musical understanding. The chapter also shows how recent work has addressed the issues of touching and fragmented objects—objectives that must be solved in a practical OMR system. The report concludes by discussing remaining problems, including measuring accuracy.
|
[VuilleumierStueckelberg1997] | Marc Vuilleumier Stückelberg, Christian Pellegrini, and Mélanie Hillario. A preview of an architecture for musical score recognition. Technical report, University of Geneva, 1997b. [ bib | http ] |
[VuilleumierStueckelberg1997a] |
Marc Vuilleumier Stückelberg, Christian Pellegrini, and Mélanie
Hilario.
An architecture for musical score recognition using high-level domain
knowledge.
In 4th International Conference on Document Analysis and
Recognition, pages 813-818 vol.2, 1997a.
[ bib |
DOI ]
Proposes an original approach to musical score recognition, a particular case of high-level document analysis. In order to overcome the limitations of existing systems, we propose an architecture which allows for a continuous and bidirectional interaction between high-level knowledge and low-level data, and which is able to improve itself over time by learning. This architecture is made of three cooperating layers, one made of parameterized feature detectors, another working as an object-oriented knowledge repository and the other as a supervising Bayesian metaprocessor. Although the implementation is still in progress, we show how this architecture is adequate for modeling and processing knowledge.
|
[Anstice1996] |
Jamie Anstice, Tim Bell, Andy Cockburn, and Martin Setchell.
The design of a pen-based musical input system.
In 6th Australian Conference on Computer-Human Interaction,
pages 260-267, 1996.
[ bib |
DOI ]
Computerising the task of music editing can avoid a considerable amount of tedious work for musicians, particularly for tasks such as key transposition, part extraction, and layout. However the task of getting the music onto the computer can still be time consuming and is usually done with the help of bulky equipment. This paper reports on the design of a pen-based input system that uses easily-learned gestures to facilitate fast input, particularly if the system must be portable. The design is based on observations of musicians writing music by hand, and an analysis of the symbols in samples of music. A preliminary evaluation of the system is presented, and the speed is compared with the alternatives of handwriting, synthesiser keyboard input, and optical music recognition. Evaluations suggest that the gesture-based system could be approximately three times as fast as other methods of music data entry reported in the literature.
|
[Bainbridge1996] | David Bainbridge and Tim Bell. An extensible optical music recognition system. Australian Computer Science Communications, 18: 308-317, 1996. [ bib | .html ] |
[CapellaScan] | capella-software AG. Capella Scan. https://www.capella-software.com, 1996. [ bib | http ] |
[Dan1996] |
Lee Sau Dan.
Automatic Optical Music recognition.
Technical report, The University of Waikato, New Zealand, 1996.
[ bib |
.ps.gz ]
In this pro ject, the topic of automatic optical music recognition was studied. It is the conversion of an optically sampled image of a musical score into a representation that can be conveniently stored in computer storage and retrieved for various purpose. It is analogous to optical character recognition. Optical character recognition recognizes text characters in the input images and output the text in a machine-readable format. Similarly, an optical music recognition system recognizes the symbols on a musical score and output the results in a binary format. Subsequent processing on this output can provide a wide variety of applications, such as reprinting and archiving.
|
[Fujinaga1996] | Ichiro Fujinaga. Exemplar-based learning in adaptive optical music recognition system. In International Computer Music Conference, pages 55-56, Hong Kong, 1996a. ISBN 962-85092-1-7. [ bib | http ] |
[Fujinaga1996a] | Ichiro Fujinaga. Adaptive optical music recognition. PhD thesis, McGill University, 1996b. [ bib | .pdf ] |
[Homenda1996] |
Wladyslaw Homenda.
Automatic recognition of printed music and its conversion into
playable music data.
Control and Cybernetics, 25 (2): 353-367, 1996.
[ bib |
.pdf ]
The paper describes MIDISCAN-a recognition system for printed music notation. Music notation recognition is a challenging problem in both fields: pattern recognition and knowledge representation. Music notation symbols, though well characterized by their features, are arranged in an elaborate way in real music notation, which makes recognition task very difficult and still open for new ideas, as for example, fuzzy set application in skew correction and stave location. On the other hand, the aim of the system, i.e. conversion of acquired printed music into playable MIDI format requires special representation of music data. The problems of pattern recognition and knowledge representation in context of music processing are discussed in this paper (16 Refs.) music; optical character recognition
|
[Kopec1996] |
Gary E. Kopec, Philip A. Chou, and David A. Maltz.
Markov source model for printed music decoding.
Journal of Electronic Imaging, 5, 1996.
[ bib |
DOI |
.pdf ]
A Markov source model is described for a simple subset of printed music notation that was developed as an extended example of the document image decoding (DID) approach to document image analysis. The model is based on the Adobe Sonata music symbol set and a finite-state language of textual music messages. The music message language is defined and several important aspects of message imaging are discussed. Aspects of music notation that appear problematic for a finite-state representation are identified. Finally, an example of music image decoding and resynthesis using the model is presented. Development of the model was greatly facilitated by the duality between image synthesis and image decoding that is fundamental to the DID paradigm.
|
[Miyao1996] |
Hidetoshi Miyao and Yasuaki Nakano.
Note symbol extraction for printed piano scores using neural
networks.
IEICE Transactions on Information and Systems, E79-D (5):
548-554, 1996.
[ bib |
http ]
In the traditional note symbol extraction processes, extracted candidates of note elements were identified using complex if-then rules based on the note formation rules and they needed subtle adjustment of parameters through many experiments. The purpose of our system is to avoid the tedious tasks and to present an accurate and high-speed extraction of note heads, stems and flags according to the following procedure. (1) We extract head and flag candidates based on the stem positions. (2) To identify heads and flags from the candidates, we use a couple of three-layer neural networks. To make the networks learn, we give the position informations and reliability factors of candidates to the input units. (3) With the weights learned by the net, the head and flag candidates are recognized. As an experimental result, we obtained a high extraction rate of more than 99% for thirteen printed piano scores on A4 sheet which have various difficulties. Using a workstation (SPARC Station 10), it took about 90 seconds to do on the average. It means that our system can analyze piano scores 5 times or more as fast as the manual work. Therefore, our system can execute the task without the traditional tedious works, and can recognize them quickly and accurately (9 Refs.) recognition
|
[Modayur1996] | Bharath R. Modayur. Music Score Recognition - A Selective Attention Approach using Mathematical Morphology. Technical report, Electrical Engineering Department, University of Washington, Seattle, 1996. [ bib | http ] |
[Ng1996] |
Kia Ng and Roger Boyle.
Recognition and reconstruction of primitives in music scores.
Image and Vision Computing, 14 (1): 39-46, 1996.
ISSN 0262-8856.
[ bib |
DOI |
http ]
Music recognition bears similarities and differences to OCR. In this paper we identify some of the problems peculiar to musical scores, and propose an approach which succeeds in a wide range of non-trivial cases. The composer customarily proceeds by writing notes, then stems, beams, ties and slurs — we have inverted this approach by segmenting and then subsegmenting scores to recapture the component parts of symbols. In this paper, we concentrate on the strategy of recognizing sub-segmented primitives, and the reassembly process which reconstructs low level graphical primitives back to musical symbols. The sub-segmentation process proves to be worthwhile, since many primitives complement each other and high level musical theory can be employed to enhance the recognition process.
|
[Reed1996] |
K. Todd Reed and J. R. Parker.
Automatic Computer Recognition of Printed Music.
In 13th International Conference on Pattern Recognition, pages
803-807, 1996.
ISBN 081867282X.
[ bib |
DOI ]
This paper provides an overview to the implementation of Lemon, a complete optical music recognition system. Among the techniques employed by the implementation are: template matching, the Hough transform, line adjacency graphs, character profiles, and graph grammars. Experimental results, including comparisons with commercial systems, are provided
|
[Yadid-Pecht1996] |
Orly Yadid-Pecht, Moty Gerner, Lior Dvir, Eliyahu Brutman, and Uri Shimony.
Recognition of handwritten musical notes by a modified Neocognitron.
Machine Vision and Applications, 9 (2): 65-72, 1996.
ISSN 1432-1769.
[ bib |
DOI |
http ]
A neural network for recognition of handwritten musical notes, based on the well-known Neocognitron model, is described. The Neocognitron has been used for the “what” pathway (symbol recognition), while contextual knowledge has been applied for the “where” (symbol placement). This way, we benefit from dividing the process for dealing with this complicated recognition task. Also, different degrees of intrusiveness in “learning” have been incorporated in the same network: More intrusive supervised learning has been implemented in the lower neuron layers and less intrusive in the upper one. This way, the network adapts itself to the handwriting of the user. The network consists of a 13x49 input layer and three pairs of “simple” and “complex” neuron layers. It has been trained to recognize 20 symbols of unconnected notes on a musical staff and was tested with a set of unlearned input notes. Its recognition rate for the individual unseen notes was up to 93%, averaging 80% for all categories. These preliminary results indicate that a modified Neocognitron could be a good candidate for identification of handwritten musical notes.
|
[Baumann1995] | Stephan Baumann. A Simplified Attributed Graph Grammar for High-Level Music Recognition. In 3rd International Conference on Document Analysis and Recognition, pages 1080-1083. IEEE, 1995. ISBN 0-8186-7128-9. [ bib | DOI ] |
[Baumann1995a] | Stephan Baumann and Karl Tombre. Report of the line drawing and music recognition working group. In A. Lawrence Spitz and Andreas Dengel, editors, Document Analysis Systems, pages 1080-1083, 1995. [ bib | DOI ] |
[Coueasnon1995] |
Bertrand Coüasnon, Pascal Brisset, and Igor Stéphan.
Using Logic Programming Languages For Optical Music Recognition.
In 3rd International Conference on the Practical Application of
Prolog, 1995.
[ bib |
http ]
Optical Music Recognition is a particular form of document analysis in which there is much knowledge about document structure. Indeed there exists an important set of rules for musical notation, but current systems do not fully use them. We propose a new solution using a grammar to guide the segmentation of the graphical ob jects and their recognition. The grammar is essentially a description of the relations (relative position and size, adjacency, etc) between the graphical ob jects. Inspired by Denite Clause Grammar techniques, the grammar can be directly implemented in Prolog, a higher-order dialect of Prolog. Moreover, the translation from the grammar into Prolog code can be done automatically. Our approach is justied by the rst encouraging results obtained with a prototype for music score recognition.
|
[Coueasnon1995a] |
Bertrand Coüasnon and Jean Camillerapp.
A Way to Separate Knowledge From Program in Structured Document
Analysis: Application to Optical Music Recognition.
In 3rd International Conference on Document Analysis and
Recognition, pages 1092-1097, 1995.
[ bib |
DOI ]
Optical Music Recognition is a form of document analysis for which a priori knowledge is particularly important. Musical notation is governed by a substantial set of rules, but current systems fail to use them adequately. In complex scores, existing systems cannot overcome the well-known segmentation problems of document analysis, due mainly to the high density of music information. This paper proposes a new method of recognition which uses a grammar in order to formalize the syntactic rules and represent the context. However, where objects touch, there is a discrepancy between the way the existing knowledge (grammar) will describe an object and the way it is recognized, since touching objects have to be segmented first. Following a description of the grammar, this paper shall go on to propose the use of an operator to modify the way the grammar parses the image so that the system can deal with certain touching objects (e.g. where an accidental touches a notehead).
|
[Coueasnon1995b] |
Bertrand Coüasnon and Bernard Rétif.
Using a grammar for a reliable full score recognition system.
In International Computer Music Conference, pages 187-194,
1995.
[ bib |
.pdf ]
Optical Music Recognition needs to be reliable to avoid users to detect and correct errors by controlling all the recognized score. Reliability can be reach by improving the recognition quality (on segmentation problems) and by making the system able to detect itself its recognition errors. This is possible only by using as much as possible the musical knowledge. Therefore, we propose a grammar to formalize the musical knowledge on full cores with polyphonic staves. We then show how this grammar can help detection of most of errors on note duration. The presented system is in an implementation phase but is already able to deal with full scores and to point on errors.
|
[Homenda1995] |
Wladyslaw Homenda.
Optical pattern recognition for printed music notation.
In Symposium on OE/Aerospace Sensing and Dual Use Photonics,
1995.
[ bib |
DOI |
http ]
The paper presents problems related to automated recognition of printed music notation. Music notation recognition is a challenging problem in both fields: pattern recognition and knowledge representation. Music notation symbols, though well characterized by their features, are arranged in elaborated way in real music notation, which makes recognition task very difficult and still open for new ideas. On the other hand, the aim of the system, i.e. application of acquired printed music into further processing requires special representation of music data. Due to complexity of music nature and music notation, music representation is one of the key issue in music notation recognition and music processing. The problems of pattern recognition and knowledge representation in context or music processing are discussed in this paper. MIDISCAN, the computer system for music notation recognition and music processing, is presented.
|
[Miyao1995] |
Hidetoshi Miyao and Yasuaki Nakano.
Head and stem extraction from printed music scores using a neural
network approach.
In 3rd International Conference on Document Analysis and
Recognition, pages 1074-1079, 1995.
ISBN 0-8186-7128-9.
[ bib |
DOI ]
In an automatic music score recognition system, it is very important to extract heads and stems of notes, since these symbols are most ubiquitous in a score and musically important. The purpose of our system is to present an accurate and high-speed extraction of note heads (except the whole notes) and stems according to the following procedure. (1) We extract all regions which are considered as candidates of stems or heads. (2) To identify heads from the candidates, we use a three-layer neural network. (3) The weights for the network are learned by the back propagation method. In the learning, the network learns the spatial constraints between heads and surroundings rather than the shapes of heads. (4) After the learning process is completed we use this network to identify a number of test head candidates (5) The stem candidates touching the detected heads are extracted as true stems. As an experimental result, we obtained high recognition rates of 99.0% and 99.2% for stems and note heads, respectively. It took between 40 to 100 seconds to process a printed piano score on A4 sheet using a workstation. Therefore, our system can analyze it at least 10 times as fast as manual methods
|
[Ng1995] |
Kia Ng, Roger Boyle, and David Cooper.
Low- and high-level approaches to optical music score recognition.
In IEE Colloquium on Document Image Processing and Multimedia
Environments, pages 31-36, 1995.
[ bib |
DOI ]
The computer has become an increasingly important device in music. It can not only generate sound but is also able to perform time consuming and repetitive tasks, such as transposition and part extraction, with speed and accuracy. However, a score must be represented in a machine readable format before any operation can be carried out. Current input methods, such as using an electronic keyboard, are time consuming and require human intervention. Optical music recognition (OMR) provides an interesting, efficient and automatic method to transform paper-based music scores into a machine representation. The authors outline the techniques for pre-processing and discuss the heuristic and musical rules employed to enhance recognition. A spin-off application that makes use of the intermediate results to enhance stave lines is also presented. The authors concentrate on the techniques used for time-signature detection, discuss the application of frequently-found rhythmical patterns to clarify the results of OMR, and propose possible enhancements using such knowledge. They believe that domain-knowledge enhancement is essential for complex document analysis and recognition. Other possible areas of development include melodic, harmonic and stylistic analysis to improve recognition results further.
|
[PoulaindAndecy1995] | Vincent Poulain d'Andecy, Jean Camillerapp, and Ivan Leplumey. Analyse de Partitions Musicales. Traitement du Signal, 12 (6): 653-661, 1995. [ bib | http ] |
[Seales1995] |
W. Brent Seales and Arcot Rajasekar.
Interpreting music manuscripts: A logic-based, object-oriented
approach.
In Roland T. Chin, Horace H. S. Ip, Avi C. Naiman, and Ting-Chuen
Pong, editors, Image Analysis Applications and Computer Graphics,
pages 181-188, Berlin, Heidelberg, 1995. Springer Berlin Heidelberg.
ISBN 978-3-540-49298-6.
[ bib |
DOI ]
This paper presents a complete framework for recognizing classes of machine-printed musical manuscripts. Our framework is designed around the decomposition of a manuscript into objects such as staves and bars which are processed with a knowledge base module that encodes rules in Prolog. Object decomposition focuses the recognition problem, and the rule base provides a powerful and flexible way to encode the rules of a particular manuscript class. Our rule-base registers notes and stems, eliminates false-positives and correctly labels notes according to their position on the staff. We present results that show 99% accuracy at detecting note-heads and 95% accuracy in finding stems.
|
[Yoda1995] |
Ikushi Yoda, Kazuhiko Yamamoto, and Hiromitsu Yamada.
Automatic Construction of Recognition Procedures for Musical Notes by
Genetic Algorithm.
In A. Lawrence Spitz and Andreas Dengel, editors, Document
Analysis Systems, 1995.
[ bib |
DOI ]
The Table of Contents for the full book PDF is as follows: System Architecture Data Structures for Page Readers Palace: A Multilingual Document Recognition System Experiences with High-Volume, High Accuracy Document Capture OfficeMAID - A System for Office Mail Analysis, Interpretation and Delivery Programmable Contextual Analysis A System for Exploiting Context in Automatic Recognition An Adaptive Approach to Document Classification and Understanding Class Evaluation Document Image Analysis: Automated Performance Evaluation Using Consensus Sequence Voting to Correct OCR Errors A Handwritten Character Recognition System by Efficient Combination of Multiple Classifiers A Region-Based System for the Automatic Evaluation of Page Segmentation Algorithms Integration of Contextual Knowledge Sources into a Blackboard-Based Text Recognition System Automatic Construction of Recognition Procedures for Musical Notes by Genetic Algorithm Recognition of Handwritten Responses on US Census Forms A System for the Recognition of Handwritten Literal Amounts of Checks Handwritten Text Recognition Line Drawing Knowledge Organization and Interpretation Process in Engineering Drawing Interpretation Processing Imprecise and Structural Distorted Line Drawings by An Adaptable Drawing Interpretation Kernal Vector-Based Arc Segmentation in the Machine Drawing Understanding System Environment Robust Drawing Recognition Based on Model-Guided Segmentation Innovations Document Image Matching and Retrieval with Multiple Distortion-Invariant Descriptors Off-Line Interpretation and Execution of Corrections on Text Documents Analysis of Scanned Braille Documents Document Analysis by Fractal Signatures Working Groups Possibilities for International Collaboration Document Analysis and Learning Needs of the Market and User Requirements Evaluation-Criteria Handwriting Line Drawing and Music Recognition Multilingual Documents and Natural Language Processing Form Recognition
|
[Bainbridge1994] | David Bainbridge. A complete optical music recognition system: Looking to the future. Technical report, University of Canterbury, 1994a. [ bib | http ] |
[Bainbridge1994a] | David Bainbridge. Optical music recognition: Progress report 1. Technical report, Department of Computer Science, University of Canterbury, 1994b. [ bib | http ] |
[Carter1994] |
Nicholas Paul Carter.
Conversion of the Haydn symphonies into electronic form using
automatic score recognition: a pilot study.
In International Symposium on Electronic Imaging: Science and
Technology, pages 2181 - 2181 - 12, 1994.
[ bib |
DOI |
http ]
As part of the development of an automatic recognition system for printed music scores, a series of `real-world' tasks are being undertaken. The first of these involves the production of a new edition of an existing 104-page, engraved, chamber-music score for Oxford University Press. The next substantial project, which is described here, has begun with a pilot study with a view to conversion of the 104 Haydn symphonies from a printed edition into machine- readable form. The score recognition system is based on a structural decomposition approach which provides advantages in terms of speed and tolerance of significant variations in font, scale, rotation and noise. Inevitably, some editing of the output data files is required, partially due to the limited vocabulary of symbols supported by the system and their permitted superimpositions. However, the possibility of automatically processing the bulk of the contents of over 600 pages of orchestral score in less than a day of compute time makes the conversion task manageable. The influence that this undertaking is having on the future direction of system development also is discussed.
|
[Coueasnon1994] | Bertrand Coüasnon and Jean Camillerapp. Using Grammars to Segment and Recognize Music Scores. In International Association for Pattern Recognition Workshop on Document Analysis Systems, pages 15-27, Kaiserslautern, Germany, 1994. [ bib | .ps ] |
[Essmayr1994] | Wolfgang Essmayr. Optische-Musik-Erkennung (OME), Erkennung von Notenschrift. Master's thesis, Johannes Kepler University Linz, Austria, 1994. [ bib | .ps.gz ] |
[Fahmy1994] |
Hoda M. Fahmy and Dorothea Blostein.
Graph-rewriting approach to discrete relaxation: application to music
recognition.
In International Symposium on Electronic Imaging: Science and
Technology, pages 2181 - 2181 - 12, 1994.
[ bib |
DOI ]
In image analysis, low-level recognition of the primitives plays a very important role. Once the primitives of the image are recognized, depending on the application, many types of analyses can take place. It is likely that associated with each object or primitive is a set of possible interpretations, herein referred to as the label set. The low-level recognizer may associate a probability with each label in the label set. We can use the constraints of the application domain to reduce the ambiguity in the object's identity. This process is variously termed constraint satisfaction, labeling, or relaxation. In this paper, we focus on the discrete form of relaxation. Our contribution lies in the development of a graph-rewriting approach which does not assume the degree of localness is high. We apply our approach to the recognition of music notation, where non-local interactions between primitives must be used in order to reduce ambiguity in the identity of the primitives. We use graph-rewriting rules to express not only binary constraints, but also higher-order notational constraints.
|
[PoulaindAndecy1994] | Vincent Poulain d'Andecy, Jean Camillerapp, and Ivan Leplumey. Kalman filtering for segment detection: application to music scores analysis. In 12th International Conference on Pattern Recognition. IEEE Comput. Soc. Press, 1994a. [ bib | DOI ] |
[PoulaindAndecy1994a] | Vincent Poulain d'Andecy, Jean Camillerapp, and Ivan Leplumey. Détecteur robuste de segments; Application à l'analyse de partitions musicales. In Actes 9 ème Congrés AFCET Reconnaissance des Formes et Intelligence Artificielle, 1994b. [ bib ] |
[Roth1994] | Martin Roth. An approach to recognition of printed music. Technical report, Swiss Federal Institute of Technology, 1994. [ bib | DOI ] |
[Armand1993] |
Jean-Pierre Armand.
Musical score recognition: A hierarchical and recursive approach.
In 2nd International Conference on Document Analysis and
Recognition, pages 906-909, 1993.
[ bib |
DOI ]
Musical scores for live music show specific characteristics: large format, orchestral score, bad quality of (photo) copies. Moreover such music is generally handwritten. The author addresses the music recognition problem for such scores, and show a dedicated filtering that has been developed, both for segmentation and correction of copy defects. Recognition process involves geometrical and topographical parameters evaluation. The whole process (filtering + recognition) is recursively applied on images and sub-images, in a knowledge-based way.<<ETX>>
|
[Baumann1993] | Stephan Baumann. Document recognition of printed scores and transformation into MIDI. Technical report, Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, 1993. [ bib | DOI ] |
[Clarke1993] |
Alastair T. Clarke, B. Malcom Brown, and M. P. Thorne.
Recognizing musical text.
In Machine Vision Applications, Architectures, and Systems
Integration, 1993.
[ bib |
DOI ]
This paper reports on some recent developments in a software product that recognizes printed music notation. There are a number of computer systems available which assist in the task of printing music; however the full potential of these systems cannot be realized until the musical text has been entered into the computer. It is this problem that we address in this paper. The software we describe, which uses computationally inexpensive methods, is designed to analyze a music score, previously read by a flat bed scanner, and to extract the musical information that it contains. The paper discusses the methods used to recognize the musical text: these involve sampling the image at strategic points and using this information to estimate the musical symbol. It then discusses some hard problems that have been encountered during the course of the research; for example the recognition of chords and note clusters. It also reports on the progress that has been made in solving these problems and concludes with a discussion of work that needs to be undertaken over the next five years in order to transform this research prototype into a commercial product.
|
[Fahmy1993] |
Hoda M. Fahmy and Dorothea Blostein.
Graph Grammar Processing of Uncertain Data.
In Advances in Structural and Syntactic Pattern Recognition,
pages 373-382. World Scientific, 1993a.
[ bib |
DOI ]
Abstract Graph grammars may be used to extract the information content from diagrams where there is uncertainty about symbol identity. The input to the graph grammar is derived from the output of a symbol recognizer. We propose a way in which uncertainty can be represented by a graph and a method which extracts the information content of the diagram. We consider the application of graph grammars to the recognition of diagrams such as music scores.
|
[Fahmy1993a] |
Hoda M. Fahmy and Dorothea Blostein.
A graph grammar programming style for recognition of music notation.
Machine Vision and Applications, 6 (2): 83-99, 1993b.
ISSN 1432-1769.
[ bib |
DOI |
http ]
Graph grammars are a promising tool for solving picture processing problems. However, the application of graph grammars to diagram recognition has been limited to rather simple analysis of local symbol configurations. This paper introduces the Build-Weed-Incorporate programming style for graph grammars and shows its application in determining the meaning of complex diagrams, where the interaction among physically distant symbols is semantically important. Diagram recognition can be divided into two stages: symbol recognition and high-level recognition. Symbol recognition has been studied extensively in the literature. In this work we assume the existence of a symbol recognizer and use a graph grammar to assemble the diagram's information content from the symbols and their spatial relationships. The Build-Weed-Incorporate approach is demonstrated by a detailed discussion of a graph grammar for high-level recognition of music notation.
|
[Fujinaga1993] |
Ichiro Fujinaga.
Optical music recognition system which learns.
In Enabling Technologies for High-Bandwidth Applications,
1993.
[ bib |
DOI ]
This paper describes an optical music recognition system composed of a database and three interdependent processes: a recognizer, an editor, and a learner. Given a scanned image of a musical score, the recognizer locates, separates, and classifies symbols into musically meaningful categories. This classification is based on the k-nearest neighbor method using a subset of the database that contains features of symbols classified in previous recognition sessions. Output of the recognizer is corrected by a musically trained human operator using a music notation editor. The editor provides both visual and high-quality audio feedback of the output. Editorial corrections made by the operator are passed to the learner which then adds the newly acquired data to the database. The learner's main task, however, involves selecting a subset of the database and reweighing the importance of the features to improve accuracy and speed for subsequent sessions. Good preliminary results have been obtained with everything from professionally engraved scores to hand-written manuscripts.
|
[Leplumey1993] |
Ivan Leplumey, Jean Camillerapp, and G. Lorette.
A robust detector for music staves.
In 2nd International Conference on Document Analysis and
Recognition, pages 902-905, 1993.
[ bib |
DOI ]
A method for the automatic recognition of music staves based on a prediction-and-check technique is presented in order to extract staves. It can detect lines with some curvature, discontinuities, and inclination. Lines are asserted to be a part of a staff if they can be grouped by five, thus completing the staff. This last phase also identifies additional staff lines.<<ETX>>
|
[Modayur1993] |
Bharath R. Modayur, Visvanathan Ramesh, Robert M. Haralick, and Linda G.
Shapiro.
MUSER: A prototype musical score recognition system using
mathematical morphology.
Machine Vision and Applications, 6 (2): 140-150, 1993.
ISSN 1432-1769.
[ bib |
DOI ]
Music representation utilizes a fairly rich repertoire of symbols. These symbols appear on a score sheet with relatively little shape distortion, differing from the prototype symbol shapes mainly by a positional translation and scale change. The prototype system we describe in this article is aimed at recognizing printed music notation from digitized music score images. The recognition system is composed of two parts: a low-level vision module that uses morphological algorithms for symbol detection and a high-level module that utilizes prior knowledge of music notation to reason about spatial positions and spatial sequences of these symbols. The high-level module also employs verification procedures to check the veracity of the output of the morphological symbol recognizer. The system produces an ASCII representation of music scores that can be input to a music-editing system. Mathematical morphology provides us the theory and the tools to analyze shapes. This characteristic of mathematical morphology lends itself well to analyzing and subsequently recognizing music scores that are rich in well-defined musical symbols. Since morphological operations can be efficiently implemented in machine vision systems that have special hardware support, the recognition task can be performed in near real-time. The system achieves accuracy in excess of 95% on the sample scores processed so far with a peak accuracy of 99.7% for the quarter and eighth notes, demonstrating the efficacy of morphological techniques for shape extraction.
|
[Randriamahefa1993] |
R. Randriamahefa, J. P. Cocquerez, C. Fluhr, F. Pepin, and S. Philipp.
Printed music recognition.
In 2nd International Conference on Document Analysis and
Recognition, pages 898-901, 1993.
[ bib |
DOI ]
The different steps to recognize printed music are described. The first step is to detect and to eliminate the staff lines. A robust method based on finding regions where are only the staff lines, linking between them the staff lines pieces in these regions is used. After staff lines elimination, symbols are isolated and a representation called attributed graph is constructed for each symbol. Thinning, polygonalization, spurious segments cleaning, and segment fusion are performed. A first classification, separating all notes with black heads from others, is performed. To recognize notes with black heads (beamed group or quarter notes), a straightforward structural approach using this representation is sufficient and efficient in most cases. In the ambiguous cases (chord or black head linked to two stems), an ellipse matching method is used. To recognize half notes and bar lines, a structural method using the graph is used.<<ETX>>
|
[Baumann1992] |
Stephan Baumann and Andreas Dengel.
Transforming Printed Piano Music into MIDI.
In Advances in Structural and Syntactic Pattern Recognition,
pages 363-372. World Scientific, 1992.
[ bib |
DOI ]
This paper decribes a recognition system for transforming printed piano music into the international standard MIDI for acoustic output generation. Because of the system is adapted for processing musical scores, it follows a top-down strategy in order to take advantage of the hierarchical structuring. Applying a decision tree classifier and various musical rules, the system comes up with a recognition rate of 80 to 100% depending on the musical complexity of the input. The resulting symbolic representation in terms of so called MIDI-EVENTs can be easily understood by musical devices such as synthesizers, expanders, or keyboards.
|
[Blostein1992] |
Dorothea Blostein and Henry S. Baird.
A Critical Survey of Music Image Analysis.
In Structured Document Image Analysis, pages 405-434.
Springer Berlin Heidelberg, 1992.
ISBN 978-3-642-77281-8.
[ bib |
DOI ]
The research literature concerning the automatic analysis of images of printed and handwritten music notation, for the period 1966 through 1990, is surveyed and critically examined.
|
[Blostein1992a] |
Dorothea Blostein and Nicholas Paul Carter.
Recognition of Music Notation: SSPR'90 Working Group Report.
In Structured Document Image Analysis, pages 573-574.
Springer Berlin Heidelberg, 1992.
ISBN 978-3-642-77281-8.
[ bib |
DOI |
http ]
This report summarizes the discussions of the Working Group on the Recognition of Music Notation, of the IAPR 1990 Workshop on Syntactic and Structural Pattern Recognition, Murray Hill, NJ, 13-15 June 1990. The participants were: D. Blostein, N. Carter, R. Haralick, T. Itagaki, H. Kato, H. Nishida, and R. Siromoney. The discussion was moderated by Nicholas Carter and recorded by Dorothea Blostein.
|
[Bulis1992] | Alex Bulis, Roy Almog, Moti Gerner, and Uri Shimony. Computerized recognition of hand-written musical notes. In International Computer Music Conference, pages 110-112, 1992. [ bib | http ] |
[Carter1992] |
Nicholas Paul Carter and Richard A. Bacon.
Automatic Recognition of Printed Music.
In Structured Document Image Analysis, pages 456-465.
Springer Berlin Heidelberg, Berlin, Heidelberg, 1992.
ISBN 978-3-642-77281-8.
[ bib |
DOI |
http ]
There is a need for an automatic recognition system for printed music scores. The work presented here forms the basis of an omnifont, size-independent system with significant tolerance of noise and rotation of the original image. A structural decomposition technique is used based on an original transformation of the line adjacency graph. An example of output is given in the form of a data file and its score reconstruction.
|
[Carter1992a] |
Nicholas Paul Carter.
A New Edition of Walton's Façade Using Automatic Score
Recognition.
In Advances in Structural and Syntactic Pattern Recognition,
pages 352-362. World Scientific, 1992a.
[ bib |
DOI ]
The availability of an automatic recognition system for printed music will facilitate applications such as musicological analysis, point-of-sale printing, creation of large format or braille scores and computer-based production of new editions. An example of the last of these possibilities is described here. A score-reading system is under development which makes use of a structural decomposition technique that is intended to be tolerant of significant variation in font, size, notation and noise in the source images. A description is given of the first "real-world" task to be undertaken using the system, i.e. the production of a new edition of Façade by William Walton. Sample output files and their corresponding reconstructions are given together with a discussion of the problems involved and the implications for future work.
|
[Carter1992b] |
Nicholas Paul Carter.
Segmentation and preliminary recognition of madrigals notated in
white mensural notation.
Machine Vision and Applications, 5 (3): 223-229, 1992b.
ISSN 1432-1769.
[ bib |
DOI |
http ]
An automatic music score-reading system will facilitate applications including computer-based editing of new editions, production of databases for musicological research, and creation of braille or large-format scores for the blind or partially-sighted. The work described here deals specifically with initial processing of images containing early seventeenth century madrigals notated in white mensural notation. The problems of segmentation involved in isolating the musical symbols from the word-underlay and decorative graphics are compounded by the poor quality of the originals which present a significant challenge to any recognition system. The solution described takes advantage of structural decomposition techniques based on a novel transformation of the line adjacency graph which have been developed during work on a score-reading system for conventional music notation.
|
[Itagaki1992] |
Takebumi Itagaki, Masayuki Isogai, Shuji Hashimoto, and Sadamu Ohteru.
Automatic Recognition of Several Types of Musical Notation.
In Structured Document Image Analysis, pages 466-476.
Springer Berlin Heidelberg, Berlin, Heidelberg, 1992.
ISBN 978-3-642-77281-8.
[ bib |
DOI |
http ]
This paper describes recent progress towards systems for automatic recognition of several different types of musical notation, including printed sheet music, Braille music, and dance notation.
|
[Kato1992] |
Hirokazu Kato and Seiji Inokuchi.
A Recognition System for Printed Piano Music Using Musical Knowledge
and Constraints.
In Structured Document Image Analysis, pages 435-455.
Springer Berlin Heidelberg, Berlin, Heidelberg, 1992.
ISBN 978-3-642-77281-8.
[ bib |
DOI |
http ]
We describe a recognition system for printed piano music, which presents challenging problems in both image pattern matching and semantic analysis. In music notation, the shape of symbols is simple, but confusing connections and overlaps among symbols occur. In order to deal with these difficulties, proper knowledge is required, so our system adopts a top-down approach based on bar-unit recognition to use musical knowledge and constraints effectively. Recognition results, described with a symbolic playable data format, exceed 90% correct on beginner's piano music.
|
[Martin1992] |
Philippe Martin and Camille Bellisant.
Neural Networks for the Recognition of Engraved Musical Scores.
International Journal of Pattern Recognition and Artificial
Intelligence, 06 (01): 193-208, 1992.
[ bib |
DOI ]
The image analysis levels of a recognition system for engraved musical scores are described. Recognizing musical score images requires an accurate segmentation stage to isolate symbols from staff lines. This symbols/staves segregation is achieved by the use of inscribed line (chord) information. This information, processed by a multilayer perceptron, allows an efficient segmentation in terms of the remaining connected components. Some of these components are then classified, using another network, according to a coding of their skeleton graph. Special attention is paid to the design of the networks: the architectures are adapted to the specificities of each task. Multilayer perceptrons are employed here together with other more classical image analysis techniques which are also presented.
|
[Martin1992a] | Philippe Martin. Artificial neural networks : application to optical musical score recognition. Theses, Université Joseph-Fourier - Grenoble I, 1992. [ bib | http ] |
[Ng1992] |
Kia Ng and Roger Boyle.
Segmentation of Music Primitives.
In David Hogg and Roger Boyle, editors, BMVC92, pages
472-480, London, 1992. Springer London.
ISBN 978-1-4471-3201-1.
[ bib |
DOI ]
In this paper, low-level knowledge directed pre-processing and segmentation of music scores are presented. We discuss some of the problems that have been overlooked by existing research but have proved to be major obstacles for robust optical music recognisers [1] to help entering music into a computer, including sub-segmentation of interconnected primitives and identification of nonstraight stave lines, and present solutions to these problems. We conclude that, with knowledge, a significant improvement in low-level segmentations can be achieved.
|
[Sicard1992] |
Etienne Sicard.
An efficient method for the recognition of printed music.
In 11th International Conference on Pattern Recognition, pages
573-576, 1992.
[ bib |
DOI ]
Deals with the recognition mechanisms of printed music scores. The techniques for extracting linear features, keys, noteheads and other musical figures from a digitized image are presented. Experimental results are given to show the effectiveness of the proposed methodology with a discussion of its performances and limits. Applications to full automated music score extraction, printed or handwritten are also discussed.<<ETX>>
|
[Stevens1992] |
Catherine Stevens and Cyril Latimer.
A comparison of connectionist models of music recognition and human
performance.
Minds and Machines, 2 (4): 379-400, 1992.
ISSN 1572-8641.
[ bib |
DOI |
http ]
Current artificial neural network or connectionist models of music cognition embody feature-extraction and feature-weighting principles. This paper reports two experiments which seek evidence for similar processes mediating recognition of short musical compositions by musically trained and untrained listeners. The experiments are cast within a pattern recognition framework based on the vision-audition analogue wherein music is considered an auditory pattern consisting of local and global features. Local features such as inter-note interval, and global features such as melodic contour, are derived from a two-dimensional matrix in which music is represented as a series of frequencies plotted over time.
|
[Wolman1992] | Amnon Wolman, James Choi, Shahab Asgharzadeh, and Jason Kahana. Recognition of Handwritten Music Notation. In International Computer Music Conference, 1992. [ bib ] |
[Bainbridge1991] | David Bainbridge. Preliminary experiments in musical score recognition, 1991. [ bib ] |
[Blostein1991] | Dorothea Blostein and Lippold Haken. Justification of Printed Music. Communications of the ACM, 34 (3): 88-99, 1991. ISSN 0001-0782. [ bib | DOI ] |
[McGee1991] | William McGee and Paul Merkley. The Optical Scanning of Medieval Music. Computers and the Humanities, 25 (1): 47-53, 1991. ISSN 1572-8412. [ bib | DOI | http ] |
[Ruttenberg1991] | Alan Ruttenberg. Optical Reading of Typeset Music. Master's thesis, Massachusetts Institute of Technology, Boston, MA, 1991. [ bib | .pdf ] |
[Blostein1990] |
Dorothea Blostein and Lippold Haken.
Template matching for rhythmic analysis of music keyboard input.
In 10th International Conference on Pattern Recognition, pages
767-770, 1990.
[ bib |
DOI ]
A system that recognizes common rhythmic patterns through template matching is described. The use of template matching gives the user the unusual ability to modify the set of templates used for analysis. This modification effects a tradeoff between the temporal accuracy required of the input and the complexity of the recognizable rhythm patterns that happen to be common in a particular piece of music. The evolving implementation of this algorithm has received heavy use over a six-year period and has proven itself as a practical and reliable input method for fast music transcription. It is concluded that templates demonstrably provide the necessary temporal context for accurate rhythm recognition.<<ETX>>
|
[Diener1990] | Glendon Ross Diener. Modeling music notation: A three-dimensional approach. PhD thesis, Stanford University, Palo Alto, CA, 1990. [ bib | .ps.Z ] |
[Hewlett1990] | Walter B. Hewlett and Eleanor Selfridge-Field, editors. Computing in Musicology: A Directory of Research, volume 6. Center for Computer, 1990. [ bib | .pdf ] |
[Katayose1990] |
H. Katayose, T. Fukuoka, K. Takami, and S. Inokuchi.
Expression extraction in virtuoso music performances.
In 10th International Conference on Pattern Recognition, pages
780-784 vol.1, 1990.
[ bib |
DOI ]
An approach to music interpretation by computers is discussed. A rule-based music interpretation system is being developed that generates sophisticated performance from a printed music score. The authors describe the function of learning how to play music, which is the most important process in music interpretation. The target to be learned is expression rules and grouping strategy: expression rules are used to convert dynamic marks and motives into concrete performance data, and grouping strategy is used to extract motives from sequences of notes. They are learned from a given virtuoso performance. The delicate control of attack timing and of the duration and strength of the notes is extracted by the music transcription function. The performance rules are learned by investigating how the same or similar musical primitives are played in a performance. As for the grouping strategy, the system analyzes how the player grouped music and registers dominant note sequences to extract motives.<<ETX>>
|
[Clarke1989] |
Alastair T. Clarke, B. Malcom Brown, and M. P. Thorne.
Coping with some really rotten problems in automatic music
recognition.
Microprocessing and Microprogramming, 27 (1): 547-550, 1989.
ISSN 0165-6074.
Fifteenth EUROMICRO Symposium on Microprocessing and
Microprogramming.
[ bib |
DOI |
http ]
This paper describes some of the problems encountered, and some of the techniques that have been used and implemented, during the development of an Optical Character Recognition system for printed music. It focuses on the recognition of chords and clusters, subdivision into single “lines” of music, and translation into musical code. Whereas other, mainframe based, music recognition systems have rarely attacked these problems, our methods have given some considerable success with an IBM PC.
|
[Bacon1988] |
Richard A. Bacon and Nicholas Paul Carter.
Recognising music automatically.
Physics Bulletin, 39 (7): 265, 1988.
[ bib |
http ]
Recognising characters typed in at a keyboard is a familiar task to most computers and one at which they excel, except that they (usually) insist on recognising what we have typed, rather than what we meant to type. A number of programs now on the market, however, go rather beyond merely recognising keystrokes on a keyboard, to actually recognising printed words on paper.
|
[Carter1988] |
Nicholas Paul Carter, Richard A. Bacon, and T. Messenger.
The acquisition, representation and reconstruction of printed music
by computer: A review.
Computers and the Humanities, 22 (2): 117-136, 1988.
ISSN 1572-8412.
[ bib |
DOI |
http ]
Material published on the subject of Acquisition, Representation and Reconstruction of printed music by computer is reviewed.
|
[Clarke1988] |
Alastair T. Clarke, B. Malcom Brown, and M. P. Thorne.
Using a micro to automate data acquisition in music publishing.
Microprocessing and Microprogramming, 24 (1): 549-553, 1988.
ISSN 0165-6074.
Supercomputers: Technology and Applications.
[ bib |
DOI |
http ]
With the number of computer applications involving music information growing, and the transition from traditional music printing methods to computer typesetting that is being faced by music publishers, there is an increasing need for an efficient and accurate method of getting musical information into computers. This paper describes some of the technical problems encountered in developing a system, based upon the IBM PC and a low-cost scanning device, to automatically recognise the printed music notation on a sheet of music that is fed through the scanner.
|
[Fujinaga1988] |
Ichiro Fujinaga.
Optical Music Recognition using Projections.
Master's thesis, McGill University, 1988.
[ bib |
.pdf ]
This research examines the feasibility of implementing an optical music score recognition system on a microcomputer. Projection technique is the principal mcthod employed in the recognition process, assisted by some of the structural roles governing musical notation. Musical examples, excerpted mostly from solo repertoire for monophonic instruments and representing various publishers, are used as samples to develop a computer program that recognizes a set of musical symbols. A final test of the system is undertaken, involving additional samples of monophohnic music which were not used in the development stage. With these samples, an average recognition rate of 70% is attained without any operator intervention. On an IMB-AT-compatible microcomputer, the total processing time including the scanning operation is about two minutes per page.
|
[Roach1988] |
JW W Roach and J E Tatem.
Using domain knowledge in low-level visual processing to interpret
handwritten music: an experiment.
Pattern Recognition, 21 (1): 33-44, 1988.
ISSN 0031-3203.
[ bib |
DOI |
http ]
Turning handwritten scores into engraved scores consumes a significant portion of music publishing companies' budgets. Pattern recognition is the major bottleneck holding up automation of this process. Human beings who know music can easily read a handwritten score, but without musical knowledge, even people cannot correctly perceive the markings in a handwritten score. This paper reports an experiment in which knowledge of music, a highly structured domain is applied to extract primitive musical features. This experiment shows that if the domain of image processing is well defined, significant improvements in low-level segmentations can be achieved (17 Refs.) recognition; computerised picture processing; expert systems; music
|
[Kato1987] |
Ichiro Kato, Sadamu Ohteru, Katsuhiko Shirai, Toshiaki Matsushima, Seinosuke
Narita, Shigeki Sugano, Tetsunori Kobayashi, and Eizo Fujisawa.
The robot musician 'wabot-2' (waseda robot-2).
Robotics, 3 (2): 143-155, 1987.
ISSN 0167-8493.
Special Issue: Sensors.
[ bib |
DOI |
http ]
The wabot-2 is an anthropomorphic robot playing keyboard instruments, developed by the study group of Waseda University's Science and Engineering Department. The wabot-2 is equipped with hands tapping softly on keys, with legs handling bass keys and expression pedal, with eyes reading a score, and with a mouth and ears to converse with humans. Based on wabot-2, wasubot has been developed by Sumitomo Electric Industries Ltd., whose artistic skill has been demonstrated in performing music at the Japanese Government Pavillion in Expo'85. The present paper summarizes the wabot-2's motion, visual and vocal subsystems as well as its supervisory system and singing voice-tracking subsystem.
|
[Kim1987] | W. J. Kim, M. J. Chung, and Z. Bien. Recognition system for a printed music score. In TENCON 87- Computers and Communications Technology Toward 2000, pages 573-577, 1987. [ bib | http ] |
[Sugano1987] |
Shigeki Sugano and Ichiro Kato.
WABOT-2: Autonomous robot with dexterous finger-arm-Finger-arm
coordination control in keyboard performance.
In IEEE International Conference on Robotics and Automation,
pages 90-97, 1987.
[ bib |
DOI ]
Advanced robots will have to not only have 'hard' functions but also have 'soft' functions. Therefore, the purpose of this study is to realize 'soft' functions of robots such as dexterity, speediness and intelligence by the development of an anthropomorphic intelligent robot playing keyboard instrument. This paper describes the development of keyboard playing robot WABOT-2(WAseda roBOT-2) with a focus on the mechanisms of arm-and-hand which has 21 degrees of freedom in total, their hierarchically structured control computer system, the information processing method at the high level computer and finger-arm coordination control which realizes the autonomous movement of WABOT-2.
|
[Roads1986] | Curtis Roads. The Tsukuba Musical Robot. Computer Music Journal, 10 (2): 39-43, 1986. ISSN 01489267, 15315169. [ bib | http ] |
[Matsushima1985] |
T. Matsushima, I. Sonomoto, T. Harada, K. Kanamori, and S. Ohteru.
Automated High Speed Recognition of Printed Music (WABOT-2 Vision
System).
In International Conference on Advanced Robotics, pages
477-482, 1985.
[ bib |
http ]
Concerns the intelligent robot WABOT-2, which can play an electronic piano, using ten fingers and feet, while reading printed music. It can hold a conversation with a man using an artificial voice. The paper reports on its vision system, which can recognize not only a printed score but also fine hand-written score or instant lettering score. The resulting musical robot vision performance is sufficient to permit the reading of one sheet of commercially available printed music for an electric piano with three parts. Pertinent data can be recognized in about 15 seconds, with 100% accuracy (4 Refs.) electronic music; optical character recognition; robots
|
[Byrd1984] | Donald Byrd. Music Notation by Computer. PhD thesis, Indiana University, 1984. [ bib | http ] |
[Andronico1982] | Alfio Andronico and Alberto Ciampa. On Automatic Pattern Recognition and Acquisition of Printed Music. In International Computer Music Conference, Venice, Italy, 1982. Michigan Publishing. [ bib | http ] |
[Kassler1972] |
Michael Kassler.
Optical Character-Recognition of Printed Music : A Review of Two
Dissertations. Automatic Recognition of Sheet Music by Dennis Howard Pruslin
; Computer Pattern Recognition of Standard Engraved Music Notation by David
Stewart Prerau.
Perspectives of New Music, 11 (1): 250-254, 1972.
[ bib |
http ]
Stable URL: http://www.jstor.org/stable/832471
|
[Prerau1971] |
David S. Prerau.
Computer pattern recognition of printed music.
In Fall Joint Computer Conference, pages 153-162, 1971.
[ bib ]
The standard notation used to specify most instrumental and vocal music forms a conventionalized, two-dimensional, visual pattern class. This paper discusses computer recognition of the music information specified by a sample of this standard notation. A sample of printed music notation is scanned optically, and a digitized version of the music sample is fed into the computer. The digitized sample may be considered the data-set sensed by the computer. The computer performs the recognition and then produces an output in the Ford-Columbia music representation. Ford-Columbia is an alphanumeric language isomorphic to standard music notation It is therefore capable of representing the music information specified by the original sample
|
[Prerau1970] | David S. Prerau. Computer pattern recognition of standard engraved music notation. PhD thesis, Massachusetts Institute of Technology, 1970. [ bib ] |
[Pruslin1966] | Dennis Howard Pruslin. Automatic Recognition of Sheet Music. PhD thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, 1966. [ bib ] |
[RISM] | Robert Eitner. Répertoire International des Sources Musicales. http://www.rism.info, 1952. [ bib | http ] |
This file was generated by bibtex2html 1.96.