![]() |
OLAC Record oai:lindat.mff.cuni.cz:11234/1-5835 |
Metadata | ||
Title: | Smashcima | |
Bibliographic Citation: | http://hdl.handle.net/11234/1-5835 | |
Creator: | Mayer, Jiří | |
Pecina, Pavel | ||
Hajič jr., Jan | ||
Date (W3CDTF): | 2024-12-31T20:42:03Z | |
Date Available: | 2024-12-31T20:42:03Z | |
Description: | Smashcima is a library and framework for synthesizing images containing handwritten music for creating synthetic training data for OMR models. It is primarily intended to be used as part of optical music recognition workflows, esp. with domain adaptation in mind. The target user is therefore a machine-learning, document processing, library sciences, or computational musicology researcher with minimal skills in python programming. Smashcima is the only tool that simultaneously: - synthesizes handwritten music notation, - produces not only raster images but also segmentation masks, classification labels, bounding boxes, and more, - synthesizes entire pages as well as individual symbols, - synthesizes background paper textures, - synthesizes also polyphonic and pianoform music images, - accepts just MusicXML as input, - is written in Python, which simplifies its adoption and extensibility. Therefore, Smashcima brings a unique new capability for optical music recognition (OMR): synthesizing a near-realistic image of handwritten sheet music from just a MusicXML file. As opposed to notation editors, which work with a fixed set of fonts and a set of layout rules, it can adapt handwriting styles from existing OMR datasets to arbitrary music (beyond the music encoded in existing OMR datasets), and randomize layout to simulate the imprecisions of handwriting, while guaranteeing the semantic correctness of the output rendering. Crucially, the rendered image is provided also with the positions of all the visual elements of music notation, so that both object detection-based and sequence-to-sequence OMR pipelines can utilize Smashcima as a synthesizer of training data. (In combination with the LMX canonical linearization of MusicXML, one can imagine the endless possibilities of running Smashcima on inputs from a MusicXML generator.) | |
Identifier (URI): | http://hdl.handle.net/11234/1-5835 | |
Language: | No linguistic content | |
Language (ISO639): | zxx | |
Publisher: | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) | |
Rights: | Apache License 2.0 | |
http://opensource.org/licenses/Apache-2.0 | ||
Subject: | Optical music recognition | |
synthetic data | ||
handwritten notation | ||
Type: | toolService | |
Type (DCMI): | Software | |
OLAC Info |
||
Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:lindat.mff.cuni.cz:11234/1-5835 | |
DateStamp: | 2024-12-31 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Mayer, Jiří; Pecina, Pavel; Hajič jr., Jan. 2024. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL). | |
Terms: | dcmi_Software iso639_zxx |