# DEPRECATION WARNING

This vignette is considered deprecated! It’s content has been moved to the the EMU-SDMS manual (+ expanded and updated). Specifially see the The emuDB Format chapter.

# Introduction

This document describes the emuDB format that is used by the emuR package and shows how to create and interact with this format. The emuDB format is meant as a simple, general purpose way of storing speech databases that may contain complex, rich, hierarchical annotations as well as derived and complementary speech data. These different components will be described throughout this document and examples given as to how to generate and manipulate them. This document is meant as a practical guide / reference document to the emuDB format. The examples given below can be executed in any R session with the emuR package installed and may of course be adapted to your personal needs. First let us have a look at the general structure of an emuDB. Whenever we use a name like _XXX in the following we imply a varying prefix name (or base name) before the _ while the XXX is an obligatory string, e.g. _bndl implies file names such as rec1_bndl, rec2_bndl of the type bundle folder. The extension .json denotes a text file in JSON format.

# Database design

The database structure is basically a set of files and folders that adhere to a certain structure and naming convention (see Figure below).

The database root directory must contain a single _DBconfig.json file which, as the name implies, contains the configuration options of the database such as its level definitions, how these levels are linked in the database hierarchy and what is displayed in the EMU-webApp. The database root folder also contains arbitrarily named session folders ending with _ses, e.g. 0000_ses. These session folders can be used to logically group the recordings of a database. All files belonging to a single recording are contained in a so called bundle folder described below. A possible grouping into sessions could for instance be that all recordings of a speaker AAA are contained in one session called AAA_ses.

Each session folder can contain any number of _bndl folders, e.g. rec1_bndl rec2_bndl ... rec9_bndl. All the files belonging to a recording, i.e. all files describing the same time line of events, are stored in the corresponding bundle folder. This must include the actual recording (.wav) and can contain optional derived / complimentary signal files in the SSFF format (???) such as formants (.fms) or the fundamental frequency (.f0), both of which can be generated using the wrassp package. Each bundle folder must also contain the annotation file (_annot.json) of that bundle. This file contains the actual annotations including the hierarchical linking information. JSON schema files are provided to ensure the syntactic integrity of the database (see the dist/schemaFiles/ directory of the EMU-webApp GitHub repository). The following restrictions apply:

• the root folder of the emuDB must contain exactly one _DBconfig.json file. It is obligatory that the prefix of the _DBconfig.json file matches the value of the field name within the _DBconfig.json file (which specifies the official name of the emuDB), and that the root folder of the emuDB has the same prefix name as well. It is recommended however not obligatory that the root folder has the suffix _emuDB.
• all session folders must be named using the suffix _ses. Their prefixes can be chosen by the database maintainer.
• all bundle folders must be named using suffix _bndl. Their prefixes can be chosen by the database maintainer; prefixes must be unique within a session but not across sessions.
• all files within a bundle that belong to the bundle have to have the same basename as the _bndl folder prefix, e.g. the signal file in bundle rec1_bndl must have the name rec1.wav to be recognized by the emuDB system.
• the (single) obligatory annotation file within a bundle basename_bndl must have the same prefix as its bundle: basename_annot.json.

Files that do not follow this naming convention will simply be ignored by the database interaction functions of the emuR package (for instance additional multiple audio channels stored in individual audio files).

Optional files that may also be included in the database root directory are the _bundleList.json files. These files specify which annotator is assigned to which bundles. These files are used by EMU-websocket-protocol servers that implement user management to assign the correct bundles to the annotators. The serve() function implemented in the emuR package DOES NOT support user management which means that these files will simply be ignored by this function.

For more detailed information about the file formats used see the File descriptions section of this document. Let us now have a look at creating a new emuDB.

# Creating a emuDB

There are multiple ways of creating emuDBs. The two main strategies are to either convert existing databases or file collections to the new format or to create new databases from scratch. Refer to the emuR\_intro vignette (command: vignette("emuR_intro", package="emuR")) on how existing databases can be converted; in the following the latter of both strategies is described.

## Creating an emuDB from scratch

To create an emuDB from scratch simply call:

# load the package
library(emuR)

create_emuDB(name = 'fromScratchDB',
targetDir = tempdir(),
verbose = F)

This will create an empty emuDB that does not have any ssffTrackDefinitions or levelsDefinitions as well as not containing any sessions or bundles. Adding these to the emuDB is described in the next section.

# Editing a database

The initial step in manipulating or generally interacting with a database is to load the according database into your current R session.

# generate path to the empty fromScratchDB created above
dbPath = file.path(tempdir(), 'fromScratchDB_emuDB')
dbHandle = load_emuDB(dbPath, verbose = F)
print(dbHandle)
## [1] "<emuDBhandle> (dbName = 'fromScratchDB', basePath = '/private/var/folders/yk/8z9tn7kx6hbcg_9n4c1sld980000gn/T/RtmpUNaIbs/fromScratchDB_emuDB')"

This will load the database into it’s cached form for quick access to the data. Note that if a large emuDB has never been loaded and no cache has previously been generated, this can take a while to complete. Once a cache is present only altered annotation files have to be updated which reduces load times dramatically. As you can see the load_emuDB() function returns a database handle. This emuDBhandle is used to reference the loaded database in most database interaction functions of the emuR package.

Next, let us look at some actual database manipulation functions. The general function prefix naming convention of database manipulation functions for loaded databases are:

• add_XXX add a new instance of XXX / set_ set the current instance of XXX
• list_XXX list the current instances of XXX / get_ get the current instance of XXX
• remove_XXX removing existing instances of XXX

## Level definitions

Unlike other systems the EMU Speech Database Management System requires the user to formally define the structure of the database. An essential structural element of any emuDB are its levels. A level is a more general term for what is often referred to as a “tier”. It is more general in the sense that people usually expect tiers to contain time information. Levels can either contain time information if they are of the type “EVENT” or of the type “SEGMENT” but are timeless if they are of the type “ITEM”. Generally speaking, every unit of annotation is referred to as an “ITEM” in the context of an emuDB and “EVENT”s and “SEGMENT”s are special instances of these containing time information in the form of sample values.

The EMU system generally distinguishes between the actual representations of a structural element which are contained within the database and their formal definitions. An example of an actual representation would be a level contained in an annotation file that contains “SEGMENT”s that annotate a recording. The corresponding formal definition would be this level’s level definition, which specifies and validates the level’s existence within the database.

NOTE: if instances are mentioned in the course of this document, the actual representations are meant. Formal definitions are referred to as such.

As the already loaded ‘fromScratchDB’ does not contain any formal definitions of structural elements including levels we will begin by adding such a formal definition in the form of a new level definition:

add_levelDefinition(dbHandle,
name = 'Phonetic',
type = 'SEGMENT')

To check if this action was successful we can simply list the current level definitions by calling:

list_levelDefinitions(dbHandle)
##       name    type nrOfAttrDefs attrDefNames
## 1 Phonetic SEGMENT            1    Phonetic;

alternatively a summary of the emuDB also gives us this as well as additional information:

summary(dbHandle)
## Name:     fromScratchDB
## UUID:     7e4e80ec-092e-4785-97ce-3226cfc8a361
## Directory:    /private/var/folders/yk/8z9tn7kx6hbcg_9n4c1sld980000gn/T/RtmpUNaIbs/fromScratchDB_emuDB
## Session count: 0
## Bundle count: 0
## Annotation item count:  0
## Label count:  0
##
## Database configuration:
##
## SSFF track definitions:
## NULL
##
## Level definitions:
##       name    type nrOfAttrDefs attrDefNames
## 1 Phonetic SEGMENT            1    Phonetic;
##
## NULL

Let us add a further level definition that will contain the orthographic word transcriptions for the words uttered in our recordings. This level will be of the type “ITEM” meaning that elements contained within the level are sequentially ordered but do not contain any time information:

# add
name = 'Word',
type = 'ITEM')
# list
list_levelDefinitions(dbHandle)
##       name    type nrOfAttrDefs attrDefNames
## 1 Phonetic SEGMENT            1    Phonetic;
## 2     Word    ITEM            1        Word;

Finally we could remove one of the level definitions with the function remove_levelDefinition(), which we will once again not invoke here as we still wish to use these level definitions.

NOTE: If there are actual instances of annotation items (“SEGMENT”s, “EVENT”s or “ITEM”s) present in the emuDB it will not be possible to remove the level definition. These items would have to be removed first.

### Attribute definitions

Each level definition can contain multiple attributes, the most common and currently only supported attribute being a label ("type": "STRING"). Thus it is possible to have multiple parallel labels in a single level. This means that a single annotation item instance can contain multiple labels while sharing other properties such as the start and duration information. This can be quite useful when modeling certain types of data. A illustrative example of this would be the ‘Phonetic’ level created above. It is often the case that databases contain both the phonetic transcript using IPA UTF-8 symbols as well as using the Speech Assessment Methods Phonetic Alphabet (SAMPA). This is a perfect choice for using multiple attribute definitions within a single level:

# list
list_attributeDefinitions(dbHandle,
levelName = 'Phonetic')
##       name    level   type hasLabelGroups hasLegalLabels
## 1 Phonetic Phonetic STRING          FALSE          FALSE

Even though we have not added a single attribute definition to the ‘Phonetic’ level definition, it already contains the obligatory attribute definition that has the same name as it’s level. This indicates that it is the primary attribute of that level. To follow the above example let us now add a further attribute definition to the level definition that will contain the SAMPA versions of our annotations.

# add
levelName = 'Phonetic',
name = 'SAMPA')
## NULL
# list
list_attributeDefinitions(dbHandle,
levelName = 'Phonetic')
##       name    level   type hasLabelGroups hasLegalLabels
## 1 Phonetic Phonetic STRING          FALSE          FALSE
## 2    SAMPA Phonetic STRING          FALSE          FALSE

#### Label groups

A further optional field is the labelGroups field. It contains specifications of groups of labels that can be referenced by a name given to the group while querying the emuDB. Say we wish to reference all the long vowels in our Phonetic attribute definition with the name ‘long’ and all our short vowels with the name ‘short’. Let us now update our emuDB to contain these label groups:

# add long vowels
levelName = 'Phonetic',
attributeDefinitionName = 'Phonetic',
labelGroupName = 'long',
labelGroupValues = c('iː', 'uː'))

levelName = 'Phonetic',
attributeDefinitionName = 'Phonetic',
labelGroupName = 'short',
labelGroupValues = c('i', 'u', 'ə'))

# list
list_attrDefLabelGroups(dbHandle,
levelName = 'Phonetic',
attributeDefinitionName = 'Phonetic')
##    name  values
## 1  long  iː; uː
## 2 short i; u; ə

NOTE: It is also possible to define label groups for the entire DB. For more information on this see the R documentation for the add/list/remove_labelGroups functions.

INFO: For users who are familiar with or transitioning from the legacy EMU system the label groups correspond to the unfavorably named ‘Legal Labels’ entries of the GTemplate Editor (i.e. legal entries in the .tpl file) of the legacy system. In the new system the legalLabel entries specify the legal / allowed labels values of an attribute definitions while the label groups specify groups of labels that can be referenced by the names given to the groups while performing queries.

## File handling

Up until now we have defined the structure of our database. An essential part that is missing is of course the recordings that we wish to analyze. To import audio files, referred to as media files in the context of an emuDB, into the database one simply has to do the following:

# get path to folder containing wav files
# (in this case wav files that come with the wrassp package)
fp = system.file('extdata', package='wrassp')
# import media files into emuDB session called filesFromWrassp
import_mediaFiles(dbHandle,
dir = fp,
targetSessionName = 'filesFromWrassp',
verbose = F)
# list session
list_sessions(dbHandle)
##              name
## 1 filesFromWrassp
# list bundles
list_bundles(dbHandle)
##           session   name
## 1 filesFromWrassp lbo001
## 2 filesFromWrassp lbo002
## 3 filesFromWrassp lbo003
## 4 filesFromWrassp lbo004
## 5 filesFromWrassp lbo005
## 6 filesFromWrassp lbo006
## 7 filesFromWrassp lbo007
## 8 filesFromWrassp lbo008
## 9 filesFromWrassp lbo009

We have now added a new session called ‘filesFromWrassp’ to the ‘fromScratchDB’ containing a new bundle for each of our imported media files. These bundles adhere to the structure we have specified above. Note however that the levels in the annotation files (_annot.json) that were created during the import are still empty. These will have to be created manually at a later stage using the EMU-webApp. To list the files that are part of the emuDB call:

# show head of list_files
head(list_files(dbHandle))
## # A tibble: 6 x 4
##   session     bundle file        absolute_file_path
##   <chr>       <chr>  <chr>       <chr>
## 1 filesFromW… lbo001 lbo001.wav  /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 2 filesFromW… lbo001 lbo001_ann… /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 3 filesFromW… lbo002 lbo002.wav  /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 4 filesFromW… lbo002 lbo002_ann… /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 5 filesFromW… lbo003 lbo003.wav  /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 6 filesFromW… lbo003 lbo003_ann… /private/var/folders/yk/8z9tn7kx6hbcg_9n4…

The emuR package also provides a mechanism for adding files to preexisting bundle folders as this can be quite tedious to perform manually due to the nested folder structure of an emuDB. Let us create a set of files that contain the zero-crossing-rate values of the wav files we added above and for the sake of demonstration save them to a different location to then re-add them to the database.

# list all wav files in new emuDB
wavFilePaths = list.files(dbPath,
pattern = "wav\$",
full.names = T,
recursive = T)
# create folder to store zcr values in
outDirPath = file.path(tempdir(), 'zcranaVals')
dir.create(outDirPath)
# calculate zero-crossing-rate files
# using zcrana function of wrassp package
library(wrassp)
zcrana(listOfFiles = wavFilePaths,
outputDirectory = outDirPath)
# add zcr files to emuDB
dir = outDirPath,
fileExtension = 'zcr',
targetSessionName = 'filesFromWrassp')
# show head of list_files to check if files were added
head(list_files(dbHandle))
## # A tibble: 6 x 4
##   session     bundle file        absolute_file_path
##   <chr>       <chr>  <chr>       <chr>
## 1 filesFromW… lbo001 lbo001.wav  /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 2 filesFromW… lbo001 lbo001.zcr  /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 3 filesFromW… lbo001 lbo001_ann… /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 4 filesFromW… lbo002 lbo002.wav  /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 5 filesFromW… lbo002 lbo002.zcr  /private/var/folders/yk/8z9tn7kx6hbcg_9n4…
## 6 filesFromW… lbo002 lbo002_ann… /private/var/folders/yk/8z9tn7kx6hbcg_9n4…

## SSFF track definitions

A further important structural element of any emuDB are the so called ssffTracks (often simply referred to as tracks). These ssffTracks reference data that is stored in the Simple Signal File Format (SSFF) in the according _bndl folders. The two main types of data are:

• complementary data that was acquired during the recording such as data acquired during electromagnetic articulographic (EMA) or electropalatography (EPG) recordings;
• derived data, i.e. data that was calculated from the original audio signal such as formant values and their bandwidths or the short-term Root Mean Square amplitude of the signal.

Let us now add an ssffTrackDefinition to our database and calculate the SSFF files at the same time:

# add track and calculate SSFF files by specifying
# one of the signal processing functions the wrassp package provides
# (in this case the forest (formant estimation) function)
name = 'formantValues',
columnName = 'fm',
fileExtension = 'fms',
onTheFlyFunctionName = 'forest')
# list
list_ssffTrackDefinitions(dbHandle)
head(list_files(dbHandle))

INFO: to see the fileExtension and columnName defaults produced by the various signal processing functions of the wrassp package see ?wrasspOutputInfos. For a list of all the available signal processing functions that the wrassp package provides see ?wrassp.

As you might have noticed the .zcr files we added in the previous section are listed as being part of the bundles but have no ssffTrackDefinition associated with them. Let’s fix that by adding another ssffTrackDefinition to the database:

# add
name = 'zeroCrossing',
columnName = 'zcr',
fileExtension = 'zcr')
# list
list_ssffTrackDefinitions(dbHandle)
##            name columnName fileExtension
## 1 formantValues         fm           fms
## 2  zeroCrossing        zcr           zcr

INFO: as the get_trackdata() function can perform signal processing functions and calculates all necessary values in real time, it is seldom necessary to define ssffTracks for tracks produced by the wrassp package. For complementary data as well as data that has to be manipulated manually (e.g. manual formant corrections) this is still a feasible and necessary option. Also, if you wish to display SSFF data in the EMU-webApp it is necessary to pre-calculate the ssffTracks as the web application can not perform real-time calculations.

Note also that there are currently two special ssffTrackDefinitions. They are special in the sense that if they have either the name “FORMANTS” or the name “EPG” the EMU-webApp will expect the according SSFF files to be formated in a specific way and will also display them differently to the other tracks. If the track is named “FORMANTS” and this track is assigned to be overlayed on the spectrogram the EMU-webApp will frequency align the formant contours to the spectrogram and will permit these contours to be manually corrected. If the track is called “EPG” and the EMU-webApp is configured to display this track in the twoDimCanvases it will display an EPG plot of the data (see the File descriptions section of this document for more information on twoDimCanvases).

## Configuring the EMU-webApp

Before we can start manually annotating our speech database we have to configure our ‘fromScratchDB’ to contain information about how the database is to be displayed by the EMU-webApp. The EMU-webApp subdivides different ways to look at an emuDB into so called perspectives. These perspectives, between which you can switch in the web application, contain information on what levels are displayed, which ssffTracks are drawn, and so on. Let us list the current perspectives of our database:

# list
list_perspectives(dbHandle)
##      name signalCanvasesOrder levelCanvasesOrder
## 1 default          OSCI; SPEC

As you can see there is already a perspective available called ‘default’. This perspective was automatically added to the emuDB during the import of our mediaFiles. It currently only displays the oscillogram (“OSCI”) followed by the spectrogram (“SPEC”). “OSCI” and “SPEC” can be viewed as predefined tracks that are always available to the EMU-webApp. Using the add/remove_perspective() functions we could now add and remove as many additional perspectives to the database as we like. For now we will maintain the ‘default’ perspective and add the order in which we would like to display our levels.

# get order array of levels of default perspective
get_levelCanvasesOrder(dbHandle,
perspectiveName = 'default')
## NULL
# set order array of levels of default perspective
set_levelCanvasesOrder(dbHandle,
perspectiveName = 'default',
order = c('Phonetic'))

# get order array of levels of default perspective
get_levelCanvasesOrder(dbHandle,
perspectiveName = 'default')
## [1] "Phonetic"

As you can see we only added the “Phonetic” and not the “Word” level to be displayed in the “default” perspective as only levels of the type “SEGMENT” or “EVENT” are allowed to be displayed. All “ITEM” levels can be viewed by clicking the “showHierarchy” button in the top menu bar of the EMU-webApp and choosing an appropriate path through the hierarchy.

As the final configuration step let us also add the ssffTracks we defined and calculated above to the “default” perspective:

# get order array of signals of default perspective
get_signalCanvasesOrder(dbHandle,
perspectiveName = 'default')
## [1] "OSCI" "SPEC"
# set order array of signals of default perspective
set_signalCanvasesOrder(dbHandle,
perspectiveName = 'default',
order = c("OSCI", "SPEC", "formantValues", "zeroCrossing"))
# get order array of signals of default perspective
get_signalCanvasesOrder(dbHandle,
perspectiveName = 'default')
## [1] "OSCI"          "SPEC"          "formantValues" "zeroCrossing"

We have now completed the configuration of the ‘fromScratchDB’ emuDB. By calling the function serve(dbName) we can now start a server in our R session and connect the EMU-webApp to our database to visualize and annotate the emuDB.

INFO: the EMU-webApp is highly configurable and only a small subset of the configuration options are available through the emuR package. More complex visualization configurations can be achieved by manually editing the _DBconfig.json file and reloading the database. For a comprehensive list of all the available fields in the _DBconfig.json and their meanings see the File descriptions section of this document.

# Autobuilding

Autobuilding is a process that lets the emuDB maintainer semi-automatically build hierarchical structures from preexisting annotations by linking annotational units together. To have some preexisting annotations to play with, let us convert a TextGridCollection and load the newly created emuDB into the current R session.

# create demo data in folder provided by the tempdir() function
create_emuRdemoData(dir = tempdir())
# get the path to a folder containing .wav & .TextGrid files that is part of the demo data
path2folder = file.path(tempdir(), "emuR_demoData", "TextGrid_collection")
# convert this TextGridCollection to the emuDB format
convert_TextGridCollection(path2folder, dbName = "myTGcolDB",
targetDir = tempdir(), verbose = F)
dbHandle = load_emuDB(file.path(tempdir(), "myTGcolDB_emuDB"), verbose = F)

By inspecting the emuDB we can see that it has eleven levelDefinitions but no linkDefinitions. This means that it will not be possible to perform hierarchical queries on this emuDB, as there is no explicit hierarchical information in the database.

# list levels
list_levelDefinitions(dbHandle)
##            name    type nrOfAttrDefs  attrDefNames
## 1     Utterance SEGMENT            1    Utterance;
## 2  Intonational SEGMENT            1 Intonational;
## 3  Intermediate SEGMENT            1 Intermediate;
## 4          Word SEGMENT            1         Word;
## 5        Accent SEGMENT            1       Accent;
## 6          Text SEGMENT            1         Text;
## 7      Syllable SEGMENT            1     Syllable;
## 8       Phoneme SEGMENT            1      Phoneme;
## 9      Phonetic SEGMENT            1     Phonetic;
## 10         Tone   EVENT            1         Tone;
## 11         Foot SEGMENT            1         Foot;
# list ssffTracks
list_ssffTrackDefinitions(dbHandle)
## NULL

As it is a very laborious task to manually link ITEMs together using the EMU-webApp and the hierarchical information is already implicitly contained in the time information of the SEGMENTs and EVENTs of each level (see figure below), the emuR package provides a function to build these hierarchical structures from this information.

For the sake of brevity let’s focus on just three of the eleven levels. We will use the autobuild_linkFromTimes() function to build the following hierarchical structure:

The convertSuperlevel argument of the autobuild_linkFromTimes() function, that we will set to TRUE in the example below, tells the function to convert the super level to a level of type ITEM. As this is a very risky procedure as all the time information will be deleted from the “Syllable” level, the function automatically creates a backup of the level called “Syllable-autobuildBackup”. Before we can invoke the autobuild function we must however first add a linkDefinition to our emuDB that specifies the type of relationship that our level have:

# add linkDefinition
superlevelName = "Syllable",
sublevelName = "Phoneme")

# list
list_linkDefinitions(dbHandle)
##          type superlevelName sublevelName
## 1 ONE_TO_MANY       Syllable      Phoneme
# invoke autobuild function
superlevelName = "Syllable",
sublevelName = "Phoneme",
convertSuperlevel = TRUE,
verbose = FALSE)

# list
list_levelDefinitions(dbHandle)
##                        name    type nrOfAttrDefs              attrDefNames
## 1                 Utterance SEGMENT            1                Utterance;
## 2              Intonational SEGMENT            1             Intonational;
## 3              Intermediate SEGMENT            1             Intermediate;
## 4                      Word SEGMENT            1                     Word;
## 5                    Accent SEGMENT            1                   Accent;
## 6                      Text SEGMENT            1                     Text;
## 7                  Syllable    ITEM            1                 Syllable;
## 8                   Phoneme SEGMENT            1                  Phoneme;
## 9                  Phonetic SEGMENT            1                 Phonetic;
## 10                     Tone   EVENT            1                     Tone;
## 11                     Foot SEGMENT            1                     Foot;
## 12 Syllable-autobuildBackup SEGMENT            1 Syllable-autobuildBackup;

As we can see we have now converted the original “Syllable” level to the type ITEM and the backup level was added to the emuDB. Let us now perform the same procedure for the “Phoneme” and “Phonetic” levels:

# add linkDefinition
superlevelName = "Phoneme",
sublevelName = "Phonetic")

# list
list_linkDefinitions(dbHandle)
##           type superlevelName sublevelName
## 1  ONE_TO_MANY       Syllable      Phoneme
## 2 MANY_TO_MANY        Phoneme     Phonetic
# invoke autobuild function
superlevelName = "Phoneme",
sublevelName = "Phonetic",
convertSuperlevel = TRUE,
verbose = FALSE)

# list
list_levelDefinitions(dbHandle)
##                        name    type nrOfAttrDefs              attrDefNames
## 1                 Utterance SEGMENT            1                Utterance;
## 2              Intonational SEGMENT            1             Intonational;
## 3              Intermediate SEGMENT            1             Intermediate;
## 4                      Word SEGMENT            1                     Word;
## 5                    Accent SEGMENT            1                   Accent;
## 6                      Text SEGMENT            1                     Text;
## 7                  Syllable    ITEM            1                 Syllable;
## 8                   Phoneme    ITEM            1                  Phoneme;
## 9                  Phonetic SEGMENT            1                 Phonetic;
## 10                     Tone   EVENT            1                     Tone;
## 11                     Foot SEGMENT            1                     Foot;
## 12 Syllable-autobuildBackup SEGMENT            1 Syllable-autobuildBackup;
## 13  Phoneme-autobuildBackup SEGMENT            1  Phoneme-autobuildBackup;

This time we chose to add a linkDefinition of the type MANY_TO_MANY between the two levels. This is due to the fact that reduction processes can cause multiple phonemes can be produced as a single phone and due to insertion processes a single phoneme can be produced as multiple phones. We have now created the above hierarchical structure that we where aiming for.

# remove the newly generated emuDB and emuR_demoData as we will not be needing it
# throughout the rest of this vignette
unlink(file.path(tempdir(),'myTGcolDB_emuDB'), recursive = TRUE)

# File descriptions

## _DBconfig.json

The DBconfig file, as mentioned above, contains the configuration options of the database. People familiar with the legacy EMU system will recognize this as the replacement file for the legacy template (.tpl) file. By convention variables / strings written entirely in capital letters indicate a constant variable that usually has a special meaning. This is also the case with strings like this found in the DBconfig ("STRING","ITEM","SEGMENT", "EVENT", "OSCI", … ).

The _DBconfig.json file contains the following fields:

• "name" specifying the name of the database
• "UUID" a unique ID given to each database
• "mediafileExtension" the main mediafileExtension (currently only uncompressed mono 16-bit .wav files are supported in every component of the EMU system. This is also the recommended audio format for the EMU-SDMS.)
• "ssffTrackDefinitions" an array of definitions defining the SSFF tracks of the database. Each ssffTrackDefinition consists of:
• "name" the name of the ssffTrackDefinition
• "columnName" the name of the column of the associated SSFF file. For more information on the columns the various function of the wrassp produce see the track fields of wrasspOutputInfos object that is part of the wrassp package. Further, although the SSFF file format is a binary file format it has a plain text header which means that if you open a SSFF file in the text editor of your choice you will be able to see the columns contained within it. Another way of accessing column information about a specific SSFF file is to use the wrassp function res = read.AsspDataObj('/path/2/SSFF/file') to read the file from the file system. names(res) will then give you the names of the columns present in this file. NOTE: In the context of the SSFF file format the term column and in the context of the EMU system the term track / ssffTrack is used. They both refer to the same data.
• "fileExtention" the file extension of the associated SSFF file (also see ?wrasspOutputInfos for the default extensions produced by the wrassp functions)
• "levelDefinitions" array of definitions defining the levels of the database. A level is a more general term for what is often referred to as a tier. It is more general in the sense that people quite often expect tiers to contain time information. Levels can however either contain time information if they are of the type "EVENT" or of the type "SEGMENT" but may also be timeless if they are of the type "ITEM". Each "levelDefinitions" consists of:
• "name" the name of the levelDefinition
• "type" specifying the type of the level (either "ITEM" | "EVENT" | "SEGMENT")
• "attributeDefinitions" an array of definitions defining the attributes of the level. Each attributeDefinition consists out of:
• "name" the name of the "attributeDefinition"
• "type" specifying the type of the attribute (currently only "STRING" permitted)
• "labelGroups" an (optional) array containing label group definitions. These can be used as a shorthand notation for querying certain groups of labels.
• "name" name of label group. This will be the value used in a query to refer to this group.
• "values" array of strings representing the labels
• "legalLabels" (optional) array of strings specifying which labels are valid / legal for this attribute definition. The EMU-webApp adheres to this set of values and will not let the annotator enter any values other than the ones specified in this field. This can be used to ensure consistent label sets within levels.
• "anagestConfig" if specified (optional) this will convert the level into a special type of level for labeling articulatory data. This will also serve as a marker for the EMU-webApp to treat this level differently. This optional field may only be set for levels of the type "EVENT".
• "verticalPosSsffTrackName" name of ssffTrack containing the vertical position data
• "velocitySsffTrackName" name of ssffTrack containing the velocity data
• "autoLinkLevelName" name of level that will be used to link the created events to
• "multiplicationFactor" factor to multiply with (either -1 | 1)
• "threshold" a value between 0 and 1 defining the threshold
• "gestureOnOffsetLabels" array containing two strings that specify the on- and offset labels
• "maxVelocityOnOffsetLabels" array containing two strings that specify the on- and offset labels
• "constrictionPlateauBeginEndLabels" array containing two strings that specify the begin- and end labels
• "maxConstrictionLabel" string maximum constriction specifying label
• "linkDefinitions" an array of definitions defining the links between levels of the database. The combination of all link definitions specifies the hierarchy of the database. Each linkDefinition consists of:
• "type" specifying the type of link (either "ONE_TO_MANY" | "MANY_TO_MANY" | "ONE_TO_ONE").
• "superlevelName" specifies the name of the super-level
• "sublevelName" specifies the name of the sub-level
• "labelGroups" an (optional) array containing label group definitions. These can be used as a shorthand notation for querying certain groups of labels. Compared to the "labelGroups" that can be defined within an attributeDefinition the labelGroups defined here are globally defined for the entire database.
• "name" name of label group
• "values" array of strings containing labels
• "EMUwebAppConfig" specifies the configuration options intended for the EMU-webApp, i.e. how the database is to be displayed. This field can contain all the configurations options that are specified in the EMU-webApp’s configuration schema (see the dist/schemaFiles/emuwebappConfigSchema.json file of the EMU-webApp GitHub repository). The "EMUwebAppConfig" contains the following fields:
• "main" main behavior options
• "autoConnect": auto connect to the "serverUrl" on initial load of the webApp to automatically load a database (mainly used for development).
• "serverUrl": default server URL that is displayed in the connect modal (and used if "autoConnect" is set to true). The default: "ws://localhost:17890" points to the server started by the serve() function of the emuR package.
• "serverTimeoutInterval": the maximum amount of time the EMU-webApp waits (in milliseconds) for the server to respond.
• "comMode": communication mode that the EMU-webApp is in. Currently the only option that is available is "WS" (websocket).
• "catchMouseForKeyBinding": check if mouse has to be in labeler for key bindings to work
• "keyMappings" keyboard shortcut definitions. For the sake of brevity not every key-code is shown (see schema for extensive list)
• "toggleSideBarLeft" integer value that represents the key-code that toggles the left side bar (== bundleList side bar)
• "toggleSideBarRight" integer value that represents the key-code that toggles the right side bar (== perspective side bar)
• "spectrogramSettings" specifies the default settings of the spectrogram. The possible settings are:
• "windowSizeInSecs" specifies the window size in seconds
• "rangeFrom" specifies the lowest frequency (in Hz) that will be displayed by the spectrogram
• "rangeTo" specifies the highest frequency (in Hz) that will be displayed by the spectrogram
• "dynamicRange" specifies the dynamic rang for Maximum (in DB)
• "window" specifies the window type (either "BARTLETT" | "BARTLETTHANN" | "BLACKMAN" | "COSINE" | "GAUSS" | "HAMMING" | "HANN" | "LANCZOS" | "RECTANGULAR" | "TRIANGULAR")
• "preEmphasisFilterFactor" specifies the preemphasis factor (in formula: s’(k) = s(k) - preEmphasisFilterFactor * s(k-1) )
• "transparency" specifies the transparency of the spectrogram (integer from 0 to 255)
• "drawHeatMapColors" (optional) should the spectrogram be drawn using heat-map colors (either true or false)
• "heatMapColorAnchors" (optional) specify the heat-map color anchors (array of the form [[255, 0, 0], [0, 255, 0], [0, 0, 255]])
• "perspectives" array containing perspective configurations. Each "perspective" consists of:
• "name" name of perspective
• "signalCanvases" configuration options for the signalCanvases
• "order" array specifying the order in which the ssffTracks are to be displayed. Note that the ssffTrack names “OSCI” and “SPEC” are always available additionally to the ssffTrack defined in the database.
• "assign" array of configuration options to assign one ssffTrack to another effectively creating a visual overlay of one track over another. Each array element consists of:
• "signalCanvasName" name of signal specified in the "order" array
• "ssffTrackName" name of ssffTrack to overlay onto "signalCanvasName"
• "minMaxValLims" array of configuration options to limit the y-axis range that is displayed for a specified SSFF track
• "ssffTrackName": name specifying which ssffTrack should be limited
• "minVal": minimum value which defines the lower y-axis limit
• "maxVal": maximum value which defines the lower y-axis limit
• "contourLims" array containing contour limit values that specify an index range that is to be displayed. As a track / column can contain multi-dimensional data (e.g. 4 formant values per time stamp / 256 DFT values per time stamp / …) it is possible to specify an index range that specifies which values should be displayed (e.g. display formant 2 through 4).
• "ssffTrackName" name specifying which ssffTrack should be limited
• "minContourIdx" minimum contour index to display (starts at index 0)
• "maxContourIdx" maximum contour index to display
• "contourColors" array to specify colors of individual contours. This overrides the default of automatically calculating distinct colors for each contour
• "ssffTrackName" name ssffTrackName for that colors are defined
• "colors" array of rgb strings (e.g. ["rgb(238,130,238)", "rgb(127,255,212)"]) to specify the color of the contour (first value = first contour color and so on)
• "levelCanvases" configuration options for the levelCanvases
• "order" array specifying order in which the levels are to be displayed. Note that only levels of the type “EVENT” or “SEGMENT” can be displayed as "levelCanvases"
• "twoDimCanvases" configuration options for the 2D canvas
• "order" array specifying order in which the levels are to be displayed. Note that currently only a single twoDimDrawingDefinition can be displayed so this array may currently only contain a single element.
• "twoDimDrawingDefinitions" array containing two dimensional drawing definitions. Each two dimensional drawing definition consist of:
• "dots" array containing dot definitions. Each dot definition consist of:
• "name" name of dot
• "xSsffTrack" ssffTrackName of track that contains the x axis values
• "xContourNr" contour number of track that contains the x axis values
• "ySsffTrack" ssffTrackName of track that contains the y axis values
• "yContourNr" contour number of track that contains the y axis values
• "color" rgb color string specifying color given to dot
• "connectLines" array specifying which of the dots specified in the "dots" definition array should be connected by a line
• "fromDot" dot from which the line should start
• "toDot" dot to which the line should go
• "type" rgb string defining the color of the line
• "staticDots" array containing static dot definitions
• "name" name of static dots
• "xNameCoordinate" x coordinate specifying the location where name should be drawn
• "yNameCoordinate"y coordinate specifying the location where name should be drawn
• "xCoordinates" array of x coordinates (e.g. [300, 300, 900, 900, 300])
• "yCoordinates" array of y coordinates (e.g. [880, 2540, 2540, 880, 880])
• "connect" boolean value that specifies if to connect the static dots with lines
• "color" rgb string specifying color of static dots
• "staticContours" array containing static contour definitions
• "name" name of static contour
• "xSsffTrack" ssffTrackName of track that contains the x axis values
• "xContourNr" contour number of track that contains the x axis values
• "ySsffTrack" ssffTrackName of track that contains the y axis values
• "yContourNr" contour number of track that contains the y axis values
• "connect" boolean value that specifies if to connect the static dots with lines
• "color" rgb string specifying color of static contour
• "labelCanvasConfig" configuration options for the label canvases
• "addTimeMode" mode to add / subtract time to boundaries
• "addTimeValue": amount of samples added / subtracted to boundaries
• "newSegmentName" value given to default label if a new SEGMENT is added (default is "" == empty string)
• "newEventName" value given to default label if a new EVENT is added (default is "" == empty string)
• "restrictions"
• "playback" boolean value specifying whether to allow audio playback
• "correctionTool" boolean value specifying whether correction tools is available
• "editItemSize" boolean value specifying whether to allow changing the size of an ITEM (i.e. move boundaries)
• "editItemName" boolean value specifying whether to allow changing the label of an ITEM
• "deleteItemBoundary" boolean value specifying whether to allow deletion of boundaries
• "deleteItem" boolean value specifying whether to allow the deletion of entire ITEMs
• "deleteLevel" boolean value specifying whether to allow the deletion of entire levels
• "addItem" boolean value specifying whether to allow the adding of new ITEMs
• "drawCrossHairs" boolean value specifying whether to draw the cross hairs on signal canvases
• "drawSampleNrs" boolean value specifying whether to draw the samples numbers in the OSCI canvas if zoomed in close enough to see samples (mainly for debugging / development purposes)
• "drawZeroLine" boolean value specifying whether to draw zero value line in OSCI canvas
• "bundleComments" boolean value specifying whether to allow the annotator to add comments to bundles she / he has annotated. A bundle comment field will show up in the bundle list side bar for each bundle if this is set to true. Note that the server has to support saving these comments which the serve() function of the emuR package doesn’t.
• "bundleFinishedEditing" boolean value specifying whether to allow the annotator to mark when she / he has finished annotating a bundle. A finished editing toggle button will show up in the bundle list side bar for each bundle if this is set to true. Note that the server has to support saving these comments which theserve()function of theemuR package doesn’t.
• "showPerspectivesSidebar" boolean value specifying whether to show the perspectives side bar
• "activeButtons" specifications of which top-/bottom-menu buttons should be active / displayed by the EMU-webApp
• "addLevelSeg" boolean value specifying whether to show the add SEGMENT level button in the top menu bar
• "addLevelEvent" boolean value specifying whether to show the add EVENT level button in the top menu bar
• "renameSelLevel" boolean value specifying whether to allow the user to rename the currently selected level
• "downloadTextGrid" boolean value specifying whether to allow the user to download the current annotation as a TextGrid file by displaying a download TextGrid button in the top menu bar
• "downloadAnnotation" boolean value specifying whether to allow the user to download the current annotation as a annotJSON file by displaying a download annotJSON button in the top menu bar
• "specSettings" boolean value specifying whether to show the spec. settings button in the top menu bar
• "connect" boolean value specifying whether to display the connect button in the top menu bar
• "clear" boolean value specifying whether to display the clear button in the top menu bar
• "deleteSingleLevel" boolean value specifying whether to allow the user to delete a level containing time information
• "resizeSingleLevel" boolean value specifying whether to allow the user to resize a level
• "saveSingleLevel" boolean value specifying whether to allow the user to download a single level in the ESPS/waves+ format
• "resizeSignalCanvas" boolean value specifying whether to allow the user to resize the signalCanvases ("OSCI", "SPEC", …)
• "openDemoDB" boolean value specifying whether to show the open demoDB button
• "saveBundle" boolean value specifying whether to show the save button in bundle list side bar for each bundle
• "openMenu" boolean value specifying whether open bundle list side bar button (== ☰) is displayed
• "showHierarchy" boolean value specifying whether to show the “show hierarchy”" button
• "editEMUwebAppConfig" boolean value specifying whether to show “edit EMUwebAppConfig” button
• "demoDBs" array of strings specifying which demoDBs to display in the open demo drop-down menu. Currently available demo databases are ["ae", "ema", "epgdorsal"]

## _annot.json

The _annot.json files contain the the actual annotation information as well as the hierarchical linking information. Legacy EMU users should note that all the information that used to be split into several ESPS/waves+ label files as well as a .hlb file is now contained in this single file.

The _annot.json file contains the following fields:

• "name" specify name of annotation file (has to be equal to the bundle folder prefix as well as the _annot.json prefix)
• "annotates" specifies the (relative) media file path that this _annot.json file annotates
• "sampleRate" specifies the sample rate of the annotation (should be the same as the sample rate of the file listed in "annotates")
• "levels" contains an array of level annotation informations. Each element consists of:
• "name" specifying the name of the level
• "items" array containing the annotational units (i.e. items) of the level
• "id" unique ID of item (only unique within _annot.json file / bundle not globally for the emuDB)
• "sampleStart" contains start sample value of “SEGMENT” item.
• "sampleDur" contains sample duration value of “SEGMENT” item. Note that the EMU-webApp does not support overlapping “SEGMENT”s as well as “SEGMENT” sequences containing gaps. This infers that each sample is explicitly and unambiguously associated with a single “SEGMENT”. This means that the "sampleStart" value of a following “SEGMENT” has to be "sampleStart" + "sampleDur" + 1 of the previous “SEGMENT”.
• "samplePoint" contains sample point value of “EVENT” items
• "labels" array containing labels that belong to this item. Each element consists of:
• "name" specifying the "attributeDefinition" that this label is for
• "value" specifying the label value
• "links" array containing links between two items. These links have to adhere to the links specified in "linkDefinitions" of the according emuDB. Each link consists of:
• "fromID" ID value of item to link from (i.e. item in super-level)
• "toID"` ID value of item to link to (i.e. item in sub-level)