The parsed corpus of Southern Dutch Dialects (GCND) is a linguistically annotated corpus based on existing dialect recordings from the 1960s and 1970s: Voices from the past, supplemented with existing recordings form the Meertens Institute and a number of new recordingshe. The corpus provides audio aligned transcriptions in two layers, one closer to the dialect and one closer to Standard Dutch, both are part-of-speech tagged and syntactically tagged. The corpus is meant to facilitate large-scale research into syntactical particularities of the southern Dutch dialects.
The GCND is a medium-size infrastructure project from the Research Foundation – Flanders (FWO).