- in experimental corpora - 211 annotated occurrences of 30 nouns which CSK was acquired
- on experimental corpora - 131 annotated occurrences of 30 nouns which CSK was acquired
Preliminary Frequency Data for 30 nouns
(NOTE: This is old data corresponding to the preliminary work presented in the 2009 UMSLLS Workshop at the NAACL. The data has since been expanded greatly.)
The preliminary CSK database (as presented in UMSLLS 2009) is available for researchers upon request (as a MySQL dump). It contains information such as links between search phrases, web queries, Web results, and syntactic parses used to produce the frequencies given above.
The newest, greatly expanded, CSKB will be made available soon. It is currently part of a work in preparation. This database will contain information about 60 nouns, including everything listed for the preliminary version as well as results of the revised, information-theoretic, concept analysis, and a perl interface to access the data more easily.