Corpus of Contemporary American English - Queries


  • The interface is the same as the BYU-BNC interface for the 100 million word British National Corpus, the 100 million word TIME Magazine corpus, and the 400 million word Corpus of *Historical* American English (COHA), 1810s–2000s (see links below)
  • Queries by word, phrase, alternates, substring, part of speech, lemma, synonyms (see below), and customized lists (see below)
  • The corpus is tagged by CLAWS, the same tagger that was used for the BNC and the TIME corpus
  • Chart listings (totals for all matching forms in each genre or year, 1990–present, as well as for sub-genres) and table listings (frequency for each matching form in each genre or year)
  • Full collocates searching (up to ten words left and right of node word)
  • Re-sortable concordances, showing the most common words/strings to the left and right of the searched word
  • Comparisons between genres or time periods (e.g. collocates of 'chair' in fiction or academic, nouns with 'break the ' in newspapers or academic, adjectives that occur primarily in sports magazines, or verbs that are more common 2005–2010 than previously)
  • One-step comparisons of collocates of related words, to study semantic or cultural differences between words (e.g. comparison of collocates of 'small' and 'little', or 'Democrats' and 'Republicans', or 'men' and 'women', or 'rob' vs 'steal')
  • Users can include semantic information from a 60,000 entry thesaurus directly as part of the query syntax (e.g. frequency and distribution of synonyms of 'beautiful', synonyms of 'strong' occurring in fiction but not academic, synonyms of 'clean' + noun ('clean the floor', 'washed the dishes')
  • Users can also create their own own 'customized' word lists, and then re-use these as part of subsequent queries (e.g. lists related to a particular semantic category (clothes, foods, emotions), or a user-defined part of speech)
  • Note that the corpus is only available through the web interface, due to copyright restrictions.

Read more about this topic:  Corpus Of Contemporary American English

Other articles related to "queries":

Correlation Database - Advantages and Disadvantages
... enables creation and execution of complex queries such as associative queries ("show everything that is related to x") that are difficult if not impossible to model in SQL ... The primary advantage of the CDBMS is that it is optimized for executing ad hoc queries - queries not anticipated during the data warehouse design phase ...
Mash QL
... In the background MashQL queries are automatically translated into and executed as SPARQL queries ... where users create mashups in a graphical manner while the queries are generated in the background ...
Semmle Code - Background - Academic
... The first such system was Linton's Omega system, where queries were phrased in QUEL, a derivative of SQL ... QUEL did not allow for recursion in queries, making it difficult to inspect hierarchical program structures such as the call graph ... significant development was therefore the use of logic programming, which does allow such recursive queries, in the XL C++ Browser ...
Conjunctive Query
... conjunctive query is a restricted form of first-order queries ... A large part of queries issued on relational databases can be written as conjunctive queries, and large parts of other first-order queries can be written as conjunctive queries ... Conjunctive queries also have a number of desirable theoretical properties that larger classes of queries (e.g ...

Famous quotes containing the word queries:

    All I can say, in answer to this kind queries [of friends] is that I have not the distemper called the Plague; but that I have all the plagues of old age, and of a shattered carcase.
    Philip Dormer Stanhope, 4th Earl Chesterfield (1694–1773)