StringNet is an English lexico-grammatical knowledgebase consisting of multiword patterns of word behavior. These are represented by what we call hybrid n-grams and their relations to each other. Currently, StringNet contains about two billion hybrid n-grams extracted from the British National Corpus (BNC), each hybrid n-gram linked to all tokens attested in BNC. The design and motivation of (an earlier version of) StringNet are described in Wible and Tsao (2010).What are Hybrid n-grams?
The multiword patterns that we call hybrid n-grams are sequences of grams which may consist of (1) specific word forms (e.g., ‘trying’ but not ‘tried’ or ‘tries’ or ‘try’); (2) lexemes (e.g., try, including its various forms—trying, tried, etc.); or (3) parts of speech (POSs), marked off in brackets, including specific POSs such as [V-ing] or more general POSs such as [verb], which cover different more specific POSs such as [V-ing]. An example hybrid n-gram is:
Important: Click on this hybrid n-gram anywhere and wait a bit. A pop-up shows all the forms attested in that slot. Try it above.What about the latest version, StringNet 4.0?
Just as all previous versions, StringNet, 4.0 takes an English word (or words) as a query and responds with a ranked list of multiword and lexico-grammatical patterns in which that word is conventionally used (or in which those words conventionally co-occur) and concordances for each pattern. As a ‘net’, StringNet 4.0 still links each pattern to its related patterns, to its more abstract counterparts (its parents) and more specific counterparts (its children). So it links ‘consider yourself lucky’ to its parents ‘consider [pron reflx] lucky’, ‘[verb] yourself lucky’ and ‘consider yourself [adj]’, for example. Click on the following links beside any pattern to find these related patterns: parents; children, expand, contract. Also, clicking on any word or slot in any pattern displays its paradigm, a list of the substitutable words there representing the attested variation for that slot in that exact context.
It is the user-interface for querying and navigating StringNet (http://nav.stringnet.org). It takes queries of one or more words submitted to its query box and provides a list of patterns in which the query word is conventionally used (or, in the case of multi-word queries, patterns in which the query words conventionally co-occur). For example, a query of ‘take’ yields: ‘take place [prep]’, ‘take part in’, ‘take advantage of’, and many others. Each hybrid n-gram listed in search results is accompanied by a variety of related links and information. And that is what makes StringNet a net.The Navigable Links among Patterns that Make StringNet a Net (New)
The two figures below illustrate the links available between and among patterns that show up in the search results.
Each of the 2 billion patterns in StringNet is indexed (linked) to other related patterns by four basic types of relations.1.Parents: more abstract versions of itself
For example, for a query of the word ‘step’, the first pattern listed in the results is “step by step” and the second is this: “take the unprecedented step of [v-ing]”
Here are examples of patterns that are related in the four ways to the hybrid n-gram “take the unprecedented step of [v-ing]
|Some parents of it:||“[verb] the unprecedented step of [v-ing]”
“take the [adj] step of [v-ing]”
|A child of it:||“took the unprecedented step of [v-ing]”|
|Contracted (shorter version):||“take the step of [v-ing]”|
|Expanded (longer version):||“[noun] take the unprecedented step of [v-ing]”|
is Distinguished Professor of Learning and Instruction at National Central University (NCU) in Taiwan and Dean of the College of Liberal Arts.
Nai-Lung Tsao is a research assistant at the Graduate Institute of Learning and Instruction at National Central University in Taiwan.
We have developed StringNet as one of a suite of forthcoming tools for various aspects of second language vocabulary learning, teaching, and materials development. Our ongoing research and development has been supported by grants from Taiwan's National Science Council.