Software for the n-grams retrieval and analysis. By means of the Risk Ratio or other collocability metrics the software
analyzes the corpus and determines which n-grams are characteristic for a given words or lemmata.
Engrammer is a single-purpose tool allowing for the extraction of n-grams containing
a specific word or lemma, the rest of the slots being open. It was developed with
the following questions in mind:
- what lexical patterns is a given word involved in; i.e. which n-grams are disproportionately
collocated with a given word form/lemma?
- what are the contexts of these n-grams?
- what other words/lemmata collocate with these specific n-grams and what are
The tool operates as a standalone desktop application with no software dependencies,
the installation is thus very smooth. It is not necessary to upload corpora via the Internet,
all data are stored locally, suiting users working with custom corpora under copyright.
We have aimed to make the tool easy to operate, focussing on a user-friendly
graphical interface and ensuring a quick real-time response even for medium-sized
corpora (e.g. BNC). Its collocation metrics are based on effect size metrics rather than
null hypothesis statistical testing.
Crucially, Engrammer is designed to accommodate a range of typologically different
The application provides a graphic user interface (documentation (not yet)
The application is a freeware (licence
Download: Engrammer (Win 64bit) 6 MB.