Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
download [2013/03/13 10:57] – eros | download [2021/09/13 10:20] (current) – [Use the corpus directly (no download necessary)] eros | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Corpora availability ====== | + | ===== Use the corpus directly (no download necessary) |
- | ===== Free web interfaces | + | * The wacky corpora are available on our **official corpus repository** here: http:// |
+ | |||
+ | Other free web interfaces: | ||
* the Jožef Stefan Institute hosts a web interface where many of our corpora can be used directly for free: http:// | * the Jožef Stefan Institute hosts a web interface where many of our corpora can be used directly for free: http:// | ||
- | * the University of Lancaster hosts ItWaC and a sample of UkWaC (registration is required but the service is free): http:// | + | * the University of Lancaster hosts (among other corpora) |
+ | * the Charles University in Prague also hosts DeWaC, FrWaC, ItWaC and UkWaC (here again registration is required but the service is free): http:// | ||
===== Download ===== | ===== Download ===== | ||
- | | + | **NB**: when you download the corpora, you need to use your own tools to consult them. If you don't know what this means, then you probably don't want to download them and should use an online tools instead (see the secion "Free Web Interfaces" |
+ | |||
+ | | ||
* the semantically and syntactically annotated Italian Wikipedia is available for direct download from here: | * the semantically and syntactically annotated Italian Wikipedia is available for direct download from here: | ||
Line 20: | Line 25: | ||
* [[Frequency lists]] | * [[Frequency lists]] | ||
* [[Keyword lists: ukWaC vs. the BNC]] | * [[Keyword lists: ukWaC vs. the BNC]] | ||
- | |||
- | ===== Corpus building tools ===== | ||
- | |||
- | * [[http:// | ||
- | * {{: | ||
- | * **NB**: if you're looking for the **PotaModule** (a Perl module that is intended to perform " | ||
- |