This is an old revision of the document!


  • M. Baroni, S. Bernardini, A. Ferraresi and Eros Zanchetta (to appear) The WaCky Wide Web: A Collection of Very Large Linguistically Processed Web-Crawled Corpora (PDF)
  • A. Ferraresi, E. Zanchetta, M. Baroni and S. Bernardini (2008) Introducing and evaluating ukWaC, a very large web-derived corpus of English, in S. Evert, A. Kilgarriff and S. Sharoff (eds.) Proceedings of the 4th Web as Corpus Workshop (WAC-4) – Can we beat Google?, Marrakech, 1 June 2008 (PDF) (Webpage)
  • A. Ferraresi, 2007. Building a very large corpus of English obtained by Web crawling: ukWaC, Master Thesis, University of Bologna (PDF)
  • M. Baroni and Bernardini, Silvia (eds.) 2006. Wacky! Working papers on the Web as Corpus, Bologna: GEDIT (Webpage)
  • publications.1226409222.txt.gz
  • Last modified: 2008/11/11 14:13
  • by eros