Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
start [2009/01/14 15:52] – eros | start [2022/12/05 11:57] (current) – eros | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== WaCky ====== | + | ====== WaCky - The Web-As-Corpus Kool Yinitiative |
- | Welcome to WaCky! | + | Welcome to WaCky! |
We are a community of linguists and information technology specialists who got together to develop a set of tools (and interfaces to existing tools) that will allow linguists to crawl a section of the web, process the data, index and search them. | We are a community of linguists and information technology specialists who got together to develop a set of tools (and interfaces to existing tools) that will allow linguists to crawl a section of the web, process the data, index and search them. | ||
Line 7: | Line 7: | ||
We try to keep everything very laid-back and flexible (minimal constraint on data representation, | We try to keep everything very laid-back and flexible (minimal constraint on data representation, | ||
- | We built a few [[corpora]] you can [[download]], in the near future we'll have a web interface for direct online use of the corpora. While we wait for that (and the documentation), we described in great detail the procedure we followed to create our corpora in the paper {{papers:wacky_2008.pdf|" | + | We built a few [[corpora]] you can [[download|download or use directly]], we described in great detail the procedure we followed to create our first corpora |
- | The project | + | M. Baroni, S. Bernardini, A. Ferraresi and E. Zanchetta. 2009. The WaCky Wide Web: A Collection of Very Large Linguistically Processed Web-Crawled Corpora. //Language Resources and Evaluation// |
- | The old version | + | There, we also present a qualitative evaluation |
+ | |||
+ | The project (including this website) is currently being sponsored by the [[LiMiNe]] project. | ||
[[staff_only: | [[staff_only: | ||
+ |