This is the first book of its kind to provide a practical and student-friendly guide to corpus linguistics that explains the nature of electronic data and how it can be collected and analyzed.
- Designed to equip readers with the technical skills necessary to analyze and interpret language data, both written and (orthographically) transcribed
- Introduces a number of easy-to-use, yet powerful, free analysis resources consisting of standalone programs and web interfaces for use with Windows, Mac OS X, and Linux
- Each section includes practical exercises, a list of sources and further reading, and illustrated step-by-step introductions to analysis tools
- Requires only a basic knowledge of computer concepts in order to develop the specific linguistic analysis skills required for understanding/analyzing corpus data
List of Figures xiii
List of Tables xv
Acknowledgements xvii
1 Introduction 1
1.1 Linguistic Data Analysis 3
1.1.1 What’s data? 3
1.1.2 Forms of data 3
1.1.3 Collecting and analysing data 7
1.2 Outline of the Book 8
1.3 Conventions Used in this Book 10
1.4 A Note for Teachers 11
1.5 Online Resources 11
2 What’s Out There? 13
2.1 What’s a Corpus? 13
2.2 Corpus Formats 13
2.3 Synchronic vs. Diachronic Corpora 15
2.3.1 ‘Early’ synchronic corpora 15
2.3.2 Mixed corpora 18
2.3.3 Examples of diachronic corpora 20
2.4 General vs. Specific Corpora 21
2.4.1 Examples of specific corpora 22
2lc(