EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Building a National Corpus

Download or read book Building a National Corpus written by Dawn Knight and published by Springer Nature. This book was released on 2021-10-08 with total page 192 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book aims to provide a micro-level, working model of a methodological approach and practical guidelines for building a corpus, informed by the work on the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes - the National Corpus of Contemporary Welsh). It focuses specifically on the development of detailed design frames for corpora across communicative modes (spoken, written and e-language), and the practical processes involved in the planning, collection, transcription, collation and (re)presentation of language data. The book is designed to be of significant value and relevance to those interested in critically engaging with corpus methodology. Although Welsh is the language under discussion, the processes and approaches discussed in the building of CorCenCC can be applied to a lesser or greater extent to other language contexts. This book provides a working model, and an account of how to build a corpus dataset from which step by step guidelines for creating other linguistic corpora in any language can be easily extrapolated. It will be of value to students and scholars of minority languages and corpus linguistics.

Book Developing Linguistic Corpora

Download or read book Developing Linguistic Corpora written by Martin Wynne and published by Oxbow Books Limited. This book was released on 2005 with total page 100 pages. Available in PDF, EPUB and Kindle. Book excerpt: A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

Book Overcoming Challenges in Corpus Construction

Download or read book Overcoming Challenges in Corpus Construction written by Robbie Love and published by Routledge. This book was released on 2020-01-06 with total page 176 pages. Available in PDF, EPUB and Kindle. Book excerpt: This volume offers a critical examination of the construction of the Spoken British National Corpus 2014 (Spoken BNC2014) and points the way forward toward a more informed understanding of corpus linguistic methodology more broadly. The book begins by situating the creation of this second corpus, a compilation of new, publicly-accessible Spoken British English from the 2010s, within the context of the first, created in 1994, talking through the need to balance backward capability and optimal practice for today’s users. Chapters subsequently use the Spoken BNC2014 as a focal point around which to discuss the various considerations taken into account in corpus construction, including design, data collection, transcription, and annotation. The volume concludes by reflecting on the successes and limitations of the project, as well as the broader utility of the corpus in linguistic research, both in current examples and future possibilities. This exciting new contribution to the literature on linguistic methodology is a valuable resource for students and researchers in corpus linguistics, applied linguistics, and English language teaching.

Book English Corpus Linguistics

Download or read book English Corpus Linguistics written by Charles F. Meyer and published by Cambridge University Press. This book was released on 2002-06-13 with total page 188 pages. Available in PDF, EPUB and Kindle. Book excerpt: English Corpus Linguistics is a step-by-step guide to creating and analyzing linguistic corpora. It begins with a discussion of the role that corpus linguistics plays in linguistic theory, demonstrating that corpora have proven to be very useful resources for linguists who believe that their theories and descriptions of English should be based on real rather than contrived data. Charles F. Meyer goes on to describe how to plan the creation of a corpus, how to collect and computerize data for inclusion in a corpus, how to annotate the data that are collected, and how to conduct a corpus analysis of a completed corpus. The book concludes with an overview of the challenges that corpus linguists face to make both the creation and analysis of corpora much easier undertakings than they currently are. Clearly organized and accessibly written, this book will appeal to students of linguistics and English language.

Book Essential Speech and Language Technology for Dutch

Download or read book Essential Speech and Language Technology for Dutch written by Peter Spyns and published by Springer Science & Business Media. This book was released on 2013-02-26 with total page 414 pages. Available in PDF, EPUB and Kindle. Book excerpt: The book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It not only presents the state of the art of HLT for Dutch in the areas covered, but, even more importantly, a description of the resources (data and tools) for Dutch that have been created are now available for both academia and industry worldwide. The contributions cover many areas of human language technology (for Dutch): corpus collection (including IPR issues) and building (in particular one corpus aiming at a collection of 500M word tokens), lexicology, anaphora resolution, a semantic network, parsing technology, speech recognition, machine translation, text (summaries) generation, web mining, information extraction, and text to speech to name the most important ones. The book also shows how a medium-sized language community (spanning two territories) can create a digital language infrastructure (resources, tools, etc.) as a basis for subsequent R&D. At the same time, it bundles contributions of almost all the HLT research groups in Flanders and the Netherlands, hence offers a view of their recent research activities. Targeted readers are mainly researchers in human language technology, in particular those focusing on Dutch. It concerns researchers active in larger networks such as the CLARIN, META-NET, FLaReNet and participating in conferences such as ACL, EACL, NAACL, COLING, RANLP, CICling, LREC, CLIN and DIR ( both in the Low Countries), InterSpeech, ASRU, ICASSP, ISCA, EUSIPCO, CLEF, TREC, etc. In addition, some chapters are interesting for human language technology policy makers and even for science policy makers in general.

Book The Routledge Handbook of Corpus Linguistics

Download or read book The Routledge Handbook of Corpus Linguistics written by Anne O'Keeffe and published by Routledge. This book was released on 2010-04-05 with total page 1263 pages. Available in PDF, EPUB and Kindle. Book excerpt: The Routledge Handbook of Corpus Linguistics provides a timely overview of a dynamic and rapidly growing area with a widely applied methodology. Through the electronic analysis of large bodies of text, corpus linguistics demonstrates and supports linguistic statements and assumptions. In recent years it has seen an ever-widening application in a variety of fields: computational linguistics, discourse analysis, forensic linguistics, pragmatics and translation studies. Bringing together experts in the key areas of development and change, the handbook is structured around six themes which take the reader through building and designing a corpus to using a corpus to study literature and translation. A comprehensive introduction covers the historical development of the field and its growing influence and application in other areas. Structured around five headings for ease of reference, each contribution includes further reading sections with three to five key texts highlighted and annotated to facilitate further exploration of the topics. The Routledge Handbook of Corpus Linguistics is the ideal resource for advanced undergraduates and postgraduates.

Book Statistics in Corpus Linguistics

Download or read book Statistics in Corpus Linguistics written by Vaclav Brezina and published by Cambridge University Press. This book was released on 2018-09-20 with total page 317 pages. Available in PDF, EPUB and Kindle. Book excerpt: A comprehensive and accessible introduction to statistics in corpus linguistics, covering multiple techniques of quantitative language analysis and data visualisation.

Book Using Corpora in Discourse Analysis

Download or read book Using Corpora in Discourse Analysis written by Paul Baker and published by Bloomsbury Publishing. This book was released on 2023-08-24 with total page 218 pages. Available in PDF, EPUB and Kindle. Book excerpt: How can you carry out discourse analysis using corpus linguistics? What research questions should I ask? Which methods should you use and when? What is a collocational network or a key cluster? Introducing the major techniques, methods and tools for corpus-assisted analysis of discourse, this book answers these questions and more, showing readers how to best use corpora in their analyses of discourse. Using carefully tailored case studies, each chapter is devoted to a central technique, including frequency, concordancing and keywords, going step by step through the process of applying different analytical procedures. Introducing a wide range of different corpora, from holiday brochures to political debates, the book considers the key debates and latest advances in the field. Fully revised and updated, this new edition includes: - A new chapter on how to conduct research projects in corpus-based discourse analysis - Completely rewritten chapters on collocation and advanced techniques, using a corpus of jihadist propaganda texts and covering topics such as social media and visual analysis - Coverage of major tools, including CQPweb, AntConc, Sketch Engine and #LancsBox - Discussion of newer techniques including the derivation of lockwords and the comparison of multiple data sets for diachronic analysis With exercises, discussion questions and suggested further readings in each chapter, this book is an excellent guide to using corpus linguistics techniques to carry out discourse analysis.

Book Corpus Design and Construction in Minoritised Language Contexts   Cynllunio a Chreu Corpws mewn Cyd destunau Ieithoedd Lleiafrifoledig

Download or read book Corpus Design and Construction in Minoritised Language Contexts Cynllunio a Chreu Corpws mewn Cyd destunau Ieithoedd Lleiafrifoledig written by Dawn Knight and published by Springer Nature. This book was released on 2021-07-05 with total page 178 pages. Available in PDF, EPUB and Kindle. Book excerpt: This bilingual book provides a detailed overview of the project to construct a National Corpus of Contemporary Welsh (CorCenCC), addressing the conceptual and methodological challenges faced when developing language corpora for minoritised languages. A conceptual framework is presented for the user-driven design that underpinned the CorCenCC project, along with a detailed blueprint that can function as a scaffold for other researchers embarking on projects of this nature. This book will be of value to those working in language teaching, learning and assessment, language policy and planning, translation, corpus linguistics and language technology, and to anyone with an interest in Welsh and other minoritised languages. Mae'r llyfr dwyieithog hwn yn rhoi trosolwg manwl o'r prosiect i greu Corpws Cenedlaethol Cymraeg Cyfoes (CorCenCC), ac yn mynd i'r afael â'r heriau cysyniadol a methodolegol a wynebir wrth ddatblygu corpora iaith ar gyfer ieithoedd lleiafrifoledig. Cyflwynir fframwaith cysyniadol ar gyfer y cynllun wedi'i yrru gan ddefnyddwyr sy'n greiddiol i brosiect CorCenCC, ynghyd â glasbrint manwl a all weithredu fel sgaffald i ymchwilwyr eraill sy'n dechrau ar brosiectau o'r fath. Bydd y llyfr hwn o werth i'r rhai sy'n gweithio ym meysydd addysgu, dysgu ac asesu ieithoedd, polisi iaith a chynllunio ieithyddol, cyfieithu, ieithyddiaeth gorpws a thechnoleg iaith, ac unrhyw un â diddordeb yn y Gymraeg ac ieithoedd lleiafrifoledig eraill.

Book Understanding Corpus Linguistics

Download or read book Understanding Corpus Linguistics written by Danielle Barth and published by Routledge. This book was released on 2021-11-18 with total page 276 pages. Available in PDF, EPUB and Kindle. Book excerpt: This textbook introduces the fundamental concepts and methods of corpus linguistics for students approaching this topic for the first time, putting specific emphasis on the enormous linguistic diversity represented by approximately 7,000 human languages and broadening the scope of current concerns in general corpus linguistics. Including a basic toolkit to help the reader investigate language in different usage contexts, this book: Shows the relevance of corpora to a range of linguistic areas from phonology to sociolinguistics and discourse Covers recent developments in the application of corpus linguistics to the study of understudied languages and linguistic typology Features exercises, short problems, and questions Includes examples from real studies in over 15 languages plus multilingual corpora Providing the necessary corpus linguistics skills to critically evaluate and replicate studies, this book is essential reading for anyone studying corpus linguistics.

Book The BNC Handbook

Download or read book The BNC Handbook written by Guy Aston and published by . This book was released on 1998 with total page 284 pages. Available in PDF, EPUB and Kindle. Book excerpt: The authors explain how to use large language corpora in explanatory learning and English languages teaching and research. They focus on the largest corpus of spoken and written data compiled (the BNC) and on the search tool SARA.

Book Natural Language Annotation for Machine Learning

Download or read book Natural Language Annotation for Machine Learning written by James Pustejovsky and published by "O'Reilly Media, Inc.". This book was released on 2013 with total page 344 pages. Available in PDF, EPUB and Kindle. Book excerpt: Includes bibliographical references (p. 305-315) and index.

Book Glossary of Corpus Linguistics

Download or read book Glossary of Corpus Linguistics written by Paul Baker and published by Edinburgh University Press. This book was released on 2006-05-19 with total page 192 pages. Available in PDF, EPUB and Kindle. Book excerpt: This alphabetic guide provides definitions and discussion of key terms used in corpus linguistics. Corpus data is being used in a growing number of English and Linguistics departments which have no record of past research with corpus data. This is the first comprehensive glossary of the many specialist terms in corpus linguistics and will be useful for corpus linguists and non corpus linguists alike. Clearly written, by a team of experienced academics in the field, the glossary provides full coverage of both traditional and contemporary terminology.

Book History  Features  and Typology of Language Corpora

Download or read book History Features and Typology of Language Corpora written by Niladri Sekhar Dash and published by Springer. This book was released on 2018-02-01 with total page 293 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book discusses key issues of corpus linguistics like the definition of the corpus, primary features of a corpus, and utilization and limitations of corpora. It presents a unique classification scheme of language corpora to show how they can be studied from the perspective of genre, nature, text type, purpose, and application. A reference to parallel translation corpus is mandatory in the discussion of corpus generation, which the authors thoroughly address here, with a focus on Indian language corpora and English. Web-text corpus, a new development in corpus linguistics, is also discussed with elaborate reference to Indian web text corpora. The book also presents a short history of corpus generation and provides scenarios before and after the advent of computer-generated digital corpora. This book has several important features: it discusses many technical issues of the field in a lucid manner; contains extensive new diagrams and charts for easy comprehension; and presents discussions in simplified English to cater to the needs of non-native English readers. This is an important resource authored by academics who have many years of experience teaching and researching corpus linguistics. Its focus on Indian languages and on English corpora makes it applicable to students of graduate and postgraduate courses in applied linguistics, computational linguistics and language processing in South Asia and across countries where English is spoken as a first or second language.

Book Corpus Linguistics and the Web

Download or read book Corpus Linguistics and the Web written by and published by BRILL. This book was released on 2015-07-14 with total page 311 pages. Available in PDF, EPUB and Kindle. Book excerpt: Using the Web as Corpus is one of the recent challenges for corpus linguistics. This volume presents a current state-of-the-arts discussion of the topic. The articles address practical problems such as suitable linguistic search tools for accessing the www, the question of register variation, or they probe into methods for culling data from the web. The book also offers a wide range of case studies, covering morphology, syntax, lexis, as well as synchronic and diachronic variation in English. These case studies make use of the two approaches to the www in corpus linguistics – web-as-corpus and web-for-corpus-building. The case studies demonstrate that web data can provide useful additional evidence for a broad range of research questions.

Book Building and Exploring Web Corpora  WAC3   2007

Download or read book Building and Exploring Web Corpora WAC3 2007 written by Cédrick Fairon and published by Presses univ. de Louvain. This book was released on 2007 with total page 186 pages. Available in PDF, EPUB and Kindle. Book excerpt: WAC More and more people are using Web data for linguistic and NLP research. The Web as Corpusworkshop (WAC) provides a venue for exploring how we can use it effectively and the advancementsto which this could lead.This book is a collection of the talks presented at the 3 rd WAC in Louvain-la-Neuve (Belgium).The focus is on the description of Web corpus collection projects, the exploration of Web datacharacteristics from a linguistics/NLP perspective, and on the use of crawled Web data for NLPpurposes. CLEANEVAL Any use of Web data requires that it be cleaned in order to get rid of unwanted material including,for example, HTML markup, navigation bars, advertisements. To date there has been no sharingof resources or expertise in this particular domain and the cleaning has often been done minimally.Cleaneval was an exercise aimed at promoting collaboration and improving our understandingof the issues. Results and perspectives are presented in this book.

Book Corpus Linguistics

Download or read book Corpus Linguistics written by Tony McEnery and published by Cambridge University Press. This book was released on 2011-10-06 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Corpus linguistics is the study of language data on a large scale - the computer-aided analysis of very extensive collections of transcribed utterances or written texts. This textbook outlines the basic methods of corpus linguistics, explains how the discipline of corpus linguistics developed and surveys the major approaches to the use of corpus data. It uses a broad range of examples to show how corpus data has led to methodological and theoretical innovation in linguistics in general. Clear and detailed explanations lay out the key issues of method and theory in contemporary corpus linguistics. A structured and coherent narrative links the historical development of the field to current topics in 'mainstream' linguistics. Practical tasks and questions for discussion at the end of each chapter encourage students to test their understanding of what they have read and an extensive glossary provides easy access to definitions of technical terms used in the text.