EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Using Large Corpora

Download or read book Using Large Corpora written by Armstrong-Warwick Armstrong and published by MIT Press. This book was released on 1994 with total page 364 pages. Available in PDF, EPUB and Kindle. Book excerpt: Using Large Corpora identifies new data-oriented methods for organizing and analyzing large corpora and describes the potential results that the use of large corpora offers. Today, large corpora consisting of hundreds of millions or even billions of words, along with new empirical and statistical methods for organizing and analyzing these data, promise new insights into the use of language. Already, the data extracted from these large corpora reveal that language use is more flexible and complex than most rule-based systems have tried to account for, providing a basis for progress in the performance of Natural Language Processing systems. Using Large Corpora identifies these new data-oriented methods and describes the potential results that the use of large corpora offers. The research described shows that the new methods may offer solutions to key issues of acquisition (automatically identifying and coding information), coverage (accounting for all of the phenomena in a given domain), robustness (accommodating real data that may be corrupt or not accounted for in the model), and extensibility (applying the model and data to a new domain, text, or problem). There are chapters on lexical issues, issues in syntax, and translation topics, as well discussions of the statistics-based vs. rule-based debate. ACL-MIT Series in Natural Language Processing.

Book Natural Language Processing Using Very Large Corpora

Download or read book Natural Language Processing Using Very Large Corpora written by S. Armstrong and published by Springer Science & Business Media. This book was released on 2013-04-17 with total page 314 pages. Available in PDF, EPUB and Kindle. Book excerpt: ABOUT THIS BOOK This book is intended for researchers who want to keep abreast of cur rent developments in corpus-based natural language processing. It is not meant as an introduction to this field; for readers who need one, several entry-level texts are available, including those of (Church and Mercer, 1993; Charniak, 1993; Jelinek, 1997). This book captures the essence of a series of highly successful work shops held in the last few years. The response in 1993 to the initial Workshop on Very Large Corpora (Columbus, Ohio) was so enthusias tic that we were encouraged to make it an annual event. The following year, we staged the Second Workshop on Very Large Corpora in Ky oto. As a way of managing these annual workshops, we then decided to register a special interest group called SIGDAT with the Association for Computational Linguistics. The demand for international forums on corpus-based NLP has been expanding so rapidly that in 1995 SIGDAT was led to organize not only the Third Workshop on Very Large Corpora (Cambridge, Mass. ) but also a complementary workshop entitled From Texts to Tags (Dublin). Obviously, the success of these workshops was in some measure a re flection of the growing popularity of corpus-based methods in the NLP community. But first and foremost, it was due to the fact that the work shops attracted so many high-quality papers.

Book Using Corpora in Discourse Analysis

Download or read book Using Corpora in Discourse Analysis written by Paul Baker and published by Bloomsbury Publishing. This book was released on 2023-08-24 with total page 218 pages. Available in PDF, EPUB and Kindle. Book excerpt: How can you carry out discourse analysis using corpus linguistics? What research questions should I ask? Which methods should you use and when? What is a collocational network or a key cluster? Introducing the major techniques, methods and tools for corpus-assisted analysis of discourse, this book answers these questions and more, showing readers how to best use corpora in their analyses of discourse. Using carefully tailored case studies, each chapter is devoted to a central technique, including frequency, concordancing and keywords, going step by step through the process of applying different analytical procedures. Introducing a wide range of different corpora, from holiday brochures to political debates, the book considers the key debates and latest advances in the field. Fully revised and updated, this new edition includes: - A new chapter on how to conduct research projects in corpus-based discourse analysis - Completely rewritten chapters on collocation and advanced techniques, using a corpus of jihadist propaganda texts and covering topics such as social media and visual analysis - Coverage of major tools, including CQPweb, AntConc, Sketch Engine and #LancsBox - Discussion of newer techniques including the derivation of lockwords and the comparison of multiple data sets for diachronic analysis With exercises, discussion questions and suggested further readings in each chapter, this book is an excellent guide to using corpus linguistics techniques to carry out discourse analysis.

Book The Handbook of Historical Linguistics  Volume II

Download or read book The Handbook of Historical Linguistics Volume II written by Richard D. Janda and published by John Wiley & Sons. This book was released on 2020-09-15 with total page 640 pages. Available in PDF, EPUB and Kindle. Book excerpt: An entirely new follow-up volume providing a detailed account of numerous additional issues, methods, and results that characterize current work in historical linguistics. This brand-new, second volume of The Handbook of Historical Linguistics is a complement to the well-established first volume first published in 2003. It includes extended content allowing uniquely comprehensive coverage of the study of language(s) over time. Though it adds fresh perspectives on several topics previously treated in the first volume, this Handbook focuses on extensions of diachronic linguistics beyond those key issues. This Handbook provides readers with studies of language change whose perspectives range from comparisons of large open vs. small closed corpora, via creolistics and linguistic contact in general, to obsolescence and endangerment of languages. Written by leading scholars in their respective fields, new chapters are offered on matters such as the origin of language, evidence from language for reconstructing human prehistory, invocations of language present in studies of language past, benefits of linguistic fieldwork for historical investigation, ways in which not only biological evolution but also field biology can serve as heuristics for research into the rise and spread of linguistic innovations, and more. Moreover, it: offers novel and broadened content complementing the earlier volume so as to provide the fullest available overview of a wholly engrossing field includes 23 all-new contributed chapters, treating some familiar themes from fresh perspectives but mostly covering entirely new topics features expanded discussion of material from language families other than Indo-European provides a multiplicity of views from numerous specialists in linguistic diachrony. The Handbook of Historical Linguistics, Volume II is an ideal book for undergraduate and graduate students in linguistics, researchers and professional linguists, as well as all those interested in the history of particular languages and the history of language more generally.

Book Advances in Empirical Translation Studies

Download or read book Advances in Empirical Translation Studies written by Meng Ji and published by Cambridge University Press. This book was released on 2019-06-13 with total page 285 pages. Available in PDF, EPUB and Kindle. Book excerpt: Introduces the integration of theoretical and applied translation studies for socially-oriented and data-driven empirical translation research.

Book Web Corpus Construction

    Book Details:
  • Author : Roland Schäfer
  • Publisher : Morgan & Claypool Publishers
  • Release : 2013-07-01
  • ISBN : 1627053123
  • Pages : 197 pages

Download or read book Web Corpus Construction written by Roland Schäfer and published by Morgan & Claypool Publishers. This book was released on 2013-07-01 with total page 197 pages. Available in PDF, EPUB and Kindle. Book excerpt: The World Wide Web constitutes the largest existing source of texts written in a great variety of languages. A feasible and sound way of exploiting this data for linguistic research is to compile a static corpus for a given language. There are several adavantages of this approach: (i) Working with such corpora obviates the problems encountered when using Internet search engines in quantitative linguistic research (such as non-transparent ranking algorithms). (ii) Creating a corpus from web data is virtually free. (iii) The size of corpora compiled from the WWW may exceed by several orders of magnitudes the size of language resources offered elsewhere. (iv) The data is locally available to the user, and it can be linguistically post-processed and queried with the tools preferred by her/him. This book addresses the main practical tasks in the creation of web corpora up to giga-token size. Among these tasks are the sampling process (i.e., web crawling) and the usual cleanups including boilerplate removal and removal of duplicated content. Linguistic processing and problems with linguistic processing coming from the different kinds of noise in web corpora are also covered. Finally, the authors show how web corpora can be evaluated and compared to other corpora (such as traditionally compiled corpora).

Book Exploring Linguistic Science

Download or read book Exploring Linguistic Science written by Allison Burkette and published by . This book was released on 2018-03-15 with total page 253 pages. Available in PDF, EPUB and Kindle. Book excerpt: Introduces students to the scientific study of language, using the basic principles of complexity theory.

Book Corpora and Language Learners

Download or read book Corpora and Language Learners written by Guy Aston and published by John Benjamins Publishing. This book was released on 2004-01-01 with total page 326 pages. Available in PDF, EPUB and Kindle. Book excerpt: Corpus-aided language pedagogy is one of the central application areas of corpus methodologies, and a test bed for theories of language and learning. This volume provides an overview of current trends, offering methodological and theoretical position statements along with results from empirical studies. The relationship between corpora and learning is examined from complementary perspectives — the study of learner language, the didactic use of corpus findings, and the interaction between corpora and their users. Reflections on current theory and technology open and close the volume.With its focus on the learner and the learning setting, Corpora and Language Learners is addressed to corpus linguists with an interest in learner language, applied linguists wishing to expand their understanding of corpora and their pedagogic potential, and language teachers wishing to critically assess the relevance of work in this field. This volume grew out of selected presentations at the 5th Teaching and Language Corpora conference in Bertinoro, Italy.

Book Exploring English with Online Corpora

Download or read book Exploring English with Online Corpora written by Wendy Anderson and published by Bloomsbury Publishing. This book was released on 2017-09-16 with total page 242 pages. Available in PDF, EPUB and Kindle. Book excerpt: This is an essential guide to using digital resources in the study of English language and linguistics. Assuming no prior experience, it introduces the fundamentals of online corpora and equips readers with the skills needed to search and interpret corpus data. Later chapters focus on specific elements of linguistic analysis, namely vocabulary, grammar, discourse and pronunciation. Examples from five major online corpora illustrate key issues to consider in corpus analysis, while case studies and activities help students get to grips with the wide range of resources that are available and select those that best suit their needs. Perfect for students of corpus linguistics and applied linguistics, this engaging and accessible guide opens the door to an ever-expanding world of online resources. It is also ideal for anyone who is curious about how the English language works and has a desire to explore its many written and spoken forms. New to this Edition: - Fully revised and updated throughout, incorporating the latest developments in corpus linguistics - Expanded material on corpora in teaching, contextualising corpus texts and critical discourse analysis

Book Corpus based Language Studies

Download or read book Corpus based Language Studies written by Tony McEnery and published by Taylor & Francis. This book was released on 2006 with total page 412 pages. Available in PDF, EPUB and Kindle. Book excerpt: Covering the major approaches to the use of corpus data, this work gathers together influential readings from leading names in the discipline, including Biber, Widdowson, Sinclair, Carter and McCarthy.

Book Learning with Corpora

Download or read book Learning with Corpora written by Guy Aston and published by Athelstan. This book was released on 2001 with total page 290 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book covers the use of corpora in language learning and translation. Chapters include: Learning with corpora: an overview; Corpora and their uses in language research; Corpus-based description in teaching and learning; The pedagogic use of spoken corpora; The learner as researcher; Integrating corpus work into an academic reading course; Swimming in words; Going to the Clochemerle; 'Spoilt for choice': a learner explores general language corpora.

Book Working with Portuguese Corpora

Download or read book Working with Portuguese Corpora written by Tony Berber Sardinha and published by A&C Black. This book was released on 2014-04-10 with total page 347 pages. Available in PDF, EPUB and Kindle. Book excerpt: Although Portuguese is one of the main world languages and researchers have been working on Portuguese electronic text collections for decades (e.g. Kelly, 1970; Biderman, 1978; Bacelar do Nascimento et al., 1984; see Berber Sardinha, 2005), this is the first volume in English that encapsulates the exciting and cutting-edge corpus linguistic work being done with Portuguese language corpora on different continents. The book includes chapters by leading corpus linguists dealing with Portuguese corpora across the world, and their contributions explore various methods and how they are applicable to a wide range of language issues. The book is divided into six sections, each covering a key issue in Corpus Linguistics: lexis and grammar, lexicography, language teaching and terminology, translation, corpus building and sharing, and parsing and annotation. Together these sections present the reader with a broad picture of the field.

Book Creating and Using English Language Corpora

Download or read book Creating and Using English Language Corpora written by Fries and published by BRILL. This book was released on 2023-11-20 with total page 224 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Linguistic Analysis of Large Corpora with Local Grammars

Download or read book Linguistic Analysis of Large Corpora with Local Grammars written by Clemens Marschner and published by . This book was released on 2010 with total page 208 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Developing Linguistic Corpora

Download or read book Developing Linguistic Corpora written by Martin Wynne and published by Oxbow Books Limited. This book was released on 2005 with total page 100 pages. Available in PDF, EPUB and Kindle. Book excerpt: A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

Book Building and Using Comparable Corpora for Multilingual Natural Language Processing

Download or read book Building and Using Comparable Corpora for Multilingual Natural Language Processing written by Serge Sharoff and published by Springer Nature. This book was released on 2023-08-23 with total page 138 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications.