[EBOOK] Statistical Models For Hierarchical Phrase Based Machine Translation PDF Download

Statistical Models for Hierarchical Phrase based Machine Translation

Book Details:

Author : Matthias Huck
Publisher :
Release : 2018
ISBN :
Pages : pages

Download or read book Statistical Models for Hierarchical Phrase based Machine Translation written by Matthias Huck and published by . This book was released on 2018 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt:

Computers

Syntax based Statistical Machine Translation

Book Details:

Author : Philip Williams
Publisher : Springer Nature
Release : 2022-05-31
ISBN : 3031021649
Pages : 190 pages

Download or read book Syntax based Statistical Machine Translation written by Philip Williams and published by Springer Nature. This book was released on 2022-05-31 with total page 190 pages. Available in PDF, EPUB and Kindle. Book excerpt: This unique book provides a comprehensive introduction to the most popular syntax-based statistical machine translation models, filling a gap in the current literature for researchers and developers in human language technologies. While phrase-based models have previously dominated the field, syntax-based approaches have proved a popular alternative, as they elegantly solve many of the shortcomings of phrase-based models. The heart of this book is a detailed introduction to decoding for syntax-based models. The book begins with an overview of synchronous-context free grammar (SCFG) and synchronous tree-substitution grammar (STSG) along with their associated statistical models. It also describes how three popular instantiations (Hiero, SAMT, and GHKM) are learned from parallel corpora. It introduces and details hypergraphs and associated general algorithms, as well as algorithms for decoding with both tree and string input. Special attention is given to efficiency, including search approximations such as beam search and cube pruning, data structures, and parsing algorithms. The book consistently highlights the strengths (and limitations) of syntax-based approaches, including their ability to generalize phrase-based translation units, their modeling of specific linguistic phenomena, and their function of structuring the search space.

Improvements in Hierarchical Phrase based Statistical Machine Translation

Book Details:

Author : Baskaran Sankaran
Publisher :
Release : 2013
ISBN :
Pages : 133 pages

Download or read book Improvements in Hierarchical Phrase based Statistical Machine Translation written by Baskaran Sankaran and published by . This book was released on 2013 with total page 133 pages. Available in PDF, EPUB and Kindle. Book excerpt: Hierarchical phrase-based translation (Hiero) is a statistical machine translation (SMT) model that encodes translation as a synchronous context-free grammar derivation between source and target language strings (Chiang, 2005; Chiang, 2007). Hiero models are more powerful than phrase-based models in capturing complex source-target reordering as well as discontiguous phrases, while being easier to estimate and decode with compared to their full syntax-based counterparts. In this thesis, we propose improvements to two broad aspects of the Hiero translation pipeline: i) learning Hiero translation model and estimating their parameters and ii) parameter tuning for discriminative log-linear models that are used to decode with such features. We use our own open-source implementation of Hiero called Kriya (Sankaran et al., 2012b) for all the experiments in this thesis. This thesis contains the following specific contributions: We propose a Bayesian model for learning Hiero grammars as an alternative to the heuristic method usually used in Hiero. Our model learns a peaked distribution of grammars, which consistently performs better than the heuristically extracted grammars across several language pairs (Sankaran et al., 2013a). We propose a novel unified-cascade framework for jointly learning alignments and the Hiero translation rules by removing the disconnect between the alignments and extracted synchronous context-free grammar. This is the first time a joint training framework is being proposed for Hiero, where we iterate the two step inference so that it learns in alternate iterations the phrase alignments and then the Hiero rules that are consistent with alignments. We extend our Bayesian model for extracting compact Hiero translation rules using arity-1 grammars, resulting in up to 57% reduction in model size while retaining the translation performance (Sankaran et al., 2011; Sankaran et al., 2012a). We propose several novel approaches for parameter tuning of discriminative log-linear models for SMT which can be used for jointly optimizing towards multiple evaluation metrics. We show that our methods for multi-objective tuning for SMT yield substantial gains in translation quality measured through automatic as well as human evaluations (Sankaran et al., 2013b; Duh et al., 2013).

CCG augmented Hierarchical Phrase based Statistical Machine Translation

Book Details:

Author : Hala Almaghout
Publisher :
Release : 2012
ISBN :
Pages : pages

Download or read book CCG augmented Hierarchical Phrase based Statistical Machine Translation written by Hala Almaghout and published by . This book was released on 2012 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt:

Language Arts & Disciplines

Linguistically Motivated Statistical Machine Translation

Book Details:

Author : Deyi Xiong
Publisher : Springer
Release : 2015-02-11
ISBN : 9812873562
Pages : 159 pages

Download or read book Linguistically Motivated Statistical Machine Translation written by Deyi Xiong and published by Springer. This book was released on 2015-02-11 with total page 159 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides a wide variety of algorithms and models to integrate linguistic knowledge into Statistical Machine Translation (SMT). It helps advance conventional SMT to linguistically motivated SMT by enhancing the following three essential components: translation, reordering and bracketing models. It also serves the purpose of promoting the in-depth study of the impacts of linguistic knowledge on machine translation. Finally it provides a systematic introduction of Bracketing Transduction Grammar (BTG) based SMT, one of the state-of-the-art SMT formalisms, as well as a case study of linguistically motivated SMT on a BTG-based platform.

Left to Right Hierarchical Phrase based Machine Translation

Book Details:

Author : Maryam Siahbani
Publisher :
Release : 2016
ISBN :
Pages : 82 pages

Download or read book Left to Right Hierarchical Phrase based Machine Translation written by Maryam Siahbani and published by . This book was released on 2016 with total page 82 pages. Available in PDF, EPUB and Kindle. Book excerpt: Hierarchical phrase-based translation (Hiero for short) models statistical machine translation (SMT) using a lexicalized synchronous context-free grammar (SCFG) extracted from word aligned bitexts. The standard decoding algorithm for Hiero uses a CKY-style dynamic programming algorithm with time complexity O(n̂3) for source input with n words. Scoring target language strings using a language model in CKY-style decoding requires two histories per hypothesis making it significantly slower than phrase-based translation which only keeps one history per hypothesis. In addition, the size of a Hiero SCFG grammar is typically much larger than phrase-based models when extracted from the same data which also slows down decoding. In this thesis we address these issues in Hiero by adopting a new translation model and decoding algorithm called Left-to-Right hierarchical phrase-based translation (LR-Hiero for short). LR-Hiero uses a constrained form of lexicalized SCFG rules to encode translation, where the target-side is constrained to be prefix-lexicalized. LR-Hiero uses a decoding algorithm with time complexity O(n̂2) that generates the target language output in left-to-right manner which keeps only one history per hypothesis resulting in faster decoding for Hiero grammars. The thesis contains the following contributions: (i) We propose a novel dynamic programming algorithm for rule extraction phase. Unlike traditional Hiero rule extraction which performs a brute-force search, LR-Hiero rule extraction is linear in the number of rules. (ii) We propose an augmented version of LR-decoding algorithm previously proposed by (Watanabe+, ACL 2006). Our modified LR-decoding algorithm addresses issues related to decoding time and translation quality and is shown to be more efficient than the CKY decoding algorithm in our experimental results. (iii) We extend our LR-decoding algorithm to capture all hierarchical phrasal alignments that are reachable in CKY-style decoding algorithms. (iv) We introduce a lexicalized reordering model to LR-Hiero that significantly improves the translation quality. (v) We apply LR-Hiero to the task of simultaneous translation; the first attempt to use Hiero models in simultaneous translation. We show that we can perform online segmentation on the source side to improve latency and maintain translation quality.

Phrase Based Statistical Machine Translation

Book Details:

Author : Richard Zens
Publisher :
Release : 2008
ISBN :
Pages : 0 pages

Download or read book Phrase Based Statistical Machine Translation written by Richard Zens and published by . This book was released on 2008 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Computers

Statistical Machine Translation

Book Details:

Author : Philipp Koehn
Publisher : Cambridge University Press
Release : 2010
ISBN : 0521874157
Pages : 447 pages

Download or read book Statistical Machine Translation written by Philipp Koehn and published by Cambridge University Press. This book was released on 2010 with total page 447 pages. Available in PDF, EPUB and Kindle. Book excerpt: The dream of automatic language translation is now closer thanks to recent advances in the techniques that underpin statistical machine translation. This class-tested textbook from an active researcher in the field, provides a clear and careful introduction to the latest methods and explains how to build machine translation systems for any two languages. It introduces the subject's building blocks from linguistics and probability, then covers the major models for machine translation: word-based, phrase-based, and tree-based, as well as machine translation evaluation, language modeling, discriminative training and advanced methods to integrate linguistic annotation. The book also reports the latest research, presents the major outstanding challenges, and enables novices as well as experienced researchers to make novel contributions to this exciting area. Ideal for students at undergraduate and graduate level, or for anyone interested in the latest developments in machine translation.

Phrase Based Statistical Machine Translation

Book Details:

Author : Richard Zens
Publisher :
Release : 2008
ISBN :
Pages : 151 pages

Download or read book Phrase Based Statistical Machine Translation written by Richard Zens and published by . This book was released on 2008 with total page 151 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Computers

Neural Machine Translation

Book Details:

Author : Philipp Koehn
Publisher : Cambridge University Press
Release : 2020-06-18
ISBN : 1108497322
Pages : 409 pages

Download or read book Neural Machine Translation written by Philipp Koehn and published by Cambridge University Press. This book was released on 2020-06-18 with total page 409 pages. Available in PDF, EPUB and Kindle. Book excerpt: Learn how to build machine translation systems with deep learning from the ground up, from basic concepts to cutting-edge research.

Use of Source Language Context in Statistical MacHine Translation

Book Details:

Author : Rejwanul Haque
Publisher : LAP Lambert Academic Publishing
Release : 2012-02
ISBN : 9783847340973
Pages : 228 pages

Download or read book Use of Source Language Context in Statistical MacHine Translation written by Rejwanul Haque and published by LAP Lambert Academic Publishing. This book was released on 2012-02 with total page 228 pages. Available in PDF, EPUB and Kindle. Book excerpt: The translation features typically used in state-of-the-art statistical machine translation (SMT) model dependencies between the source and target phrases, but not among the phrases in the source language themselves. A swathe of research has demonstrated that integrating source context modelling directly into log-linear phrase-based SMT (PB-SMT) and hierarchical PB-SMT (HPB-SMT), and can positively influence the weighting and selection of target phrases, and thus improve translation quality. In this book we present novel approaches to incorporate source-language contextual modelling into the state-of-the-art SMT models in order to enhance the quality of lexical selection. We investigate the effectiveness of use of a range of contextual features, including lexical features of neighbouring words, part-of-speech tags, supertags, sentence-similarity features, dependency information, and semantic roles. We explored a series of language pairs featuring typologically different languages, and examined the scalability of our research to larger amounts of training data.

Business & Economics

Machine Translation

Book Details:

Author : Pushpak Bhattacharyya
Publisher : CRC Press
Release : 2015-02-04
ISBN : 1439897204
Pages : 227 pages

Download or read book Machine Translation written by Pushpak Bhattacharyya and published by CRC Press. This book was released on 2015-02-04 with total page 227 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book compares and contrasts the principles and practices of rule-based machine translation (RBMT), statistical machine translation (SMT), and example-based machine translation (EBMT). Presenting numerous examples, the text introduces language divergence as the fundamental challenge to machine translation, emphasizes and works out word alignment, explores IBM models of machine translation, covers the mathematics of phrase-based SMT, provides complete walk-throughs of the working of interlingua-based and transfer-based RBMT, and analyzes EBMT, showing how translation parts can be extracted and recombined to automatically translate a new input.

Computers

Machine Translation with Minimal Reliance on Parallel Resources

Book Details:

Author : George Tambouratzis
Publisher : Springer
Release : 2017-08-09
ISBN : 3319631071
Pages : 92 pages

Download or read book Machine Translation with Minimal Reliance on Parallel Resources written by George Tambouratzis and published by Springer. This book was released on 2017-08-09 with total page 92 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides a unified view on a new methodology for Machine Translation (MT). This methodology extracts information from widely available resources (extensive monolingual corpora) while only assuming the existence of a very limited parallel corpus, thus having a unique starting point to Statistical Machine Translation (SMT). In this book, a detailed presentation of the methodology principles and system architecture is followed by a series of experiments, where the proposed system is compared to other MT systems using a set of established metrics including BLEU, NIST, Meteor and TER. Additionally, a free-to-use code is available, that allows the creation of new MT systems. The volume is addressed to both language professionals and researchers. Prerequisites for the readers are very limited and include a basic understanding of the machine translation as well as of the basic tools of natural language processing.

Education

Proceedings of the 2022 5th International Conference on Humanities Education and Social Sciences ICHESS 2022

Book Details:

Author : Augustin Holl
Publisher : Springer Nature
Release : 2023-01-13
ISBN : 2494069890
Pages : 3270 pages

Download or read book Proceedings of the 2022 5th International Conference on Humanities Education and Social Sciences ICHESS 2022 written by Augustin Holl and published by Springer Nature. This book was released on 2023-01-13 with total page 3270 pages. Available in PDF, EPUB and Kindle. Book excerpt: This is an open access book. ICHESS started in 2018, the last four sessions of ICHESS have all been successfully published. ICHESS is to bring together innovative academics and industrial experts in the field of Humanities Education and Social Sciences to a common forum. And we achieved the primary goal which is to promote research and developmental activities in Humanities Education and Social Sciences, and another goal is to promote scientific information interchange between researchers, developers, engineers, students, and practitioners working all around the world. 2022 5th International Conference on Humanities Education and Social Sciences (ICHESS 2022) was held on October 14-16, 2022 in Chongqing, China. ICHESS 2022 is to bring together innovative academics and industrial experts in the field of Humanities Education and Social Sciences to a common forum. The primary goal of the conference is to promote research and developmental activities in Humanities Education and Social Sciences and another goal is to promote scientific information interchange between researchers, developers, engineers, students, and practitioners working all around the world. The conference will be held every year to make it an ideal platform for people to share views and experiences in Humanities Education and Social Sciences and related areas.

Computers

Handbook of Natural Language Processing and Machine Translation

Book Details:

Author : Joseph Olive
Publisher : Springer Science & Business Media
Release : 2011-03-02
ISBN : 1441977139
Pages : 956 pages

Download or read book Handbook of Natural Language Processing and Machine Translation written by Joseph Olive and published by Springer Science & Business Media. This book was released on 2011-03-02 with total page 956 pages. Available in PDF, EPUB and Kindle. Book excerpt: This comprehensive handbook, written by leading experts in the field, details the groundbreaking research conducted under the breakthrough GALE program--The Global Autonomous Language Exploitation within the Defense Advanced Research Projects Agency (DARPA), while placing it in the context of previous research in the fields of natural language and signal processing, artificial intelligence and machine translation. The most fundamental contrast between GALE and its predecessor programs was its holistic integration of previously separate or sequential processes. In earlier language research programs, each of the individual processes was performed separately and sequentially: speech recognition, language recognition, transcription, translation, and content summarization. The GALE program employed a distinctly new approach by executing these processes simultaneously. Speech and language recognition algorithms now aid translation and transcription processes and vice versa. This combination of previously distinct processes has produced significant research and performance breakthroughs and has fundamentally changed the natural language processing and machine translation fields. This comprehensive handbook provides an exhaustive exploration into these latest technologies in natural language, speech and signal processing, and machine translation, providing researchers, practitioners and students with an authoritative reference on the topic.

A Phrase Based Joint Probability for Statistical Machine Translation

Book Details:

Author :
Publisher :
Release : 2002
ISBN :
Pages : 8 pages

Download or read book A Phrase Based Joint Probability for Statistical Machine Translation written by and published by . This book was released on 2002 with total page 8 pages. Available in PDF, EPUB and Kindle. Book excerpt: We present a joint probability model for statistical machine translation, which automatically learns word and phrase equivalents from bilingual corpora. Translations produced with parameters estimated using the joint model are more accurate than translations produced using IBM Model 4.

Aligning the Foundations of Hierarchical Statistical Machine Translation

Book Details:

Author : Gideon Maillette de Buy Wenniger
Publisher :
Release : 2016
ISBN : 9789402801934
Pages : 0 pages

Download or read book Aligning the Foundations of Hierarchical Statistical Machine Translation written by Gideon Maillette de Buy Wenniger and published by . This book was released on 2016 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: "Statistical machine translation (SMT) plays an important role in the automatic translation of the large and increasing volume of documents that has become globally available. The results of SMT are often still lacking in various aspects including word order. This thesis focuses on the improvement of hierarchical SMT, in particular Hiero. Hiero rules lack nonterminal labels. This gives them little context and makes their combination into full translations poorly coordinated, and strongly dependent on the language model. In this thesis, bilingual labels are added to Hiero rules. These bilingual labels lead to more coherent translations with better word order, as demonstrated by extensive experiments on three language pairs. The proposed labels require no syntactic information, and use only the information from word alignments. This distinguishes them from various types of syntactic labels earlier proposed in the literature. Bilingual labels are based on a newly proposed framework called hierarchical alignment trees (HATs). HATs are bilingual trees that represent the hierarchical translation equivalence structure induced from word alignments. HATs maximally decompose word alignments into phrase pairs, and provide an explicit description of the local reordering taking place within each phrase pair. The last part of the thesis is concerned with the complexity of empirical translation equivalence. Given a word alignment and a grammar, it studies the question what it means for the grammar to cover the word alignment. HATs play a key role in answering this question exactly and efficiently, and are applied to characterize alignment complexity for various language pairs."--Samenvatting auteur.