EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Integrative Modeling for Genome wide Regulation of Gene Expression

Download or read book Integrative Modeling for Genome wide Regulation of Gene Expression written by Zhengqing Ouyang and published by Stanford University. This book was released on 2010 with total page 135 pages. Available in PDF, EPUB and Kindle. Book excerpt: High-throughput genomics has been increasingly generating the massive amount of genome-wide data. With proper modeling methodologies, we can expect to archive a more comprehensive understanding of the regulatory mechanisms of biological systems. This work presents integrative approaches for the modeling and analysis of gene regulatory systems. In mammals, gene expression regulation is combinatorial in nature, with diverse roles of regulators on target genes. Microarrays (such as Exon Arrays) and RNA-Seq can be used to quantify the whole spectrum of RNA transcripts. ChIP-Seq is being used for the identification of transcription factor (TF) binding sites and histone modification marks. RNA interference (RNAi), coupled with gene expression profiles, allow perturbations of gene regulatory systems. Our approaches extract useful information from those genome-wide measurements for effectively modeling the logic of gene expression regulation. We present a predictive model for the prediction of gene expression from ChIP-Seq signals, based on quantitative modeling of regulator-gene association strength, principal component analysis, and regression-based model selection. We demonstrate the combinatorial regulation of TFs, and their power for explaining genome-wide gene expression variation. We also illustrate the roles of covalent histone modification marks on predicting gene expression and their regulation by TFs. We present a dynamical model of gene expression profiling, and derive the perturbed behaviors of the ordinary differential equation (ODE) system. Based on that, we present a regularized multivariate regression method for inferring the gene regulatory network of a stable cell type. We model the sparsity and stability of the network by a regularization approach. We applied the approaches to both a simulation data set and the RNAi perturbation data in mouse embryonic stem cells.

Book Integrative Modeling for Genome wide Regulation of Gene Expression

Download or read book Integrative Modeling for Genome wide Regulation of Gene Expression written by Zhengqing Ouyang and published by . This book was released on 2010 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: High-throughput genomics has been increasingly generating the massive amount of genome-wide data. With proper modeling methodologies, we can expect to archive a more comprehensive understanding of the regulatory mechanisms of biological systems. This work presents integrative approaches for the modeling and analysis of gene regulatory systems. In mammals, gene expression regulation is combinatorial in nature, with diverse roles of regulators on target genes. Microarrays (such as Exon Arrays) and RNA-Seq can be used to quantify the whole spectrum of RNA transcripts. ChIP-Seq is being used for the identification of transcription factor (TF) binding sites and histone modification marks. RNA interference (RNAi), coupled with gene expression profiles, allow perturbations of gene regulatory systems. Our approaches extract useful information from those genome-wide measurements for effectively modeling the logic of gene expression regulation. We present a predictive model for the prediction of gene expression from ChIP-Seq signals, based on quantitative modeling of regulator-gene association strength, principal component analysis, and regression-based model selection. We demonstrate the combinatorial regulation of TFs, and their power for explaining genome-wide gene expression variation. We also illustrate the roles of covalent histone modification marks on predicting gene expression and their regulation by TFs. We present a dynamical model of gene expression profiling, and derive the perturbed behaviors of the ordinary differential equation (ODE) system. Based on that, we present a regularized multivariate regression method for inferring the gene regulatory network of a stable cell type. We model the sparsity and stability of the network by a regularization approach. We applied the approaches to both a simulation data set and the RNAi perturbation data in mouse embryonic stem cells.

Book Transcriptomics and Gene Regulation

Download or read book Transcriptomics and Gene Regulation written by Jiaqian Wu and published by Springer. This book was released on 2015-11-17 with total page 190 pages. Available in PDF, EPUB and Kindle. Book excerpt: This volume focuses on modern computational and statistical tools for translational gene expression and regulation research to improve prognosis, diagnostics, prediction of severity, and therapies for human diseases. It introduces some of state of the art technologies as well as computational and statistical tools for translational bioinformatics in the areas of gene transcription and regulation, including the tools for next generation sequencing analyses, alternative spicing, the modeling of signaling pathways, network analyses in predicting disease genes, as well as protein and gene expression data integration in complex human diseases etc. The book is particularly useful for researchers and students in the field of molecular biology, clinical biology and bioinformatics, as well as physicians etc. Dr. Jiaqian Wu is assistant professor in the Vivian L. Smith Department of Neurosurgery and Center for Stem Cell and Regenerative Medicine, University of Texas Health Science Centre, Houston, TX, USA.​

Book In silico Modeling Gene Expression Utilizing Genomic Activity and 3D Contact Information

Download or read book In silico Modeling Gene Expression Utilizing Genomic Activity and 3D Contact Information written by Jordan Hughey and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Large scale genetic studies for numerous traits have implicated genetic variants across the genome. This is well illustrated through the recent genome wide association study (GWAS) of 1.2 million individuals leading to the discovery of 406 loci associated with tobacco and alcohol use (1). Yet, there still lies a knowledge gap of the functional mechanisms and biological etiology behind the variant-phenotype associations found. With the advances in modern genomic technologies, datasets often gather a multitude of biological measures, including DNA genotypes, RNA expression and epigenetic information (2). Association studies using these integrative datasets will not only implicate associated genes, but also reveal underlying mechanisms for diseases (3). Here, we look to understand the mechanisms behind gene regulation that influence complex traits. This will be accomplished by evaluating current transcriptome wide association study (TWAS) methods and developing a new multi-omic TWAS framework. Transcriptome wide association studies (TWAS) are a popular approach to multi- omic integrative methods. To date, TWAS methods rely on statistical models to predict gene expression due to a lack of individual level expression data available. These predicted expression values are further correlated to the phenotype of interest to see which genes (and their expression) effect the trait. Mainstream TWAS methods use genetic variants that are within a 1Mb distance from the gene as predictors for the gene's expression. This heuristic definition of cis-regulatory region is often not optimal and fails to pinpoint the true set of eQTLs. Alternatively, we propose using chromatin conformation data and enhancer activity marks to select the genetic variants used for transcriptome prediction. This will use molecular and spatial knowledge to have a biologically informed method in selecting genetic variants for expression prediction model training. Furthermore, we apply our approach to a cross-tissue framework for gene expression prediction, as well as an ensemble approach to produce optimal models from either cross-tissue or single tissue measures based on gene context. We implemented this suite of frameworks to 13 human tissues in the Genotype-Tissue Expression project and compare our findings to previous methods. Our approach resulted in an average gain of 52% and 14% more significant imputed models and an average of 44% and 5% improvement in prediction accuracy when compared to two widely-used methods: PrediXcan and UTMOST, respectively. Finally, we apply our expression prediction models to the genome-wide association results of the largest smoking and drinking use cohort to highlight our methods advantages for analyzing complex traits. We present that the improved expression prediction accuracy from multi-omic integration leads to increased power to detect gene-trait associations. Ultimately, this dissertation highlights the use of integrative approaches for genomic association studies. This dissertation provides a foundation for future epigenetic integration for association studies, and emphasizes that multi-omic approaches will substantially improve our understanding of the mechanisms behind complex traits.

Book Approaches in Integrative Bioinformatics

Download or read book Approaches in Integrative Bioinformatics written by Ming Chen and published by Springer Science & Business Media. This book was released on 2014-01-18 with total page 385 pages. Available in PDF, EPUB and Kindle. Book excerpt: Approaches in Integrative Bioinformatics provides a basic introduction to biological information systems, as well as guidance for the computational analysis of systems biology. This book also covers a range of issues and methods that reveal the multitude of omics data integration types and the relevance that integrative bioinformatics has today. Topics include biological data integration and manipulation, modeling and simulation of metabolic networks, transcriptomics and phenomics, and virtual cell approaches, as well as a number of applications of network biology. It helps to illustrate the value of integrative bioinformatics approaches to the life sciences. This book is intended for researchers and graduate students in the field of Bioinformatics. Professor Ming Chen is the Director of the Bioinformatics Laboratory at the College of Life Sciences, Zhejiang University, Hangzhou, China. Professor Ralf Hofestädt is the Chair of the Department of Bioinformatics and Medical Informatics, Bielefeld University, Germany.

Book The Analysis of Regulatory DNA  Current Developments  Knowledge and Applications Uncovering Gene Regulation

Download or read book The Analysis of Regulatory DNA Current Developments Knowledge and Applications Uncovering Gene Regulation written by Kenneth Berendzen and published by Bentham Science Publishers. This book was released on 2013-10-29 with total page 225 pages. Available in PDF, EPUB and Kindle. Book excerpt: A major goal of integrative research is understanding regulatory networks to such an extent as to allow researchers to model developmental and stress responses. Regulatory networks of living systems include complex and vast interactions between proteins, metabolites, RNA, various signaling molecules and DNA. One aspect of systems biology is understanding the dynamics of protein-DNA interactions affecting gene expression that are caused by transcription factors (TFs) and chromatin remodeling factors. This e-book provides a resource for summarizing current knowledge eukaryotic transcription and explores cis-elements and methods for their analysis, prediction and discovery. The book also presents an overview of exploring gene regulatory networks, chromatin, and miRNAs. Information about state-of-the-art techniques for the determination of TF - cis-element interactions in vivo and in silico give cutting edge insights on how genomic-scale research is being approached. The Analysis of Regulatory DNA provides readers with both the necessary background knowledge and provocative, up-to-date insights aimed at sparking new and vibrant experimental designs for understanding and predicting cis-elements in the eukaryotic genome.

Book Computational Modeling Of Gene Regulatory Networks   A Primer

Download or read book Computational Modeling Of Gene Regulatory Networks A Primer written by Hamid Bolouri and published by World Scientific Publishing Company. This book was released on 2008-08-13 with total page 341 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book serves as an introduction to the myriad computational approaches to gene regulatory modeling and analysis, and is written specifically with experimental biologists in mind. Mathematical jargon is avoided and explanations are given in intuitive terms. In cases where equations are unavoidable, they are derived from first principles or, at the very least, an intuitive description is provided. Extensive examples and a large number of model descriptions are provided for use in both classroom exercises as well as self-guided exploration and learning. As such, the book is ideal for self-learning and also as the basis of a semester-long course for undergraduate and graduate students in molecular biology, bioengineering, genome sciences, or systems biology./a

Book Genome scale Integrative Modelling of Gene Expression and Metabolic Networks

Download or read book Genome scale Integrative Modelling of Gene Expression and Metabolic Networks written by Delali Anku Adiamah and published by . This book was released on 2012 with total page 212 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book An Integrated Approach to Reconstructing Genome scale Transcriptional Regulatory Networks

Download or read book An Integrated Approach to Reconstructing Genome scale Transcriptional Regulatory Networks written by and published by . This book was released on 2015 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied [gamma]-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the [alpha]-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating comparative genomics of closely related organisms with gene expression data to assemble large-scale TRN models with high-quality predictions.

Book Gene Regulation

    Book Details:
  • Author : Minou Bina
  • Publisher : Humana Press
  • Release : 2013-02-23
  • ISBN : 9781627032858
  • Pages : 401 pages

Download or read book Gene Regulation written by Minou Bina and published by Humana Press. This book was released on 2013-02-23 with total page 401 pages. Available in PDF, EPUB and Kindle. Book excerpt: In this volume of Methods in Molecular BiologyTM, expert investigators offer comprehensive, complementary, and cutting-edge technologies for studies of gene regulation. The chapters of Gene Regulation: Methods and Protocols are organized to provide an integrated and a coherent view of control systems and their associated components. The protocols are broad in their scope. They include molecular, biochemical, spectroscopic techniques as well as high throughput strategies. Written in the highly successful Methods in Molecular BiologyTM series format, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible laboratory protocols, and key tips on troubleshooting and avoiding known pitfalls. Comprehensive and broad in their scope, the protocols are useful to researchers in many disciplines including molecular biology, genomics, biochemistry, biomedicine, nutrition, and agricultural sciences.

Book Dynamic regulation of DNA methylation in human T cell biology

Download or read book Dynamic regulation of DNA methylation in human T cell biology written by Antonio Lentini and published by Linköping University Electronic Press. This book was released on 2019-03-19 with total page 65 pages. Available in PDF, EPUB and Kindle. Book excerpt: T helper cells play a central role in orchestrating immune responses in humans. Upon encountering a foreign antigen, T helper cells are activated followed by a differentiation process where the cells are specialised to help combating the infection. Dysregulation of T helper cell activation, differentiation and function has been implicated in numerous diseases, including autoimmunity and cancer. Whereas gene-regulatory networks help drive T-cell differentiation, acquisition of stable cell states require heritable epigenetic signals, such as DNA methylation. Indeed, the establishment of DNA methylation patterns is a key part of appropriate T-cell differentiation but how this is regulated over time remains unknown. Methylation can be directly attached to cytosine residues in DNA to form 5-methylcytosine (5mC) but the removal of DNA methylation requires multiple enzymatic reactions, commonly initiated by the conversion into 5-hydroxymethylcytosine (5hmC), thus creating a highly complex regulatory system. This thesis aimed to investigate how DNA methylation is dynamically regulated during T-cell differentiation. To this end, we employed large-scale profiling techniques combining gene expression as well as genome-wide 5mC and 5hmC measurements to construct a time-series model of epigenetic regulation of differentiation. This revealed that early T-cell activation was accompanied by extensive genome-wide deposition of 5hmC which resulted in demethylation upon proliferation. Early DNA methylation remodelling through 5hmC was not only indicative of demethylation events during T-cell differentiation but also marked changes persisting longterm in memory T-cell subsets. These results suggest that priming of epigenetic landscapes in T-cells is initiated during early activation events, preceding any establishment of a stable lineage, which are then maintained throughout the cells lifespan. The regions undergoing remodelling were also highly enriched for genetic variants in autoimmune diseases which we show to be functional through disruption of protein binding. These variants could potentially disrupt gene-regulatory networks and the establishment of epigenetic priming, highlighting the complex interplay between genetic and epigenetic layers. In the course of this work, we discovered that a commonly used technique to study genome-wide DNA modifications, DNA immunoprecipitation (DIP)-seq, had a false discovery rate between 50-99% depending on the modification and cell type being assayed. This represented inherent technical errors related to the use of antibodies resulting in off-target binding of repetitive sequences lacking any DNA modifications. These sequences are common in mammalian genomes making robust detection of rare DNA modifications very difficult due to the high background signals. However, offtarget binding could easily be controlled for using a non-specific antibody control which greatly improved data quality and biological insight of the data. Although future studies are advised to use alternative methods where available, error correction is an acceptable alternative which will help fuel new discoveries through the removal of extensive background signals. Taken together, this thesis shows how integrative use of high-resolution epigenomic data can be used to study complex biological systems over time as well as how these techniques can be systematically characterised to identify and correct errors resulting in improved detection.

Book Computational Methods for Integrative Inference of Genome scale Gene Regulatory Networks

Download or read book Computational Methods for Integrative Inference of Genome scale Gene Regulatory Networks written by Alireza Fotuhi Siahpirani and published by . This book was released on 2019 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Inference of transcriptional regulatory networks is an important filed of research in systems biology, and many computational methods have been developed to infer regulatory networks from different types of genomic data. One of the most popular classes of computational network inference methods is expression based network inference. Given the mRNA levels of genes, these methods reconstruct a network between regulatory genes (called transcription factors) and potential target genes that best explains the input data. However, it has been shown that the networks that are inferred only using expression, have low agreement with experimentally validated physical regulatory interactions. In recent years, many methods have been developed to improve the accuracy of these computational methods by incorporating additional data types. In this dissertation, we describe our contributions towards advancing the state of the art in this field. Our first contribution, is developing a prior-based network inference method, MERLIN-P. MERLIN-P uses both expression of genes, and prior knowledge of interactions between regulatory genes and their potential targets, and infers a network that is supported by both expression and prior knowledge. Using a logistic function, MERLIN-P could incorporate and combine multiple sources of prior knowledge. The inferred networks in yeast, outperform state of the art expression based network inference methods, and perform better or at a par with prior based state of the art method. Our second contribution, is developing a method to estimate transcription factor activity from a noisy prior network, NCA+LASSO. Network Component Analysis (NCA), is a computational method that given expression of target genes and a (potentially incomplete and noisy) network structure that describes the connection of regulatory genes to these target genes, estimates unobserved activity of the regulators (transcription factor activities, TFA). It has been shown that using TFA can improve the quality of inferred networks. However, our prior knowledge in new contexts could be incomplete and noisy, and we do not know to what extent presence of noise in input network affects the quality of estimated TFA. We first show how presence of noise in the input prior network can decrease the quality of estimated TFA, and then show that by adding a regularization term, we can improve the quality of the estimated TFA. We show that using estimated TFA instead of just expression of TFs in network inference, improves the agreement of inferred networks to experimentally validated physical interactions, for all state of the art methods, including MERLIN-P. Our final contribution, is developing a multi-task inference method, Dynamic Regulatory Module Network (DRMN), that simultaneously infers regulatory networks for related cell lines, while taking into account the expected similarity of the cell lines. Many biological contexts are hierarchically related, and leveraging the similarity of these contexts could help us infer more accurate regulatory programs in each context. However, the small number of measurements in each context makes the inference of regulatory networks challenging. By inferring regulatory programs at module level (groups of co-expressed genes), DRMN is able to handle the small number of measurements, while the use of multi-task learning allows for incorporation of hierarchical relationship of contexts. DRMN first infers modules of co-expressed genes in each cell line, then infers a regulatory network for each module, and iteratively updates the inferred modules to reflect both co-expression and co-regulation, and updates the inferred networks to reflect the updated modules. We assess the accuracy of the inferred networks by predicting the expression on hold out genes, and show that the resulting modules and networks, provide insight into the process of differentiation between these related cell lines. For all the developed methods, we validate our results by comparing to known experimentally validated networks, and show that our results provide useful insight into the biological processes under consideration. Specifically, in chapter 2, we evaluated our inferred networks based on both network structure and predictive power, identified TFs that all tested methods fail to recover their target sets, and explored potential reasons that can explain this failure. Additionally, we used our method to infer stress specific networks, and evaluated predictions using stress specific knock-down experiments. In chapter 3, we evaluated our inferred networks based on both network structure and predictive power, and furthermore used our inferred networks to identify potential regulators that could be important for pluripotency state in mESC. We tested the effect of these regulators using shRNA experiments, and experimentally validated some of their predicted targets. Finally, in chapter 4, we evaluated our inferred models based on their predictive power and ability to predict gene expression in hold out data.

Book Systems Biology

    Book Details:
  • Author : Bernhard Palsson
  • Publisher : Cambridge University Press
  • Release : 2015-01-26
  • ISBN : 1107038855
  • Pages : 551 pages

Download or read book Systems Biology written by Bernhard Palsson and published by Cambridge University Press. This book was released on 2015-01-26 with total page 551 pages. Available in PDF, EPUB and Kindle. Book excerpt: The first comprehensive single-authored textbook on genome-scale models and the bottom-up approach to systems biology.

Book Computational Methods for Integrative Annotation of the Human Regulatory Genome

Download or read book Computational Methods for Integrative Annotation of the Human Regulatory Genome written by Tevfik Umut Dincer and published by . This book was released on 2023 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Deciphering the complex regulatory programs controlling gene expression is key to gaining insight into countless biological processes. However, a comprehensive characterization of the regulatory elements controlling expression across diverse cell types remains elusive. Analysis of DNA sequence provides insights into potential regulatory regions but cannot provide functional evidence of regulation on its own. Biochemical assays like ChIP-seq and ATAC-seq map epigenetic marks and regions of open chromatin associated with regulatory activity in a wide variety of cell and tissue types across the genome, but do not directly measure regulatory activity. Functional characterization assays like massively parallel reporter assays or CRISPR interference screens offer more direct evidence of regulatory activity but may have limited genomic coverage and cell type availability. Computational methods integrating these diverse data types can enable the prediction and interpretation of regulatory elements across the genome. Here, I present integrative modeling approaches that combine epigenomic, functional, and DNA sequence data for the comprehensive annotation of the human regulatory genome. First, we introduce ChromActivity, a computational method for annotating the regulatory genome across hundreds of cell and tissue types. ChromActivity integrates epigenomic data across over a hundred human cell and tissue types with a diverse set of functional characterization datasets to generate genomewide annotations of regulatory activity. ChromActivity provides annotations featuring discrete states reflecting combinatorial activity patterns and also continuous activity scores reflecting predicted regulatory element activities. Next, we present SHARPR-seq, a computational method for integrating DNA sequence information to extend the Sharpr-MPRA high-resolution regulatory activity mapping framework. SHARPR-seq improves upon the SHARPR method in multiple evaluation metrics, enabling improved functional dissection of regulatory elements controlling gene expression. These integrative modeling approaches demonstrate the utility of combining complementary data types to provide a more comprehensive understanding of the human regulatory landscape.

Book Detection  Annotation and Prioritization of Human Regulatory Variants in the Genetics Study

Download or read book Detection Annotation and Prioritization of Human Regulatory Variants in the Genetics Study written by Jun Mulin Li and published by . This book was released on 2017-01-26 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: This dissertation, "Detection, Annotation and Prioritization of Human Regulatory Variants in the Genetics Study" by Jun, Mulin, Li, 李俊, was obtained from The University of Hong Kong (Pokfulam, Hong Kong) and is being sold pursuant to Creative Commons: Attribution 3.0 Hong Kong License. The content of this dissertation has not been altered in any way. We have altered the formatting in order to facilitate the ease of printing and reading of the dissertation. All rights not granted by the above license are retained by the author. Abstract: Interpreting human regulatory variants in the noncoding genomic region is critical to understand the regulatory mechanisms of disease pathogenesis and promote personalized medicine. Recent studies showed that the associated SNPs detected by genome wide association study (GWAS) are significantly enriched in those regions that harbor functional elements, such as transcriptional factor binding sites (TFBSs), chromatin with histone modifications, DNase I hypersensitive sites (DHSs), expression quantitative trait loci (eQTLs) and microRNA (miRNA) binding sites. With the accumulation of functional genomics data, computational methods have been developed to annotate, predict and prioritize noncoding regulatory variants regarding different biological processes. However, evaluating the regulatory effect of genetic variants requires systematic consideration in both transcriptional and post-transcriptional level. In this dissertation, we designed a set of computational methods to predict and prioritize regulatory variants that affect gene regulation with comprehensive evaluations. We first constructed an integrative database that collect all disease-associated variants from genome wide association study (GWAS). Given the GWAS variants for particular disease/trait, we developed a pipeline GWAS3D to systematically analyze the probability of genetics variants affecting regulatory pathways and underlying disease associations by integrating chromatin state, long range chromosome interaction, sequence motif, and conservation information. We demonstrated that GWAS3D can identify functional regulatory variant that was experimentally validated to affect enhancer function. Detection and prioritization of regulatory variants in a particular cell/tissue is challenging and requires systematic consideration of chromatin states under corresponding condition. Prediction based on cell type-specific function genomic data can improve the chance and accuracy of regulatory variants discovery. By combining results from multiple methods and epigenome profiles, we developed a Bayesian approach to measure the regulatory potential of genetic variants in a cell type-specific manner. This model can also measure the ensemble effect of chromatin marks around variant locus and estimate regulatory probability of genetic variant on specific cell environment. We showed that this integrative and condition-dependent strategy significantly improves the prediction performance of functional regulatory variants. Last, we sought to investigate whether genetic variants in the miRNA binding site can affect the function of competing endogenous RNA (ceRNA) and subsequent disease development. Using RNA-seq data on human individuals from different populations, we revealed the genome-wide association between DNA polymorphism and ceRNA regulation. We found regulatory variants can simultaneously affect gene expression changes in both cis and trans through the ceRNA mechanism. We prioritized these variants with their associated ceRNAs according to different criteria and evaluated their collective effect on the ceRNA regulatory network. DOI: 10.5353/th_b5689295 Subjects: Human genetics - Variation Genomics - Data processing

Book Methods for Integrative Analysis of Genomic Data

Download or read book Methods for Integrative Analysis of Genomic Data written by Paul T. Manser and published by . This book was released on 2014 with total page 184 pages. Available in PDF, EPUB and Kindle. Book excerpt: In recent years, the development of new genomic technologies has allowed for the investigation of many regulatory epigenetic marks besides expression levels, on a genome-wide scale. As the price for these technologies continues to decrease, study sizes will not only increase, but several different assays are beginning to be used for the same samples. It is therefore desirable to develop statistical methods to integrate multiple data types that can handle the increased computational burden of incorporating large data sets. Furthermore, it is important to develop sound quality control and normalization methods as technical errors can compound when integrating multiple genomic assays. DNA methylation is a commonly studied epigenetic mark, and the Infinium HumanMethylation450 BeadChip has become a popular microarray that provides genome-wide coverage and is affordable enough to scale to larger study sizes. It employs a complex array design that has complicated efforts to develop normalization methods. We propose a novel normalization method that uses a set of stable methylation sites from housekeeping genes as empirical controls to fit a local regression hypersurface to signal intensities. We demonstrate that our method performs favorably compared to other popular methods for the array. We also discuss an approach to estimating cell-type admixtures, which is a frequent biological confound in these studies. For data integration we propose a gene-centric procedure that uses canonical correlation and subsequent permutation testing to examine correlation or other measures of association and co-localization of epigenetic marks on the genome. Specifically, a likelihood ratio test for general association between data modalities is performed after an initial dimension reduction step. Canonical scores are then regressed against covariates of interest using linear mixed effects models. Lastly, permutation testing is performed on weighted correlation matrices to test for co-localization of relationships to physical locations in the genome. We demonstrate these methods on a set of developmental brain samples from the BrainSpan consortium and find substantial relationships between DNA methylation, gene expression, and alternative promoter usage primarily in genes related to axon guidance. We perform a second integrative analysis on another set of brain samples from the Stanley Medical Research Institute.

Book Applying Integrative Computational Models to Study the Evolution of Gene Regulation

Download or read book Applying Integrative Computational Models to Study the Evolution of Gene Regulation written by Dan Xie and published by . This book was released on 2011 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Gene regulatory networks dynamically control the expression levels of all the genes, and are the keys in explaining various phenotypes and biological processes. The advance of high-throughput measurement technology, such as microarray and next-generation sequencing, enabled us to globally scrutinize various cell properties related to gene regulation and build statistical models to make quantitative predictions. The evolutionary process has left all kinds of traces in the current biological systems. The study of the evolution of gene regulatory networks in comparable cell types across species is an efficient method to unravel such evolutionary traces and help us to better understand the regulatory mechanism. The two main themes of my research are: analysing various "omics" data in the evolutionary context to identify conservation and changes in gene regulatory networks; and building computational models to incorporate different "omics" data for the annotation of genomes and prediction of evolution in gene regulation. The second chapter of my thesis described a computational algorithm for de novo prediction of transcription factor binding site motifs in multiple species. The algorithm, named "GibbsModule", uses three information sources to improve the prediction power, which are 1)co-expressed genes sharing the same set of motifs; 2)binding sites co-localizing to form modules; and 3)the conservation for the use of motifs across species. We developed a Gibbs sampling procedure to incorporate the three information sources. GibbsModule out-performed the existing algorithms on several synthetic and real datasets. When applied to study the binding regions of KLF in embryonic stem cells, GibbsModule discovered a new functional motif. We also used ChIP followed by qPCR to demonstrate that the binding affinity of GibbsModule predicted binding sites are stronger than non-predicted motifs. Both genome sequence and gene expression carry information about gene regulation. Therefore, we can learn more about gene regulatory networks by jointly analysing sequence and expression data. In the third chapter of my thesis, we first introduced a comparative study of the pre-implantation process of embryos in three mammalian species: human, mouse, and cow. We measured time course expression profiles of the embryos during the early development, and analysed them together with genome sequence data and ChIP-seq data. We observed a large portion of changed homologous gene expression, suggesting a prevalent rewiring of gene regulation. We associated the changes of gene expression with different types of cis-changes on the genome sequences. Especially, we found about 10% of species specific transposons are carrying multiple functional binding sites, which are likely to explain the evolution of gene expression. The second part of this chapter presented a phylogenetic model that incorporated the change of motif use and gene expression to infer the rewiring of gene regulatory networks. Epi-genetic modifications, including histone modifications and DNA methylation, are known to be associated with gene regulation. In chapter four, we studied the evolution of epi-genomes in pluripotent stem cells of human, mice, and pigs. We observed the conservation of epi-genomes in different categories of genomic regions. We found the evidence of positive and negative selections on the evolution of epi-genomes. Using linear regression models, the evolution of epi-genomes can largely explain the evolution of gene expression. In the second part of this chapter, we introduced a statistical model to describe the evolution of genomes considering both the DNA sequences and epi-genetic modifications. Based on the evolutionary model, we improved the current alignment algorithm with the information of epi-genetic modification distributions.