[EBOOK] Dense Subgraph Mining In Probabilistic Graphs PDF Download

Dense Subgraph Mining in Probabilistic Graphs

Book Details:

Author : Fatemeh Esfahani
Publisher :
Release : 2021
ISBN :
Pages : pages

Download or read book Dense Subgraph Mining in Probabilistic Graphs written by Fatemeh Esfahani and published by . This book was released on 2021 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: In this dissertation we consider the problem of mining cohesive (dense) subgraphs in probabilistic graphs, where each edge has a probability of existence. Mining probabilistic graphs has become the focus of interest in analyzing many real-world datasets, such as social, trust, communication, and biological networks due to the intrinsic uncertainty present in them. Studying cohesive subgraphs can reveal important information about connectivity, centrality, and robustness of the network, with applications in areas such as bioinformatics and social networks. In deterministic graphs, there exists various definitions of cohesive substructures, including cliques, quasi-cliques, k-cores and k-trusses. In this regard, k-core and k-truss decompositions are popular tools for finding cohesive subgraphs. In deterministic graphs, a k-core is the largest subgraph in which each vertex has at least k neighbors, and a k-truss is the largest subgraph whose edges are contained in at least k triangles (or k-2 triangles depending on the definition). The k-core and k-truss decomposition in deterministic graphs have been thoroughly studied in the literature. However, in the probabilistic context, the computation is challenging and state-of-art approaches are not scalable to large graphs. The main challenge is efficient computation of the tail probabilities of vertex degrees and triangle count of edges in probabilistic graphs. We employ a special version of central limit theorem (CLT) to obtain the tail probabilities efficiently. Based on our CLT approach we propose peeling algorithms for core and truss decomposition of a probabilistic graph that scales to very large graphs and offers significant improvement over state-of-the-art approaches. Moreover, we propose a second algorithm for probabilistic core decomposition that can handle graphs not fitting in memory by processing them sequentially one vertex at a time. In terms of truss decomposition, we design a second method which is based on progressive tightening of the estimate of the truss value of each edge based on h-index computation and novel use of dynamic programming. We provide extensive experimental results to show the efficiency of the proposed algorithms. Another contribution of this thesis is mining cohesive subgraphs using the recent notion of nucleus decomposition introduced by Sariyuce et al. Nucleus decomposition is based on higher order structures such as cliques nested in other cliques. Nucleus decomposition can reveal interesting subgraphs that can be missed by core and truss decompositions. In this dissertation, we present nucleus decomposition for probabilistic graphs. The major questions we address are: How to define meaningfully nucleus decomposition in probabilistic graphs? How hard is computing nucleus decomposition in probabilistic graphs? Can we devise efficient algorithms for exact or approximate nucleus decomposition in large graphs? We present three natural definitions of nucleus decomposition in probabilistic graphs: local, global, and weakly-global. We show that the local version is in PTIME, whereas global and weakly-global are #P-hard and NP-hard, respectively. We present an efficient and exact dynamic programming approach for the local case. Further, we present statistical approximations that can scale to bigger datasets without much loss of accuracy. For global and weakly-global decompositions we complement our intractability results by proposing efficient algorithms that give approximate solutions based on search space pruning and Monte-Carlo sampling. Extensive experiments show the scalability and efficiency of our algorithms. Compared to probabilistic core and truss decompositions, nucleus decomposition significantly outperforms in terms of density and clustering metrics.

Computers

On Uncertain Graphs

Book Details:

Author : Arijit Khan
Publisher : Springer Nature
Release : 2022-05-31
ISBN : 3031018605
Pages : 80 pages

Download or read book On Uncertain Graphs written by Arijit Khan and published by Springer Nature. This book was released on 2022-05-31 with total page 80 pages. Available in PDF, EPUB and Kindle. Book excerpt: Large-scale, highly interconnected networks, which are often modeled as graphs, pervade both our society and the natural world around us. Uncertainty, on the other hand, is inherent in the underlying data due to a variety of reasons, such as noisy measurements, lack of precise information needs, inference and prediction models, or explicit manipulation, e.g., for privacy purposes. Therefore, uncertain, or probabilistic, graphs are increasingly used to represent noisy linked data in many emerging application scenarios, and they have recently become a hot topic in the database and data mining communities. Many classical algorithms such as reachability and shortest path queries become #P-complete and, thus, more expensive over uncertain graphs. Moreover, various complex queries and analytics are also emerging over uncertain networks, such as pattern matching, information diffusion, and influence maximization queries. In this book, we discuss the sources of uncertain graphs and their applications, uncertainty modeling, as well as the complexities and algorithmic advances on uncertain graphs processing in the context of both classical and emerging graph queries and analytics. We emphasize the current challenges and highlight some future research directions.

Computers

Graph Mining

Book Details:

Author : Deepayan Chakrabarti
Publisher : Morgan & Claypool Publishers
Release : 2012-10-01
ISBN : 160845116X
Pages : 209 pages

Download or read book Graph Mining written by Deepayan Chakrabarti and published by Morgan & Claypool Publishers. This book was released on 2012-10-01 with total page 209 pages. Available in PDF, EPUB and Kindle. Book excerpt: What does the Web look like? How can we find patterns, communities, outliers, in a social network? Which are the most central nodes in a network? These are the questions that motivate this work. Networks and graphs appear in many diverse settings, for example in social networks, computer-communication networks (intrusion detection, traffic management), protein-protein interaction networks in biology, document-text bipartite graphs in text retrieval, person-account graphs in financial fraud detection, and others. In this work, first we list several surprising patterns that real graphs tend to follow. Then we give a detailed list of generators that try to mirror these patterns. Generators are important, because they can help with "what if" scenarios, extrapolations, and anonymization. Then we provide a list of powerful tools for graph analysis, and specifically spectral methods (Singular Value Decomposition (SVD)), tensors, and case studies like the famous "pageRank" algorithm and the "HITS" algorithm for ranking web search results. Finally, we conclude with a survey of tools and observations from related fields like sociology, which provide complementary viewpoints. Table of Contents: Introduction / Patterns in Static Graphs / Patterns in Evolving Graphs / Patterns in Weighted Graphs / Discussion: The Structure of Specific Graphs / Discussion: Power Laws and Deviations / Summary of Patterns / Graph Generators / Preferential Attachment and Variants / Incorporating Geographical Information / The RMat / Graph Generation by Kronecker Multiplication / Summary and Practitioner's Guide / SVD, Random Walks, and Tensors / Tensors / Community Detection / Influence/Virus Propagation and Immunization / Case Studies / Social Networks / Other Related Work / Conclusions

Analyzing Probabilistic Graphs

Book Details:

Author : Michalis Potamias
Publisher :
Release : 2012
ISBN :
Pages : 224 pages

Download or read book Analyzing Probabilistic Graphs written by Michalis Potamias and published by . This book was released on 2012 with total page 224 pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: Large probabilistic graphs appear in many diverse application domains such as social, biological, and mobile ad-hoc networks. Similar to standard graphs, probabilistic graphs may be weighted or unweighted, and directed or undirected; the difference is that their components are also associated with uncertainty. This thesis focuses on analyzing graphs whose edges are labeled with probability values. Assuming that the probabilistic graph is known a priori, we revisit well known graph mining problems. In particular, we study the problems of defining distance functions between two nodes, answering k-nearest neighbors queries, and clustering the probabilistic graph into partitions. Contrary to mining tasks, in learning tasks the probabilistic graph is unknown--it is the objective of the analysis. In this thesis we propose models and design algorithms to infer probabilistic graphs. In particular we infer probabilistic graphs that explain the observed spread of information in social networks. We analyze probabilistic graphs both theoretically and experimentally. The theoretical analysis consists of defining analytical tasks, studying their computational complexity, and designing algorithms to address them. In the experimental analysis, we apply our techniques to synthetic data as well as to real-world data from biological and online social networks. This analysis shows the computational efficiency and the analytical efficacy of the proposed techniques.

Computers

Efficient Frequent Subtree Mining Beyond Forests

Book Details:

Author : P. Welke
Publisher : IOS Press
Release : 2020-06-02
ISBN : 164368079X
Pages : 190 pages

Download or read book Efficient Frequent Subtree Mining Beyond Forests written by P. Welke and published by IOS Press. This book was released on 2020-06-02 with total page 190 pages. Available in PDF, EPUB and Kindle. Book excerpt: A common paradigm in distance-based learning is to embed the instance space into a feature space equipped with a metric and define the dissimilarity between instances by the distance of their images in the feature space. Frequent connected subgraphs are sometimes used to define such feature spaces if the instances are graphs, but identifying the set of frequent connected subgraphs and subsequently computing embeddings for graph instances is computationally intractable. As a result, existing frequent subgraph mining algorithms either restrict the structural complexity of the instance graphs or require exponential delay between the output of subsequent patterns, meaning that distance-based learners lack an efficient way to operate on arbitrary graph data. This book presents a mining system that gives up the demand on the completeness of the pattern set, and instead guarantees a polynomial delay between subsequent patterns. To complement this, efficient methods devised to compute the embedding of arbitrary graphs into the Hamming space spanned by the pattern set are described. As a result, a system is proposed that allows the efficient application of distance-based learning methods to arbitrary graph databases. In addition to an introduction and conclusion, the book is divided into chapters covering: preliminaries; related work; probabilistic frequent subtrees; boosted probabilistic frequent subtrees; and fast computation, with a further two chapters on Hamiltonian path for cactus graphs and Poisson binomial distribution.

Technology & Engineering

Mining Graph Data

Book Details:

Author : Diane J. Cook
Publisher : John Wiley & Sons
Release : 2006-12-18
ISBN : 0470073039
Pages : 501 pages

Download or read book Mining Graph Data written by Diane J. Cook and published by John Wiley & Sons. This book was released on 2006-12-18 with total page 501 pages. Available in PDF, EPUB and Kindle. Book excerpt: This text takes a focused and comprehensive look at mining data represented as a graph, with the latest findings and applications in both theory and practice provided. Even if you have minimal background in analyzing graph data, with this book you’ll be able to represent data as graphs, extract patterns and concepts from the data, and apply the methodologies presented in the text to real datasets. There is a misprint with the link to the accompanying Web page for this book. For those readers who would like to experiment with the techniques found in this book or test their own ideas on graph data, the Web page for the book should be http://www.eecs.wsu.edu/MGD.

Finding Hierarchical and Overlapping Dense Subgraphs Using Nucleus Decompositions

Book Details:

Author :
Publisher :
Release : 2014
ISBN :
Pages : 12 pages

Download or read book Finding Hierarchical and Overlapping Dense Subgraphs Using Nucleus Decompositions written by and published by . This book was released on 2014 with total page 12 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Computers

Managing and Mining Graph Data

Book Details:

Author : Charu C. Aggarwal
Publisher : Springer Science & Business Media
Release : 2010-02-02
ISBN : 1441960457
Pages : 623 pages

Download or read book Managing and Mining Graph Data written by Charu C. Aggarwal and published by Springer Science & Business Media. This book was released on 2010-02-02 with total page 623 pages. Available in PDF, EPUB and Kindle. Book excerpt: Managing and Mining Graph Data is a comprehensive survey book in graph management and mining. It contains extensive surveys on a variety of important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy. It also studies a number of domain-specific scenarios such as stream mining, web graphs, social networks, chemical and biological data. The chapters are written by well known researchers in the field, and provide a broad perspective of the area. This is the first comprehensive survey book in the emerging topic of graph data processing. Managing and Mining Graph Data is designed for a varied audience composed of professors, researchers and practitioners in industry. This volume is also suitable as a reference book for advanced-level database students in computer science and engineering.

Computers

On Uncertain Graphs

Book Details:

Author : Arijit Khan
Publisher : Morgan & Claypool Publishers
Release : 2018-07-23
ISBN : 1681730383
Pages : 96 pages

Download or read book On Uncertain Graphs written by Arijit Khan and published by Morgan & Claypool Publishers. This book was released on 2018-07-23 with total page 96 pages. Available in PDF, EPUB and Kindle. Book excerpt: Large-scale, highly interconnected networks, which are often modeled as graphs, pervade both our society and the natural world around us. Uncertainty, on the other hand, is inherent in the underlying data due to a variety of reasons, such as noisy measurements, lack of precise information needs, inference and prediction models, or explicit manipulation, e.g., for privacy purposes. Therefore, uncertain, or probabilistic, graphs are increasingly used to represent noisy linked data in many emerging application scenarios, and they have recently become a hot topic in the database and data mining communities. Many classical algorithms such as reachability and shortest path queries become #P-complete and, thus, more expensive over uncertain graphs. Moreover, various complex queries and analytics are also emerging over uncertain networks, such as pattern matching, information diffusion, and influence maximization queries. In this book, we discuss the sources of uncertain graphs and their applications, uncertainty modeling, as well as the complexities and algorithmic advances on uncertain graphs processing in the context of both classical and emerging graph queries and analytics. We emphasize the current challenges and highlight some future research directions.

Computers

Graph Representation Learning

Book Details:

Author : William L. William L. Hamilton
Publisher : Springer Nature
Release : 2022-06-01
ISBN : 3031015886
Pages : 141 pages

Download or read book Graph Representation Learning written by William L. William L. Hamilton and published by Springer Nature. This book was released on 2022-06-01 with total page 141 pages. Available in PDF, EPUB and Kindle. Book excerpt: Graph-structured data is ubiquitous throughout the natural and social sciences, from telecommunication networks to quantum chemistry. Building relational inductive biases into deep learning architectures is crucial for creating systems that can learn, reason, and generalize from this kind of data. Recent years have seen a surge in research on graph representation learning, including techniques for deep graph embeddings, generalizations of convolutional neural networks to graph-structured data, and neural message-passing approaches inspired by belief propagation. These advances in graph representation learning have led to new state-of-the-art results in numerous domains, including chemical synthesis, 3D vision, recommender systems, question answering, and social network analysis. This book provides a synthesis and overview of graph representation learning. It begins with a discussion of the goals of graph representation learning as well as key methodological foundations in graph theory and network analysis. Following this, the book introduces and reviews methods for learning node embeddings, including random-walk-based methods and applications to knowledge graphs. It then provides a technical synthesis and introduction to the highly successful graph neural network (GNN) formalism, which has become a dominant and fast-growing paradigm for deep learning with graph data. The book concludes with a synthesis of recent advancements in deep generative models for graphs—a nascent but quickly growing subset of graph representation learning.

Computers

Cohesive Subgraph Computation over Large Sparse Graphs

Book Details:

Author : Lijun Chang
Publisher : Springer
Release : 2018-12-24
ISBN : 3030035999
Pages : 107 pages

Download or read book Cohesive Subgraph Computation over Large Sparse Graphs written by Lijun Chang and published by Springer. This book was released on 2018-12-24 with total page 107 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book is considered the first extended survey on algorithms and techniques for efficient cohesive subgraph computation. With rapid development of information technology, huge volumes of graph data are accumulated. An availability of rich graph data not only brings great opportunities for realizing big values of data to serve key applications, but also brings great challenges in computation. Using a consistent terminology, the book gives an excellent introduction to the models and algorithms for the problem of cohesive subgraph computation. The materials of this book are well organized from introductory content to more advanced topics while also providing well-designed source codes for most algorithms described in the book. This is a timely book for researchers who are interested in this topic and efficient data structure design for large sparse graph processing. It is also a guideline book for new researchers to get to know the area of cohesive subgraph computation.

Computers

Frequent Pattern Mining

Book Details:

Author : Charu C. Aggarwal
Publisher : Springer
Release : 2014-08-29
ISBN : 3319078216
Pages : 480 pages

Download or read book Frequent Pattern Mining written by Charu C. Aggarwal and published by Springer. This book was released on 2014-08-29 with total page 480 pages. Available in PDF, EPUB and Kindle. Book excerpt: This comprehensive reference consists of 18 chapters from prominent researchers in the field. Each chapter is self-contained, and synthesizes one aspect of frequent pattern mining. An emphasis is placed on simplifying the content, so that students and practitioners can benefit from the book. Each chapter contains a survey describing key research on the topic, a case study and future directions. Key topics include: Pattern Growth Methods, Frequent Pattern Mining in Data Streams, Mining Graph Patterns, Big Data Frequent Pattern Mining, Algorithms for Data Clustering and more. Advanced-level students in computer science, researchers and practitioners from industry will find this book an invaluable reference.

Computers

Advances in Internet Data Web Technologies

Book Details:

Author : Leonard Barolli
Publisher : Springer Nature
Release : 2022-02-01
ISBN : 3030959031
Pages : 478 pages

Download or read book Advances in Internet Data Web Technologies written by Leonard Barolli and published by Springer Nature. This book was released on 2022-02-01 with total page 478 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book presents original contributions to the theories and practices of emerging Internet, data, and Web technologies and their applicability in businesses, engineering, and academia. Internet has become the most proliferative platform for emerging large-scale computing paradigms. Among these, data and Web technologies are two most prominent paradigms, in a variety of forms such as Data Centers, Cloud Computing, Mobile Cloud, Mobile Web Services, and so on. These technologies altogether create a digital ecosystem whose corner stone is the data cycle, from capturing to processing, analysis, and visualization. The investigation of various research and development issues in this digital ecosystem is boosted by the ever-increasing needs of real-life applications, which are based on storing and processing large amounts of data. As a key feature, it addresses advances in the life cycle exploitation of data generated from the digital ecosystem data technologies that create value for the knowledge and businesses toward a collective intelligence approach. Researchers, software developers, practitioners, and students interested in the field of data and Web technologies find this book useful and a reference for their activity.

Mathematics

The Probabilistic Method

Book Details:

Author : Noga Alon
Publisher : John Wiley & Sons
Release : 2015-11-02
ISBN : 1119062071
Pages : 396 pages

Download or read book The Probabilistic Method written by Noga Alon and published by John Wiley & Sons. This book was released on 2015-11-02 with total page 396 pages. Available in PDF, EPUB and Kindle. Book excerpt: Praise for the Third Edition “Researchers of any kind of extremal combinatorics or theoretical computer science will welcome the new edition of this book.” - MAA Reviews Maintaining a standard of excellence that establishes The Probabilistic Method as the leading reference on probabilistic methods in combinatorics, the Fourth Edition continues to feature a clear writing style, illustrative examples, and illuminating exercises. The new edition includes numerous updates to reflect the most recent developments and advances in discrete mathematics and the connections to other areas in mathematics, theoretical computer science, and statistical physics. Emphasizing the methodology and techniques that enable problem-solving, The Probabilistic Method, Fourth Edition begins with a description of tools applied to probabilistic arguments, including basic techniques that use expectation and variance as well as the more advanced applications of martingales and correlation inequalities. The authors explore where probabilistic techniques have been applied successfully and also examine topical coverage such as discrepancy and random graphs, circuit complexity, computational geometry, and derandomization of randomized algorithms. Written by two well-known authorities in the field, the Fourth Edition features: Additional exercises throughout with hints and solutions to select problems in an appendix to help readers obtain a deeper understanding of the best methods and techniques New coverage on topics such as the Local Lemma, Six Standard Deviations result in Discrepancy Theory, Property B, and graph limits Updated sections to reflect major developments on the newest topics, discussions of the hypergraph container method, and many new references and improved results The Probabilistic Method, Fourth Edition is an ideal textbook for upper-undergraduate and graduate-level students majoring in mathematics, computer science, operations research, and statistics. The Fourth Edition is also an excellent reference for researchers and combinatorists who use probabilistic methods, discrete mathematics, and number theory. Noga Alon, PhD, is Baumritter Professor of Mathematics and Computer Science at Tel Aviv University. He is a member of the Israel National Academy of Sciences and Academia Europaea. A coeditor of the journal Random Structures and Algorithms, Dr. Alon is the recipient of the Polya Prize, The Gödel Prize, The Israel Prize, and the EMET Prize. Joel H. Spencer, PhD, is Professor of Mathematics and Computer Science at the Courant Institute of New York University. He is the cofounder and coeditor of the journal Random Structures and Algorithms and is a Sloane Foundation Fellow. Dr. Spencer has written more than 200 published articles and is the coauthor of Ramsey Theory, Second Edition, also published by Wiley.

Computers

Solving Large Scale Learning Tasks Challenges and Algorithms

Book Details:

Author : Stefan Michaelis
Publisher : Springer
Release : 2016-07-02
ISBN : 3319417061
Pages : 397 pages

Download or read book Solving Large Scale Learning Tasks Challenges and Algorithms written by Stefan Michaelis and published by Springer. This book was released on 2016-07-02 with total page 397 pages. Available in PDF, EPUB and Kindle. Book excerpt: In celebration of Prof. Morik's 60th birthday, this Festschrift covers research areas that Prof. Morik worked in and presents various researchers with whom she collaborated. The 23 refereed articles in this Festschrift volume provide challenges and solutions from theoreticians and practitioners on data preprocessing, modeling, learning, and evaluation. Topics include data-mining and machine-learning algorithms, feature selection and feature generation, optimization as well as efficiency of energy and communication.

Computers

Random Graphs and Complex Networks

Book Details:

Author : Remco van der Hofstad
Publisher : Cambridge University Press
Release : 2017
ISBN : 110717287X
Pages : 341 pages

Download or read book Random Graphs and Complex Networks written by Remco van der Hofstad and published by Cambridge University Press. This book was released on 2017 with total page 341 pages. Available in PDF, EPUB and Kindle. Book excerpt: This classroom-tested text is the definitive introduction to the mathematics of network science, featuring examples and numerous exercises.

Finding Homogeneous Collections of Dense Subgraphs Using Constraint based Data Mining Approaches

Book Details:

Author : Pierre-Nicolas Mougel
Publisher :
Release : 2012
ISBN :
Pages : 0 pages

Download or read book Finding Homogeneous Collections of Dense Subgraphs Using Constraint based Data Mining Approaches written by Pierre-Nicolas Mougel and published by . This book was released on 2012 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: The work presented in this thesis deals with data mining approaches for the analysis of attributed graphs. An attributed graph is a graph where properties, encoded by means of attributes, are associated to each vertex. In such data, our objective is the discovery of subgraphs formed by several dense groups of vertices that are homogeneous with respect to the attributes. More precisely, we define the constraint-based extraction of collections of subgraphs densely connected and such that the vertices share enough attributes. To this aim, we propose two new classes of patterns along with sound and complete algorithms to compute them efficiently using constraint-based approaches. The first family of patterns, named Maximal Homogeneous Clique Set (MHCS), contains patterns satisfying constraints on the number of dense subgraphs, on the size of these subgraphs, and on the number of shared attributes. The second class of patterns, named Collection of Homogeneous k-clique Percolated components (CoHoP), is based on a relaxed notion of density in order to handle missing values. Both approaches are used for the analysis of scientific collaboration networks and protein-protein interaction networks. The extracted patterns exhibit structures useful in a decision support process. Indeed, in a scientific collaboration network, the analysis of such structures might give hints to propose new collaborations between researchers working on the same subjects. In a protein-protein interaction network, the analysis of the extracted patterns can be used to study the relationships between modules of proteins involved in similar biological situations. The analysis of the performances, on real and synthetic data, with respect to different attributed graph characteristics, shows that the proposed approaches scale well for large datasets.