Download or read book Individual and Collective Graph Mining written by Danai Koutra and published by Springer Nature. This book was released on 2022-06-01 with total page 197 pages. Available in PDF, EPUB and Kindle. Book excerpt: Graphs naturally represent information ranging from links between web pages, to communication in email networks, to connections between neurons in our brains. These graphs often span billions of nodes and interactions between them. Within this deluge of interconnected data, how can we find the most important structures and summarize them? How can we efficiently visualize them? How can we detect anomalies that indicate critical events, such as an attack on a computer system, disease formation in the human brain, or the fall of a company? This book presents scalable, principled discovery algorithms that combine globality with locality to make sense of one or more graphs. In addition to fast algorithmic methodologies, we also contribute graph-theoretical ideas and models, and real-world applications in two main areas: Individual Graph Mining: We show how to interpretably summarize a single graph by identifying its important graph structures. We complement summarization with inference, which leverages information about few entities (obtained via summarization or other methods) and the network structure to efficiently and effectively learn information about the unknown entities. Collective Graph Mining: We extend the idea of individual-graph summarization to time-evolving graphs, and show how to scalably discover temporal patterns. Apart from summarization, we claim that graph similarity is often the underlying problem in a host of applications where multiple graphs occur (e.g., temporal anomaly detection, discovery of behavioral patterns), and we present principled, scalable algorithms for aligning networks and measuring their similarity. The methods that we present in this book leverage techniques from diverse areas, such as matrix algebra, graph theory, optimization, information theory, machine learning, finance, and social science, to solve real-world problems. We present applications of our exploration algorithms to massive datasets, including a Web graph of 6.6 billion edges, a Twitter graph of 1.8 billion edges, brain graphs with up to 90 million edges, collaboration, peer-to-peer networks, browser logs, all spanning millions of users and interactions.
Download or read book Individual and Collective Graph Mining written by Danai Koutra and published by Morgan & Claypool Publishers. This book was released on 2017-10-26 with total page 208 pages. Available in PDF, EPUB and Kindle. Book excerpt: Graphs naturally represent information ranging from links between web pages, to communication in email networks, to connections between neurons in our brains. These graphs often span billions of nodes and interactions between them. Within this deluge of interconnected data, how can we find the most important structures and summarize them? How can we efficiently visualize them? How can we detect anomalies that indicate critical events, such as an attack on a computer system, disease formation in the human brain, or the fall of a company? This book presents scalable, principled discovery algorithms that combine globality with locality to make sense of one or more graphs. In addition to fast algorithmic methodologies, we also contribute graph-theoretical ideas and models, and real-world applications in two main areas: •Individual Graph Mining: We show how to interpretably summarize a single graph by identifying its important graph structures. We complement summarization with inference, which leverages information about few entities (obtained via summarization or other methods) and the network structure to efficiently and effectively learn information about the unknown entities. •Collective Graph Mining: We extend the idea of individual-graph summarization to time-evolving graphs, and show how to scalably discover temporal patterns. Apart from summarization, we claim that graph similarity is often the underlying problem in a host of applications where multiple graphs occur (e.g., temporal anomaly detection, discovery of behavioral patterns), and we present principled, scalable algorithms for aligning networks and measuring their similarity. The methods that we present in this book leverage techniques from diverse areas, such as matrix algebra, graph theory, optimization, information theory, machine learning, finance, and social science, to solve real-world problems. We present applications of our exploration algorithms to massive datasets, including a Web graph of 6.6 billion edges, a Twitter graph of 1.8 billion edges, brain graphs with up to 90 million edges, collaboration, peer-to-peer networks, browser logs, all spanning millions of users and interactions.
Download or read book Graph Mining written by Deepayan Chakrabarti and published by Morgan & Claypool Publishers. This book was released on 2012-10-01 with total page 209 pages. Available in PDF, EPUB and Kindle. Book excerpt: What does the Web look like? How can we find patterns, communities, outliers, in a social network? Which are the most central nodes in a network? These are the questions that motivate this work. Networks and graphs appear in many diverse settings, for example in social networks, computer-communication networks (intrusion detection, traffic management), protein-protein interaction networks in biology, document-text bipartite graphs in text retrieval, person-account graphs in financial fraud detection, and others. In this work, first we list several surprising patterns that real graphs tend to follow. Then we give a detailed list of generators that try to mirror these patterns. Generators are important, because they can help with "what if" scenarios, extrapolations, and anonymization. Then we provide a list of powerful tools for graph analysis, and specifically spectral methods (Singular Value Decomposition (SVD)), tensors, and case studies like the famous "pageRank" algorithm and the "HITS" algorithm for ranking web search results. Finally, we conclude with a survey of tools and observations from related fields like sociology, which provide complementary viewpoints. Table of Contents: Introduction / Patterns in Static Graphs / Patterns in Evolving Graphs / Patterns in Weighted Graphs / Discussion: The Structure of Specific Graphs / Discussion: Power Laws and Deviations / Summary of Patterns / Graph Generators / Preferential Attachment and Variants / Incorporating Geographical Information / The RMat / Graph Generation by Kronecker Multiplication / Summary and Practitioner's Guide / SVD, Random Walks, and Tensors / Tensors / Community Detection / Influence/Virus Propagation and Immunization / Case Studies / Social Networks / Other Related Work / Conclusions
Download or read book Exploiting the Power of Group Differences written by Guozhu Dong and published by Springer Nature. This book was released on 2022-05-31 with total page 135 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book presents pattern-based problem-solving methods for a variety of machine learning and data analysis problems. The methods are all based on techniques that exploit the power of group differences. They make use of group differences represented using emerging patterns (aka contrast patterns), which are patterns that match significantly different numbers of instances in different data groups. A large number of applications outside of the computing discipline are also included. Emerging patterns (EPs) are useful in many ways. EPs can be used as features, as simple classifiers, as subpopulation signatures/characterizations, and as triggering conditions for alerts. EPs can be used in gene ranking for complex diseases since they capture multi-factor interactions. The length of EPs can be used to detect anomalies, outliers, and novelties. Emerging/contrast pattern based methods for clustering analysis and outlier detection do not need distance metrics, avoiding pitfalls of the latter in exploratory analysis of high dimensional data. EP-based classifiers can achieve good accuracy even when the training datasets are tiny, making them useful for exploratory compound selection in drug design. EPs can serve as opportunities in opportunity-focused boosting and are useful for constructing powerful conditional ensembles. EP-based methods often produce interpretable models and results. In general, EPs are useful for classification, clustering, outlier detection, gene ranking for complex diseases, prediction model analysis and improvement, and so on. EPs are useful for many tasks because they represent group differences, which have extraordinary power. Moreover, EPs represent multi-factor interactions, whose effective handling is of vital importance and is a major challenge in many disciplines. Based on the results presented in this book, one can clearly say that patterns are useful, especially when they are linked to issues of interest. We believe that many effective ways to exploit group differences' power still remain to be discovered. Hopefully this book will inspire readers to discover such new ways, besides showing them existing ways, to solve various challenging problems.
Download or read book Managing and Mining Graph Data written by Charu C. Aggarwal and published by Springer Science & Business Media. This book was released on 2010-02-02 with total page 623 pages. Available in PDF, EPUB and Kindle. Book excerpt: Managing and Mining Graph Data is a comprehensive survey book in graph management and mining. It contains extensive surveys on a variety of important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy. It also studies a number of domain-specific scenarios such as stream mining, web graphs, social networks, chemical and biological data. The chapters are written by well known researchers in the field, and provide a broad perspective of the area. This is the first comprehensive survey book in the emerging topic of graph data processing. Managing and Mining Graph Data is designed for a varied audience composed of professors, researchers and practitioners in industry. This volume is also suitable as a reference book for advanced-level database students in computer science and engineering.
Download or read book Multidimensional Mining of Massive Text Data written by Chao Zhang and published by Springer Nature. This book was released on 2022-06-01 with total page 183 pages. Available in PDF, EPUB and Kindle. Book excerpt: Unstructured text, as one of the most important data forms, plays a crucial role in data-driven decision making in domains ranging from social networking and information retrieval to scientific research and healthcare informatics. In many emerging applications, people's information need from text data is becoming multidimensional—they demand useful insights along multiple aspects from a text corpus. However, acquiring such multidimensional knowledge from massive text data remains a challenging task. This book presents data mining techniques that turn unstructured text data into multidimensional knowledge. We investigate two core questions. (1) How does one identify task-relevant text data with declarative queries in multiple dimensions? (2) How does one distill knowledge from text data in a multidimensional space? To address the above questions, we develop a text cube framework. First, we develop a cube construction module that organizes unstructured data into a cube structure, by discovering latent multidimensional and multi-granular structure from the unstructured text corpus and allocating documents into the structure. Second, we develop a cube exploitation module that models multiple dimensions in the cube space, thereby distilling from user-selected data multidimensional knowledge. Together, these two modules constitute an integrated pipeline: leveraging the cube structure, users can perform multidimensional, multigranular data selection with declarative queries; and with cube exploitation algorithms, users can extract multidimensional patterns from the selected data for decision making. The proposed framework has two distinctive advantages when turning text data into multidimensional knowledge: flexibility and label-efficiency. First, it enables acquiring multidimensional knowledge flexibly, as the cube structure allows users to easily identify task-relevant data along multiple dimensions at varied granularities and further distill multidimensional knowledge. Second, the algorithms for cube construction and exploitation require little supervision; this makes the framework appealing for many applications where labeled data are expensive to obtain.
Download or read book Outlier Detection Techniques and Applications written by N. N. R. Ranga Suri and published by Springer. This book was released on 2019-01-10 with total page 227 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book, drawing on recent literature, highlights several methodologies for the detection of outliers and explains how to apply them to solve several interesting real-life problems. The detection of objects that deviate from the norm in a data set is an essential task in data mining due to its significance in many contemporary applications. More specifically, the detection of fraud in e-commerce transactions and discovering anomalies in network data have become prominent tasks, given recent developments in the field of information and communication technologies and security. Accordingly, the book sheds light on specific state-of-the-art algorithmic approaches such as the community-based analysis of networks and characterization of temporal outliers present in dynamic networks. It offers a valuable resource for young researchers working in data mining, helping them understand the technical depth of the outlier detection problem and devise innovative solutions to address related challenges.
Download or read book Mining Structures of Factual Knowledge from Text written by Xiang Ren and published by Springer Nature. This book was released on 2022-05-31 with total page 183 pages. Available in PDF, EPUB and Kindle. Book excerpt: The real-world data, though massive, is largely unstructured, in the form of natural-language text. It is challenging but highly desirable to mine structures from massive text data, without extensive human annotation and labeling. In this book, we investigate the principles and methodologies of mining structures of factual knowledge (e.g., entities and their relationships) from massive, unstructured text corpora. Departing from many existing structure extraction methods that have heavy reliance on human annotated data for model training, our effort-light approach leverages human-curated facts stored in external knowledge bases as distant supervision and exploits rich data redundancy in large text corpora for context understanding. This effort-light mining approach leads to a series of new principles and powerful methodologies for structuring text corpora, including (1) entity recognition, typing and synonym discovery, (2) entity relation extraction, and (3) open-domain attribute-value mining and information extraction. This book introduces this new research frontier and points out some promising research directions.
Download or read book Detecting Fake News on Social Media written by Kai Shu and published by Springer Nature. This book was released on 2022-05-31 with total page 121 pages. Available in PDF, EPUB and Kindle. Book excerpt: In the past decade, social media has become increasingly popular for news consumption due to its easy access, fast dissemination, and low cost. However, social media also enables the wide propagation of "fake news," i.e., news with intentionally false information. Fake news on social media can have significant negative societal effects. Therefore, fake news detection on social media has recently become an emerging research area that is attracting tremendous attention. This book, from a data mining perspective, introduces the basic concepts and characteristics of fake news across disciplines, reviews representative fake news detection methods in a principled way, and illustrates challenging issues of fake news detection on social media. In particular, we discussed the value of news content and social context, and important extensions to handle early detection, weakly-supervised detection, and explainable detection. The concepts, algorithms, and methods described in this lecture can help harness the power of social media to build effective and intelligent fake news detection systems. This book is an accessible introduction to the study of detecting fake news on social media. It is an essential reading for students, researchers, and practitioners to understand, manage, and excel in this area. This book is supported by additional materials, including lecture slides, the complete set of figures, key references, datasets, tools used in this book, and the source code of representative algorithms. The readers are encouraged to visit the book website for the latest information: http://dmml.asu.edu/dfn/
Download or read book Querying Graphs written by Angela Bonifati and published by Morgan & Claypool Publishers. This book was released on 2018-10-01 with total page 186 pages. Available in PDF, EPUB and Kindle. Book excerpt: Graph data modeling and querying arises in many practical application domains such as social and biological networks where the primary focus is on concepts and their relationships and the rich patterns in these complex webs of interconnectivity. In this book, we present a concise unified view on the basic challenges which arise over the complete life cycle of formulating and processing queries on graph databases. To that purpose, we present all major concepts relevant to this life cycle, formulated in terms of a common and unifying ground: the property graph data model—the pre-dominant data model adopted by modern graph database systems. We aim especially to give a coherent and in-depth perspective on current graph querying and an outlook for future developments. Our presentation is self-contained, covering the relevant topics from: graph data models, graph query languages and graph query specification, graph constraints, and graph query processing. We conclude by indicating major open research challenges towards the next generation of graph data management systems.
Download or read book Correlation Clustering written by Bonchi Francesco and published by Springer Nature. This book was released on 2022-05-31 with total page 133 pages. Available in PDF, EPUB and Kindle. Book excerpt: Given a set of objects and a pairwise similarity measure between them, the goal of correlation clustering is to partition the objects in a set of clusters to maximize the similarity of the objects within the same cluster and minimize the similarity of the objects in different clusters. In most of the variants of correlation clustering, the number of clusters is not a given parameter; instead, the optimal number of clusters is automatically determined. Correlation clustering is perhaps the most natural formulation of clustering: as it just needs a definition of similarity, its broad generality makes it applicable to a wide range of problems in different contexts, and, particularly, makes it naturally suitable to clustering structured objects for which feature vectors can be difficult to obtain. Despite its simplicity, generality, and wide applicability, correlation clustering has so far received much more attention from an algorithmic-theory perspective than from the data-mining community. The goal of this lecture is to show how correlation clustering can be a powerful addition to the toolkit of a data-mining researcher and practitioner, and to encourage further research in the area.
Download or read book Improving E Commerce Web Applications Through Business Intelligence Techniques written by Sreedhar, G. and published by IGI Global. This book was released on 2018-02-02 with total page 379 pages. Available in PDF, EPUB and Kindle. Book excerpt: As the Internet becomes increasingly interconnected with modern society, the transition to online business has developed into a prevalent form of commerce. While there exist various advantages and disadvantages to online business, it plays a major role in contemporary business methods. Improving E-Commerce Web Applications Through Business Intelligence Techniques provides emerging research on the core areas of e-commerce web applications. While highlighting the use of data mining, search engine optimization, and online marketing to advance online business, readers will learn how the role of online commerce is becoming more prevalent in modern business. This book is an important resource for vendors, website developers, online customers, and scholars seeking current research on the development and use of e-commerce.
Download or read book Tenth International Conference on Applications and Techniques in Cyber Intelligence ICATCI 2022 written by Jemal H. Abawajy and published by Springer Nature. This book was released on 2023-03-29 with total page 775 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book presents innovative ideas, cutting-edge findings, and novel techniques, methods, and applications in a broad range of cybersecurity and cyberthreat intelligence areas. As our society becomes smarter, there is a corresponding need to secure our cyberfuture. The book describes approaches and findings that are of interest to business professionals and governments seeking to secure our data and underpin infrastructures, as well as to individual users.
Download or read book Graph Representation Learning written by William L. William L. Hamilton and published by Springer Nature. This book was released on 2022-06-01 with total page 141 pages. Available in PDF, EPUB and Kindle. Book excerpt: Graph-structured data is ubiquitous throughout the natural and social sciences, from telecommunication networks to quantum chemistry. Building relational inductive biases into deep learning architectures is crucial for creating systems that can learn, reason, and generalize from this kind of data. Recent years have seen a surge in research on graph representation learning, including techniques for deep graph embeddings, generalizations of convolutional neural networks to graph-structured data, and neural message-passing approaches inspired by belief propagation. These advances in graph representation learning have led to new state-of-the-art results in numerous domains, including chemical synthesis, 3D vision, recommender systems, question answering, and social network analysis. This book provides a synthesis and overview of graph representation learning. It begins with a discussion of the goals of graph representation learning as well as key methodological foundations in graph theory and network analysis. Following this, the book introduces and reviews methods for learning node embeddings, including random-walk-based methods and applications to knowledge graphs. It then provides a technical synthesis and introduction to the highly successful graph neural network (GNN) formalism, which has become a dominant and fast-growing paradigm for deep learning with graph data. The book concludes with a synthesis of recent advancements in deep generative models for graphs—a nascent but quickly growing subset of graph representation learning.
Download or read book Databases Theory and Applications written by Hua Wang and published by Springer. This book was released on 2014-07-04 with total page 251 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book constitutes the refereed proceedings of the 25th Australasian Database Conference, ADC 2014, held in Brisbane, NSW, Australia, in July 2014. The 15 full papers presented together with 6 short papers and 2 keynotes were carefully reviewed and selected from 38 submissions. A large variety of subjects are covered, including hot topics such as data warehousing; database integration; mobile databases; cloud, distributed, and parallel databases; high dimensional and temporal data; image/video retrieval and databases; database performance and tuning; privacy and security in databases; query processing and optimization; semi-structured data and XML; spatial data processing and management; stream and sensor data management; uncertain and probabilistic databases; web databases; graph databases; web service management; and social media data management.
Download or read book Mining of Massive Datasets written by Jure Leskovec and published by Cambridge University Press. This book was released on 2014-11-13 with total page 480 pages. Available in PDF, EPUB and Kindle. Book excerpt: Now in its second edition, this book focuses on practical algorithms for mining data from even the largest datasets.
Download or read book Practical Graph Mining with R written by Nagiza F. Samatova and published by CRC Press. This book was released on 2013-07-15 with total page 498 pages. Available in PDF, EPUB and Kindle. Book excerpt: Discover Novel and Insightful Knowledge from Data Represented as a Graph Practical Graph Mining with R presents a "do-it-yourself" approach to extracting interesting patterns from graph data. It covers many basic and advanced techniques for the identification of anomalous or frequently recurring patterns in a graph, the discovery of groups or clusters of nodes that share common patterns of attributes and relationships, the extraction of patterns that distinguish one category of graphs from another, and the use of those patterns to predict the category of new graphs. Hands-On Application of Graph Data Mining Each chapter in the book focuses on a graph mining task, such as link analysis, cluster analysis, and classification. Through applications using real data sets, the book demonstrates how computational techniques can help solve real-world problems. The applications covered include network intrusion detection, tumor cell diagnostics, face recognition, predictive toxicology, mining metabolic and protein-protein interaction networks, and community detection in social networks. Develops Intuition through Easy-to-Follow Examples and Rigorous Mathematical Foundations Every algorithm and example is accompanied with R code. This allows readers to see how the algorithmic techniques correspond to the process of graph data analysis and to use the graph mining techniques in practice. The text also gives a rigorous, formal explanation of the underlying mathematics of each technique. Makes Graph Mining Accessible to Various Levels of Expertise Assuming no prior knowledge of mathematics or data mining, this self-contained book is accessible to students, researchers, and practitioners of graph data mining. It is suitable as a primary textbook for graph mining or as a supplement to a standard data mining course. It can also be used as a reference for researchers in computer, information, and computational science as well as a handy guide for data analytics practitioners.