Skip to main content

2024 | Buch

Association Analysis Techniques and Applications in Bioinformatics

insite
SUCHEN

Über dieses Buch

Advances in experimental technologies have given rise to tremendous amounts of biology data. This not only offers valuable sources of data to help understand biological evolution and functional mechanisms, but also poses challenges for accurate and effective data analysis.

This book offers an essential introduction to the theoretical and practical aspects of association analysis, including data pre-processing, data mining methods/algorithms, and tools that are widely applied for computational biology. It covers significant recent advances in the field, both foundational and application-oriented, helping readers understand the basic principles and emerging techniques used to discover interesting association patterns in diverse and heterogeneous biology data, such as structure-function correlations, and complex networks with gene/protein regulation.

The main results and approaches are described in an easy-to-follow way and accompanied by sufficientreferences and suggestions for future research. This carefully edited monograph is intended to provide investigators in the fields of data mining, machine learning, artificial intelligence, and bioinformatics with a profound guide to the role of association analysis in computational biology. It is also very useful as a general source of information on association analysis, and as an overall accompanying course book and self-study material for graduate students and researchers in both computer science and bioinformatics.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Introduction
Abstract
Bioinformatics is viewed as the product of combining traditional biology and modern information technology. Unlike traditional statistical methods, bioinformatics represents a new and evolving cross-cutting scientific field. It is a comprehensive interdisciplinary study that uses mathematics and computer technology to store and analyze biological data, and uses computational methods to mine, process, and summarize large amounts of biological data, aiming to simulate and solve complex biological problems and discover the inner connections and patterns hidden under its huge data.
Qingfeng Chen
Chapter 2. Association Analysis: Basic Concepts and Algorithms
Abstract
In 1993, Agrawal et al. pioneered the theory of mining association rules from large database, which is used to identify interesting links between items in market basket data transactions. Market basket transaction is a typical example of the application of association analysis.
Qingfeng Chen
Chapter 3. Complex Networks: Basic Concepts, Construction, and Learning Methods
Abstract
This chapter introduces an approach to modeling called complex networks. Complex networks are an abstract model for understanding real-world complex systems. It abstracts entities in a complex system into nodes and abstracts the relationship between entities into connections.
Qingfeng Chen
Chapter 4. Computational Linguistics and Biological Sequences in Artificial Intelligence
Abstract
Researchers generally believe that nucleic acid is also a language with rich information. Nucleic acid language can be used to describe the structure of life and life processes, and there is as much diversity as language, with many common characteristics. Therefore, many existing studies apply the results and methods achieved in the field of language theory to the study of biological sequences. Based on this route, computational linguistics has also brought many new breakthroughs to the study of biological sequences. Association analysis is a data mining technique that can be used to discover frequent patterns and association rules in a dataset. In computational linguistics, association analysis can be used to uncover association rules in text data, helping us better understand semantic and grammatical rules in natural languages. In biological sequence analysis, association analysis can be applied to identify association rules in the genome and proteome to reveal interactions and functional relationships between genes or proteins. These analysis results can help us better understand the data and draw conclusions and promote further development in both fields. Therefore, the application of association analysis technology to the study of biological sequences in computational linguistics is a research field worthy of our expectations.
Qingfeng Chen
Chapter 5. Non-Coding RNA Function and Structure
Abstract
In eukaryotic genomes, approximately 90\(\%\) of genes are transcribed genes, of which only 1–2\(\%\) encode proteins, while the majority of transcribed genes are non-coding RNAs. Non-coding RNAs represent one of the most rapidly advancing frontiers in the field of life sciences. They continuously enhance our understanding of the essence of life, lead the deepening of life sciences, and are poised to make significant breakthroughs in modern life science and technology, providing novel ideas and techniques for genetic breeding; intervention, prevention, and treatment of major human diseases; as well as drug research.
Qingfeng Chen
Chapter 6. The Associations Between Non-coding RNA and Disease
Abstract
Association analysis is used to discover intriguing associations hidden in datasets. In this chapter, we present the application of association analysis in bioinformatics to discover potential non-coding RNA-disease associations.
Qingfeng Chen
Chapter 7. Protein Structure Prediction
Abstract
Proteins are essential components of living organisms. They are composed of a linear sequence of amino acids. However, proteins only exhibit activity and perform their specific biological functions when they fold into a particular spatial structure. Correlation analysis techniques can be used to study the interactions between proteins. However, in order to truly understand the function of a protein, it is necessary to have a clear understanding of its accurate spatial structure. The spatial structure of a protein allows us to comprehend how it carries out its corresponding function, which is crucial in the fields of medicine, pharmacology, and biology. Additionally, understanding the structures of known proteins can provide a reliable theoretical basis for designing new proteins. Therefore, this chapter primarily focuses on research related to protein structure prediction and introduces the topic of protein hotspot prediction.
Qingfeng Chen
Chapter 8. Gene Sequence Assembly and Application
Abstract
Sequencing and gene sequence assembly are the first steps in the process of genome-wide association studies (GWAS), in which gene fragments obtained by sequencing technologies are analyzed by algorithms and finally assembled into gene chains ready for analysis, while some advanced association analysis technologies are also able to obtain statistics of insertion and variation patterns of gene chains in this process.
Qingfeng Chen
Chapter 9. Biological Pathway Identification
Abstract
Genome-wide association study (GWAS) has become an essential method to reveal the genetic mechanism of complex diseases. In the past decade, the research on GWAS methods has gradually advanced from the initial single-locus, single-trait analysis to multi-locus, multi-trait association analysis, but the results can only explain a small portion of the genetic power. Therefore, the methodological study of GWAS is of great importance.
Qingfeng Chen
Chapter 10. Fusion and Radiomics Study of Multimodal Medical Images
Abstract
With the tremendous technological advances in the medical field, various medical imaging devices have emerged to produce images that enhance clinical medical diagnosis, and different medical imaging modalities are now widely used in clinical applications of diseases.
Qingfeng Chen
Chapter 11. Bioinformatics Research Based on Evolutionary Computation
Abstract
Evolutionary computation-based association analysis has achieved significant progress in the field of data mining. This research approach fully leverages the advantages of evolutionary computation in global search and optimization, enhancing the efficiency and accuracy of association rule mining. The key of evolutionary computation methods lies in transforming association analysis problems into optimization problems. By doing so, the optimal association rules can be sought within the space of association rules. To achieve this objective, researchers need to define fitness functions to evaluate the quality of association rules, such as support and confidence measures. Additionally, evolutionary computation algorithms require settings for population initialization, selection, mutation, and other operations to effectively explore the search space.
Qingfeng Chen
Chapter 12. Relationship Prediction Based on Complex Network
Abstract
Complex networks are network structures composed of a large number of nodes and complex relationships between nodes. Various complex network topologies exist in fields such as biological sciences, social sciences, and information sciences. Nodes represent various entities such as social individuals, network users, and network sites, while the links between nodes represent communication or relationships between the objects represented by the nodes.
Qingfeng Chen
Chapter 13. Summary and Prospect
Abstract
There is a widespread proverb that “the twenty-first century is the century of life sciences,” yet more than 20 years into the twenty-first century, we have yet to see a revolutionary breakthrough in the field of biological sciences similar to the steam engine and electricity. Computer science, on the other hand, has developed rapidly in the last decade, with new technologies emerging and combining with other disciplines to form cross-disciplines. Bioinformatics is a product of this background, and AlphaFold, which appeared in 2018, is one of the more groundbreaking results. Based on the above phenomenon, we can foresee that biology can hardly reach the height of the “century of life sciences” by the research of its own disciplines alone, and to achieve this ambitious goal. Thus, it must be built on the intersection of other disciplines. Bioinformatics has such potential, and correlation analysis is an important technique and analysis method to solve the core problems in bioinformatics, and it is based on this background, and it provides a detailed introduction to the application of correlation analysis and other techniques in bioinformatics against the background of this era.
Qingfeng Chen
Backmatter
Metadaten
Titel
Association Analysis Techniques and Applications in Bioinformatics
verfasst von
Qingfeng Chen
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
Electronic ISBN
978-981-9982-51-6
Print ISBN
978-981-9982-50-9
DOI
https://doi.org/10.1007/978-981-99-8251-6

Premium Partner