Big Data Analysis and Artificial Intelligence for Medical Sciences
1. Edition June 2024
432 Pages, Hardcover
Practical Approach Book
Carpentieri, Bruno (Editor)
Big Data Analysis and Artificial Intelligence for Medical Sciences
Overview of the current state of the art on the use of artificial intelligence in medicine and biology
Big Data Analysis and Artificial Intelligence for Medical Sciences demonstrates the efforts made in the fields of Computational Biology and medical sciences to design and implement robust, accurate, and efficient computer algorithms for modeling the behavior of complex biological systems much faster than using traditional modeling approaches based solely on theory.
With chapters written by international experts in the field of medical and biological research, Big Data Analysis and Artificial Intelligence for Medical Sciences includes information on:
* Studies conducted by the authors which are the result of years of interdisciplinary collaborations with clinicians, computer scientists, mathematicians, and engineers
* Differences between traditional computational approaches to data processing (those of mathematical biology) versus the experiment-data-theory-model-validation cycle
* Existing approaches to the use of big data in the healthcare industry, such as through IBM's Watson Oncology, Microsoft's Hanover, and Google's DeepMind
* Difficulties in the field that have arisen as a result of technological changes, and potential future directions these changes may take
A timely and up-to-date resource on the integration of artificial intelligence in medicine and biology, Big Data Analysis and Artificial Intelligence for Medical Sciences is of great benefit not only to professional scholars, but also MSc or PhD program students eager to explore advancement in the field.
Preface xix
1 Introduction 1
Bruno Carpentieri and Paola Lecca
1.1 Disease Diagnoses 4
1.2 Drug Development 6
1.3 Personalized Medicine 6
1.4 Gene Editing 7
Author Biographies 9
References 9
2 Fuzzy Logic for Knowledge-Driven and Data-Driven Modeling in Biomedical Sciences 17
Paolo Cazzaniga, Simone Spolaor, Caro Fuchs, Marco S. Nobile and Daniela Besozzi
2.1 Introduction 17
2.2 Fuzzy Logic 18
2.2.1 Fuzzy Sets 19
2.2.2 Linguistic Variables 19
2.2.3 Fuzzy Rules 20
2.2.4 Fuzzy Inference Systems 21
2.2.5 Simpful 22
2.3 Knowledge-Driven Modeling 22
2.3.1 Dynamic Fuzzy Modeling 23
2.3.2 Application 1: Maximizing Cancer Cells Death with Minimal Drug Combinations 25
2.3.3 FuzzX: A Hybrid Mechanistic-Fuzzy Modeling and Simulation Engine 27
2.3.4 Application 2: Analyzing Oscillatory Regimes in Signal Transduction Pathways 29
2.4 Data-Driven Modeling 30
2.4.1 pyFUME: Automatic Generation of Fuzzy Inference Systems 31
2.4.2 Application 3: Assessing Tremor Severity in Neurological Disorders 33
2.5 Discussion 35
Author Biographies 36
References 37
3 Application of Machine Learning Algorithms to Diagnosis and Prognosis of Chronic Wounds 43
Mai Dabas and Amit Gefen
3.1 Background 43
3.1.1 Chronic Wounds 43
3.1.2 Implementation of AI Methodologies in Wound Care and Management 43
3.2 Clinical Visual Assessment of Wounds Supported by Artificial Intelligence 44
3.2.1 Predicting the Formation and Progress of Wounds Based on Electronic Health Records 46
3.2.2 Predicting the Formation and Evolution of Wounds Based on a Dynamic Evaluation of Wound Characteristics and Relevant Physiological Measures 48
3.2.3 Feasible Implementation of AI Solutions For Wound Care Delivery and Management 49
3.2.4 Types of Data Modalities for Diagnosis, Detection, and Prediction of Chronic Wounds 50
3.3 Smartphone and Tablet Use in Wound Diagnosis and Management 51
3.4 Conclusions 53
Acronyms 54
Author Biographies 55
References 55
4 Deep Learning Techniques for Gene Identification in Cancer Prevention 59
Eleonora Lusito
4.1 The Next-Generation Era of Cancer Investigation 59
4.1.1 Cancer at Its First Definitions 59
4.1.2 Attempts to Sequence Nucleic Acids Over the Years 60
4.1.3 From the First to the Third-Generation Sequencing 61
4.1.4 Applications of NGS in Clinical Oncology 62
4.2 Deep Learning Approaches for Genomic Variants Identification in Cancer 63
4.2.1 Cancer Causing Factors 63
4.2.2 The Contribution of Germline Alterations to Cancer 64
4.2.3 Somatic Mutations and Cancer 64
4.2.4 Calling Variants from Sequence Data 65
4.2.5 Computational Approaches for Variant Discovery 65
4.2.6 Convolutional Neural Networks (CNNs): Basic Principles 66
4.2.7 Application of CNNs to Variant Calling 67
4.2.8 A Typical CNN Architecture for Variant Calling 68
4.2.9 The Activation Function 69
4.2.10 Dropout and L1-L2 Regularization 71
4.2.11 Advantages of Deep Learning Over the Existing Techniques 72
4.2.12 Residual Neural Networks (ResNet)-Inspired CNN in Genomic Variants Detection 73
4.3 Deep Learning in Cancer Transcriptomics 74
4.3.1 Gene Expression and Cancer 74
4.3.2 Analytical Approaches to Deal with Gene Expression Data 76
4.3.3 Stacked Denoising Autoencoders (SDAEs) for Dimensionality Reduction 76
4.3.4 The Variational Autoencoder (VAE) 79
4.3.5 VAEs to Integrate Gene Expression and Methylation Data 81
4.3.5.1 DNA Methylation: the Epigenetic Regulation of Gene Expression 81
4.3.5.2 Preprocessing Input Data of Different Sources 82
4.3.5.3 A VAE Architecture for Multimodal Data 82
4.4 Conclusions 84
Acronyms 86
Author Biographies 87
References 87
5 Deep Learning for Network Biology 97
Eleonora Lusito
5.1 Types of Interactions Between Genes and Their Products 97
5.2 Deep Learning Methods with Graph-input Data 99
5.2.1 Graph Embedding 99
5.2.1.1 Random Walk-Based Graph Embedding 100
5.2.1.2 Proximity-Based Graph Embedding 101
5.2.2 Graph Convolutional Networks (GCNs) 102
5.3 Applications of GNNs to Infer Biological and Pharmacological Interactions 104
5.3.1 Proteomics 104
5.3.2 Drug Development and Repurposing 104
5.3.3 Drug-Drug Interaction Prediction 105
5.3.4 Disease Classification and Outcome Prediction 106
Author Biography 107
References 107
6 Deep Learning-Based Reduced Order Models for Cardiac Electrophysiology 115
Stefania Fresca, Luca Dedè and Andrea Manzoni
6.1 Overview of Cardiac Physiology 115
6.1.1 Atrial Tachycardia and Atrial Fibrillation 117
6.1.2 Mathematical Models for Cardiac Electrophysiology 118
6.2 Reduced Order Modeling 121
6.2.1 Problem Formulation 123
6.2.2 Nonlinear Dimensionality Reduction 123
6.3 Decreasing Complexity in Cardiac Electrophysiology 124
6.3.1 POD-Enhanced Deep Learning-Based ROMs 125
6.3.1.1 POD-DL-ROM Architecture and Algorithms 128
6.4 Numerical Results 130
6.4.1 Test 1: Two-Dimensional Slab with Figure of Eight Reentry 131
6.4.2 Test 2: Three-Dimensional Left Ventricle Geometry 133
6.4.3 Test 3: Left Atrium Surface by Varying the Stimuli Location 135
6.4.4 Test 4: Reentry Breakup 137
6.5 Conclusions 139
Author Biographies 140
References 140
7 The Potential of Microbiome Big Data in Precision Medicine: Predicting Outcomes Through Machine Learning 149
Silvia Turroni and Simone Rampelli
7.1 The Gut Microbiome: A Major Player in Human Physiology and Pathophysiology 149
7.2 Machine Learning Applied to Microbiome Research 151
7.2.1 Case Study 1: Obesity 151
7.2.2 Case Study 2: Cancer 153
7.2.3 Case Study 3: Personalized Nutrition 154
7.2.4 Case Study 4: Exploiting the Meta-Community Theory for New Machine Learning Approaches 155
7.3 Conclusions and Perspectives 155
Author Biographies 156
References 156
8 Predictive Patient Stratification Using Artificial Intelligence and Machine Learning 161
Thanh-Phuong Nguyen, Thanh T. Giang, Quang T. Pham and Dang H. Tran
8.1 Overview of Artificial Intelligence for Patient Stratification 161
8.2 A RPCA and MKL Combination Model for Patient Stratification 164
8.2.1 Robust Principal Component Analysis 164
8.2.2 Dimensionality Reduction and Features Extraction Based on RPCA 166
8.2.3 Predictive Model Construction Based on Multiple Kernel Learning 168
8.2.4 Materials 169
8.2.4.1 Cancer Patient Datasets 169
8.2.4.2 Alzheimer Disease Patient Datasets 170
8.2.5 Experiment Design 171
8.2.5.1 Experiment of Stratifying Cancer Patients 171
8.2.5.2 Experiment of Stratifying Alzheimer Disease Patients 171
8.2.6 Results and Discussions 171
8.2.6.1 Application of Stratifying Cancer Patients 172
8.2.7 Application of Stratifying Alzheimer Disease Patients 174
8.3 Conclusion 175
Author Biographies 175
References 176
9 Hybrid Data-Driven and Numerical Modeling of Articular Cartilage 181
Seyed Shayan Sajjadinia, Bruno Carpentieri and Gerhard A. Holzapfel
9.1 Introduction 181
9.2 Knee and Cartilage 182
9.2.1 Main Joint Substructures 182
9.2.2 Load-Bearing Cartilage Phases 183
9.3 Physics-Based Modeling 185
9.3.1 Numerical Modeling 185
9.3.2 Constitutive Modeling 188
9.4 AI-Enhanced Modeling 191
9.4.1 Deep Learning 191
9.4.2 Surrogate Modeling 192
9.5 Discussion and Conclusion 194
Author Biographies 194
References 195
10 A Hybrid of Differential Evolution and Minimization of Metabolic Adjustment for Succinic and Ethanol Production 205
Zhang N. Hor, Mohd S. Mohamad, Yee W. Choon, Muhammad A. Remli and Hairudin A. Majid
10.1 Introduction 205
10.2 Method 206
10.2.1 Differential Evolution (DE) 206
10.2.2 Mutation 206
10.2.3 Crossover 207
10.2.4 Selection 208
10.2.5 Minimization of Metabolic Adjustment 208
10.2.6 A Hybrid of Differential Evolution and Minimization of Metabolic Adjustment 209
10.3 Experiments and Discussion 209
10.3.1 Dataset 209
10.3.2 Parameter Setting 209
10.3.3 Experimental Results 210
10.3.4 Comparative Analysis 214
10.4 Conclusion 214
Acknowledgment 215
Author Bibliographies 215
References 216
11 Analysis Pipelines and a Platform Solution for Next-Generation Sequencing Data 219
Víctor Duarte, Alesandro Gómez and Juan M. Corchado
11.1 Introduction 219
11.2 NGS Data Analysis Pipeline and State of the Art Tools 220
11.2.1 Quality Assessment 220
11.2.2 Alignment 221
11.2.3 Post-alignment and pre-variant Calling Processing 222
11.2.4 Variant Calling 223
11.2.5 Variant Annotation 228
11.3 Nanopore Sequencing Data Analysis 229
11.3.1 Base-Calling 230
11.3.2 Quality Control and Preprocessing 230
11.3.3 Error Correction 231
11.3.4 Alignment 231
11.3.5 Variant Calling 231
11.4 Machine Learning Approaches in Variant Calling 232
11.5 Next-Generation Sequencing Data Analysis Frameworks 233
11.6 DeepNGS 235
11.6.1 Pipeline 235
11.6.2 DeepNGS Main Features 236
11.6.2.1 Power and Speed 236
11.6.2.2 Optimized Workflow 236
11.6.2.3 Intuitive Design and Interactive Charts 237
11.6.2.4 Extended Information 237
11.6.2.5 Artificial Intelligence and Machine Learning 237
11.7 Conclusions 240
Author Biographies 241
References 241
12 Artificial Intelligence: From Drug Discovery to Clinical Pharmacology 253
Paola Lecca
12.1 Artificial Intelligence and the Druggable Genome 253
12.2 Feature-Based Methods 257
12.3 Similarity/Distance-Based Methods 257
12.4 Matrix Factorization 258
12.4.1 Causal K-Nearest-Neighborhood 261
12.4.2 Causal Random Forests 263
12.4.3 Causal Support Vector Machine 264
12.5 Opportunities and Challenges 265
Author Biography 266
References 266
13 Using AI to Steer Brain Regeneration: The Enhanced Regenerative Medicine Paradigm 273
Gabriella Panuccio, Narayan P. Subramaniyam, Angel Canal-Alonso, Juan M. Corchado and Carlo Ierna
13.1 The Challenge of Brain Regeneration 273
13.2 The Enhanced Regenerative Medicine Paradigm 274
13.3 The Case of Epilepsy 276
13.4 AI to Understand Epilepsy 279
13.4.1 Commonly Applied Learning Algorithms for Basic Neuroscience and Clinical Application in Epilepsy 282
13.4.2 Seizure and Epilepsy Type Classification 284
13.4.3 Seizure Onset Zone Localization 284
13.4.4 Seizure Detection 285
13.4.5 Seizure Prediction 285
13.4.6 Signal Feature Extraction for Seizure Detection and Prediction 288
13.4.7 Network Interactions and Evolving Dynamics in the Epileptic Brain: The Eye of AI 290
13.5 Artificial Intelligence to Guide Graft-Host Dynamics in Epilepsy 292
13.6 Challenges and Limitations 294
13.6.1 From AI to Explainable AI 295
13.7 A Philosophical Perspective on Enhanced Brain Regeneration 297
Acknowledgments 299
Acronyms 299
Author Biographies 300
References 300
14 Towards Better Ways to Assess Predictive Computing in Medicine: On Reliability, Robustness, and Utility 309
Federico Cabitza and Andrea Campagner
14.1 Introduction 309
14.2 On Ground Truth Reliability 311
14.2.1 Weighted Reliability 314
14.2.2 Example Application 316
14.3 On Utility Metrics to Evaluate ML Performance 318
14.3.1 Weighted Utility 318
14.3.2 Example Application 321
14.4 On the Replicability of Clinical ML Models 322
14.4.1 Dataset Size 323
14.4.2 Dataset Similarity 325
14.4.3 Meta-Validation Procedure 325
14.4.4 Example Application 328
14.5 Conclusions and Future Outlook 331
Author Biographies 332
References 333
15 Legal Aspects of AI in the Biomedical Field. The Role of Interpretable Models 339
Chiara Gallese
15.1 Introduction 339
15.2 Data Protection 340
15.3 Transparency Principle 343
15.3.1 Right of Explanation 343
15.3.2 Right of Information 348
15.3.3 Informed Consent Requirements 349
15.4 Accountability Principle 350
15.5 Non-discrimination Principle and Biases 351
15.6 High-Risk Systems and Human Oversight 353
15.7 Additional Requirements of the AI Act Proposal 354
15.8 Interpretability as a Standard 355
15.9 Conclusion 358
Author Biography 358
References 359
16 The Long Path to Usable AI 363
Barbara Di Camillo, Enrico Longato, Erica Tavazzi and Martina Vettoretti
16.1 Promises and Challenges of Artificial Intelligence in Healthcare 363
16.2 Deployment of Usable Artificial Intelligence Models 367
16.2.1 Case Study: Predicting the Cardiovascular Complications of Diabetes via a Deep Learning Approach 368
16.3 Potential and Challenges of Employing Longitudinal Clinical Data in AI 375
16.3.1 Case Study: Modeling the Progression of Amyotrophic Lateral Sclerosis Through a Dynamic Bayesian Network 378
16.3.2 Case Study: Investigating Amyotrophic Lateral Sclerosis Progression Trajectories Leveraging Process Mining 381
16.4 Enhancing the Applicability of AI Predictive Models by a Combined Model Approach: A Case Study on T2D Onset Prediction 386
16.4.1 The Problem of Type 2 Diabetes Prediction 386
16.4.2 Potential Applications of T2D Predictive Models 387
16.4.3 Barriers to the Adoption of T2D Predictive Models 387
16.4.4 Addressing Practical Issues by Combining Multiple T2D Predictive Models 388
16.4.5 The Combined Model Achieves High Prediction Performance with High Coverage 390
16.5 Conclusions and Future Outlook 391
Author Biography 392
References 393
Index 399
Paola Lecca is Assistant Professor in the Faculty of Engineering at the Free University of Bozen-Bolzano, Bozen-Bolzano, Italy.