Machine Learning: Science and Technology

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

ISSN: 2632-2153

OPEN ACCESS

Machine Learning: Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights.

Submit an article opens in new tab Track my article opens in new tab

RSS

Current volume

Journal archive

Focus issues

Median submission to first decision before peer review 3 days

Median submission to first decision after peer review 49 days

Impact factor 6.8

Citescore 7.1

Full list of journal metrics

Open all abstracts, in this tab

The following article is Open access

Chemformer: a pre-trained transformer for computational chemistry

Ross Irwin et al 2022 Mach. Learn.: Sci. Technol. 3 015022

View article, Chemformer: a pre-trained transformer for computational chemistry PDF, Chemformer: a pre-trained transformer for computational chemistry

Transformer models coupled with a simplified molecular line entry system (SMILES) have recently proven to be a powerful combination for solving challenges in cheminformatics. These models, however, are often developed specifically for a single application and can be very resource-intensive to train. In this work we present the Chemformer model—a Transformer-based model which can be quickly applied to both sequence-to-sequence and discriminative cheminformatics tasks. Additionally, we show that self-supervised pre-training can improve performance and significantly speed up convergence on downstream tasks. On direct synthesis and retrosynthesis prediction benchmark datasets we publish state-of-the-art results for top-1 accuracy. We also improve on existing approaches for a molecular optimisation task and show that Chemformer can optimise on multiple discriminative tasks simultaneously. Models, datasets and code will be made available after publication.

https://doi.org/10.1088/2632-2153/ac3ffb

The following article is Open access

Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art

Tanujit Chakraborty et al 2024 Mach. Learn.: Sci. Technol. 5 011001

View article, Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art PDF, Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art

Generative adversarial networks (GANs) have rapidly emerged as powerful tools for generating realistic and diverse data across various domains, including computer vision and other applied areas, since their inception in 2014. Consisting of a discriminative network and a generative network engaged in a minimax game, GANs have revolutionized the field of generative modeling. In February 2018, GAN secured the leading spot on the 'Top Ten Global Breakthrough Technologies List' issued by the Massachusetts Science and Technology Review. Over the years, numerous advancements have been proposed, leading to a rich array of GAN variants, such as conditional GAN, Wasserstein GAN, cycle-consistent GAN, and StyleGAN, among many others. This survey aims to provide a general overview of GANs, summarizing the latent architecture, validation metrics, and application areas of the most widely recognized variants. We also delve into recent theoretical developments, exploring the profound connection between the adversarial principle underlying GAN and Jensen–Shannon divergence while discussing the optimality characteristics of the GAN framework. The efficiency of GAN variants and their model architectures will be evaluated along with training obstacles as well as training solutions. In addition, a detailed discussion will be provided, examining the integration of GANs with newly developed deep learning frameworks such as transformers, physics-informed neural networks, large language models, and diffusion models. Finally, we reveal several issues as well as future research outlines in this field.

https://doi.org/10.1088/2632-2153/ad1f77

The following article is Open access

The MLIP package: moment tensor potentials with MPI and active learning

Ivan S Novikov et al 2021 Mach. Learn.: Sci. Technol. 2 025002

View article, The MLIP package: moment tensor potentials with MPI and active learning PDF, The MLIP package: moment tensor potentials with MPI and active learning

The subject of this paper is the technology (the 'how') of constructing machine-learning interatomic potentials, rather than science (the 'what' and 'why') of atomistic simulations using machine-learning potentials. Namely, we illustrate how to construct moment tensor potentials using active learning as implemented in the MLIP package, focusing on the efficient ways to automatically sample configurations for the training set, how expanding the training set changes the error of predictions, how to set up ab initio calculations in a cost-effective manner, etc. The MLIP package (short for Machine-Learning Interatomic Potentials) is available at https://mlip.skoltech.ru/download/.

https://doi.org/10.1088/2632-2153/abc9fe

The following article is Open access

Prediction of chemical reaction yields using deep learning

Philippe Schwaller et al 2021 Mach. Learn.: Sci. Technol. 2 015016

View article, Prediction of chemical reaction yields using deep learning PDF, Prediction of chemical reaction yields using deep learning

Artificial intelligence is driving one of the most important revolutions in organic chemistry. Multiple platforms, including tools for reaction prediction and synthesis planning based on machine learning, have successfully become part of the organic chemists' daily laboratory, assisting in domain-specific synthetic problems. Unlike reaction prediction and retrosynthetic models, the prediction of reaction yields has received less attention in spite of the enormous potential of accurately predicting reaction conversion rates. Reaction yields models, describing the percentage of the reactants converted to the desired products, could guide chemists and help them select high-yielding reactions and score synthesis routes, reducing the number of attempts. So far, yield predictions have been predominantly performed for high-throughput experiments using a categorical (one-hot) encoding of reactants, concatenated molecular fingerprints, or computed chemical descriptors. Here, we extend the application of natural language processing architectures to predict reaction properties given a text-based representation of the reaction, using an encoder transformer model combined with a regression layer. We demonstrate outstanding prediction performance on two high-throughput experiment reactions sets. An analysis of the yields reported in the open-source USPTO data set shows that their distribution differs depending on the mass scale, limiting the data set applicability in reaction yields predictions.

https://doi.org/10.1088/2632-2153/abc81d

The following article is Open access

Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation

Mario Krenn et al 2020 Mach. Learn.: Sci. Technol. 1 045024

View article, Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation PDF, Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation

The discovery of novel materials and functional molecules can help to solve some of society's most urgent challenges, ranging from efficient energy harvesting and storage to uncovering novel pharmaceutical drug candidates. Traditionally matter engineering–generally denoted as inverse design–was based massively on human intuition and high-throughput virtual screening. The last few years have seen the emergence of significant interest in computer-inspired designs based on evolutionary or deep learning methods. The major challenge here is that the standard strings molecular representation SMILES shows substantial weaknesses in that task because large fractions of strings do not correspond to valid molecules. Here, we solve this problem at a fundamental level and introduce SELFIES (SELF-referencIng Embedded Strings), a string-based representation of molecules which is 100% robust. Every SELFIES string corresponds to a valid molecule, and SELFIES can represent every molecule. SELFIES can be directly applied in arbitrary machine learning models without the adaptation of the models; each of the generated molecule candidates is valid. In our experiments, the model's internal memory stores two orders of magnitude more diverse molecules than a similar test with SMILES. Furthermore, as all molecules are valid, it allows for explanation and interpretation of the internal working of the generative models.

https://doi.org/10.1088/2632-2153/aba947

The following article is Open access

Closed-loop Koopman operator approximation

Steven Dahdah and James Richard Forbes 2024 Mach. Learn.: Sci. Technol. 5 025038

View article, Closed-loop Koopman operator approximation PDF, Closed-loop Koopman operator approximation

This paper proposes a method to identify a Koopman model of a feedback-controlled system given a known controller. The Koopman operator allows a nonlinear system to be rewritten as an infinite-dimensional linear system by viewing it in terms of an infinite set of lifting functions. A finite-dimensional approximation of the Koopman operator can be identified from data by choosing a finite subset of lifting functions and solving a regression problem in the lifted space. Existing methods are designed to identify open-loop systems. However, it is impractical or impossible to run experiments on some systems, such as unstable systems, in an open-loop fashion. The proposed method leverages the linearity of the Koopman operator, along with knowledge of the controller and the structure of the closed-loop (CL) system, to simultaneously identify the CL and plant systems. The advantages of the proposed CL Koopman operator approximation method are demonstrated in simulation using a Duffing oscillator and experimentally using a rotary inverted pendulum system. An open-source software implementation of the proposed method is publicly available, along with the experimental dataset generated for this paper.

https://doi.org/10.1088/2632-2153/ad45b0

The following article is Open access

Deeptime: a Python library for machine learning dynamical models from time series data

Moritz Hoffmann et al 2022 Mach. Learn.: Sci. Technol. 3 015009

View article, Deeptime: a Python library for machine learning dynamical models from time series data PDF, Deeptime: a Python library for machine learning dynamical models from time series data

Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables, dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software. Deeptime can be found under https://deeptime-ml.github.io/.

https://doi.org/10.1088/2632-2153/ac3de0

The following article is Open access

Rediscovering orbital mechanics with machine learning

Pablo Lemos et al 2023 Mach. Learn.: Sci. Technol. 4 045002

View article, Rediscovering orbital mechanics with machine learning PDF, Rediscovering orbital mechanics with machine learning

We present an approach for using machine learning to automatically discover the governing equations and unknown properties (in this case, masses) of real physical systems from observations. We train a 'graph neural network' to simulate the dynamics of our Solar System's Sun, planets, and large moons from 30 years of trajectory data. We then use symbolic regression to correctly infer an analytical expression for the force law implicitly learned by the neural network, which our results showed is equivalent to Newton's law of gravitation. The key assumptions our method makes are translational and rotational equivariance, and Newton's second and third laws of motion. It did not, however, require any assumptions about the masses of planets and moons or physical constants, but nonetheless, they, too, were accurately inferred with our method. Naturally, the classical law of gravitation has been known since Isaac Newton, but our results demonstrate that our method can discover unknown laws and hidden properties from observed data.

https://doi.org/10.1088/2632-2153/acfa63

The following article is Open access

Deep learning in electron microscopy

Jeffrey M Ede 2021 Mach. Learn.: Sci. Technol. 2 011004

View article, Deep learning in electron microscopy PDF, Deep learning in electron microscopy

Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy.

https://doi.org/10.1088/2632-2153/abd614

The following article is Open access

Quantum machine learning for image classification

Arsenii Senokosov et al 2024 Mach. Learn.: Sci. Technol. 5 015040

View article, Quantum machine learning for image classification PDF, Quantum machine learning for image classification

Image classification, a pivotal task in multiple industries, faces computational challenges due to the burgeoning volume of visual data. This research addresses these challenges by introducing two quantum machine learning models that leverage the principles of quantum mechanics for effective computations. Our first model, a hybrid quantum neural network with parallel quantum circuits, enables the execution of computations even in the noisy intermediate-scale quantum era, where circuits with a large number of qubits are currently infeasible. This model demonstrated a record-breaking classification accuracy of 99.21% on the full MNIST dataset, surpassing the performance of known quantum–classical models, while having eight times fewer parameters than its classical counterpart. Also, the results of testing this hybrid model on a Medical MNIST (classification accuracy over 99%), and on CIFAR-10 (classification accuracy over 82%), can serve as evidence of the generalizability of the model and highlights the efficiency of quantum layers in distinguishing common features of input data. Our second model introduces a hybrid quantum neural network with a Quanvolutional layer, reducing image resolution via a convolution process. The model matches the performance of its classical counterpart, having four times fewer trainable parameters, and outperforms a classical model with equal weight parameters. These models represent advancements in quantum machine learning research and illuminate the path towards more accurate image classification systems.

https://doi.org/10.1088/2632-2153/ad2aef

Open all abstracts, in this tab

The following article is Open access

A multifidelity approach to continual learning for physical systems

Amanda Howard et al 2024 Mach. Learn.: Sci. Technol. 5 025042

View article, A multifidelity approach to continual learning for physical systems PDF, A multifidelity approach to continual learning for physical systems

We introduce a novel continual learning method based on multifidelity deep neural networks. This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset, limiting catastrophic forgetting. On its own the multifidelity continual learning method shows robust results that limit forgetting across several datasets. Additionally, we show that the multifidelity method can be combined with existing continual learning methods, including replay and memory aware synapses, to further limit catastrophic forgetting. The proposed continual learning method is especially suited for physical problems where the data satisfy the same physical laws on each domain, or for physics-informed neural networks, because in these cases we expect there to be a strong correlation between the output of the previous model and the model on the current training domain.

https://doi.org/10.1088/2632-2153/ad45b2

The following article is Open access

Exploiting data diversity in multi-domain federated learning

Hussain Ahmad Madni et al 2024 Mach. Learn.: Sci. Technol. 5 025041

View article, Exploiting data diversity in multi-domain federated learning PDF, Exploiting data diversity in multi-domain federated learning

Federated learning (FL) is an evolving machine learning technique that allows collaborative model training without sharing the original data among participants. In real-world scenarios, data residing at multiple clients are often heterogeneous in terms of different resolutions, magnifications, scanners, or imaging protocols, and thus challenging for global FL model convergence in collaborative training. Most of the existing FL methods consider data heterogeneity within one domain by assuming same data variation in each client site. In this paper, we consider data heterogeneity in FL with different domains of heterogeneous data by raising the problems of domain-shift, class-imbalance, and missing data. We propose a method, multi-domain FL as a solution to heterogeneous training data from multiple domains by training robust vision transformer model. We use two loss functions, one for correctly predicting class labels and other for encouraging similarity and dissimilarity over latent features, to optimize the global FL model. We perform various experiments using different convolution-based networks and non-convolutional Transformer architectures on multi-domain datasets. We evaluate the proposed approach on benchmark datasets and compare with the existing FL methods. Our results show the superiority of the proposed approach which performs better in term of robust FL global model than the exiting methods.

https://doi.org/10.1088/2632-2153/ad4768

The following article is Open access

A comprehensive machine learning-based investigation for the index-value prediction of 2G HTS coated conductor tapes

Shahin Alipour Bonab et al 2024 Mach. Learn.: Sci. Technol. 5 025040

View article, A comprehensive machine learning-based investigation for the index-value prediction of 2G HTS coated conductor tapes PDF, A comprehensive machine learning-based investigation for the index-value prediction of 2G HTS coated conductor tapes

Index-value, or so-called n-value prediction is of paramount importance for understanding the superconductors' behaviour specially when modeling of superconductors is needed. This parameter is dependent on several physical quantities including temperature, the magnetic field's density and orientation, and affects the behaviour of high-temperature superconducting devices made out of coated conductors in terms of losses and quench propagation. In this paper, a comprehensive analysis of many machine learning (ML) methods for estimating the n-value has been carried out. The results demonstrated that cascade forward neural network (CFNN) excels in this scope. Despite needing considerably higher training time when compared to the other attempted models, it performs at the highest accuracy, with 0.48 root mean squared error (RMSE) and 99.72% Pearson coefficient for goodness of fit (R-squared). In contrast, the rigid regression method had the worst predictions with 4.92 RMSE and 37.29% R-squared. Also, random forest, boosting methods, and simple feed forward neural network can be considered as a middle accuracy model with faster training time than CFNN. The findings of this study not only advance modeling of superconductors but also pave the way for applications and further research on ML plug-and-play codes for superconducting studies including modeling of superconducting devices.

https://doi.org/10.1088/2632-2153/ad45b1

The following article is Open access

Machine learning for efficient grazing-exit x-ray absorption near edge structure spectroscopy analysis: Bayesian optimization approach

Cafer Tufan Cakir et al 2024 Mach. Learn.: Sci. Technol. 5 025037

View article, Machine learning for efficient grazing-exit x-ray absorption near edge structure spectroscopy analysis: Bayesian optimization approach PDF, Machine learning for efficient grazing-exit x-ray absorption near edge structure spectroscopy analysis: Bayesian optimization approach

In materials science, traditional techniques for analyzing layered structures are essential for obtaining information about local structure, electronic properties and chemical states. While valuable, these methods often require high vacuum environments and have limited depth profiling capabilities. The grazing exit x-ray absorption near-edge structure (GE-XANES) technique addresses these limitations by providing depth-resolved insight at ambient conditions, facilitating in situ material analysis without special sample preparation. However, GE-XANES is limited by long data acquisition times, which hinders its practicality for various applications. To overcome this, we have incorporated Bayesian optimization (BO) into the GE-XANES data acquisition process. This innovative approach potentially reduces measurement time by a factor of 50. We have used a standard GE-XANES experiment, which serve as reference, to validate the effectiveness and accuracy of the BO-informed experimental setup. Our results show that this optimized approach maintains data quality while significantly improving efficiency, making GE-XANES more accessible to a wider range of materials science applications.

https://doi.org/10.1088/2632-2153/ad4253

The following article is Open access

Learning a general model of single phase flow in complex 3D porous media

Javier E Santos et al 2024 Mach. Learn.: Sci. Technol. 5 025039

View article, Learning a general model of single phase flow in complex 3D porous media PDF, Learning a general model of single phase flow in complex 3D porous media

Modeling effective transport properties of 3D porous media, such as permeability, at multiple scales is challenging as a result of the combined complexity of the pore structures and fluid physics—in particular, confinement effects which vary across the nanoscale to the microscale. While numerical simulation is possible, the computational cost is prohibitive for realistic domains, which are large and complex. Although machine learning (ML) models have been proposed to circumvent simulation, none so far has simultaneously accounted for heterogeneous 3D structures, fluid confinement effects, and multiple simulation resolutions. By utilizing numerous computer science techniques to improve the scalability of training, we have for the first time developed a general flow model that accounts for the pore-structure and corresponding physical phenomena at scales from Angstrom to the micrometer. Using synthetic computational domains for training, our ML model exhibits strong performance (R² = 0.9) when tested on extremely diverse real domains at multiple scales.

https://doi.org/10.1088/2632-2153/ad45af

Open all abstracts, in this tab

The following article is Open access

Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art

Tanujit Chakraborty et al 2024 Mach. Learn.: Sci. Technol. 5 011001

View article, Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art PDF, Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art

https://doi.org/10.1088/2632-2153/ad1f77

The following article is Open access

Manifold learning in atomistic simulations: a conceptual review

Jakub Rydzewski et al 2023 Mach. Learn.: Sci. Technol. 4 031001

View article, Manifold learning in atomistic simulations: a conceptual review PDF, Manifold learning in atomistic simulations: a conceptual review

Analyzing large volumes of high-dimensional data requires dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. Such practice is needed in atomistic simulations of complex systems where even thousands of degrees of freedom are sampled. An abundance of such data makes gaining insight into a specific physical problem strenuous. Our primary aim in this review is to focus on unsupervised machine learning methods that can be used on simulation data to find a low-dimensional manifold providing a collective and informative characterization of the studied process. Such manifolds can be used for sampling long-timescale processes and free-energy estimation. We describe methods that can work on datasets from standard and enhanced sampling atomistic simulations. Unlike recent reviews on manifold learning for atomistic simulations, we consider only methods that construct low-dimensional manifolds based on Markov transition probabilities between high-dimensional samples. We discuss these techniques from a conceptual point of view, including their underlying theoretical frameworks and possible limitations.

https://doi.org/10.1088/2632-2153/ace81a

The following article is Open access

Numerical and geometrical aspects of flow-based variational quantum Monte Carlo

James Stokes et al 2023 Mach. Learn.: Sci. Technol. 4 021001

View article, Numerical and geometrical aspects of flow-based variational quantum Monte Carlo PDF, Numerical and geometrical aspects of flow-based variational quantum Monte Carlo

This article aims to summarize recent and ongoing efforts to simulate continuous-variable quantum systems using flow-based variational quantum Monte Carlo techniques, focusing for pedagogical purposes on the example of bosons in the field amplitude (quadrature) basis. Particular emphasis is placed on the variational real- and imaginary-time evolution problems, carefully reviewing the stochastic estimation of the time-dependent variational principles and their relationship with information geometry. Some practical instructions are provided to guide the implementation of a PyTorch code. The review is intended to be accessible to researchers interested in machine learning and quantum information science.

https://doi.org/10.1088/2632-2153/acc8b9

The following article is Open access

Physics-AI symbiosis

Bahram Jalali et al 2022 Mach. Learn.: Sci. Technol. 3 041001

View article, Physics-AI symbiosis PDF, Physics-AI symbiosis

The phenomenal success of physics in explaining nature and engineering machines is predicated on low dimensional deterministic models that accurately describe a wide range of natural phenomena. Physics provides computational rules that govern physical systems and the interactions of the constituents therein. Led by deep neural networks, artificial intelligence (AI) has introduced an alternate data-driven computational framework, with astonishing performance in domains that do not lend themselves to deterministic models such as image classification and speech recognition. These gains, however, come at the expense of predictions that are inconsistent with the physical world as well as computational complexity, with the latter placing AI on a collision course with the expected end of the semiconductor scaling known as Moore's Law. This paper argues how an emerging symbiosis of physics and AI can overcome such formidable challenges, thereby not only extending AI's spectacular rise but also transforming the direction of engineering and physical science.

https://doi.org/10.1088/2632-2153/ac9215

The following article is Open access

Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations

April M Miksch et al 2021 Mach. Learn.: Sci. Technol. 2 031001

View article, Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations PDF, Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations

Recent advances in machine-learning interatomic potentials have enabled the efficient modeling of complex atomistic systems with an accuracy that is comparable to that of conventional quantum-mechanics based methods. At the same time, the construction of new machine-learning potentials can seem a daunting task, as it involves data-science techniques that are not yet common in chemistry and materials science. Here, we provide a tutorial-style overview of strategies and best practices for the construction of artificial neural network (ANN) potentials. We illustrate the most important aspects of (a) data collection, (b) model selection, (c) training and validation, and (d) testing and refinement of ANN potentials on the basis of practical examples. Current research in the areas of active learning and delta learning are also discussed in the context of ANN potentials. This tutorial review aims at equipping computational chemists and materials scientists with the required background knowledge for ANN potential construction and application, with the intention to accelerate the adoption of the method, so that it can facilitate exciting research that would otherwise be challenging with conventional strategies.

https://doi.org/10.1088/2632-2153/abfd96

Open all abstracts, in this tab

The following article is Open access

Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers

Tian et al

View accepted manuscript, Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers PDF, Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers

We present a new class of AI models for the detection of quasi-circular, spinning, non-precessing binary black hole mergers whose waveforms include the higher order gravitational wave modes $(\ell, |m|)=\{(2, 2), (2, 1), (3, 3), (3, 2), (4, 4)\}$, and mode mixing effects in the $\ell = 3, |m| = 2$ harmonics. These AI models combine hybrid dilated convolution neural networks to accurately model both short- and long-range temporal sequential information of gravitational waves; and graph neural networks to capture spatial correlations among gravitational wave observatories to consistently describe and identify the presence of a signal in a three detector network encompassing the Advanced LIGO and Virgo detectors. We first trained these spatiotemporal-graph AI models using synthetic noise, using 1.2 million modeled waveforms to densely sample this signal manifold, within 1.7 hours using 256 NVIDIA A100 GPUs in the Polaris supercomputer at the Argonne Leadership Computing Facility. This distributed training approach exhibited optimal classification performance, and strong scaling up to 512 NVIDIA A100 GPUs. With these AI ensembles we processed data from a three detector network, and found that an ensemble of 4 AI models achieves state-of-the-art performance for signal detection, and reports two misclassifications for every decade of searched data. We distributed AI inference over 128 GPUs in the Polaris supercomputer and 128 nodes in the Theta supercomputer, and completed the processing of a decade of gravitational wave data from a three detector network within 3.5 hours. Finally, we fine-tuned these AI ensembles to process the entire month of February 2020, which is part of the O3b LIGO/Virgo observation run, and found 6 gravitational waves, concurrently identified in Advanced LIGO and Advanced Virgo data, and zero false positives. This analysis was completed in one hour using one NVIDIA A100 GPU.

https://doi.org/10.1088/2632-2153/ad4c37

The following article is Open access

Semi-Supervised Segmentation of Abdominal Organs and Liver Tumor: Uncertainty Rectified Curriculum Labeling Meets X-Fuse

Lyu et al

View accepted manuscript, Semi-Supervised Segmentation of Abdominal Organs and Liver Tumor: Uncertainty Rectified Curriculum Labeling Meets X-Fuse PDF, Semi-Supervised Segmentation of Abdominal Organs and Liver Tumor: Uncertainty Rectified Curriculum Labeling Meets X-Fuse

Precise liver tumors and associated organ segmentation hold immense value for surgical and radiological intervention, enabling anatomical localization for pre-operative planning and intra-operative guidance. Modern deep learning models for medical image segmentation have evolved from convolution neural networks to transformer architectures, significantly boosting global context understanding. However, accurate delineation especially of hepatic lesions remains an enduring challenge due to models' predominant focus solely on spatial feature extraction failing to adequately characterize complex medical anatomies. Moreover, the relative paucity of expertly annotated medical imaging data restricts model exposure to diverse pathological presentations. In this paper, we present a three-phrased cascaded segmentation framework featuring an X-Fuse model that synergistically integrates spatial and frequency domain's complementary information in dual encoders to enrich latent feature representation. To enhance model generalizability, building upon X Fuse topology and taking advantage of additional unlabeled pathological data, our proposed integration of curriculum pseudo-labeling with Jensen-Shannon variance-based uncertainty rectification promotes optimized pseudo-supervision in the context of semi-supervised learning. We further introduce a tumor-focus augmentation technique including training-free copy-paste and knowledge-based synthesis that show efficacy in simplicity, contributing to the substantial elevation of model adaptability on diverse lesional morphologies. Extensive experiments and modular evaluations on a holdout test set demonstrate that our methods significantly outperform existing state-of-the-art segmentation models in both supervised and semi-supervised settings, as measured by the Dice similarity coefficient, achieving superior delineation of bones (95.42%), liver (96.26%), and liver tumors (89.53%) with 16.41% increase comparing to V-Net on supervised-only and augmented-absent scenario. Our method marks a significant step toward the realization of more reliable and robust AI-assisted diagnostic tools for liver tumor intervention. We have made the codes publicly available.

https://doi.org/10.1088/2632-2153/ad4c38

The following article is Open access

Interpolation of Environmental Data Using Deep Learning and Model Inference

Ibebuchi et al

View accepted manuscript, Interpolation of Environmental Data Using Deep Learning and Model Inference PDF, Interpolation of Environmental Data Using Deep Learning and Model Inference

The temporal resolution of environmental data sets plays a major role in the granularity of the information that can be derived from the data. In most cases, it is required that different data sets have a common temporal resolution to enable their consistent evaluations and applications in making informed decisions. This study leverages deep learning with long short-term memory (LSTM) neural networks and model inference to enhance the temporal resolution of climate datasets, specifically temperature, and precipitation, from daily to sub-daily scales. We trained our model to learn the relationship between daily and sub-daily data, subsequently applying this knowledge to increase the resolution of a separate dataset with a coarser (daily) temporal resolution. Our findings reveal a high degree of accuracy for temperature predictions, evidenced by a correlation of 0.99 and a mean absolute error of 0.21 °C, between the actual and predicted sub-daily values. In contrast, the approach was less effective for precipitation, achieving an explained variance of only 37%, compared to 98% for temperature. Further, besides the sub-daily interpolation of the climate data sets, we adapted our approach to increase the resolution of the Normalized difference vegetation index of Landsat (from 16-day to 5-day interval) using the LSTM model pre-trained from the Sentinel 2 Normalized difference vegetation index - that exists at a relatively higher temporal resolution. The explained variance between the predicted Landsat and Sentinel 1 data is 70% with a mean absolute error of 0.03. These results suggest that our method is particularly suitable for environmental datasets with less pronounced short-term variability, offering a promising tool for improving the resolution and utility of the data.

https://doi.org/10.1088/2632-2153/ad4b94

The following article is Open access

Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems

Zeng et al

View accepted manuscript, Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems PDF, Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems

While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the mapping functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework's ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a ``collective weight variable" incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices.

https://doi.org/10.1088/2632-2153/ad4ba5

The following article is Open access

Learning the dynamics of a one-dimensional plasma model with graph neural networks

Carvalho et al

View accepted manuscript, Learning the dynamics of a one-dimensional plasma model with graph neural networks PDF, Learning the dynamics of a one-dimensional plasma model with graph neural networks

We explore the possibility of fully replacing a plasma physics kinetic simulator with a graph neural network-based simulator. We focus on this class of surrogate models given the similarity between their message-passing update mechanism and the traditional physics solver update, and the possibility of enforcing known physical priors into the graph construction and update. We show that our model learns the kinetic plasma dynamics of the one-dimensional plasma model, a predecessor of contemporary kinetic plasma simulation codes, and recovers a wide range of well-known kinetic plasma processes, including plasma thermalization, electrostatic fluctuations about thermal equilibrium, and the drag on a fast sheet and Landau damping. We compare the performance against the original plasma model in terms of run-time, conservation laws, and temporal evolution of key physical quantities. The limitations of the model are presented and possible directions for higher-dimensional surrogate models for kinetic plasmas are discussed.

https://doi.org/10.1088/2632-2153/ad4ba6

More Accepted manuscripts

Machine Learning: Science and Technology

Journal links

Journal information

Machine Learning: Science and Technology

Most read

Latest articles

Review articles

Accepted manuscripts

Trending

Trending on Altmetric

Journal links

Journal information