Machine Learning: Science and Technology

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

ISSN: 2632-2153

OPEN ACCESS

Machine Learning: Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights.

Submit an article opens in new tab Track my article opens in new tab

RSS

Current volume

Journal archive

Focus issues

Median submission to first decision before peer review 3 days

Median submission to first decision after peer review 49 days

Impact factor 6.8

Citescore 7.1

Full list of journal metrics

Open all abstracts, in this tab

The following article is Open access

Chemformer: a pre-trained transformer for computational chemistry

Ross Irwin et al 2022 Mach. Learn.: Sci. Technol. 3 015022

View article, Chemformer: a pre-trained transformer for computational chemistry PDF, Chemformer: a pre-trained transformer for computational chemistry

Transformer models coupled with a simplified molecular line entry system (SMILES) have recently proven to be a powerful combination for solving challenges in cheminformatics. These models, however, are often developed specifically for a single application and can be very resource-intensive to train. In this work we present the Chemformer model—a Transformer-based model which can be quickly applied to both sequence-to-sequence and discriminative cheminformatics tasks. Additionally, we show that self-supervised pre-training can improve performance and significantly speed up convergence on downstream tasks. On direct synthesis and retrosynthesis prediction benchmark datasets we publish state-of-the-art results for top-1 accuracy. We also improve on existing approaches for a molecular optimisation task and show that Chemformer can optimise on multiple discriminative tasks simultaneously. Models, datasets and code will be made available after publication.

https://doi.org/10.1088/2632-2153/ac3ffb

The following article is Open access

Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art

Tanujit Chakraborty et al 2024 Mach. Learn.: Sci. Technol. 5 011001

View article, Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art PDF, Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art

Generative adversarial networks (GANs) have rapidly emerged as powerful tools for generating realistic and diverse data across various domains, including computer vision and other applied areas, since their inception in 2014. Consisting of a discriminative network and a generative network engaged in a minimax game, GANs have revolutionized the field of generative modeling. In February 2018, GAN secured the leading spot on the 'Top Ten Global Breakthrough Technologies List' issued by the Massachusetts Science and Technology Review. Over the years, numerous advancements have been proposed, leading to a rich array of GAN variants, such as conditional GAN, Wasserstein GAN, cycle-consistent GAN, and StyleGAN, among many others. This survey aims to provide a general overview of GANs, summarizing the latent architecture, validation metrics, and application areas of the most widely recognized variants. We also delve into recent theoretical developments, exploring the profound connection between the adversarial principle underlying GAN and Jensen–Shannon divergence while discussing the optimality characteristics of the GAN framework. The efficiency of GAN variants and their model architectures will be evaluated along with training obstacles as well as training solutions. In addition, a detailed discussion will be provided, examining the integration of GANs with newly developed deep learning frameworks such as transformers, physics-informed neural networks, large language models, and diffusion models. Finally, we reveal several issues as well as future research outlines in this field.

https://doi.org/10.1088/2632-2153/ad1f77

The following article is Open access

The MLIP package: moment tensor potentials with MPI and active learning

Ivan S Novikov et al 2021 Mach. Learn.: Sci. Technol. 2 025002

View article, The MLIP package: moment tensor potentials with MPI and active learning PDF, The MLIP package: moment tensor potentials with MPI and active learning

The subject of this paper is the technology (the 'how') of constructing machine-learning interatomic potentials, rather than science (the 'what' and 'why') of atomistic simulations using machine-learning potentials. Namely, we illustrate how to construct moment tensor potentials using active learning as implemented in the MLIP package, focusing on the efficient ways to automatically sample configurations for the training set, how expanding the training set changes the error of predictions, how to set up ab initio calculations in a cost-effective manner, etc. The MLIP package (short for Machine-Learning Interatomic Potentials) is available at https://mlip.skoltech.ru/download/.

https://doi.org/10.1088/2632-2153/abc9fe

The following article is Open access

Prediction of chemical reaction yields using deep learning

Philippe Schwaller et al 2021 Mach. Learn.: Sci. Technol. 2 015016

View article, Prediction of chemical reaction yields using deep learning PDF, Prediction of chemical reaction yields using deep learning

Artificial intelligence is driving one of the most important revolutions in organic chemistry. Multiple platforms, including tools for reaction prediction and synthesis planning based on machine learning, have successfully become part of the organic chemists' daily laboratory, assisting in domain-specific synthetic problems. Unlike reaction prediction and retrosynthetic models, the prediction of reaction yields has received less attention in spite of the enormous potential of accurately predicting reaction conversion rates. Reaction yields models, describing the percentage of the reactants converted to the desired products, could guide chemists and help them select high-yielding reactions and score synthesis routes, reducing the number of attempts. So far, yield predictions have been predominantly performed for high-throughput experiments using a categorical (one-hot) encoding of reactants, concatenated molecular fingerprints, or computed chemical descriptors. Here, we extend the application of natural language processing architectures to predict reaction properties given a text-based representation of the reaction, using an encoder transformer model combined with a regression layer. We demonstrate outstanding prediction performance on two high-throughput experiment reactions sets. An analysis of the yields reported in the open-source USPTO data set shows that their distribution differs depending on the mass scale, limiting the data set applicability in reaction yields predictions.

https://doi.org/10.1088/2632-2153/abc81d

The following article is Open access

Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation

Mario Krenn et al 2020 Mach. Learn.: Sci. Technol. 1 045024

View article, Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation PDF, Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation

The discovery of novel materials and functional molecules can help to solve some of society's most urgent challenges, ranging from efficient energy harvesting and storage to uncovering novel pharmaceutical drug candidates. Traditionally matter engineering–generally denoted as inverse design–was based massively on human intuition and high-throughput virtual screening. The last few years have seen the emergence of significant interest in computer-inspired designs based on evolutionary or deep learning methods. The major challenge here is that the standard strings molecular representation SMILES shows substantial weaknesses in that task because large fractions of strings do not correspond to valid molecules. Here, we solve this problem at a fundamental level and introduce SELFIES (SELF-referencIng Embedded Strings), a string-based representation of molecules which is 100% robust. Every SELFIES string corresponds to a valid molecule, and SELFIES can represent every molecule. SELFIES can be directly applied in arbitrary machine learning models without the adaptation of the models; each of the generated molecule candidates is valid. In our experiments, the model's internal memory stores two orders of magnitude more diverse molecules than a similar test with SMILES. Furthermore, as all molecules are valid, it allows for explanation and interpretation of the internal working of the generative models.

https://doi.org/10.1088/2632-2153/aba947

The following article is Open access

Closed-loop Koopman operator approximation

Steven Dahdah and James Richard Forbes 2024 Mach. Learn.: Sci. Technol. 5 025038

View article, Closed-loop Koopman operator approximation PDF, Closed-loop Koopman operator approximation

This paper proposes a method to identify a Koopman model of a feedback-controlled system given a known controller. The Koopman operator allows a nonlinear system to be rewritten as an infinite-dimensional linear system by viewing it in terms of an infinite set of lifting functions. A finite-dimensional approximation of the Koopman operator can be identified from data by choosing a finite subset of lifting functions and solving a regression problem in the lifted space. Existing methods are designed to identify open-loop systems. However, it is impractical or impossible to run experiments on some systems, such as unstable systems, in an open-loop fashion. The proposed method leverages the linearity of the Koopman operator, along with knowledge of the controller and the structure of the closed-loop (CL) system, to simultaneously identify the CL and plant systems. The advantages of the proposed CL Koopman operator approximation method are demonstrated in simulation using a Duffing oscillator and experimentally using a rotary inverted pendulum system. An open-source software implementation of the proposed method is publicly available, along with the experimental dataset generated for this paper.

https://doi.org/10.1088/2632-2153/ad45b0

The following article is Open access

Quantum machine learning for image classification

Arsenii Senokosov et al 2024 Mach. Learn.: Sci. Technol. 5 015040

View article, Quantum machine learning for image classification PDF, Quantum machine learning for image classification

Image classification, a pivotal task in multiple industries, faces computational challenges due to the burgeoning volume of visual data. This research addresses these challenges by introducing two quantum machine learning models that leverage the principles of quantum mechanics for effective computations. Our first model, a hybrid quantum neural network with parallel quantum circuits, enables the execution of computations even in the noisy intermediate-scale quantum era, where circuits with a large number of qubits are currently infeasible. This model demonstrated a record-breaking classification accuracy of 99.21% on the full MNIST dataset, surpassing the performance of known quantum–classical models, while having eight times fewer parameters than its classical counterpart. Also, the results of testing this hybrid model on a Medical MNIST (classification accuracy over 99%), and on CIFAR-10 (classification accuracy over 82%), can serve as evidence of the generalizability of the model and highlights the efficiency of quantum layers in distinguishing common features of input data. Our second model introduces a hybrid quantum neural network with a Quanvolutional layer, reducing image resolution via a convolution process. The model matches the performance of its classical counterpart, having four times fewer trainable parameters, and outperforms a classical model with equal weight parameters. These models represent advancements in quantum machine learning research and illuminate the path towards more accurate image classification systems.

https://doi.org/10.1088/2632-2153/ad2aef

The following article is Open access

Deeptime: a Python library for machine learning dynamical models from time series data

Moritz Hoffmann et al 2022 Mach. Learn.: Sci. Technol. 3 015009

View article, Deeptime: a Python library for machine learning dynamical models from time series data PDF, Deeptime: a Python library for machine learning dynamical models from time series data

Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables, dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software. Deeptime can be found under https://deeptime-ml.github.io/.

https://doi.org/10.1088/2632-2153/ac3de0

The following article is Open access

Multimodal protein representation learning and target-aware variational auto-encoders for protein-binding ligand generation

Nhat Khang Ngo and Truong Son Hy 2024 Mach. Learn.: Sci. Technol. 5 025021

View article, Multimodal protein representation learning and target-aware variational auto-encoders for protein-binding ligand generation PDF, Multimodal protein representation learning and target-aware variational auto-encoders for protein-binding ligand generation

Without knowledge of specific pockets, generating ligands based on the global structure of a protein target plays a crucial role in drug discovery as it helps reduce the search space for potential drug-like candidates in the pipeline. However, contemporary methods require optimizing tailored networks for each protein, which is arduous and costly. To address this issue, we introduce TargetVAE, a target-aware variational auto-encoder that generates ligands with desirable properties including high binding affinity and high synthesizability to arbitrary target proteins, guided by a multimodal deep neural network built based on geometric and sequence models, named Protein Multimodal Network (PMN), as the prior for the generative model. PMN unifies different representations of proteins (e.g. primary structure—sequence of amino acids, 3D tertiary structure, and residue-level graph) into a single representation. Our multimodal architecture learns from the entire protein structure and is able to capture their sequential, topological, and geometrical information by utilizing language modeling, graph neural networks, and geometric deep learning. We showcase the superiority of our approach by conducting extensive experiments and evaluations, including predicting protein-ligand binding affinity in the PBDBind v2020 dataset as well as the assessment of generative model quality, ligand generation for unseen targets, and docking score computation. Empirical results demonstrate the promising and competitive performance of our proposed approach. Our software package is publicly available at https://github.com/HySonLab/Ligand_Generation.

https://doi.org/10.1088/2632-2153/ad3ee4

The following article is Open access

Rediscovering orbital mechanics with machine learning

Pablo Lemos et al 2023 Mach. Learn.: Sci. Technol. 4 045002

View article, Rediscovering orbital mechanics with machine learning PDF, Rediscovering orbital mechanics with machine learning

We present an approach for using machine learning to automatically discover the governing equations and unknown properties (in this case, masses) of real physical systems from observations. We train a 'graph neural network' to simulate the dynamics of our Solar System's Sun, planets, and large moons from 30 years of trajectory data. We then use symbolic regression to correctly infer an analytical expression for the force law implicitly learned by the neural network, which our results showed is equivalent to Newton's law of gravitation. The key assumptions our method makes are translational and rotational equivariance, and Newton's second and third laws of motion. It did not, however, require any assumptions about the masses of planets and moons or physical constants, but nonetheless, they, too, were accurately inferred with our method. Naturally, the classical law of gravitation has been known since Isaac Newton, but our results demonstrate that our method can discover unknown laws and hidden properties from observed data.

https://doi.org/10.1088/2632-2153/acfa63

Open all abstracts, in this tab

The following article is Open access

Unifying O(3) equivariant neural networks design with tensor-network formalism

Zimu Li et al 2024 Mach. Learn.: Sci. Technol. 5 025044

View article, Unifying O(3) equivariant neural networks design with tensor-network formalism PDF, Unifying O(3) equivariant neural networks design with tensor-network formalism

Many learning tasks, including learning potential energy surfaces from ab initio calculations, involve global spatial symmetries and permutational symmetry between atoms or general particles. Equivariant graph neural networks are a standard approach to such problems, with one of the most successful methods employing tensor products between various tensors that transform under the spatial group. However, as the number of different tensors and the complexity of relationships between them increase, maintaining parsimony and equivariance becomes increasingly challenging. In this paper, we propose using fusion diagrams, a technique widely employed in simulating SU(2)-symmetric quantum many-body problems, to design new spatial equivariant components for neural networks. This results in a diagrammatic approach to constructing novel neural network architectures. When applied to particles within a given local neighborhood, the resulting components, which we term 'fusion blocks,' serve as universal approximators of any continuous equivariant function defined on the neighborhood. We incorporate a fusion block into pre-existing equivariant architectures (Cormorant and MACE), leading to improved performance with fewer parameters on a range of challenging chemical problems. Furthermore, we apply group-equivariant neural networks to study non-adiabatic molecular dynamics of stilbene cis-trans isomerization. Our approach, which combines tensor networks with equivariant neural networks, suggests a potentially fruitful direction for designing more expressive equivariant neural networks.

https://doi.org/10.1088/2632-2153/ad4a04

The following article is Open access

Feature selection for high-dimensional neural network potentials with the adaptive group lasso

Johannes Sandberg et al 2024 Mach. Learn.: Sci. Technol. 5 025043

View article, Feature selection for high-dimensional neural network potentials with the adaptive group lasso PDF, Feature selection for high-dimensional neural network potentials with the adaptive group lasso

Neural network potentials are a powerful tool for atomistic simulations, allowing to accurately reproduce ab initio potential energy surfaces with computational performance approaching classical force fields. A central component of such potentials is the transformation of atomic positions into a set of atomic features in a most efficient and informative way. In this work, a feature selection method is introduced for high dimensional neural network potentials, based on the adaptive group lasso (AGL) approach. It is shown that the use of an embedded method, taking into account the interplay between features and their action in the estimator, is necessary to optimize the number of features. The method's efficiency is tested on three different monoatomic systems, including Lennard–Jones as a simple test case, Aluminium as a system characterized by predominantly radial interactions, and Boron as representative of a system with strongly directional components in the interactions. The AGL is compared with unsupervised filter methods and found to perform consistently better in reducing the number of features needed to reproduce the reference simulation data at a similar level of accuracy as the starting feature set. In particular, our results show the importance of taking into account model predictions in feature selection for interatomic potentials.

https://doi.org/10.1088/2632-2153/ad450e

The following article is Open access

A multifidelity approach to continual learning for physical systems

Amanda Howard et al 2024 Mach. Learn.: Sci. Technol. 5 025042

View article, A multifidelity approach to continual learning for physical systems PDF, A multifidelity approach to continual learning for physical systems

We introduce a novel continual learning method based on multifidelity deep neural networks. This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset, limiting catastrophic forgetting. On its own the multifidelity continual learning method shows robust results that limit forgetting across several datasets. Additionally, we show that the multifidelity method can be combined with existing continual learning methods, including replay and memory aware synapses, to further limit catastrophic forgetting. The proposed continual learning method is especially suited for physical problems where the data satisfy the same physical laws on each domain, or for physics-informed neural networks, because in these cases we expect there to be a strong correlation between the output of the previous model and the model on the current training domain.

https://doi.org/10.1088/2632-2153/ad45b2

The following article is Open access

Exploiting data diversity in multi-domain federated learning

Hussain Ahmad Madni et al 2024 Mach. Learn.: Sci. Technol. 5 025041

View article, Exploiting data diversity in multi-domain federated learning PDF, Exploiting data diversity in multi-domain federated learning

Federated learning (FL) is an evolving machine learning technique that allows collaborative model training without sharing the original data among participants. In real-world scenarios, data residing at multiple clients are often heterogeneous in terms of different resolutions, magnifications, scanners, or imaging protocols, and thus challenging for global FL model convergence in collaborative training. Most of the existing FL methods consider data heterogeneity within one domain by assuming same data variation in each client site. In this paper, we consider data heterogeneity in FL with different domains of heterogeneous data by raising the problems of domain-shift, class-imbalance, and missing data. We propose a method, multi-domain FL as a solution to heterogeneous training data from multiple domains by training robust vision transformer model. We use two loss functions, one for correctly predicting class labels and other for encouraging similarity and dissimilarity over latent features, to optimize the global FL model. We perform various experiments using different convolution-based networks and non-convolutional Transformer architectures on multi-domain datasets. We evaluate the proposed approach on benchmark datasets and compare with the existing FL methods. Our results show the superiority of the proposed approach which performs better in term of robust FL global model than the exiting methods.

https://doi.org/10.1088/2632-2153/ad4768

The following article is Open access

A comprehensive machine learning-based investigation for the index-value prediction of 2G HTS coated conductor tapes

Shahin Alipour Bonab et al 2024 Mach. Learn.: Sci. Technol. 5 025040

View article, A comprehensive machine learning-based investigation for the index-value prediction of 2G HTS coated conductor tapes PDF, A comprehensive machine learning-based investigation for the index-value prediction of 2G HTS coated conductor tapes

Index-value, or so-called n-value prediction is of paramount importance for understanding the superconductors' behaviour specially when modeling of superconductors is needed. This parameter is dependent on several physical quantities including temperature, the magnetic field's density and orientation, and affects the behaviour of high-temperature superconducting devices made out of coated conductors in terms of losses and quench propagation. In this paper, a comprehensive analysis of many machine learning (ML) methods for estimating the n-value has been carried out. The results demonstrated that cascade forward neural network (CFNN) excels in this scope. Despite needing considerably higher training time when compared to the other attempted models, it performs at the highest accuracy, with 0.48 root mean squared error (RMSE) and 99.72% Pearson coefficient for goodness of fit (R-squared). In contrast, the rigid regression method had the worst predictions with 4.92 RMSE and 37.29% R-squared. Also, random forest, boosting methods, and simple feed forward neural network can be considered as a middle accuracy model with faster training time than CFNN. The findings of this study not only advance modeling of superconductors but also pave the way for applications and further research on ML plug-and-play codes for superconducting studies including modeling of superconducting devices.

https://doi.org/10.1088/2632-2153/ad45b1

Open all abstracts, in this tab

The following article is Open access

Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art

Tanujit Chakraborty et al 2024 Mach. Learn.: Sci. Technol. 5 011001

View article, Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art PDF, Ten years of generative adversarial nets (GANs): a survey of the state-of-the-art

https://doi.org/10.1088/2632-2153/ad1f77

The following article is Open access

Manifold learning in atomistic simulations: a conceptual review

Jakub Rydzewski et al 2023 Mach. Learn.: Sci. Technol. 4 031001

View article, Manifold learning in atomistic simulations: a conceptual review PDF, Manifold learning in atomistic simulations: a conceptual review

Analyzing large volumes of high-dimensional data requires dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. Such practice is needed in atomistic simulations of complex systems where even thousands of degrees of freedom are sampled. An abundance of such data makes gaining insight into a specific physical problem strenuous. Our primary aim in this review is to focus on unsupervised machine learning methods that can be used on simulation data to find a low-dimensional manifold providing a collective and informative characterization of the studied process. Such manifolds can be used for sampling long-timescale processes and free-energy estimation. We describe methods that can work on datasets from standard and enhanced sampling atomistic simulations. Unlike recent reviews on manifold learning for atomistic simulations, we consider only methods that construct low-dimensional manifolds based on Markov transition probabilities between high-dimensional samples. We discuss these techniques from a conceptual point of view, including their underlying theoretical frameworks and possible limitations.

https://doi.org/10.1088/2632-2153/ace81a

The following article is Open access

Numerical and geometrical aspects of flow-based variational quantum Monte Carlo

James Stokes et al 2023 Mach. Learn.: Sci. Technol. 4 021001

View article, Numerical and geometrical aspects of flow-based variational quantum Monte Carlo PDF, Numerical and geometrical aspects of flow-based variational quantum Monte Carlo

This article aims to summarize recent and ongoing efforts to simulate continuous-variable quantum systems using flow-based variational quantum Monte Carlo techniques, focusing for pedagogical purposes on the example of bosons in the field amplitude (quadrature) basis. Particular emphasis is placed on the variational real- and imaginary-time evolution problems, carefully reviewing the stochastic estimation of the time-dependent variational principles and their relationship with information geometry. Some practical instructions are provided to guide the implementation of a PyTorch code. The review is intended to be accessible to researchers interested in machine learning and quantum information science.

https://doi.org/10.1088/2632-2153/acc8b9

The following article is Open access

Physics-AI symbiosis

Bahram Jalali et al 2022 Mach. Learn.: Sci. Technol. 3 041001

View article, Physics-AI symbiosis PDF, Physics-AI symbiosis

The phenomenal success of physics in explaining nature and engineering machines is predicated on low dimensional deterministic models that accurately describe a wide range of natural phenomena. Physics provides computational rules that govern physical systems and the interactions of the constituents therein. Led by deep neural networks, artificial intelligence (AI) has introduced an alternate data-driven computational framework, with astonishing performance in domains that do not lend themselves to deterministic models such as image classification and speech recognition. These gains, however, come at the expense of predictions that are inconsistent with the physical world as well as computational complexity, with the latter placing AI on a collision course with the expected end of the semiconductor scaling known as Moore's Law. This paper argues how an emerging symbiosis of physics and AI can overcome such formidable challenges, thereby not only extending AI's spectacular rise but also transforming the direction of engineering and physical science.

https://doi.org/10.1088/2632-2153/ac9215

The following article is Open access

Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations

April M Miksch et al 2021 Mach. Learn.: Sci. Technol. 2 031001

View article, Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations PDF, Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations

Recent advances in machine-learning interatomic potentials have enabled the efficient modeling of complex atomistic systems with an accuracy that is comparable to that of conventional quantum-mechanics based methods. At the same time, the construction of new machine-learning potentials can seem a daunting task, as it involves data-science techniques that are not yet common in chemistry and materials science. Here, we provide a tutorial-style overview of strategies and best practices for the construction of artificial neural network (ANN) potentials. We illustrate the most important aspects of (a) data collection, (b) model selection, (c) training and validation, and (d) testing and refinement of ANN potentials on the basis of practical examples. Current research in the areas of active learning and delta learning are also discussed in the context of ANN potentials. This tutorial review aims at equipping computational chemists and materials scientists with the required background knowledge for ANN potential construction and application, with the intention to accelerate the adoption of the method, so that it can facilitate exciting research that would otherwise be challenging with conventional strategies.

https://doi.org/10.1088/2632-2153/abfd96

Open all abstracts, in this tab

The following article is Open access

Transformer-Powered Surrogates Close the ICF Simulation-Experiment Gap with Extremely Limited Data

Olson et al

View accepted manuscript, Transformer-Powered Surrogates Close the ICF Simulation-Experiment Gap with Extremely Limited Data PDF, Transformer-Powered Surrogates Close the ICF Simulation-Experiment Gap with Extremely Limited Data

Recent advances in machine learning, specifically transformer architecture, have led to significant advancements in commercial domains. These powerful models have demonstrated superior capability to learn complex relationships and often generalize better to new data and problems. This paper presents a novel transformer-powered approach for enhancing prediction accuracy in multi-modal output scenarios, where sparse experimental data is supplemented with simulation data. The proposed approach integrates transformer-based architecture with a novel graph-based hyper-parameter optimization technique. The resulting system not only effectively reduces simulation bias, but also achieves superior prediction accuracy compared to the prior method. We demonstrate the efficacy of our approach on inertial confinement fusion experiments, where only 10 shots of real-world data are available, as well as synthetic versions of these experiments.

https://doi.org/10.1088/2632-2153/ad4e03

The following article is Open access

STG-MTL: Scalable Task Grouping For Multi-Task Learning Using Data Maps

Sherif et al

View accepted manuscript, STG-MTL: Scalable Task Grouping For Multi-Task Learning Using Data Maps PDF, STG-MTL: Scalable Task Grouping For Multi-Task Learning Using Data Maps

Multi-Task Learning (MTL) is a powerful technique that has gained popularity due to its performance improvement over traditional Single-Task Learning (STL). However, MTL is often challenging because there is an exponential number of possible task groupings, which can make it difficult to choose the best one because some groupings might produce performance degradation due to negative interference between tasks. That is why existing solutions are severely suffering from scalability issues, limiting any practical application. In our paper, we propose a new data-driven method that addresses these challenges and provides a scalable and modular solution for classification task grouping based on a re-proposed data-driven features, Data Maps, which capture the training dynamics for each classification task during the MTL training. Through a theoretical comparison with other techniques, we manage to show that our approach has the superior scalability. Our experiments show a better performance and verify the method's effectiveness, even on an unprecedented number of tasks (up to 100 tasks on CIFAR100). Being the first to work on such number of tasks, our comparisons on the resulting grouping shows similar grouping to the mentioned in the dataset, CIFAR100. Finally, we provide a modular implementation for easier integration and testing, with examples from multiple datasets and tasks.

https://doi.org/10.1088/2632-2153/ad4e04

The following article is Open access

Global System Errors to Simultaneously Improve the Identification of Subsystems with Mixed Data Gaussian Process Regression

LaMack et al

View accepted manuscript, Global System Errors to Simultaneously Improve the Identification of Subsystems with Mixed Data Gaussian Process Regression PDF, Global System Errors to Simultaneously Improve the Identification of Subsystems with Mixed Data Gaussian Process Regression

This paper explores the use of Gaussian Process Regression (GPR) for system iden-tification in control engineering. It introduces two novel approaches that utilize thedata from a measured global system error. The paper demonstrates these approachesby identifying a simulated system with three subsystems, a one degree of freedommass with two antagonist muscles. The first approach uses this whole-system errordata alone, achieving accuracy on the same order of magnitude as subsystem-specificdata (9.28 ± 0.87 N vs. 6.96 ± 0.32 N of total model errors). This is significant, asit shows that the same data set can be used to identify unique subsystems, as op-posed to requiring a set of data descriptive of only a single subsystem. The secondapproach demonstrated in this paper mixes traditional subsystem-specific data withthe whole system error data, achieving up to 98.71% model improvement.

https://doi.org/10.1088/2632-2153/ad4e05

The following article is Open access

Journey over Destination: Dynamic Sensor Placement Enhances Generalization

Marcato et al

View accepted manuscript, Journey over Destination: Dynamic Sensor Placement Enhances Generalization PDF, Journey over Destination: Dynamic Sensor Placement Enhances Generalization

Reconstructing complex, high-dimensional global fields from limited data points is a challenge across various scientific and industrial domains. This is particularly important for recovering spatio-temporal fields using sensor data from, for example, laboratory-based scientific experiments, weather forecasting, or drone surveys. Given the prohibitive costs of specialized sensors and the inaccessibility ofcertain regions of the domain, achieving full field coverage is typically not feasible. Therefore, the development of machine learning algorithms trained to reconstruct fields given a limited dataset is of critical importance. In this study, we introduce a generalapproach that employs moving sensors to enhance data exploitation during the training of an attention based neural network, thereby improving field reconstruction. The training of sensor locations is accomplished using an end-to-end workflow, ensuringdifferentiability in the interpolation of field values associated to the sensors, and is simple to implement using differentiable programming. Additionally, we have incorporated a correction mechanism to prevent sensors from entering invalid regions within the domain. We evaluated our method using two distinct datasets; the results show that our approach enhances learning, as evidenced by improved test scores.

https://doi.org/10.1088/2632-2153/ad4e06

The following article is Open access

Enhancing Particle String Detection in Electrorheological Plasmas Using Asymmetrical Kernel Convolutional Networks

Klein et al

View accepted manuscript, Enhancing Particle String Detection in Electrorheological Plasmas Using Asymmetrical Kernel Convolutional Networks PDF, Enhancing Particle String Detection in Electrorheological Plasmas Using Asymmetrical Kernel Convolutional Networks

Under different plasma conditions and electric fields in a complex plasma the plasma particles organize themselves in a string-like or chain-like manner. A phase transition from string-like to an isotropic particle distribution is observed at different electrical conditions. The streaming of charged ions around plasma particles with the surrounding electric field gives the plasma its electrorheological properties. The visibility of individual particles in a complex plasma opens up the opportunity to examine properties and phase transitions of such electrorheological fluids in detail. Because of the limited one-dimensional symmetry, determining the configuration of a particle and recognizing strings in particle distributions is not always straightforward. Several approaches have already been used to analyse particle clouds while either considering each particle locally or considering the particle cloud as a whole without providing information about single particle configurations. This paper presents a new machine learning approach that takes advantage of particle distributions over the entire particle cloud and detects all string-like particles at once, using a convolutional neural network in form of an encoder-decoder network with asymmetric kernel convolutions. This not only enhances the result quality but also accelerates the evaluation process, possibly enabling real-time analyses on electrorheological phase transitions, while achieving an accuracy of over 95% on manually labelled data.

https://doi.org/10.1088/2632-2153/ad4d3e

More Accepted manuscripts

Machine Learning: Science and Technology

Journal links

Journal information

Machine Learning: Science and Technology

Most read

Latest articles

Review articles

Accepted manuscripts

Trending

Trending on Altmetric

Journal links

Journal information