Publications
See also my Google Scholar. Asterisks (*) indicate papers on which I am the sole or co-first author, or the sole or co-corresponding author. Frequent collaborators have their websites linked in the author lists below.
Preprints
- Eviatar Bach*, Dan Crisan, and Michael GhilNov 2024
There is a history of simple forecast error growth models designed to capture the key properties of error growth in operational numerical weather prediction (NWP) models. We propose here such a scalar model that relies on the previous ones and incorporates multiplicative noise in a nonlinear stochastic differential equation (SDE). We analyze the properties of the SDE, including the shape of the error growth curve for small times and its stationary distribution, and prove well-posedness and positivity of solutions. We then fit this model to operational NWP error growth curves, showing good agreement with both the mean and probabilistic features of the error growth. These results suggest that the dynamic-stochastic error growth model proposed herein and similar ones could play a role in many other areas of the sciences that involve prediction.
- Eviatar Bach*, Ricardo Baptista, Daniel Sanz-Alonso, and Andrew StuartOct 2024
The aim of these notes is to demonstrate the potential for ideas in machine learning to impact on the fields of inverse problems and data assimilation. The perspective is one that is primarily aimed at researchers from inverse problems and/or data assimilation who wish to see a mathematical presentation of machine learning as it pertains to their fields. As a by-product, we include a succinct mathematical treatment of various topics in machine learning.
- Enoch Luk, Eviatar Bach*, Ricardo Baptista, and Andrew StuartJun 2024
Filtering – the task of estimating the conditional distribution of states of a dynamical system given partial, noisy, observations – is important in many areas of science and engineering, including weather and climate prediction. However, the filtering distribution is generally intractable to obtain for high-dimensional, nonlinear systems. Filters used in practice, such as the ensemble Kalman filter (EnKF), are biased for nonlinear systems and have numerous tuning parameters. Here, we present a framework for learning a parameterized analysis map – the map that takes a forecast distribution and observations to the filtering distribution – using variational inference. We show that this methodology can be used to learn gain matrices for filtering linear and nonlinear dynamical systems, as well as inflation and localization parameters for an EnKF. Future work will apply this framework to learn new filtering algorithms.
2024
- David Vishny, Matthias Morzfeld, Kyle Gwirtz, Eviatar Bach, Oliver R. A. Dunbar, and Daniel HodyssJournal of Advances in Modeling Earth Systems Aug 2024
We synthesize knowledge from numerical weather prediction, inverse theory, and statistics to address the problem of estimating a high-dimensional covariance matrix from a small number of samples. This problem is fundamental in statistics, machine learning/artificial intelligence, and in modern Earth science. We create several new adaptive methods for high-dimensional covariance estimation, but one method, which we call Noise-Informed Covariance Estimation (NICE), stands out because it has three important properties: (a) NICE is conceptually simple and computationally efficient; (b) NICE guarantees symmetric positive semi-definite covariance estimates; and (c) NICE is largely tuning-free. We illustrate the use of NICE on a large set of Earth science–inspired numerical examples, including cycling data assimilation, inversion of geophysical field data, and training of feed-forward neural networks with time-averaged data from a chaotic dynamical system. Our theory, heuristics and numerical tests suggest that NICE may indeed be a viable option for high-dimensional covariance estimation in many Earth science problems.
- Gaston Manta, Eviatar Bach*, Stefanie Talento, Marcelo Barreiro, Sabrina Speich, and Michael GhilScientific Reports Jul 2024
This study analyzes coupled atmosphere–ocean variability in the South Atlantic Ocean. To do so, we characterize the spatio-temporal variability of annual mean sea-surface temperature (SST) and sea-level pressure (SLP) using Multichannel Singular Spectrum Analysis (M-SSA). We applied M-SSA to ERA5 reanalysis data (1959–2022) of South Atlantic SST and SLP, both individually and jointly, and identified a nonlinear trend, as well as two climate oscillations. The leading oscillation, with a period of 13 years, consists of a basin-wide southwest–northeast dipole and is observed both in the individual variables and in the coupled analysis. This mode is reminiscent of the already known South Atlantic Dipole, and it is probably related to the Pacific Decadal Oscillation and to El Niño–Southern Oscillation in the Pacific Ocean. The second oscillation has a 5-year period and also displays a dipolar structure. The main difference between the spatial structure of the decadal, 13-year, and the interannual, 5-year mode is that, in the first one, the SST cold tongue region in the southeast Atlantic’s Cape Basin is included in the pole closer to the equator. Together, these two oscillatory modes, along with the trend, capture almost 40% of the total interannual variability of the SST and SLP fields, and of their co-variability. These results provide further insights into the spatio-temporal evolution of SST and SLP variability in the South Atlantic, in particular as it relates to the South Atlantic Dipole and its predictability.
- Eviatar Bach*, V. Krishnamurthy, Safa Mote, Jagadish Shukla, A. Surjalal Sharma, Eugenia Kalnay, and Michael GhilProceedings of the National Academy of Sciences Apr 2024
Predicting the temporal and spatial patterns of South Asian monsoon rainfall within a season is of critical importance due to its impact on agriculture, water availability, and flooding. The monsoon intraseasonal oscillation (MISO) is a robust northward-propagating mode that determines the active and break phases of the monsoon and much of the regional distribution of rainfall. However, dynamical atmospheric forecast models predict this mode poorly. Data-driven methods for MISO prediction have shown more skill, but only predict the portion of the rainfall corresponding to MISO rather than the full rainfall signal. Here, we combine state-of-the-art ensemble precipitation forecasts from a high-resolution atmospheric model with data-driven forecasts of MISO. The ensemble members of the detailed atmospheric model are projected onto a lower-dimensional subspace corresponding to the MISO dynamics and are then weighted according to their distance from the data-driven MISO forecast in this subspace. We thereby achieve improvements in rainfall forecasts over India, as well as the broader monsoon region, at 10- to 30-d lead times, an interval that is generally considered to be a predictability gap. The temporal correlation of rainfall forecasts is improved by up to 0.28 in this time range. Our results demonstrate the potential of leveraging the predictability of intraseasonal oscillations to improve extended-range forecasts; more generally, they point toward a future of combining dynamical and data-driven forecasts for Earth system prediction.
- Eviatar Bach*, Tim Colonius, Isabel Scherl, and Andrew StuartChaos: An Interdisciplinary Journal of Nonlinear Science Mar 2024
We consider the problem of filtering dynamical systems, possibly stochastic, using observations of statistics. Thus, the computational task is to estimate a time-evolving density ρ(v, t) given noisy observations of the true density ρ†; this contrasts with the standard filtering problem based on observations of the state v. The task is naturally formulated as an infinite-dimensional filtering problem in the space of densities ρ. However, for the purposes of tractability, we seek algorithms in state space; specifically, we introduce a mean-field state-space model, and using interacting particle system approximations to this model, we propose an ensemble method. We refer to the resulting methodology as the ensemble Fokker–Planck filter (EnFPF). Under certain restrictive assumptions, we show that the EnFPF approximates the Kalman–Bucy filter for the Fokker–Planck equation, which is the exact solution to the infinite-dimensional filtering problem. Furthermore, our numerical experiments show that the methodology is useful beyond this restrictive setting. Specifically, the experiments show that the EnFPF is able to correct ensemble statistics, to accelerate convergence to the invariant density for autonomous systems, and to accelerate convergence to time-dependent invariant densities for non-autonomous systems. We discuss possible applications of the EnFPF to climate ensembles and to turbulence modeling.
- Zhengjie Xu, Yan Li, Yingzuo Qin, and Eviatar BachSolar Energy Jan 2024
The rapid development of solar energy worldwide has attracted increasing attention due to its climatic and environmental impacts. Using MODIS data, we quantified the effects of solar farms (SFs) on albedo, vegetation (using enhanced vegetation index (EVI) as a proxy), and land surface temperature (LST) based on 116 large SFs across the world. The results show that the installation of SFs decreased the annual mean surface shortwave albedo by 0.016 ± 0.009 (mean ± 1 STD) and reduced the EVI by 0.015 ± 0.019 relative to the surrounding areas. SFs produced a strong cooling effect of −0.49 ± 0.43 K in the annual mean land surface temperature during the daytime and a weaker cooling effect of −0.21 ± 0.25 K during the nighttime. The greatest impacts on albedo and daytime LST were observed in barren land, followed by grassland and cropland, while the opposite order applied for vegetation impact. In terms of seasonal and latitudinal variations, the largest impact was observed at high latitudes in winter on albedo, at mid-latitudes in summer on vegetation, and at low latitudes in spring–summer transitions on daytime LST. Correlation analysis showed that the albedo and LST impacts were enhanced over large SFs with high capacity. The vegetation and LST impacts were both correlated with geographic and climatic factors and dependent on the type of SF (photovoltaic or concentrating solar power). Our global assessment provides observational evidence for the effects of SF construction on the environment and local climate, which can help the sustainable development of solar energy.
2023
- Ashesh Chattopadhyay, Ebrahim Nabizadeh, Eviatar Bach, and Pedram HassanzadehJournal of Computational Physics Mar 2023
Data assimilation (DA) is a key component of many forecasting models in science and engineering. DA allows one to estimate better initial conditions using an imperfect dynamical model of the system and noisy/sparse observations available from the system. Ensemble Kalman filter (EnKF) is a DA algorithm that is widely used in applications involving high-dimensional nonlinear dynamical systems. However, EnKF requires evolving large ensembles of forecasts using the dynamical model of the system. This often becomes computationally intractable, especially when the number of states of the system is very large, e.g., for weather prediction. With small ensembles, the estimated background error covariance matrix in the EnKF algorithm suffers from sampling error, leading to an erroneous estimate of the analysis state (initial condition for the next forecast cycle). In this work, we propose hybrid ensemble Kalman filter (H-EnKF), which is applied to a two-layer quasi-geostrophic turbulent flow as a test case. This framework utilizes a pre-trained deep learning-based data-driven surrogate that inexpensively generates and evolves a large data-driven ensemble of the states to accurately compute the background error covariance matrix with smaller sampling errors. The H-EnKF framework outperforms EnKF with only dynamical model or only the data-driven surrogate, and estimates a better initial condition without the need for any ad-hoc localization strategies. H-EnKF can be extended to any ensemble-based DA algorithm, e.g., particle filters, which are currently too expensive to use for high-dimensional systems.
- Eviatar Bach*, and Michael GhilJournal of Advances in Modeling Earth Systems Jan 2023
Data assimilation (DA) aims to optimally combine model forecasts and observations that are both partial and noisy. Multi-model DA generalizes the variational or Bayesian formulation of the Kalman filter, and we prove that it is also the minimum variance linear unbiased estimator. Here, we formulate and implement a multi-model ensemble Kalman filter (MM-EnKF) based on this framework. The MM-EnKF can combine multiple model ensembles for both DA and forecasting in a flow-dependent manner; it uses adaptive model error estimation to provide matrix-valued weights for the separate models and the observations. We apply this methodology to various situations using the Lorenz96 model for illustration purposes. Our numerical experiments include multiple models with parametric error, different resolved scales, and different fidelities. The MM-EnKF results in significant error reductions compared to the best model, as well as to an unweighted multi-model ensemble, with respect to both probabilistic and deterministic error metrics.
2022
- Oliver R. A. Dunbar, Ignacio Lopez-Gomez, Alfredo Garbuno-Iñigo, Daniel Zhengyu Huang, Eviatar Bach, and Jin-long WuJournal of Open Source Software Dec 2022
- Ashesh Chattopadhyay, Mustafa Mustafa, Pedram Hassanzadeh, Eviatar Bach, and Karthik KashinathGeoscientific Model Development Mar 2022
There is growing interest in data-driven weather prediction (DDWP), e.g., using convolutional neural networks such as U-NET that are trained on data from models or reanalysis. Here, we propose three components, inspired by physics, to integrate with commonly used DDWP models in order to improve their forecast accuracy. These components are (1) a deep spatial transformer added to the latent space of U-NET to capture rotation and scaling transformation in the latent space for spatiotemporal data, (2) a data-assimilation (DA) algorithm to ingest noisy observations and improve the initial conditions for next forecasts, and (3) a multi-time-step algorithm, which combines forecasts from DDWP models with different time steps through DA, improving the accuracy of forecasts at short intervals. To show the benefit and feasibility of each component, we use geopotential height at 500 hPa (Z500) from ERA5 reanalysis and examine the short-term forecast accuracy of specific setups of the DDWP framework. Results show that the spatial-transformer-based U-NET (U-STN) clearly outperforms the U-NET, e.g., improving the forecast skill by 45 %. Using a sigma-point ensemble Kalman (SPEnKF) algorithm for DA and U-STN as the forward model, we show that stable, accurate DA cycles are achieved even with high observation noise. This DDWP+DA framework substantially benefits from large (O(1000)) ensembles that are inexpensively generated with the data-driven forward model in each DA cycle. The multi-time-step DDWP+DA framework also shows promise; for example, it reduces the average error by factors of 2–3. These results show the benefits and feasibility of these three components, which are flexible and can be used in a variety of DDWP setups. Furthermore, while here we focus on weather forecasting, the three components can be readily adopted for other parts of the Earth system, such as ocean and land, for which there is a rapid growth of data and need for forecast and assimilation.
- Yingzuo Qin, Yan Li, Ru Xu, Chengcheng Hou, Alona Armstrong, Eviatar Bach, Yang Wang, and Bojie FuEnvironmental Research Letters Feb 2022
The development of wind energy is essential for decarbonizing energy production. However, the construction of wind farms changes land surface temperature (LST) and vegetation by modifying land surface properties and disturbing land–atmosphere interactions. In this study, we used moderate resolution imaging spectroradiometer satellite data to quantify the impacts on local climate and vegetation of 319 wind farms in the United States. Our results indicated insignificant impacts on LST during the daytime but significant warming of 0.10 \textdegree C of annual mean nighttime LST averaged over all wind farms, and 0.36 \textdegree C for those 61% wind farms with warming. The nighttime LST impacts exhibited seasonal variations, with stronger warming in winter and autumn, up to 0.18 \textdegree C, but weaker effects in summer and spring. We observed a decrease in peak normalized difference vegetation index (NDVI) for 59% of wind farms due to infrastructure construction, with an average reduction of 0.0067 compared to non-wind farm areas. The impacts of wind farms depended on wind farm size, with winter LST impacts for large and small wind farms ranging from 0.21 \textdegree C to 0.14 \textdegree C, and peak NDVI impacts ranging from -0.009 to -0.006. The LST impacts declined with the increasing distance from the wind farm, with detectable impacts up to 10 km. In contrast, the vegetation impacts on NDVI were only evident within the wind farm locations. Wind farms built in grassland and cropland showed larger warming effects but weaker vegetation impact than those built on forests. Furthermore, spatial correlation analyses with environmental factors suggest limited geographical controls on the heterogeneous wind farm impacts and highlight the important role of local factors. Our analyses based on a large sample offer new evidence for wind farm impacts with improved representativeness compared to previous studies. This knowledge is important to fully understand the climatic and environmental implications of energy system decarbonization.
2021
- Journal of Climate Jul 2021
Oscillatory modes of the climate system are among its most predictable features, especially at intraseasonal time scales. These oscillations can be predicted well with data-driven methods, often with better skill than dynamical models. However, since the oscillations only represent a portion of the total variance, a method for beneficially combining oscillation forecasts with dynamical forecasts of the full system was not previously known. We introduce Ensemble Oscillation Correction (EnOC), a general method to correct oscillatory modes in ensemble forecasts from dynamical models. We compute the ensemble mean—or the ensemble probability distribution—with only the best ensemble members, as determined by their discrepancy from a data-driven forecast of the oscillatory modes. We also present an alternate method that uses ensemble data assimilation to combine the oscillation forecasts with an ensemble of dynamical forecasts of the system (EnOC-DA). The oscillatory modes are extracted with a time series analysis method called multichannel singular spectrum analysis (M-SSA), and forecast using an analog method. We test these two methods using chaotic toy models with significant oscillatory components and show that they robustly reduce error compared to the uncorrected ensemble. We discuss the applications of this method to improve prediction of monsoons as well as other parts of the climate system. We also discuss possible extensions of the method to other data-driven forecasts, including machine learning.
- Eviatar Bach*SoftwareX Jan 2021
We introduce parasweep, a free and open-source utility for facilitating parallel parameter sweeps with computational models. Instead of requiring parameters to be passed by command-line, which can be error-prone and time-consuming, parasweep leverages the model’s existing configuration files using a template system, requiring minimal code changes. parasweep supports a variety different sweep types, generating parameter sets accordingly and dispatching a parallel job for each set, with support for local execution as well as common high-performance computing (HPC) job schedulers. Post-processing is facilitated by providing a mapping between the parameter sets and the simulations. We demonstrate the usage of parasweep with an example.
2019
- Eviatar Bach*, Safa Motesharrei, Eugenia Kalnay, and Alfredo Ruiz-BarradasJournal of Climate Nov 2019
Due to the physical coupling between atmosphere and ocean, information about the ocean helps to better predict the future of the atmosphere, and in turn, information about the atmosphere helps to better predict the ocean. Here, we investigate the spatial and temporal nature of this predictability: where, for how long, and at what frequencies does the ocean significantly improve prediction of the atmosphere, and vice versa? We apply Granger causality, a statistical test to measure whether a variable improves prediction of another, to local time series of sea surface temperature (SST) and low-level atmospheric variables. We calculate the detailed spatial structure of the atmosphere-to-ocean and ocean-to-atmosphere predictability. We find that the atmosphere improves prediction of the ocean most in the extratropics, especially in regions of large SST gradients. This atmosphere-to-ocean predictability is weaker but longer-lived in the tropics, where it can last for several months in some regions. On the other hand, the ocean improves prediction of the atmosphere most significantly in the tropics, where this predictability lasts for months to over a year. However, we find a robust signature of the ocean on the atmosphere almost everywhere in the extratropics, an influence that has been difficult to demonstrate with model studies. We find that both the atmosphere-to-ocean and ocean-to-atmosphere predictability are maximal at low frequencies, and both are larger in the summer hemisphere. The patterns we observe generally agree with dynamical understanding and the results of the Kalnay dynamical rule, which diagnoses the direction of forcing between the atmosphere and ocean by considering the local phase relationship between simultaneous sea surface temperature and vorticity anomaly signals. We discuss applications to coupled data assimilation.
- Stephen G. Penny, Eviatar Bach, Kriti Bhargava, Chu-Chun Chang, Cheng Da, Luyu Sun, and Takuma YoshidaJournal of Advances in Modeling Earth Systems Jun 2019
Strongly coupled data assimilation (SCDA) views the Earth as one unified system. This allows observations to have an instantaneous impact across boundaries such as the air-sea interface when estimating the state of each individual component. Operational prediction centers are moving toward Earth system modeling for all forecast timescales, ranging from days to months. However, there have been few studies that examine fundamental aspects of SCDA and the transition from traditional approaches that apply data assimilation only to a single component, whether forecasts were derived from a coupled model or an uncoupled forced model. The SCDA approach is examined here in detail using numerical experiments with a simple coupled atmosphere-ocean quasi-geostrophic model. The impact of coupling is explored with respect to its impact on the Lyapunov spectrum and on data assimilation system stability. Different data assimilation methods are compared within the context of SCDA, including the 3-D and 4-D Variational methods, the ensemble Kalman filter, and the hybrid gain method. The impact of observing system coverage is also investigated. We find that SCDA is generally superior to weakly coupled or uncoupled approaches. Dynamically defined background error covariance estimates are essential for SCDA to achieve an accurate coupled state estimate as the observing system becomes sparser. As a clarification of seemingly contradictory findings from previous studies, it is shown that ocean observations can adequately constrain atmospheric state estimates provided that the analysis-observing frequency is sufficiently high and the ensemble size determining the background error covariance is sufficiently large.
2018
- Yan Li, Eugenia Kalnay, Safa Motesharrei, Jorge Rivas, Fred Kucharski, Daniel Kirk-Davidoff, Eviatar Bach, and Ning ZengScience Sep 2018
Wind and solar farms offer a major pathway to clean, renewable energies. However, these farms would significantly change land surface properties, and, if sufficiently large, the farms may lead to unintended climate consequences. In this study, we used a climate model with dynamic vegetation to show that large-scale installations of wind and solar farms covering the Sahara lead to a local temperature increase and more than a twofold precipitation increase, especially in the Sahel, through increased surface friction and reduced albedo. The resulting increase in vegetation further enhances precipitation, creating a positive albedo–precipitation–vegetation feedback that contributes ~80% of the precipitation increase for wind farms. This local enhancement is scale dependent and is particular to the Sahara, with small impacts in other deserts.
- Eviatar Bach*, Valentina Radić, and Christian SchoofJournal of Glaciology Apr 2018
Simple models of glacier volume evolution are important in understanding features of glacier response to climate change, due to the scarcity of data adequate for running more complex models on a global scale. Two quantities of interest in a glacier’s response to climate changes are its response time and its volume sensitivity to changes in the equilibrium line altitude (ELA). We derive a simplified, computationally inexpensive model of glacier volume evolution based on a block model with volume–area–length scaling. After analyzing its steady-state properties, we apply the model to each mountain glacier worldwide and estimate regionally differentiated response times and sensitivities to ELA changes. We use a statistical method from the family of global sensitivity analysis methods to determine the glacier quantities, geometric and climatic, that most influence the model output. The response time is dominated by the climatic setting reflected in the mass-balance gradient in the ablation zone, followed by the surface slope, while volume sensitivity is mainly affected by glacier size, followed by the surface slope.