Relevant ArXiv eess Papers - 2025-08-06

An AI-driven EDA Algorithm-Empowered VCO and LDO Co-Design Method

Authors: Yijia Hao, Maarten Strackx, Miguel Gandara, Sandy Cochran, Bo Liu

Traditionally, the output noise and power supply rejection of low-dropout regulators (LDOs) are optimized to minimize power supply fluctuations, reducing their impact on the low-frequency noise of target voltage-controlled oscillators (VCOs). However, this sequential design approach does not fully address the trade-offs between high-frequency and LDO-induced low-frequency phase noise. To overcome this limitation, this paper presents a co-design method for low phase-noise LC-tank VCOs powered by LDOs. It is difficult to carry out the co-design using traditional manual design techniques. Hence, an efficient AI-driven EDA algorithm is used. To validate the proposed method, a 5.6 GHz LC-tank VCO with an integrated LDO is designed using a 65 nm CMOS process. Simulations show that the co-design method improves phase noise by 1.2 dB at a 1 MHz offset and reduces dynamic power consumption by 28.8%, with FoM increased by 2.4 dBc/Hz compared to the conventional sequential design method.

Integrating Upstream Supply Chains into Generation Expansion Planning

Authors: Boyu Yao, Andrey Bernstein, Yury Dvorkin

Rising electricity demand underscores the need for secure and reliable generation expansion planning that accounts for upstream supply chain constraints. Traditional models often overlook limitations in materials, manufacturing capacity, lead times for deployment, and field availability, which can delay availability of planned resources and thus to threaten system reliability. This paper introduces a multi-stage supply chain-constrained generation expansion planning (SC-GEP) model that optimizes long-term investments while capturing material availability, production limits, spatial and temporal constraints, and material reuse from retired assets. A decomposition algorithm efficiently solves the resulting MILP. A Maryland case study shows that supply chain constraints shift technology choices, amplify deployment delays caused by lead times, and prompt earlier investment in shorter lead-time, low-material-intensity options. In the low-demand scenario, supply chain constraints raise investment costs by $1.2 billion. Under high demand, persistent generation and reserve shortfalls emerge, underscoring the need to integrate upstream constraints into long-term planning.

Power System Voltage Stability Boundary: Computational Results and Applications

Authors: Zhenyao Li, Yifan Yao, Deqiang Gan

The objective of this paper is to report some computational results for the theory of DAE stability boundary, with the aim of advancing applications in power system voltage stability studies. Firstly, a new regularization transformation for standard differential-algebraic equations (DAEs) is proposed. Then the existence of anchor points on voltage stability boundary is examined, and an optimization method for computing the controlling pseudo-saddle is suggested. Subsequently, a local representation of the stable manifold of the pseudo-saddle on the stability boundary is presented, and a voltage stability margin expression is obtained. Finally, the proposed results are verified using several examples, demonstrating the accuracy and effectiveness of the suggested methods.

Grid-Forming Vector Current Control FRT Modes Under Symmetrical and Asymmetrical Faults

Authors: Ognjen Stanojev, Orcun Karaca, Mario Schweizer

Recent research has shown that operating grid-connected converters using the grid-forming vector current control (GFVCC) scheme offers significant benefits, including the simplicity and modularity of the control architecture, as well as enabling a seamless transition from PLL-based grid-following control to grid-forming. An important aspect of any grid-connected converter control strategy is the handling of grid-fault scenarios such as symmetrical and asymmetrical short-circuit faults. This paper presents several fault ride-through (FRT) strategies for GFVCC that enable the converter to provide fault current and stay synchronized to the grid while respecting the converter hardware limitations and retaining grid-forming behavior. The converter control scheme is extended in a modular manner to include negative-sequence loops, and the proposed FRT strategies address both symmetrical and asymmetrical faults. The proposed FRT strategies are analyzed through case studies, including infinite-bus setups and multi-unit grids.

AI-driven Wireless Positioning: Fundamentals, Standards, State-of-the-art, and Challenges

Authors: Guangjin Pan, Yuan Gao, Yilin Gao, Wenjun Yu, Zhiyong Zhong, Xiaoyu Yang, Xinyu Guo, Shugong Xu

Wireless positioning technologies hold significant value for applications in autonomous driving, extended reality (XR), unmanned aerial vehicles (UAVs), and more. With the advancement of artificial intelligence (AI), leveraging AI to enhance positioning accuracy and robustness has emerged as a field full of potential. Driven by the requirements and functionalities defined in the 3rd Generation Partnership Project (3GPP) standards, AI/machine learning (ML)-based cellular positioning is becoming a key technology to overcome the limitations of traditional methods. This paper presents a comprehensive survey of AI-driven cellular positioning. We begin by reviewing the fundamentals of wireless positioning and AI models, analyzing their respective challenges and synergies. We provide a comprehensive review of the evolution of 3GPP positioning standards, with a focus on the integration of AI/ML in current and upcoming standard releases. Guided by the 3GPP-defined taxonomy, we categorize and summarize state-of-the-art (SOTA) research into two major classes: AI/ML-assisted positioning and direct AI/ML-based positioning. The former includes line-of-sight (LOS)/non-line-of-sight (NLOS) detection, time of arrival (TOA)/time difference of arrival (TDOA) estimation, and angle prediction; the latter encompasses fingerprinting, knowledge-assisted learning, and channel charting. Furthermore, we review representative public datasets and conduct performance evaluations of AI-based positioning algorithms using these datasets. Finally, we conclude by summarizing the challenges and opportunities of AI-driven wireless positioning.

Relevant ArXiv eess Papers - 2025-08-05

Estimating Reliability of Electric Vehicle Charging Ecosystem using the Principle of Maximum Entropy

Authors: Himanshu Tripathi, Subash Neupane, Shahram Rahimi, Noorbakhsh Amiri Golilarz, Sudip Mittal, Mohammad Sepehrifar

This paper addresses the critical challenge of estimating the reliability of an Electric Vehicle (EV) charging systems when facing risks such as overheating, unpredictable, weather, and cyberattacks. Traditional methods for predicting failures often rely on past data or limiting assumptions, making them ineffective for new or less common threats that results in failure. To solve this issue, we utilize the Principle of Maximum Entropy (PME), a statistical tool that estimates risks even with limited information. PME works by balancing known constraints to create an unbiased predictions without guessing missing details. Using the EV charging ecosystem as a case study, we show how PME models stress factors responsible for failure. Our findings reveal a critical insight: even minor, localized stress events can trigger disproportionately large drops in overall system reliability, similar to a domino effect. The our PME model demonstrates how high-impact components, such as the power grid, are more likely to fail as stress accumulates, creating network-wide tipping points. Beyond EVs, this approach applies to any complex system with incomplete data, such as smart grids, healthcare devices, or logistics networks. By mathematically establishing an inverse relationship between uncertainty (entropy) and reliability, our work quantifies how greater system unpredictability directly degrades robustness. This offers a universal tool to improve decision-making under unpredictable conditions. This work bridges advanced mathematics with real-world engineering, providing actionable insights for policymakers and industries to build safer, more efficient systems in our increasingly connected world.

Consumer-based Carbon Costs: Integrating Consumer Carbon Preferences in Electricity Markets

Authors: Wenqian Jiang, Aditya Rangarajan, Line Roald

An increasing share of consumers care about the carbon footprint of their electricity. This paper proposes to integrate consumer carbon preferences in the electricity market-clearing through consumer-based carbon costs. Specifically, consumers can submit not only bids for power but also assign a cost to the carbon emissions incurred by their electricity use. We start from a centralized market clearing that maximizes social welfare under consideration of generation costs, consumer utility and consumer carbon costs. We then derive an equivalent equilibrium formulation which incorporates a carbon allocation problem and gives rise to a set of carbon-adjusted electricity prices for both consumers and generators. We prove that the carbon-adjusted prices are higher for low-emitting generators and consumers with high carbon costs. Further, we prove that this new paradigm satisfies the same desirable market properties as standard electricity markets based on locational marginal prices, namely revenue adequacy and individual rationality, and demonstrate that a carbon tax on generators is equivalent to imposing a uniform carbon cost on consumers. Using a simplified three-bus system and the RTS-GMLC system, we illustrate that consumer-based carbon costs contribute to greener electricity market clearing both through generation redispatch and reductions in demand.

Centralized Dynamic State Estimation Algorithm for Detecting and Distinguishing Faults and Cyber Attacks in Power Systems

Authors: Emad Abukhousa, Syed Sohail Feroz Syed Afroz, Fahad Alsaeed, Abdulaziz Qwbaiban, A.P. Sakis Meliopoulos

As power systems evolve with increased integration of renewable energy sources, they become more complex and vulnerable to both cyber and physical threats. This study validates a centralized Dynamic State Estimation (DSE) algorithm designed to enhance the protection of power systems, particularly focusing on microgrids with substantial renewable energy integration. The algorithm utilizing a structured hypothesis testing framework, systematically identifies and differentiates anomalies caused by cyberattacks from those resulting from physical faults. This algorithm was evaluated through four case studies: a False Data Injection Attack (FDIA) via manipulation of Current Transformer (CT) ratios, a single line-to-ground (SLG) fault, and two combined scenarios involving both anomalies. Results from real-time simulations demonstrate that the algorithm effectively distinguishes between cyber-induced anomalies and physical faults, thereby significantly enhancing the reliability and security of energy systems. This research underscores the critical role of advanced diagnostic tools in protecting power systems against the growing prevalence of cyber-physical threats, enhancing the resilience of the grid and preventing potential blackouts by avoiding the mis-operation of protection relays.

Analog OFDM based on Real-Time Fourier Transformation

Authors: Xiaolu Yang, Oscar Céspedes Vicente, Christophe Caloz

This paper proposes an analog orthogonal frequency division multiplexing (OFDM) architecture based on the real-time Fourier transform (RTFT). The core enabling component is a linear-chirp phaser with engineered group velocity dispersion (GVD), which realizes RTFT and performs frequency-to-time mapping in the analog domain. In this architecture, conventional digital fast Fourier transform (FFT) and inverse FFT (IFFT) processors are replaced by two linear-chirp phasers with opposite group delay dispersions, respectively. Theoretical analysis demonstrates that, under specific phaser conditions, the OFDM signal generated by the RTFT-based analog system is mathematically equivalent to that of a conventional digital OFDM system. This equivalence is further supported by simulation results, which confirm accurate symbol transmission and recovery, as well as robustness to multipath fading when a prefix is applied. Benefiting from the use of passive microwave components, the analog OFDM system offers ultra-fast processing with reduced power consumption. Overall, this work establishes a foundation for fully analog or hybrid analog-digital OFDM system, offering a promising solution for next-generation high-speed, wideband, and energy-efficient wireless communication platforms.

AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation

Authors: Le Wang, Jun Wang, Feng Deng, Chen Zhang, Di Zhang, Kun Gai

We present AudioGen-Omni - a unified approach based on multimodal diffusion transformers (MMDit), capable of generating high-fidelity audio, speech, and songs coherently synchronized with the input video. AudioGen-Omni introduces a novel joint training paradigm that seamlessly integrates large-scale video-text-audio corpora, enabling a model capable of generating semantically rich, acoustically diverse audio conditioned on multimodal inputs and adaptable to a wide range of audio generation tasks. AudioGen-Omni employs a unified lyrics-transcription encoder that encodes graphemes and phonemes from both sung and spoken inputs into dense frame-level representations. Dense frame-level representations are fused using an AdaLN-based joint attention mechanism enhanced with phase-aligned anisotropic positional infusion (PAAPI), wherein RoPE is selectively applied to temporally structured modalities to ensure precise and robust cross-modal alignment. By unfreezing all modalities and masking missing inputs, AudioGen-Omni mitigates the semantic constraints of text-frozen paradigms, enabling effective cross-modal conditioning. This joint training approach enhances audio quality, semantic alignment, and lip-sync accuracy, while also achieving state-of-the-art results on Text-to-Audio/Speech/Song tasks. With an inference time of 1.91 seconds for 8 seconds of audio, it offers substantial improvements in both efficiency and generality.

Relevant ArXiv eess Papers - 2025-08-04

Closed-form Expression for the Power Profile in Wideband Systems with Inter-channel Stimulated Raman Scattering

Authors: Lucas Alves Zischler, Chiara Lasagni, Paolo Serena, Alberto Bononi, Giammarco Di Sciullo, Divya A. Shaji, Antonio Mecozzi, Cristian Antonelli

Wideband systems experience significant inter-channel stimulated Raman scattering (ISRS) and channel-dependent losses. Due to the non-uniform attenuation profile, the combined effects of ISRS and fiber loss can only be accurately estimated using numerical methods. In this work, we present an approximate closed-form expression for the channels' power profile accounting for these combined effects. We validate the proposed expression against numerical solutions in the case of CLU transmission, showing high accuracy for both single-span and multi-span fiber-optic links. Additionally, we derive an inverse expression, formulated as a function of the output power, which can be utilized to target a desired optical signal-to-noise ratio (OSNR) profile through pre-emphasis of the launched channel powers.

Cyber-Physical Co-Simulation of Load Frequency Control under Load-Altering Attacks

Authors: Michał Forystek, Andrew D. Syrmakesis, Alkistis Kontou, Panos Kotsampopoulos, Nikos D. Hatziargyriou, Charalambos Konstantinou

Integrating Information and Communications Technology (ICT) devices into the power grid brings many benefits. However, it also exposes the grid to new potential cyber threats. Many control and protection mechanisms, such as Load Frequency Control (LFC), responsible for maintaining nominal frequency during load fluctuations and Under Frequency Load Shedding (UFLS) disconnecting portion of the load during an emergency, are dependent on information exchange through the communication network. The recently emerging Load Altering Attacks (LAAs) utilize a botnet of high-wattage devices to introduce load fluctuation. In their dynamic form (DLAAs), they manipulate the load in response to live grid frequency measurements for increased efficiency, posing a notable threat to grid stability. Recognizing the importance of communication networks in power grid cyber security research, this paper presents an open-source co-simulation environment that models the power grid with the corresponding communication network, implementing grid protective mechanisms. This setup allows the comprehensive analysis of the attacks in concrete LFC and UFLS scenarios.

Wind Power Scenario Generation based on the Generalized Dynamic Factor Model and Generative Adversarial Network

Authors: Young-ho Cho, Hao Zhu, Duehee Lee, Ross Baldick

For conducting resource adequacy studies, we synthesize multiple long-term wind power scenarios of distributed wind farms simultaneously by using the spatio-temporal features: spatial and temporal correlation, waveforms, marginal and ramp rates distributions of waveform, power spectral densities, and statistical characteristics. Generating the spatial correlation in scenarios requires the design of common factors for neighboring wind farms and antithetical factors for distant wind farms. The generalized dynamic factor model (GDFM) can extract the common factors through cross spectral density analysis, but it cannot closely imitate waveforms. The GAN can synthesize plausible samples representing the temporal correlation by verifying samples through a fake sample discriminator. To combine the advantages of GDFM and GAN, we use the GAN to provide a filter that extracts dynamic factors with temporal information from the observation data, and we then apply this filter in the GDFM to represent both spatial and frequency correlations of plausible waveforms. Numerical tests on the combination of GDFM and GAN have demonstrated performance improvements over competing alternatives in synthesizing wind power scenarios from Australia, better realizing plausible statistical characteristics of actual wind power compared to alternatives such as the GDFM with a filter synthesized from distributions of actual dynamic filters and the GAN with direct synthesis without dynamic factors.

DoF Analysis and Beamforming Design for Active IRS-aided Multi-user MIMO Wireless Communication in Rank-deficient Channels

Authors: Jinbing Jiang, Feng Shu, Xuehui Wang, Ke Yang, Chong Shen, Qi Zhang, Dongming Wang, Jiangzhou Wang

Due to its ability of significantly improving data rate, intelligent reflecting surface (IRS) will be a potential crucial technique for the future generation wireless networks like 6G. In this paper, we will focus on the analysis of degree of freedom (DoF) in IRS-aided multi-user MIMO network. Firstly, the DoF upper bound of IRS-aided single-user MIMO network, i.e., the achievable maximum DoF of such a system, is derived, and the corresponding results are extended to the case of IRS-aided multiuser MIMO by using the matrix rank inequalities. In particular, in serious rank-deficient, also called low-rank, channels like line-of-sight (LoS), the network DoF may doubles over no-IRS with the help of IRS. To verify the rate performance gain from augmented DoF, three closed-form beamforming methods, null-space projection plus maximize transmit power and maximize receive power (NSP-MTP-MRP), Schmidt orthogonalization plus minimum mean square error (SO-MMSE) and two-layer leakage plus MMSE (TLL-MMSE) are proposed to achieve the maximum DoF. Simulation results shows that IRS does make a dramatic rate enhancement. For example, in a serious deficient channel, the sum-rate of the proposed TLL-MMSE aided by IRS is about twice that of no IRS. This means that IRS may achieve a significant DoF improvement in such a channel.

Relevant ArXiv eess Papers - 2025-08-01

DNN-based Methods of Jointly Sensing Number and Directions of Targets via a Green Massive H2AD MIMO Receiver

Authors: Bin Deng, Jiatong Bai, Feilong Zhao, Zuming Xie, Maolin Li, Yan Wang, Feng Shu

As a green MIMO structure, the heterogeneous hybrid analog-digital H2AD MIMO architecture has been shown to own a great potential to replace the massive or extremely large-scale fully-digital MIMO in the future wireless networks to address the three challenging problems faced by the latter: high energy consumption, high circuit cost, and high complexity. However, how to intelligently sense the number and direction of multi-emitters via such a structure is still an open hard problem. To address this, we propose a two-stage sensing framework that jointly estimates the number and direction values of multiple targets. Specifically, three target number sensing methods are designed: an improved eigen-domain clustering (EDC) framework, an enhanced deep neural network (DNN) based on five key statistical features, and an improved one-dimensional convolutional neural network (1D-CNN) utilizing full eigenvalues. Subsequently, a low-complexity and high-accuracy DOA estimation is achieved via the introduced online micro-clustering (OMC-DOA) method. Furthermore, we derive the Cramér-Rao lower bound (CRLB) for the H2AD under multiple-source conditions as a theoretical performance benchmark. Simulation results show that the developed three methods achieve 100\% number of targets sensing at moderate-to-high SNRs, while the improved 1D-CNN exhibits superior under extremely-low SNR conditions. The introduced OMC-DOA outperforms existing clustering and fusion-based DOA methods in multi-source environments.

Terahertz for Radar applications and Wireless Communication

Authors: Sofiane Latreche, Hocine Bellahsene, Abdelmalik Taleb-Ahmed

Technological advancements in the design of electronic and optical materials have opened up the possibility of utilizing the latest available Radio Frequency spectrum the Terahertz (THz) band. This band holds great promise for next-generation wireless systems, which are poised to seamlessly integrate a wide array of data-intensive and time-sensitive applications. In this article, we delve into the Terahertz band, providing insights into its properties and showcasing examples of its applications. We begin by exploring the specific characteristics of wireless communications and radar systems operating in the THz band. Subsequently, we analyze various effects and parameters unique to each of these this http URL we scrutinize the application of Terahertz (THz) wireless and radar systems, delving into the modeling of various facets of radio frequency propagation within this domain. The interpretation of our findings will be presented at the conclusion of this study.

Experimentally-Driven Analysis of Stability in Connected Vehicle Platooning: Insights and Control Strategies

Authors: Niladri Dutta, Elham Abolfazli, Themistoklis Charalambous

This paper presents the development of a tangible platform for demonstrating the practical implementation of cooperative adaptive cruise control (CACC) systems, an enhancement to the standard adaptive cruise control (ACC) concept by means of Vehicle-to-Everything (V2X) communication. It involves a detailed examination of existing longitudinal controllers and their performance in homogeneous vehicle platoons. Moreover, extensive tests are conducted using multiple autonomous experimental vehicle platform topologies to verify the effectiveness of the controller. The outcomes from both simulations and field tests affirm the substantial benefits of the proposed CACC platooning approach in longitudinal vehicle platooning scenarios. This research is crucial due to a notable gap in the existing literature; while numerous studies focus on simulated vehicle platooning systems, there is lack of research demonstrating these controllers on physical vehicle systems or robot platforms. This paper seeks to fill this gap by providing a practical demonstration of CACC systems in action, showcasing their potential for real-world application in intelligent transportation systems.

Foundation Models for Clean Energy Forecasting: A Comprehensive Review

Authors: Md Meftahul Ferdaus, Tanmoy Dam, Md Rasel Sarkar, Moslem Uddin, Sreenatha G. Anavatti

As global energy systems transit to clean energy, accurate renewable generation and renewable demand forecasting is imperative for effective grid management. Foundation Models (FMs) can help improve forecasting of renewable generation and demand because FMs can rapidly process complex, high-dimensional time-series data. This review paper focuses on FMs in the realm of renewable energy forecasting, primarily focusing on wind and solar. We present an overview of the architectures, pretraining strategies, finetuning methods, and types of data used in the context of renewable energy forecasting. We emphasize the role of models that are trained at a large scale, domain specific Transformer architectures, where attention is paid to spatial temporal correlations, the embedding of domain knowledge, and also the brief and intermittent nature of renewable generation. We assess recent FM based advancements in forecast accuracy such as reconciling predictions over multiple time scales and quantifying uncertainty in renewable energy forecasting. We also review existing challenges and areas of improvement in long-term and multivariate time series forecasting. In this survey, a distinction between theory and practice is established regarding the use of FMs in the clean energy forecasting domain. Additionally, it critically assesses the strengths and weaknesses of FMs while advancing future research direction in this new and exciting area of forecasting.

Energy management and flexibility quantification in a discrete event distribution grid simulation

Authors: Sebastian Peter, Daniel Feismann, Johannes Bao, Thomas Oberließen, Christian Rehtanz

Distribution grid operation faces new challenges caused by a rising share of renewable energy sources and the introduction of additional types of loads to the grid. With the increasing adoption of distributed generation and emerging prosumer households, Energy Management Systems, which manage and apply flexibility of connected devices, are gaining popularity. While potentially beneficial to grid capacity, strategic energy management also adds to the complexity of distribution grid operation and planning processes. Novel approaches of time-series-based planning likewise face increasingly complex simulation scenarios and rising computational cost. Discrete event modelling helps facilitating simulations of such scenarios by restraining computation to the most relevant points in simulation time. We provide an enhancement of a discrete event distribution grid simulation software that offers fast implementation and testing of energy management algorithms, embedded into a feature-rich simulation environment. Physical models are specified using the Discrete Event System Specification. Furthermore, we contribute a communication protocol that makes use of the discrete event paradigm by only computing flexibility potential when necessary.

Asynchronous Grid Connections Providing Fast-Frequency Response: System Integration Study

Authors: Felix Wald, Amir Sajadi, Barry Mather, Giovanni De Carne

This paper presents an integration study for a power electronic-based fast-frequency response technology, an asynchronous grid connection operating as an aggregator for behindthe-meter resources and distributed generators. Both technical feasibility and techno-economic viability studies are presented. The dynamic performance of the fast-frequency response enabled by the asynchronous grid connection is validated with Power Hardware-in-the-Loop experiments and transferred to an IEEE 9-bus system in DigSilent PowerFactory for dynamic stability analysis. We demonstrate that droop-based control enhancements to the local distributed generators could allow their aggregation to provide grid-supporting functionalities and participate in the market for ancillary services. To this end, we performed a long-term simulation embedding the system within the ancillary service market framework of PJM. The fast-frequency response regulation is subsequently used to calculate the potential revenue and project the results on a 15-year investment horizon. Finally, the techno-economic analysis concludes with recommendations for enhancements to access the full potential of distributed generators on a technical and regulatory level.

PGLib-CO2: A Power Grid Library for Computing and Optimizing Carbon Emissions

Authors: Young-ho Cho, Min-Seung Ko, Hao Zhu

A sustainable electricity infrastructure requires the explicit integration of carbon emissions into power system modeling and optimization paradigms. However, existing open-source datasets for power system R&D lack generator-level carbon emission profiling, limiting the ability to benchmark and compare various carbon-aware grid operational strategies. To address this gap, this work introduces PGLib-CO2, an open-source extension to the widely adopted PGLib-OPF test case library. PGLib-CO2 enriches standard network cases with CO2 and CO2-equivalent emission intensity factors by expanding the fuel-type categorization used by PGLib-OPF, attaining a realistic generator-level carbon profiling. It is also packaged for both Python's pandapower and Julia's this http URL, for a seamless, user-friendly integration of emission modeling into grid computation and optimization tasks. The dataset produced by PGLib-CO2 can support grid-based carbon accounting, emission metric evaluation, and integration into AC optimal power flow (OPF) and optimal load shifting (OLS) formulations. We demonstrate PGLib-CO2's utility through case studies that quantify cost-emission trade-offs and optimize a carbon-aware objective function. By standardizing carbon-enhanced test cases, PGLib-CO2 provides an open-source, reproducible foundation for benchmarking carbon-aware computation, facilitating future research in sustainable power system operation.

Harnessing Rydberg Atomic Receivers: From Quantum Physics to Wireless Communications

Authors: Yuanbin Chen, Xufeng Guo, Chau Yuen, Yufei Zhao, Yong Liang Guan, Chong Meng Samson See, Merouane Débbah, Lajos Hanzo

The intrinsic integration of Rydberg atomic receivers into wireless communication systems is proposed, by harnessing the principles of quantum physics in wireless communications. More particularly, we conceive a pair of Rydberg atomic receivers, one incorporates a local oscillator (LO), referred to as an LO-dressed receiver, while the other operates without an LO and is termed an LO-free receiver. The appropriate wireless model is developed for each configuration, elaborating on the receiver's responses to the radio frequency (RF) signal, on the potential noise sources, and on the signal-to-noise ratio (SNR) performance. The developed wireless model conforms to the classical RF framework, facilitating compatibility with established signal processing methodologies. Next, we investigate the associated distortion effects that might occur, specifically identifying the conditions under which distortion arises and demonstrating the boundaries of linear dynamic ranges. This provides critical insights into its practical implementations in wireless systems. Finally, extensive simulation results are provided for characterizing the performance of wireless systems, harnessing this pair of Rydberg atomic receivers. Our results demonstrate that LO-dressed systems achieve a significant SNR gain of approximately 40~50 dB over conventional RF receivers in the standard quantum limit regime. This SNR head-room translates into reduced symbol error rates, enabling efficient and reliable transmission with higher-order constellations.

Cross-layer Integrated Sensing and Communication: A Joint Industrial and Academic Perspective

Authors: Henk Wymeersch, Nuutti Tervo, Stefan Wänstedt, Sharief Saleh, Joerg Ahlendorf, Ozgur Akgul, Vasileios Tsekenis, Sokratis Barmpounakis, Liping Bai, Martin Beale, Rafael Berkvens, Nabeel Nisar Bhat, Hui Chen, Shrayan Das, Claude Desset, Antonio de la Oliva, Prajnamaya Dass, Jeroen Famaey, Hamed Farhadi, Gerhard P. Fettweis, Yu Ge, Hao Guo, Rreze Halili, Katsuyuki Haneda, Abdur Rahman Mohamed Ismail, Akshay Jain, Sylvaine Kerboeuf, Musa Furkan Keskin, Emad Ibrahim, Bilal Khan, Siddhartha Kumar, Stefan Köpsell, Apostolos Kousaridas, Pekka Kyösti, Simon Lindberg, Mohammad Hossein Moghaddam, Ahmad Nimr, Victor Pettersson, Aarno Pärssinen, Basuki Priyanto, Athanasios Stavridis, Tommy Svensson, Sonika Ujjwal

Integrated sensing and communication (ISAC) enables radio systems to simultaneously sense and communicate with their environment. This paper, developed within the Hexa-X-II project funded by the European Union, presents a comprehensive cross-layer vision for ISAC in 6G networks, integrating insights from physical-layer design, hardware architectures, AI-driven intelligence, and protocol-level innovations. We begin by revisiting the foundational principles of ISAC, highlighting synergies and trade-offs between sensing and communication across different integration levels. Enabling technologies (such as multiband operation, massive and distributed MIMO, non-terrestrial networks, reconfigurable intelligent surfaces, and machine learning) are analyzed in conjunction with hardware considerations including waveform design, synchronization, and full-duplex operation. To bridge implementation and system-level evaluation, we introduce a quantitative cross-layer framework linking design parameters to key performance and value indicators. By synthesizing perspectives from both academia and industry, this paper outlines how deeply integrated ISAC can transform 6G into a programmable and context-aware platform supporting applications from reliable wireless access to autonomous mobility and digital twinning.

Relevant ArXiv eess Papers - 2025-07-31

Optimal Planning for Enhancing the Resilience of Modern Distribution Systems Against Cyberattacks

Authors: Armita Khashayardoost, Ahmad Mohammad Saber, Deepa Kundur

The increasing integration of IoT-connected devices in smart grids has introduced new vulnerabilities at the distribution level. Of particular concern is the potential for cyberattacks that exploit high-wattage IoT devices, such as EV chargers, to manipulate local demand and destabilize the grid. While previous studies have primarily focused on such attacks at the transmission level, this paper investigates their feasibility and impact at the distribution level. We examine how cyberattackers can target voltage-sensitive nodes, especially those exposed by the presence of high-consumption devices, to cause voltage deviation and service disruption. Our analysis demonstrates that conventional grid protections are insufficient against these intelligent, localized attacks. To address this, we propose resilience strategies using distributed generation (DGs), exploring their role in preemptive planning. This research highlights the urgent need for distribution-level cyber resilience planning in smart grids.

Green One-Bit Quantized Precoding in Cell-Free Massive MIMO

Authors: Salih Gümüsbuğa, Ozan Alp Topal, Özlem Tuğfe Demir

Cell-free massive MIMO (multiple-input multiple-output) is expected to be one of the key technologies in sixth-generation (6G) and beyond wireless communications, offering enhanced spectral efficiency for cell-edge user equipments by employing joint transmission and reception with a large number of antennas distributed throughout the region. However, high-resolution RF chains associated with these antennas significantly increase power consumption. To address this issue, the use of low-resolution analog-to-digital and digital-to-analog converters (ADCs/DACs) has emerged as a promising approach to balance power efficiency and performance in massive MIMO networks. In this work, we propose a novel quantized precoding algorithm tailored for cell-free massive MIMO systems, where the proposed method dynamically deactivates unnecessary antennas based on the structure of each symbol vector, thereby enhancing energy efficiency. Simulation results demonstrate that our algorithm outperforms existing methods such as squared-infinity norm Douglas-Rachford splitting (SQUID) and regularized zero forcing (RZF), achieving superior performance while effectively reducing power consumption.

Assessing Value of Renewable-based VPP Versus Electrical Storage: Multi-market Participation Under Different Scheduling Regimes and Uncertainties

Authors: Hadi Nemati, Ignacio Egido, Pedro Sánchez-Martín, Álvaro Ortega

This paper compares the participation of Renewable-only Virtual Power Plants (RVPPs) and grid-scale Electrical Storage Systems (ESSs) in energy and reserve markets, evaluating their technical performance, market strategies, and economic outcomes. To ensure a fair comparison, scheduling is analyzed over representative sample days that capture seasonal operating regimes, and the associated uncertainties are explicitly modeled. Two-stage robust optimization frameworks are employed: the RVPP model addresses price, generation, and demand uncertainties, whereas the ESS model considers price uncertainty only. In addition, an algorithm is proposed for sizing the ESS so that its market performance matches that of the RVPP. Simulations cover both favorable and unfavorable scenarios, reflecting seasonal energy limits for dispatchable resources, varying forecast errors for nondispatchable resources, and alternative uncertainty-management strategies. The results provide operators with quantitative guidance on the relative value of each approach.

Measurement and Analysis of the Power Consumption of Hybrid-Amplified SCL-band Links

Authors: Ronit Sohanpal, Jiaqian Yang, Eric Sillekens, Henrique Buglia, Mingming Tan, Dini Pratiwi, Robert I. Killey, Polina Bayvel

We studied the power consumption of hybrid-amplified SCL-band links using commercial benchtop amplifiers and Raman pumps. We show a reduction in energy per bit for multi-span hybrid Raman amplified links of up to 26% versus lumped amplification.

Foundations for Energy-Aware Zero-Energy Devices: From Energy Sensing to Adaptive Protocols

Authors: Onel L. A. López, Mateen Ashraf, Samer Nasser, Gabriel M. de Jesus, Ritesh Kumar Singh, Miltiadis C. Filippou, Jeroen Famaey

Zero-energy devices (ZEDs) are key enablers of sustainable Internet of Things networks by operating solely on harvested ambient energy. Their limited and dynamic energy budget calls for protocols that are energy-aware and intelligently adaptive. However, designing effective energy-aware protocols for ZEDs requires theoretical models that realistically reflect device constraints. Indeed, existing approaches often oversimplify key aspects such as energy information (EI) acquisition, task-level variability, and energy storage dynamics, limiting their practical relevance and transferability. This article addresses this gap by offering a structured overview of the key modeling components, trade-offs, and limitations involved in energy-aware ZED protocol design. For this, we dissect EI acquisition methods and costs, characterize core operational tasks, analyze energy usage models and storage constraints, and review representative protocol strategies. Moreover, we offer design insights and guidelines on how ZED operation protocols can leverage EI, often illustrated through selected in-house examples. Finally, we outline key research directions to inspire more efficient and scalable protocol solutions for future ZEDs.

Large Language Model-Based Framework for Explainable Cyberattack Detection in Automatic Generation Control Systems

Authors: Muhammad Sharshar, Ahmad Mohammad Saber, Davor Svetinovic, Amr M. Youssef, Deepa Kundur, Ehab F. El-Saadany

The increasing digitization of smart grids has improved operational efficiency but also introduced new cybersecurity vulnerabilities, such as False Data Injection Attacks (FDIAs) targeting Automatic Generation Control (AGC) systems. While machine learning (ML) and deep learning (DL) models have shown promise in detecting such attacks, their opaque decision-making limits operator trust and real-world applicability. This paper proposes a hybrid framework that integrates lightweight ML-based attack detection with natural language explanations generated by Large Language Models (LLMs). Classifiers such as LightGBM achieve up to 95.13% attack detection accuracy with only 0.004 s inference latency. Upon detecting a cyberattack, the system invokes LLMs, including GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o mini, to generate human-readable explanation of the event. Evaluated on 100 test samples, GPT-4o mini with 20-shot prompting achieved 93% accuracy in identifying the attack target, a mean absolute error of 0.075 pu in estimating attack magnitude, and 2.19 seconds mean absolute error (MAE) in estimating attack onset. These results demonstrate that the proposed framework effectively balances real-time detection with interpretable, high-fidelity explanations, addressing a critical need for actionable AI in smart grid cybersecurity.

A reference frame-based microgrid primary control for ensuring global convergence to a periodic orbit

Authors: Xinyuan Jiang, Constantino M. Lagoa, Daning Huang, Yan Li

Power systems with a high penetration of renewable generation are vulnerable to frequency oscillation and voltage instability. Traditionally, the stability of power systems is considered either in terms of local stability or as an angle oscillator synchronization problem with the simplifying assumption that the dynamics of the amplitudes are on much shorter time scales. Without this assumption, however, the steady state being studied is essentially a limit cycle with the convergence of its orbit in question. In this paper, we present a method to analyze the orbital stability of a microgrid and propose a voltage controller for the inverter-interfaced renewable generators. The main hurdle to the problem lies in the constant terms in the rotating internal reference frames of each generator. We extend the shifted passivity of port-Hamiltonian systems to the analysis of limit cycles and prove that, if the system is shifted passive without considering these constant terms, then the periodic orbit is globally attractive. To the best of our knowledge, this is the first global stability result for non-nominal steady states of the microgrid in the full state space, which provides new insights into the synchronization phenomenon where the dissipativity of the system ensures convergence. The proposed controller is verified with a test microgrid, demonstrating its stability and transient smoothness compared to the standard droop control.

Mutual Coupling-Aware Channel Estimation and Beamforming for RIS-Assisted Communications

Authors: Pinjun Zheng, Simon Tarboush, Hadi Sarieddeen, Tareq Y. Al-Naffouri

This work studies the problems of channel estimation and beamforming for active reconfigurable intelligent surface (RIS)-assisted multiple-input multiple-output (MIMO) communication, incorporating the mutual coupling (MC) effect through an electromagnetically consistent model. We first demonstrate that MC can be incorporated into a compressed sensing (CS) formulation, albeit with an increase in the dimensionality of the sensing matrix. To overcome this increased complexity, we propose a two-stage strategy. Initially, a low-complexity MC-unaware CS estimation is performed to obtain a coarse channel estimate, which is then used to implement a dictionary reduction (DR) for the MC-aware estimation, effectively reducing the dimensionality of the sensing matrices. This method achieves estimation accuracy close to the direct MC-aware CS method with less overall computational complexity. Furthermore, we consider the joint optimization of RIS configuration, base station precoding, and user combining in a single-user MIMO system. We employ an alternating optimization strategy to optimize these three beamformers. The primary challenge lies in optimizing the RIS configuration, as the MC effect renders the problem non-convex and intractable. To address this, we propose a novel algorithm based on the successive convex approximation (SCA) and the Neumann series expansion. Within the SCA framework, we propose a surrogate function that rigorously satisfies both convexity and equal-gradient conditions to update the iteration direction. Numerical results validate our proposal, demonstrating that the proposed channel estimation and beamforming methods effectively manage the MC in RIS, achieving higher spectral efficiency compared to state-of-the-art approaches.

Coordinated vehicle dispatching and charging scheduling for an electric ride-hailing fleet under charging congestion and dynamic prices

Authors: Tai-Yu Ma, Richard D. Connors, Francesco Viti

Effective utilization of charging station capacity plays an important role in enhancing the profitability of ride-hailing systems using electric vehicles. Existing studies assume constant energy prices and uncapacitated charging stations or do not explicitly consider vehicle queueing at charging stations, resulting in over-optimistic charging infrastructure utilization. In this study, we develop a dynamic charging scheduling method (named CongestionAware) that anticipates vehicles' energy needs and coordinates their charging operations with real-time energy prices to avoid long waiting time at charging stations and increase the total profit of the system. A sequential mixed integer linear programming model is proposed to devise vehicles' day-ahead charging plans based on their experienced charging waiting times and energy consumption. The obtained charging plans are adapted within the day in response to vehicles' energy needs and charging station congestion. The developed charging policy is tested using NYC yellow taxi data in a Manhattan-like study area with a fleet size of 100 vehicles given the scenarios of 3000 and 4000 customers per day. The computational results show that our CongestionAware policy outperforms different benchmark policies with up to +15.06% profit and +19.16% service rate for 4000 customers per day. Sensitivity analysis is conducted with different system parameters and managerial insights are discussed.

Relevant ArXiv eess Papers - 2025-07-30

Simultaneous improvement of control and estimation for battery management systems

Authors: Mohammad S. Ramadan, Marfred Barrera, Mihai Anitescu, Sylvia Herbert

The state of charge of battery systems is an important metric typically estimated by observation models, represented by open-circuit voltage graphs. These observation models are often nonlinear in the state of charge, resulting in varying observability from a state estimation perspective. In this paper, we employ a stochastic optimal control (also known as dual control) approach to simultaneously satisfy the control objective in the state of charge of battery systems and improve estimation accuracy. This is achieved implicitly by prioritizing trajectories that pass through high-observability regions of the state space, thereby improving the quality of future measurements. We apply our algorithm to a numerical simulation of a multi-battery system and show a statistical improvement in both the control objective and the state estimation error.

Experimental Implementation and Validation of Predictor-Based CACC for Vehicular Platoons With Distinct Actuation Delays

Authors: Amirhossein Samii, Redmer de Haan, Nikolaos Bekiaris-Liberis

We provide experimental validation, in a pair of vehicles, of a recently introduced predictor-based cooperative adaptive cruise control (CACC) design, developed for achieving delay compensation in heterogeneous vehicular platoons subject to long actuation delays that may be distinct for each individual vehicle. We provide the explicit formulae of the control design that is implemented, accounting for the effect of zero-order hold and sampled measurements; as well as we obtain vehicle and string stability conditions numerically, via derivation of the transfer functions relating the speeds of pairs of consecutive vehicles. We also present consistent simulation results for a platoon with a larger number of vehicles, under digital implementation of the controller. Both the simulation and experimental results confirm the effectiveness of the predictor-based CACC design in guaranteeing individual vehicle stability, string stability, and tracking, despite long/distinct actuation delays.

The impact of large-scale EV charging on the real-time operation of distribution systems: A comprehensive review

Authors: Zhe Yu, Chuang Yang, Qin Wang

With the large-scale integration of electric vehicles (EVs) in the distribution grid, the unpredictable nature of EV charging introduces considerable uncertainties to the grid's real-time operations. This can exacerbate load fluctuations, compromise power quality, and pose risks to the grid's stability and security. However, due to their dual role as controllable loads and energy storage devices, EVs have the potential to mitigate these fluctuations, balance the variability of renewable energy sources, and provide ancillary services that support grid stability. By leveraging the bidirectional flow of information and energy in smart grids, the adverse effects of EV charging can be minimized and even converted into beneficial outcomes through effective real-time management strategies. This paper explores the negative impacts of EV charging on the distribution system's real-time operations and outlines methods to transform these challenges into positive contributions. Additionally, it provides an in-depth analysis of the real-time management system for EV charging, focusing on state estimation and management strategies.

Deep Reinforcement Learning for Real-Time Green Energy Integration in Data Centers

Authors: Abderaouf Bahi, Amel Ourici

This paper explores the implementation of a Deep Reinforcement Learning (DRL)-optimized energy management system for e-commerce data centers, aimed at enhancing energy efficiency, cost-effectiveness, and environmental sustainability. The proposed system leverages DRL algorithms to dynamically manage the integration of renewable energy sources, energy storage, and grid power, adapting to fluctuating energy availability in real time. The study demonstrates that the DRL-optimized system achieves a 38\% reduction in energy costs, significantly outperforming traditional Reinforcement Learning (RL) methods (28\%) and heuristic approaches (22\%). Additionally, it maintains a low SLA violation rate of 1.5\%, compared to 3.0\% for RL and 4.8\% for heuristic methods. The DRL-optimized approach also results in an 82\% improvement in energy efficiency, surpassing other methods, and a 45\% reduction in carbon emissions, making it the most environmentally friendly solution. The system's cumulative reward of 950 reflects its superior performance in balancing multiple objectives. Through rigorous testing and ablation studies, the paper validates the effectiveness of the DRL model's architecture and parameters, offering a robust solution for energy management in data centers. The findings highlight the potential of DRL in advancing energy optimization strategies and addressing sustainability challenges.

Assessment of Quantitative Cyber-Physical Reliability of SCADA Systems in Autonomous Vehicle to Grid (V2G) Capable Smart Grids

Authors: Md Abdul Gaffar

The integration of electric vehicles (EVs) into power grids via Vehicle-to-Grid (V2G) system technology is increasing day by day, but these phenomena present both advantages and disadvantages. V2G can increase grid reliability by providing distributed energy storage and ancillary services. However, on the other hand, it has a scope that encompasses the cyber-physical attack surface of the national power grid, introducing new vulnerabilities in monitoring and supervisory control and data acquisition (SCADA) systems. This paper investigates the maliciousness caused by Autonomous Vehicle to Grid (AV2G) communication infrastructures and assesses their impacts on SCADA system reliability. This paper presents a quantitative reliability assessment using Bayesian attack graph combined with probabilistic capacity outage modeling based on IEEE RTS-79 system data. This work presents how AV2G-based attacks degrade system performance by using Monte Carlo simulations method, highlighting the need for cybersecurity-hardening strategies in smart grid design.

Large Language Model Powered Automated Modeling and Optimization of Active Distribution Network Dispatch Problems

Authors: Xu Yang, Chenhui Lin, Yue Yang, Qi Wang, Haotian Liu, Haizhou Hua, Wenchuan Wu

The increasing penetration of distributed energy resources into active distribution networks (ADNs) has made effective ADN dispatch imperative. However, the numerous newly-integrated ADN operators, such as distribution system aggregators, virtual power plant managers, and end prosumers, often lack specialized expertise in power system operation, modeling, optimization, and programming. This knowledge gap renders reliance on human experts both costly and time-intensive. To address this challenge and enable intelligent, flexible ADN dispatch, this paper proposes a large language model (LLM) powered automated modeling and optimization approach. First, the ADN dispatch problems are decomposed into sequential stages, and a multi-LLM coordination architecture is designed. This framework comprises an Information Extractor, a Problem Formulator, and a Code Programmer, tasked with information retrieval, optimization problem formulation, and code implementation, respectively. Afterwards, tailored refinement techniques are developed for each LLM agent, greatly improving the accuracy and reliability of generated content. The proposed approach features a user-centric interface that enables ADN operators to derive dispatch strategies via simple natural language queries, eliminating technical barriers and increasing efficiency. Comprehensive comparisons and end-to-end demonstrations on various test cases validate the effectiveness of the proposed architecture and methods.

Robust Capacity Expansion Modelling for Renewable Energy Systems under Weather Uncertainty

Authors: Sebastian Kebrich, Felix Engelhardt, David Franzmann, Christina Büsing, Jochen Linßen, Heidi Heinrichs

Future greenhouse gas neutral energy systems will be dominated by renewable energy technologies whose energy output is subject to uncertain weather conditions. This work proposes an algorithm to do capacity expansion planning (CAPEX) under weather uncertainty. When faced with multiple possible weather years, the quality of a CAPEX solution derived on a single year's data is evaluated across all years, and the CAPEX optimisation problem is iteratively modified whenever supply gaps are detected. These modifications lead to solutions with sufficient back--up capacity to overcome periods of (cold) dark lulls, and sufficient total annual energy supply across all years. A computational study on an energy system model of Germany shows that the iterative algorithm finds solutions that guarantee security of supply for all considered weather years for an increase of 1.6-2.9% in total annual cost compared to initial solutions. Results also underline the importance of assessing the feasibility of energy system models using atypical time--series, including dark lull and cold period effects.

Relevant ArXiv eess Papers - 2025-07-29

Computing Longitudinal Dynamic Derivatives of a VTOL Aircraft Using CFD Simulations and Forced-Oscillation Model

Authors: Ali Khosravani Nezhad, AmirReza Kosari, Rasoul Askari

This study presents a comprehensive evaluation of dynamic aerodynamic derivatives during aircraft transition phases using advanced CFD simulations and forced oscillation testing. Two case studies are examined: a three dimensional fighter aircraft (Standard Dynamic Model, SDM) and a UT24 eVTOL model. The transition phase from vertical hover to forward cruise is analyzed with harmonic oscillation techniques to capture unsteady aerodynamic forces and moments. Grid sensitivity studies and multi zone meshing strategies ensure simulation accuracy, while ANSYS Fluent finite volume solver and coupled pressure velocity algorithms provide high fidelity results. Dynamic derivatives are derived from variations in angle of attack, flight path, and rotational movements, with experimental and numerical data validating the approach. The findings offer valuable insights for robust control design and stability analysis, supporting future advancements in urban air mobility and aerospace engineering. Overall, this approach demonstrates substantial promise for optimizing aircraft performance during critical transition phases. These results pave the way for future innovations

Biogeography-Based Optimization of Fuzzy Controllers for Improved Quarter Car Suspension Performance

Authors: Lida Shahbandari, Mohammad Mansouri

This study proposes optimized Type-I and Type-II fuzzy controllers for automotive suspension systems to enhance ride comfort and stability under road disturbances (step/sine inputs), addressing the lack of systematic performance comparisons in existing literature. We integrate Biogeography-Based Optimization (BBO), Particle Swarm Optimization (PSO), and Genetic Algorithms (GA) to tune controller parameters for a quarter car model, with emphasis on BBO's underexplored efficacy. MATLAB Simulink simulations demonstrate that BBO-optimized Type-II fuzzy control reduces body displacement by 22% and acceleration by 18% versus baseline methods under step disturbances, while maintaining computational efficiency. The framework provides practical, high-performance solutions for modern vehicles, particularly electric and autonomous platforms where vibration attenuation and energy efficiency are critical.

SLENet: A Novel Multiscale CNN-Based Network for Detecting the Rats Estrous Cycle

Authors: Qinyang Wang, Hoileong Lee, Xiaodi Pu, Yuanming Lai, Yiming Ma

In clinical medicine, rats are commonly used as experimental subjects. However, their estrous cycle significantly impacts their biological responses, leading to differences in experimental results. Therefore, accurately determining the estrous cycle is crucial for minimizing interference. Manually identifying the estrous cycle in rats presents several challenges, including high costs, long training periods, and subjectivity. To address these issues, this paper proposes a classification network-Spatial Long-distance EfficientNet (SLENet). This network is designed based on EfficientNet, specifically modifying the Mobile Inverted Bottleneck Convolution (MBConv) module by introducing a novel Spatial Efficient Channel Attention (SECA) mechanism to replace the original Squeeze Excitation (SE) module. Additionally, a Non-local attention mechanism is incorporated after the last convolutional layer to enhance the network's ability to capture long-range dependencies. The dataset used 2,655 microscopic images of rat vaginal epithelial cells, with 531 images in the test set. Experimental results indicate that SLENet achieved an accuracy of 96.31%, outperforming baseline EfficientNet model (94.2%). This finding provide practical value for optimizing experimental design in rat-based studies such as reproductive and pharmacological research, but this study is limited to microscopy image data, without considering other factors like temporal patterns, thus, incorporating multi-modal input is necessary for future application.

Multisession Longitudinal Dynamic MRI Incorporating Patient-Specific Prior Image Information Across Time

Authors: Jingjia Chen, Hersh Chandarana, Daniel K. Sodickson, Li Feng

Serial Magnetic Resonance Imaging (MRI) exams are often performed in clinical practice, offering shared anatomical and motion information across imaging sessions. However, existing reconstruction methods process each session independently without leveraging this valuable longitudinal information. In this work, we propose a novel concept of longitudinal dynamic MRI, which incorporates patient-specific prior images to exploit temporal correlations across sessions. This framework enables progressive acceleration of data acquisition and reduction of scan time as more imaging sessions become available. The concept is demonstrated using the 4D Golden-angle RAdial Sparse Parallel (GRASP) MRI, a state-of-the-art dynamic imaging technique. Longitudinal reconstruction is performed by concatenating multi-session time-resolved 4D GRASP datasets into an extended dynamic series, followed by a low-rank subspace-based reconstruction algorithm. A series of experiments were conducted to evaluate the feasibility and performance of the proposed method. Results show that longitudinal 4D GRASP reconstruction consistently outperforms standard single-session reconstruction in image quality, while preserving inter-session variations. The approach demonstrated robustness to changes in anatomy, imaging intervals, and body contour, highlighting its potential for improving imaging efficiency and consistency in longitudinal MRI applications. More generally, this work suggests a new context-aware imaging paradigm in which the more we see a patient, the faster we can image.

Wardropian Cycles make traffic assignment both optimal and fair by eliminating price-of-anarchy with Cyclical User Equilibrium for compliant connected autonomous vehicles

Authors: Michał Hoffmann, Michał Bujak, Grzegorz Jamróz, Rafał Kucharski

Connected and Autonomous Vehicles (CAVs) open the possibility for centralised routing with full compliance, making System Optimal traffic assignment attainable. However, as System Optimum makes some drivers better off than others, voluntary acceptance seems dubious. To overcome this issue, we propose a new concept of Wardropian cycles, which, in contrast to previous utopian visions, makes the assignment fair on top of being optimal, which amounts to satisfaction of both Wardrop's principles. Such cycles, represented as sequences of permutations to the daily assignment matrices, always exist and equalise, after a limited number of days, average travel times among travellers (like in User Equilibrium) while preserving everyday optimality of path flows (like in System Optimum). We propose exact methods to compute such cycles and reduce their length and within-cycle inconvenience to the users. As identification of optimal cycles turns out to be NP-hard in many aspects, we introduce a greedy heuristic efficiently approximating the optimal solution. Finally, we introduce and discuss a new paradigm of Cyclical User Equilibrium, which ensures stability of optimal Wardropian Cycles under unilateral deviations. We complement our theoretical study with large-scale simulations. In Barcelona, 670 vehicle-hours of Price-of-Anarchy are eliminated using cycles with a median length of 11 days-though 5% of cycles exceed 90 days. However, in Berlin, just five days of applying the greedy assignment rule significantly reduces initial inequity. In Barcelona, Anaheim, and Sioux Falls, less than 7% of the initial inequity remains after 10 days, demonstrating the effectiveness of this approach in improving traffic performance with more ubiquitous social acceptability.

Coverage Probability and Average Rate Analysis of Hybrid Cellular and Cell-free Network

Authors: Zhuoyin Dai, Jingran Xu, Xiaoli Xu, Ruoguang Li, Yong Zeng, Jiangbin Lyu

Cell-free wireless networks deploy distributed access points (APs) to simultaneously serve user equipments (UEs) across the service region and are regarded as one of the most promising network architectural paradigms. Despite recent advances in the performance analysis and optimization of cellfree wireless networks, it remains an open question whether large-scale deployment of APs in existing wireless networks can cost-effectively achieve communication capacity growth. Besides, the realization of a cell-free network is considered to be a gradual long-term evolutionary process in which cell-free APs will be incrementally introduced into existing cellular networks, and form a hybrid communication network with the existing cellular base stations (BSs). Such a collaboration will bridge the gap between the established cellular network and the innovative cellfree network. Therefore, hybrid cellular and cell-free networks (HCCNs) emerge as a practical and feasible solution for advancing cell-free network development, and it is worthwhile to further explore its performance limits. This paper presents a stochastic geometry-based hybrid cellular and cell-free network model to analyze the distributions of signal and interference and reveal their mutual coupling. Specifically, in order to benefit the UEs from both the cellular BSs and the cell-free APs, a conjugate beamforming design is employed, and the aggregated signal is analyzed using moment matching. Then, the coverage probability of the hybrid network is characterized by deriving the Laplace transforms and their higher-order derivatives of interference components. Furthermore, the average achievable rate of the hybrid network over channel fading is derived based on the interference coupling analysis.

Radar and Acoustic Sensor Fusion using a Transformer Encoder for Robust Drone Detection and Classification

Authors: Gevindu Ganganath, Pasindu Sankalpa, Samal Punsara, Demitha Pasindu, Chamira U. S. Edussooriya, Ranga Rodrigo, Udaya S. K. P. Miriya Thanthrige

The use of drones in a wide range of applications is steadily increasing. However, this has also raised critical security concerns such as unauthorized drone intrusions into restricted zones. Therefore, robust and accurate drone detection and classification mechanisms are required despite significant challenges due to small size of drones, low-altitude flight, and environmental noise. In this letter, we propose a multi-modal approach combining radar and acoustic sensing for detecting and classifying drones. We employ radar due to its long-range capabilities, and robustness to different weather conditions. We utilize raw acoustic signals without converting them to other domains such as spectrograms or Mel-frequency cepstral coefficients. This enables us to use fewer number of parameters compared to the stateof-the-art approaches. Furthermore, we explore the effectiveness of the transformer encoder architecture in fusing these sensors. Experimental results obtained in outdoor settings verify the superior performance of the proposed approach compared to the state-of-the-art methods.

Feature Engineering for Wireless Communications and Networking: Concepts, Methodologies, and Applications

Authors: Jiacheng Wang, Changyuan Zhao, Zehui Xiong, Tao Xiang, Dusit Niyato, Xianbin Wang, Shiwen Mao, Dong In Kim

AI-enabled wireless communications have attracted tremendous research interest in recent years, particularly with the rise of novel paradigms such as low-altitude integrated sensing and communication (ISAC) networks. Within these systems, feature engineering plays a pivotal role by transforming raw wireless data into structured representations suitable for AI models. Hence, this paper offers a comprehensive investigation of feature engineering techniques in AI-driven wireless communications. Specifically, we begin with a detailed analysis of fundamental principles and methodologies of feature engineering. Next, we present its applications in wireless communication systems, with special emphasis on ISAC networks. Finally, we introduce a generative AI-based framework, which can reconstruct signal feature spectrum under malicious attacks in low-altitude ISAC networks. The case study shows that it can effectively reconstruct the signal spectrum, achieving an average structural similarity index improvement of 4%, thereby supporting downstream sensing and communication applications.

Dependability Theory-based Statistical QoS Provisioning of Fluid Antenna Systems

Authors: Irfan Muhammad, Priyadarshi Mukherjee, Wee Kiat New, Hirley Alves, Ioannis Krikidis, Kai-Kit Wong

Fluid antenna systems (FAS) have recently emerged as a promising technology for next-generation wireless networks, offering real-time spatial reconfiguration to enhance reliability, throughput, and energy efficiency. Nevertheless, existing studies often overlook the temporal dynamics of channel fading and their implications for mission-critical operations. In this paper, we propose a dependability-theoretic framework for statistical quality-of-service (QoS) provisioning of FAS under finite blocklength (FBL) constraints. Specifically, we derive new closed-form expressions for the level-crossing rate (LCR) and average fade duration (AFD) of an $N$-port FAS over Nakagami-$m$ fading channels. Leveraging these second-order statistics, we define two key dependability metrics such as mission reliability and mean time-to-first-failure (MTTFF), to quantify the probability of uninterrupted operation over a defined mission duration. We further extend the classical effective capacity (EC) concept to incorporate mission reliability in the FBL regime, yielding a mission EC (mEC). To capture energy efficiency under bursty traffic and latency constraints, we also develop the mission effective energy efficiency (mEEE) metric and formulate its maximization as a non-convex fractional optimization problem. This problem is then solved via a modified Dinkelbach's method with an embedded line search. Extensive simulations uncover critical trade-offs among port count, QoS exponent, signal-to-noise ratio, and mission duration, offering insights for the design of ultra-reliable, low-latency, and energy-efficient industrial internet-of-things (IIoT) systems.

DOA Estimation via Optimal Weighted Low-Rank Matrix Completion

Authors: Saeed Razavikia, Mohammad Bokaei, Arash Amini, Stefano Rini, Carlo Fischione

This paper presents a novel method for estimating the direction of arrival (DOA) for a non-uniform and sparse linear sensor array using the weighted lifted structure low-rank matrix completion. The proposed method uses a single snapshot sample in which a single array of data is observed. The method is rooted in a weighted lifted-structured low-rank matrix recovery framework. The method involves four key steps: (i) lifting the antenna samples to form a low-rank stature, then (ii) designing left and right weight matrices to reflect the sample informativeness, (iii) estimating a noise-free uniform array output through completion of the weighted lifted samples, and (iv) obtaining the DOAs from the restored uniform linear array samples. We study the complexity of steps (i) to (iii) above, where we analyze the required sample for the array interpolation of step (iii) for DOA estimation. We demonstrate that the proposed choice of weight matrices achieves a near-optimal sample complexity. This complexity aligns with the problem's degree of freedom, equivalent to the number of DOAs adjusted for logarithmic factors. Numerical evaluations show the proposed method's superiority against the non-weighted counterpart and atomic norm minimization-based methods. Notably, our proposed method significantly improves, with approximately a 10 dB reduction in normalized mean-squared error over the non-weighted method at low-noise conditions.

Comparative Analysis of Data-Driven Predictive Control Strategies

Authors: Sohrab Rezaei, Ali Khaki-Sedigh

This paper compares data-driven predictive control strategies by examining their theoretical foundations, assumptions, and applications. The three most widely recognized and consequential methods, Data Enabled Predictive Control, Willems-Koopman Predictive Control, Model-Free Adaptive Predictive Control are employed. Each of these strategies is systematically reviewed, and the primary theories supporting it are outlined. Following analysis, a discussion is provided regarding their fundamental assumptions, emphasizing their influence on control effectiveness. A numerical example is presented as a benchmark for comparison to enable a rigorous performance evaluation.

Sequential Operation of Residential Energy Hubs

Authors: Darío Slaifstein (1), Gautham Ram Chandra Mouli (1), Laura Ramirez-Elizondo (1), Pavol Bauer (1) ((1) Delft University of Technology)

The operation of residential energy hubs with multiple energy carriers (electricity, heat, mobility) poses a significant challenge due to different carrier dynamics, hybrid storage coordination and high-dimensional action-spaces. Energy management systems oversee their operation, deciding the set points of the primary control layer. This paper presents a novel 2-stage economic model predictive controller for electrified buildings including physics-based models of the battery degradation and thermal systems. The hierarchical control operates in the Dutch sequential energy markets. In particular common assumptions regarding intra-day markets (auction and continuous-time) are discussed as well as the coupling of the different storage systems. The best control policy is to co-optimize day-ahead and intra-day auctions in the first stage, to later follow intra-day auctions. If no intra-day prices are known at the time of the day-ahead auction, its best to follow continuous time intra-day in the summer and the intra-day auction in the winter. Additionally, this sequential operation increases battery degradation. Finally, under our controller the realized short-term flexibility of the thermal energy storage is marginal compared to the flexibility delivered by static battery pack and electric vehicles with bidirectional charging.

VAE-GAN Based Price Manipulation in Coordinated Local Energy Markets

Authors: Biswarup Mukherjee, Li Zhou, S. Gokul Krishnan, Milad Kabirifar, Subhash Lakshminarayana, Charalambos Konstantinou

This paper introduces a model for coordinating prosumers with heterogeneous distributed energy resources (DERs), participating in the local energy market (LEM) that interacts with the market-clearing entity. The proposed LEM scheme utilizes a data-driven, model-free reinforcement learning approach based on the multi-agent deep deterministic policy gradient (MADDPG) framework, enabling prosumers to make real-time decisions on whether to buy, sell, or refrain from any action while facilitating efficient coordination for optimal energy trading in a dynamic market. In addition, we investigate a price manipulation strategy using a variational auto encoder-generative adversarial network (VAE-GAN) model, which allows utilities to adjust price signals in a way that induces financial losses for the prosumers. Our results show that under adversarial pricing, heterogeneous prosumer groups, particularly those lacking generation capabilities, incur financial losses. The same outcome holds across LEMs of different sizes. As the market size increases, trading stabilizes and fairness improves through emergent cooperation among agents.

A Hybrid Mean Field Framework for Aggregators Participating in Wholesale Electricity Markets

Authors: Jun He, Andrew L. Liu

The rapid growth of distributed energy resources (DERs), including rooftop solar and energy storage, is transforming the grid edge, where distributed technologies and customer-side systems increasingly interact with the broader power grid. DER aggregators, entities that coordinate and optimize the actions of many small-scale DERs, play a key role in this transformation. This paper presents a hybrid Mean-Field Control (MFC) and Mean-Field Game (MFG) framework for integrating DER aggregators into wholesale electricity markets. Unlike traditional approaches that treat market prices as exogenous, our model captures the feedback between aggregators' strategies and locational marginal prices (LMPs) of electricity. The MFC component optimizes DER operations within each aggregator, while the MFG models strategic interactions among multiple aggregators. To account for various uncertainties, we incorporate reinforcement learning (RL), which allows aggregators to learn optimal bidding strategies in dynamic market conditions. We prove the existence and uniqueness of a mean-field equilibrium and validate the framework through a case study of the Oahu Island power system. Results show that our approach reduces price volatility and improves market efficiency, offering a scalable and decentralized solution for DER integration in wholesale markets.

Should Top-Down Clustering Affect Boundaries in Unsupervised Word Discovery?

Authors: Simon Malan, Benjamin van Niekerk, Herman Kamper

We investigate the problem of segmenting unlabeled speech into word-like units and clustering these to create a lexicon. Prior work can be categorized into two frameworks. Bottom-up methods first determine boundaries and then cluster the fixed segmented words into a lexicon. In contrast, top-down methods incorporate information from the clustered words to inform boundary selection. However, it is unclear whether top-down information is necessary to improve segmentation. To explore this, we look at two similar approaches that differ in whether top-down clustering informs boundary selection. Our simple bottom-up strategy predicts word boundaries using the dissimilarity between adjacent self-supervised features, then clusters the resulting segments to construct a lexicon. Our top-down system is an updated version of the ES-KMeans dynamic programming method that iteratively uses K-means to update its boundaries. On the five-language ZeroSpeech benchmarks, both approaches achieve comparable state-of-the-art results, with the bottom-up system being nearly five times faster. Through detailed analyses, we show that the top-down influence of ES-KMeans can be beneficial (depending on factors like the candidate boundaries), but in many cases the simple bottom-up method performs just as well. For both methods, we show that the clustering step is a limiting factor. Therefore, we recommend that future work focus on improved clustering techniques and learning more discriminative word-like representations. Project code repository: this https URL.

Computer Audition: From Task-Specific Machine Learning to Foundation Models

Authors: Andreas Triantafyllopoulos, Iosif Tsangko, Alexander Gebhard, Annamaria Mesaros, Tuomas Virtanen, Björn Schuller

Foundation models (FMs) are increasingly spearheading recent advances on a variety of tasks that fall under the purview of computer audition -- the use of machines to understand sounds. They feature several advantages over traditional pipelines: among others, the ability to consolidate multiple tasks in a single model, the option to leverage knowledge from other modalities, and the readily-available interaction with human users. Naturally, these promises have created substantial excitement in the audio community, and have led to a wave of early attempts to build new, general-purpose foundation models for audio. In the present contribution, we give an overview of computational audio analysis as it transitions from traditional pipelines towards auditory foundation models. Our work highlights the key operating principles that underpin those models, and showcases how they can accommodate multiple tasks that the audio community previously tackled separately.

FMSD-TTS: Few-shot Multi-Speaker Multi-Dialect Text-to-Speech Synthesis for Ü-Tsang, Amdo and Kham Speech Dataset Generation

Authors: Yutong Liu, Ziyue Zhang, Ban Ma-bao, Yuqing Cai, Yongbin Yu, Renzeng Duojie, Xiangxiang Wang, Fan Gao, Cheng Huang, Nyima Tashi

Tibetan is a low-resource language with minimal parallel speech corpora spanning its three major dialects-Ü-Tsang, Amdo, and Kham-limiting progress in speech modeling. To address this issue, we propose FMSD-TTS, a few-shot, multi-speaker, multi-dialect text-to-speech framework that synthesizes parallel dialectal speech from limited reference audio and explicit dialect labels. Our method features a novel speaker-dialect fusion module and a Dialect-Specialized Dynamic Routing Network (DSDR-Net) to capture fine-grained acoustic and linguistic variations across dialects while preserving speaker identity. Extensive objective and subjective evaluations demonstrate that FMSD-TTS significantly outperforms baselines in both dialectal expressiveness and speaker similarity. We further validate the quality and utility of the synthesized speech through a challenging speech-to-speech dialect conversion task. Our contributions include: (1) a novel few-shot TTS system tailored for Tibetan multi-dialect speech synthesis, (2) the public release of a large-scale synthetic Tibetan speech corpus generated by FMSD-TTS, and (3) an open-source evaluation toolkit for standardized assessment of speaker similarity, dialect consistency, and audio quality.

A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation

Authors: Dennis Benders, Laura Ferranti, Johannes Köhler

Designing a Model Predictive Control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing Nonlinear Model Predictive Control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.

Relevant ArXiv eess Papers - 2025-07-28

An Explainable Equity-Aware P2P Energy Trading Framework for Socio-Economically Diverse Microgrid

Authors: Abhijan Theja, Mayukha Pal

Fair and dynamic energy allocation in community microgrids remains a critical challenge, particularly when serving socio-economically diverse participants. Static optimization and cost-sharing methods often fail to adapt to evolving inequities, leading to participant dissatisfaction and unsustainable cooperation. This paper proposes a novel framework that integrates multi-objective mixed-integer linear programming (MILP), cooperative game theory, and a dynamic equity-adjustment mechanism driven by reinforcement learning (RL). At its core, the framework utilizes a bi-level optimization model grounded in Equity-regarding Welfare Maximization (EqWM) principles, which incorporate Rawlsian fairness to prioritize the welfare of the least advantaged participants. We introduce a Proximal Policy Optimization (PPO) agent that dynamically adjusts socio-economic weights in the optimization objective based on observed inequities in cost and renewable energy access. This RL-powered feedback loop enables the system to learn and adapt, continuously striving for a more equitable state. To ensure transparency, Explainable AI (XAI) is used to interpret the benefit allocations derived from a weighted Shapley value. Validated across six realistic scenarios, the framework demonstrates peak demand reductions of up to 72.6%, and significant cooperative gains. The adaptive RL mechanism further reduces the Gini coefficient over time, showcasing a pathway to truly sustainable and fair energy communities.

Approximating CCCV charging using SOC-dependent tapered charging power constraints in long-term microgrid planning

Authors: Hassan Zahid Butt, Xingpeng Li

Traditional long-term microgrid planning models assume constant power charging for battery energy storage systems (BESS), overlooking efficiency losses that occur toward the end of charge due to rising internal resistance. While this issue can be mitigated at the cell level using constant current-constant voltage (CCCV) charging, it is impractical at the pack level in large-scale systems. However, battery management systems and inverter controls can emulate this effect by tapering charging power at high state-of-charge (SOC) levels, trading off charging speed for improved efficiency and reduced thermal stress. Ignoring this behavior in planning models can lead to undersized batteries and potential reliability issues. This paper proposes a tractable and scalable approach to approximate CCCV behavior using SOC-dependent tapered charging power (TCP) constraints. A MATLAB-based proof of concept demonstrates the energy delivery and efficiency benefits of tapering. The method is integrated into a long-term planning framework and evaluated under a synthetic load and solar profile. Results show tapering significantly affects BESS sizing, cost, and reliability under dynamic operating conditions that demand fast charging. These findings highlight tapering as a critical modeling factor for accurately capturing BESS performance in long-term microgrid planning.

Research on Sectionalizing Switches Placement Problem of Distribution System Automation Based on Multi-Objective Optimization Analysis

Authors: Selma Cheshmeh Khavar, Arya Abdollahi

Achieving high distribution-reliability levels and concurrently minimizing operating costs can be considered as the main issues in distribution system optimization. Determination of the optimal number and location of automation devices in the distribution system network is an essential issue from the reliability and economical points of view. To address these issues, this paper develops a multi-objective model, wherein the primary objective, optimal automation devices placement is implemented aiming at minimizing the operating costs, while in the second objective the reliability indices improvement is taken into account. So, modified non dominated sorting genetic algorithm, is developed and presented to solve this multi-objective mixed-integer non-linear programming problem. The feasibility of the proposed algorithm examined by application to two distribution feeders of the Tabriz distribution network containing the third feeder of the Azar substation with a distributed generation unit and first and third feeders of ElGoli substation which form a double feed feeder.

Cell-based VSC Analysis Methodology: From Graph Laplacian to Converter Degrees of Freedom

Authors: Daniele Falchi, Eduardo Prieto-Araujo, Oriol Gomis-Bellmunt

Power-electronics-based converters are being considerably employed through the power system to interconnect multiple heterogeneous electrical layers. Furthermore, the intrinsic versatility to play with the converter network topology is widely exploited to accommodate a certain number of terminals and ports according with the specific application. On this regard, several converter arrangements can be encountered in power applications. Moreover, to properly establish both the operation and the control, the so-called degrees of freedom (DOFs) need to be assessed per each converter topology. On this matter, similarly to the well-known Clarke transformation, which clearly reveals the DOFs for the star-based topology system, further similar transformations can be achieved to depict the independent set of variables characterizing a certain converter structure. Referring to the cell-based class of Voltage Source Converter (VSC) topologies, including Modular Multilevel Converter (MMC); this article proposes a general methodology to determine the change of variable matrix transformation for several converter arrangements which are related to complete bi-partite and multi-partite graphs. The methodology lies in the graph Laplacian spectral analysis, which remarks the structural normal modes at the converter points of connections. Furthermore, for a complete characterization, the instantaneous power patterns formulations, based on the DOFs, are also introduced.

Modal-based prediction of power system frequency response and frequency nadir

Authors: Francisco Zelaya-Arrazabal, Sebastian Martinez-Lizana, Héctor Pulgar-Painemal

This paper introduces a novel approach for predicting system frequency response (SFR) and frequency nadir based on modal analysis. By decomposing the full system dynamic response, the method identifies dominant modes based on their participation in frequency behavior and derives a closed-form expression for the frequency trajectory. Unlike traditional approaches based on the Average System Frequency (ASF) model, this method captures the true system dynamics and avoids oversimplified representations. The dominant modes exhibit low sensitivity to system parameters, enabling robust and accurate estimations across diverse operating conditions. The proposed approach is tested on two benchmark systems as well as the Salvadoran transmission planning network, demonstrating its scalability, precision, and adaptability. This methodology represents a shift from observing a simplified average system frequency response to a more detailed analysis focusing on system dynamics.

A User-centric Game for Balancing V2G Benefits with Battery Degradation of Electric Vehicles

Authors: Arghya Mallick, Georgios Pantazis, Peyman Mohajerin Esfahani, Sergio Grammatico

We present a novel user-centric vehicle-to-grid (V2G) framework that enables electric vehicle (EV) users to balance the trade-off between financial benefits from V2G and battery health degradation based on individual preference signals.

Quasi Steady-State Frequency

Authors: Joan Gutierrez-Florensa, Alvaro Ortega, Lukas Sigrist, Federico Milano

Accurate frequency estimation is critical for the control, monitoring and protection of electrical power systems, in particular, of systems with a high penetration of power electronics. This paper introduces the novel concept of Quasi Steady-State (QSS) frequency as a quantity that fills the gap between stationary and instantaneous frequency. QSS frequency coincides with the fundamental frequency of an AC voltage in any stationary conditions, including unbalanced and non-sinusoidal, and is able to capture the time-varying fundamental frequency in transient conditions. The paper also proposes a metric borrowed from fluid dynamics, namely, the time derivative of the circulation, to define the scope of validity of the QSS frequency. Analytical examples as well as a case study based on a fully-fledged EMT model of the IEEE 39-bus system serve to illustrate, respectively, the properties of the QSS frequency and its behavior in transient conditions.

Collision-free Control Barrier Functions for General Ellipsoids via Separating Hyperplane

Authors: Zeming Wu, Lu Liu

This paper presents a novel collision avoidance method for general ellipsoids based on control barrier functions (CBFs) and separating hyperplanes. First, collision-free conditions for general ellipsoids are analytically derived using the concept of dual cones. These conditions are incorporated into the CBF framework by extending the system dynamics of controlled objects with separating hyperplanes, enabling efficient and reliable collision avoidance. The validity of the proposed collision-free CBFs is rigorously proven, ensuring their effectiveness in enforcing safety constraints. The proposed method requires only single-level optimization, significantly reducing computational time compared to state-of-the-art methods. Numerical simulations and real-world experiments demonstrate the effectiveness and practicality of the proposed algorithm.

Relevant ArXiv eess Papers - 2025-07-25

Safe Reinforcement Learning-based Automatic Generation Control

Authors: Amr S. Mohamed, Emily Nguyen, Deepa Kundur

Amidst the growing demand for implementing advanced control and decision-making algorithms|to enhance the reliability, resilience, and stability of power systems|arises a crucial concern regarding the safety of employing machine learning techniques. While these methods can be applied to derive more optimal control decisions, they often lack safety assurances. This paper proposes a framework based on control barrier functions to facilitate safe learning and deployment of reinforcement learning agents for power system control applications, specifically in the context of automatic generation control. We develop the safety barriers and reinforcement learning framework necessary to establish trust in reinforcement learning as a safe option for automatic generation control - as foundation for future detailed verification and application studies.

Carbon Emission Flow Tracing: Fast Algorithm and California Grid Study

Authors: Yuqing Shen, Yuanyuan Shi, Daniel Kirschen, Yize Chen

Power systems decarbonization are at the focal point of the clean energy transition. While system operators and utility companies increasingly publicize system-level carbon emission information, it remains unclear how emissions from individual generators are transported through the grid and how they impact electricity users at specific locations. This paper presents a novel and computationally efficient approach for exact quantification of nodal average and marginal carbon emission rates, applicable to both AC and DC optimal power flow problems. The approach leverages graph-based topological sorting and directed cycle removal techniques, applied to directed graphs formed by generation dispatch and optimal power flow solutions. Our proposed algorithm efficiently identifies each generator's contribution to each node, capturing how emissions are spatially distributed under varying system conditions. To validate its effectiveness and reveal locational and temporal emission patterns in the real world, we simulate the 8,870-bus realistic California grid using actual CAISO data and the CATS model. Based on year long hourly data on nodal loads and renewable generation, obtained or estimated from CAISO public data, our method accurately estimates power flow conditions, generation mixes, and systemwide emissions, and delivers fine grained spatiotemporal emission analysis for every California county. Both our algorithm and the California study are open-sourced, providing a foundation for future research on grid emissions, planning, operations, and energy policy.

Towards Microgrid Resilience Enhancement via Mobile Power Sources and Repair Crews: A Multi-Agent Reinforcement Learning Approach

Authors: Yi Wang, Dawei Qiu, Fei Teng, Goran Strbac

Mobile power sources (MPSs) have been gradually deployed in microgrids as critical resources to coordinate with repair crews (RCs) towards resilience enhancement owing to their flexibility and mobility in handling the complex coupled power-transport systems. However, previous work solves the coordinated dispatch problem of MPSs and RCs in a centralized manner with the assumption that the communication network is still fully functioning after the event. However, there is growing evidence that certain extreme events will damage or degrade communication infrastructure, which makes centralized decision making impractical. To fill this gap, this paper formulates the resilience-driven dispatch problem of MPSs and RCs in a decentralized framework. To solve this problem, a hierarchical multi-agent reinforcement learning method featuring a two-level framework is proposed, where the high-level action is used to switch decision-making between power and transport networks, and the low-level action constructed via a hybrid policy is used to compute continuous scheduling and discrete routing decisions in power and transport networks, respectively. The proposed method also uses an embedded function encapsulating system dynamics to enhance learning stability and scalability. Case studies based on IEEE 33-bus and 69-bus power networks are conducted to validate the effectiveness of the proposed method in load restoration.

Regional Frequency-Constrained Planning for the Optimal Sizing of Power Systems via Enhanced Input Convex Neural Networks

Authors: Yi Wang, Goran Strbac

Large renewable penetration has been witnessed in power systems, resulting in reduced levels of system inertia and increasing requirements for frequency response services. There have been plenty of studies developing frequency-constrained models for power system security. However, most existing literature only considers uniform frequency security, while neglecting frequency spatial differences in different regions. To fill this gap, this paper proposes a novel planning model for the optimal sizing problem of power systems, capturing regional frequency security and inter-area frequency oscillations. Specifically, regional frequency constraints are first extracted via an enhanced input convex neural network (ICNN) and then embedded into the original optimisation for frequency security, where a principled weight initialisation strategy is adopted to deal with the gradient vanishing issues of non-negative weights in traditional ICNNs and enhance its fitting ability. An adaptive genetic algorithm with sparsity calculation and local search is developed to separate the planning model into two stages and effectively solve it iteratively. Case studies have been conducted on three different power systems to verify the effectiveness of the proposed frequency-constrained planning model in ensuring regional system security and obtaining realistic investment decisions.

Two-Stage TSO-DSO Services Provision Framework for Electric Vehicle Coordination

Authors: Yi Wang, Dawei Qiu, Fei Teng, Goran Strbac

High renewable penetration has been witnessed in power systems, resulting in reduced system inertia and increasing requirements for frequency response services. Electric vehicles (EVs), owing to their vehicle-to-grid (V2G) capabilities, can provide cost-effective frequency services for transmission system operators (TSOs). However, EVs that are inherently connected to distribution networks may pose voltage security issues for distribution system operators (DSOs) when supporting TSO frequency. To coordinate both TSO frequency and DSO voltage, this paper proposes a two-stage service provision framework for multi-EVs. At stage one, EVs participate in day-ahead TSO-DSO interactions for frequency reserve schedules; at stage two, EVs make real-time dispatching behaviors in distribution networks for reserve delivery while supporting DSO voltage. Considering the potentially large EV number and environment complexity, a decentralized operation paradigm is introduced for real-time EV dispatches at stage two, while a communication-efficient reinforcement learning (RL) algorithm is proposed to reduce the communication overhead during large-scale multi-agent RL training without compromising policy performance. Case studies are carried out on a 6-bus transmission and 33-bus distribution network as well as a 69-bus distribution network to evaluate the effectiveness and scalability of the proposed method in enabling EVs for frequency service and voltage support.

Unit Commitment Framework for Nuclear Reactors with Reactivity Decline

Authors: Shiny Choudhury, Michael Davidson, George Tynan

Nuclear reactors are often modeled as inflexible, baseload generators with fixed downtimes and restrictive ramping limits. In practice, however, a reactor's operational flexibility is closely tied to it's fuel cycle stage and the associated reactivity margin. A key physical constraint to power maneuverability is xenon poisoning, caused by an increase in neutron absorbing xenon concentration following a power ramp down. This can delay or even prevent subsequent power ramp up due to suppressed core reactivity. Additionally, if a reactor is shutdown during periods of low reactivity, restart times can vary significantly due to these xenon transients, leading to longer downtimes. This work introduces a physics informed, metaheuristic modeling approach that embeds fuel cycle dynamics directly with a unit commitment (UC) framework. The framework tracks reactivity margin, dynamically activates xenon related constraints, and endogenously implements refueling outages based on the core conditions. By capturing intra-cycle reactivity evolution and the conditional onset of xenon poisoning, the formulation allows for operation dependent nuclear dispatch that reflects both regulatory limits and physical behavior. When applied to a representative reactor fleet operating in distinct modes of operation -- ranging from baseload to part load -- the framework reveals that flexible operation can slow reactivity degradation and extend fuel cycles. The results show that fuel cycle aware flexibility modeling is critical for accurate scheduling of nuclear reactors and offers a tractable pathway to integrate nuclear power in energy system models.

Stability Constrained Voltage Control in Distribution Grids with Arbitrary Communication Infrastructure

Authors: Zhenyi Yuan, Jie Feng, Yuanyuan Shi, Jorge Cortés

We consider the problem of designing learning-based reactive power controllers that perform voltage regulation in distribution grids while ensuring closed-loop system stability. In contrast to existing methods, where the provably stable controllers are restricted to be decentralized, we propose a unified design framework that enables the controllers to take advantage of an arbitrary communication infrastructure on top of the physical power network. This allows the controllers to incorporate information beyond their local bus, covering existing methods as a special case and leading to less conservative constraints on the controller design. We then provide a design procedure to construct input convex neural network (ICNN) based controllers that satisfy the identified stability constraints by design under arbitrary communication scenarios, and train these controllers using supervised learning. Simulation results on the the University of California, San Diego (UCSD) microgrid testbed illustrate the effectiveness of the framework and highlight the role of communication in improving control performance.

Integrated Learning and Optimization for Congestion Management and Profit Maximization in Real-Time Electricity Market

Authors: Imran Pervez, Ricardo Pinto Lima, Omar Knio

We develop novel integrated learning and optimization (ILO) methodologies to solve economic dispatch (ED) and DC optimal power flow (DCOPF) problems for better economic operation. The optimization problem for ED is formulated with load being an unknown parameter while DCOPF consists of load and power transfer distribution factor (PTDF) matrix as unknown parameters. PTDF represents the incremental variations of real power on transmission lines which occur due to real power transfers between two regions. These values represent a linearized approximation of power flows over the transmission lines. We develop novel ILO formulations to solve post-hoc penalties in electricity market and line congestion problems using ED and DCOPF optimization formulations. Our proposed methodologies capture the real-time electricity market and line congestion behavior to train the regret function which eventually train unknown loads at different buses and line PTDF matrix to achieve the afore-mentioned post-hoc goals. The proposed methodology is compared to sequential learning and optimization (SLO) which train load and PTDF forecasts for accuracy rather than economic operation. Our experimentation prove the superiority of ILO in minimizing the post-hoc penalties in electricity markets and minimizing the line congestion thereby improving the economic operation with noticeable amount.

Auto-SGCR: Automated Generation of Smart Grid Cyber Range Using IEC 61850 Standard Models

Authors: Muhammad M. Roomi, S. M. Suhail Hussain, Ee-Chien Chang, David M. Nicol, Daisuke Mashima

Digitalization of power grids have made them increasingly susceptible to cyber-attacks in the past decade. Iterative cybersecurity testing is indispensable to counter emerging attack vectors and to ensure dependability of critical infrastructure. Furthermore, these can be used to evaluate cybersecurity configuration, effectiveness of the cybersecurity measures against various attack vectors, as well as to train smart grid cybersecurity experts defending the system. Enabling extensive experiments narrows the gap between academic research and production environment. A high-fidelity cyber range is vital as it is often infeasible to conduct such experiments and training using production environment. However, the design and implementation of cyber range requires extensive domain knowledge of physical and cyber aspect of the infrastructure. Furthermore, costs incurred for setup and maintenance of cyber range are significant. Moreover, most existing smart grid cyber ranges are designed as a one-off, proprietary system, and are limited in terms of configurability, accessibility, portability, and reproducibility. To address these challenges, an automated Smart grid Cyber Range generation framework is presented in this paper. Initially a human-/machine-friendly, XML-based modeling language called Smart Grid Modeling Language was defined, which incorporates IEC 61850 System Configuration Language files. Subsequently, a toolchain to parse SG-ML model files and automatically instantiate a functional smart grid cyber range was developed. The developed SG-ML models can be easily shared and/or modified to reproduce or customize for any cyber range. The application of Auto-SGCR is demonstrated through case studies with large-scale substation models. The toolchain along with example SG-ML models have been open-sourced.

Integrated Sensing and Edge AI: Realizing Intelligent Perception in 6G

Authors: Zhiyan Liu, Xu Chen, Hai Wu, Zhanwei Wang, Xianhao Chen, Dusit Niyato, Kaibin Huang

Sensing and edge artificial intelligence (AI) are envisioned as two essential and interconnected functions in sixth-generation (6G) mobile networks. On the one hand, sensing-empowered applications rely on powerful AI models to extract features and understand semantics from ubiquitous wireless sensors. On the other hand, the massive amount of sensory data serves as the fuel to continuously refine edge AI models. This deep integration of sensing and edge AI has given rise to a new task-oriented paradigm known as integrated sensing and edge AI (ISEA), which features a holistic design approach to communication, AI computation, and sensing for optimal sensing-task performance. In this article, we present a comprehensive survey for ISEA. We first provide technical preliminaries for sensing, edge AI, and new communication paradigms in ISEA. Then, we study several use cases of ISEA to demonstrate its practical relevance and introduce current standardization and industrial progress. Next, the design principles, metrics, tradeoffs, and architectures of ISEA are established, followed by a thorough overview of ISEA techniques, including digital air interface, over-the-air computation, and advanced signal processing. Its interplay with various 6G advancements, e.g., new physical-layer and networking techniques, are presented. Finally, we present future research opportunities in ISEA, including the integration of foundation models, convergence of ISEA and integrated sensing and communications (ISAC), ultra-low-latency ISEA, and practicality issues.

Relevant ArXiv eess Papers - 2025-07-24

Stable and Fair Benefit Allocation in Mixed-Energy Truck Platooning: A Coalitional Game Approach

Authors: Ting Bai, Karl Henrik Johansson, Jonas Mårtensson, Andreas A. Malikopoulos

This paper addresses the benefit allocation in a mixed-energy truck platoon composed of fuel-powered and electric trucks. The interactions among trucks during platoon formation are modeled as a coalitional game with transferable utility. We first design a stable payoff allocation scheme that accounts for truck heterogeneity in energy savings and platoon roles (leader or follower), establishing core-stability conditions to ensure that no subset of trucks has an incentive to deviate for greater benefit. To enhance payoff fairness, we then propose a closed-form, Shapley value-based allocation approach that is computationally efficient and independent of the platoon size. Sufficient conditions under which the allocation is both fair and core-stable are provided. In scenarios where the Shapley value falls outside the core, we develop an alternative allocation based on the stable payoff that minimizes the mean relative deviation from the Shapley value while preserving core stability. This deviation is further proved to be upper-bounded by $1$, showing a favorable trade-off between stability and fairness. Finally, extensive numerical studies validate the theoretical results and demonstrate the effectiveness of the proposed framework in facilitating stable, equitable, and sustainable cooperation in mixed-energy truck platooning.

Fast Distribution Grid Topology Estimation via Subset Sum

Authors: Yueyao Xu, Yize Chen

Faced with increasing penetration of distributed energy resources and fast development of distribution grid energy management, topology identification of distribution grid becomes an important and fundamental task. As the underlying grid topology is usually unknown or incomplete to the utilities, it is becoming a fundamental task to efficiently identify the distribution grid network topology using limited measurements. A fast and accurate topology identification can help achieving the tasks of load monitoring, operation and control of power distribution system as well as outage detection. In this paper, we propose a novel and ultra-fast topology identification method. By adapting the subset sum method with a hierarchical structure, the overall grid topology can be inferred from fewer samples of smart meter power measurements. Such techniques can be applied in real time under the scenarios with fast topology change, and the proposed hierarchical algorithm is also robust against measurement noises.

Impact of Communication Delay and Sampling on Small-Signal Stability of IBR-rich Power Systems

Authors: Saugat Ghimire, Vaithianathan "Mani" Venkatasubramanian, Gilles Torresan

The growing adoption of inverter-based resources (IBRs) has introduced unprecedented dynamics in power systems, resulting in oscillations across a broad spectrum of frequencies. Communication delay between the plant-level control and the inverter-level control in IBR plants has been recognized as one of the causes of such oscillations and a factor that impacts the system's stability. The control signals from the plant-level controller also experience sampling, with the sampled values held constant by the hold elements for the duration of the sampling period. This also has a bearing on the response of IBR plants. In this paper, we analyze the impacts of communication delay and sampling of control signals between plant-level control and inverter-level control of grid-following IBR plants on the small-signal stability of power systems. The underlying fundamentals of communication delay and sampling are revisited to explain the observed responses. Our findings emphasize the unique effects of communication delay and sampling period on the stability of IBR-rich power systems and suggest strategies to mitigate their detrimental impacts. The work also highlights the need for more accurate approaches for small-signal stability analysis of such systems.

Transient Stability-Driven Planning for the Optimal Sizing of Resilient AC/DC Hybrid Microgrids

Authors: Yi Wang, Goran Strbac

This paper proposes a transient stability-driven planning framework for the optimal sizing problem of resilient AC/DC hybrid microgrids (HMGs) under different types of contingencies, capturing frequency and voltage stability requirements as well as the frequency-voltage coupling dynamics of AC/DC interlinking converters (ICs). The planning model is formulated into a defender-attacker-defender (DAD) architecture, which can be further merged into two levels, i.e., upper-level and low-level problems, and then iteratively solved by an enhanced genetic algorithm with sparsity calculation and local search. Regarding the operation stage, a novel transient stability-constrained optimal power flow (TSC-OPF) algorithm is proposed for static and transient operations of HMGs, capturing governor dynamics and automatic voltage regulator of conventional generators as well as the droop control dynamics of inverter-based resources (IBRs) for frequency control and voltage control, respectively. Furthermore, a Lyapunov optimisation approach is developed to capture the time-coupling property of energy storages (ESs) and then allow the TSC-OPF to be solved on an hourly basis with a second-scale resolution, achieving the co-optimisation of static and transient stability requirements. Case studies have been conducted to verify the effectiveness of the proposed planning framework in obtaining cost-effective investment decisions for various resources while respecting transient stability requirements under different contingencies.

Dispatch-Aware Deep Neural Network for Optimal Transmission Switching: Toward Real-Time and Feasibility Guaranteed Operation

Authors: Minsoo Kim, Jip Kim

Optimal transmission switching (OTS) improves optimal power flow (OPF) by selectively opening transmission lines, but its mixed-integer formulation increases computational complexity, especially on large grids. To deal with this, we propose a dispatch-aware deep neural network (DA-DNN) that accelerates DC-OTS without relying on pre-solved labels. DA-DNN predicts line states and passes them through a differentiable DC-OPF layer, using the resulting generation cost as the loss function so that all physical network constraints are enforced throughout training and inference. In addition, we adopt a customized weight-bias initialization that keeps every forward pass feasible from the first iteration, which allows stable learning on large grids. Once trained, the proposed DA-DNN produces a provably feasible topology and dispatch pair in the same time as solving the DCOPF, whereas conventional mixed-integer solvers become intractable. As a result, the proposed method successfully captures the economic advantages of OTS while maintaining scalability.

Integrating Grid impedance estimation method into Advanced Angle Estimation Kalman Filter in GFL inverter

Authors: Phuoc Sang Nguyen, Ghavameddin Nourbakhsh, Gerard Ledwich

The growing integration of power electronic converter-interfaced distributed energy resources into modern power systems presents significant challenges for system monitoring, protection, and control. Grid impedance plays a critical role in the operation and stability assessment of grid-connected inverter systems. This study presents a real-time grid impedance estimation method based on the Discrete Fourier Transform. The proposed method is integrated with the Advanced Angle Estimation Kalman Filter using a Linear Quadratic Regulator current controller (AAEKF-LQR), assisting the use of impedance information for accurate instantaneous phase angle estimation. Simulation results confirm that the proposed impedance estimation method interacts effectively with the AAEKF-LQR controller, maintaining stable system performance under weak grid conditions. The approach also demonstrates the ability to deliver fast and accurate impedance estimation during operational variations in grid conditions, thereby supporting stable inverter operation.

Power Allocation and RIS Elements Optimisation for Reconfigurable Intelligent Surfaces assisted RSMA

Authors: Abdullah Qayyum, Maziar Nekovee

This paper proposes power allocation and the number of reconfigurable intelligent surfaces (RIS) elements optimisation in a RIS-assisted rate splitting multiple access (RSMA) system. The optimised RIS-RSMA (ORIS-RSMA) method determines the optimal number of RIS elements and the power allocation factors for both common and private parts of a message. Additionally, it maximises the sum rate while ensuring that a target common rate is satisfied. The performance of the proposed ORIS-RSMA is compared to that of the conventional RIS-RSMA and RSMA. Simulation results show that ORIS-RSMA achieves a higher sum rate.

Model Predictive Control for Unlocking Energy Flexibility of Heat Pump and Thermal Energy Storage Systems: Experimental Results

Authors: Weihong Tang, Yun Li, Shalika Walker, Tamas Keviczky

Increasing penetration of renewable energy sources (RES) and electrification of energy systems necessitates the engagement of demand-side management (DSM) to help alleviate congestion in electricity grid. Heat pump and thermal energy storage (HPTES) systems, being energy efficient solutions, are becoming popular in modern buildings and are promising to contribute to demand-side management (DSM) due to their significant share in household electricity consumption. For typical HPTES systems, this paper presents a systematic design framework covering a control-oriented modeling process and energy-flexible model predictive control (MPC) design. The proposed MPC-based DSM strategy offers an innovative solution for efficient DSM by following a two-step DSM framework. In the first step, flexibility assessment is performed to quantitatively evaluate the flexibility potential of the HPTES system by solving a mixed-integer economic MPC problem. In the second step, flexibility exploitation is achieved through reacting to feasible demand response (DR) requests while respecting system constraints. Both numerical simulations and real-world experiments are performed based on a real HPTES installation to showcase the viability and effectiveness of the proposed design.

A Joint Planning Model for Fixed and Mobile Electric Vehicle Charging Stations Considering Flexible Capacity Strategy

Authors: Zhe Yu, Xue Hu, Qin Wang

The widespread adoption of electric vehicles (EVs) has significantly increased demand on both transportation and power systems, posing challenges to their stable operation. To support the growing need for EV charging, both fixed charging stations (FCSs) and mobile charging stations (MCSs) have been introduced, serving as key interfaces between the power grid and traffic network. Recognizing the importance of collaborative planning across these sectors, this paper presents a two-stage joint planning model for FCSs and MCSs, utilizing an improved alternating direction method of multipliers (ADMM) algorithm. The primary goal of the proposed model is to transform the potential negative impacts of large-scale EV integration into positive outcomes, thereby enhancing social welfare through collaboration among multiple stakeholders. In the first stage, we develop a framework for evaluating FCS locations, incorporating assessments of EV hosting capacity and voltage stability. The second stage introduces a joint planning model for FCSs and MCSs, aiming to minimize the overall social costs of the EV charging system while maintaining a reliable power supply. To solve the planning problem, we employ a combination of mixed-integer linear programming, queueing theory, and sequential quadratic programming. The improved ADMM algorithm couples the siting and sizing decisions consistently by introducing coupling constraints, and supports a distributed optimization framework that coordinates the interests of EV users, MCS operators, and distribution system operators. Additionally, a flexible capacity planning strategy that accounts for the multi-period development potential of EVCS is proposed to reduce both the complexity and the investment required for FCS construction. Finally, a case study with comparative experiments demonstrates the effectiveness of the proposed models and solution methods.

Safe Trajectory Sets for Online Operation of Power Systems under Uncertainty

Authors: Florian Klein-Helmkamp, Tina Möllemann, Irina Zettl, Steffen Kortmann, Andreas Ulbig

Flexibility provision from active distribution grids requires efficient and robust methods of optimization and control suitable to online operation. In this paper we introduce conditions for the safe operation of feedback optimization based controllers. We use the feasible operating region of a controlled system as bounds for safe system states and evaluate the trajectories of the controller based on the projection of the full system state onto the two-dimensional PQ-plane. We demonstrate the defined conditions for an exemplary sub-transmission system. We show that the proposed method is suitable to evaluate controller performance and robustness for systems subject to disturbances.

Channel Estimation for RIS-Assisted mmWave Systems via Diffusion Models

Authors: Yang Wang, Yin Xu, Cixiao Zhang, Zhiyong Chen, Mingzeng Dai, Haiming Wang, Bingchao Liu, Dazhi He, Meixia Tao

Reconfigurable intelligent surface (RIS) has been recognized as a promising technology for next-generation wireless communications. However, the performance of RIS-assisted systems critically depends on accurate channel state information (CSI). To address this challenge, this letter proposes a novel channel estimation method for RIS-aided millimeter-wave (mmWave) systems based on diffusion models (DMs). Specifically, the forward diffusion process of the original signal is formulated to model the received signal as a noisy observation within the framework of DMs. Subsequently, the channel estimation task is formulated as the reverse diffusion process, and a sampling algorithm based on denoising diffusion implicit models (DDIMs) is developed to enable effective inference. Furthermore, a lightweight neural network, termed BRCNet, is introduced to replace the conventional U-Net, significantly reducing the number of parameters and computational complexity. Extensive experiments conducted under various scenarios demonstrate that the proposed method consistently outperforms existing baselines.

RIS-aided Latent Space Alignment for Semantic Channel Equalization

Authors: Tomás Hüttebräucker, Mario Edoardo Pandolfo, Simone Fiorellino, Emilio Calvanese Strinati, Paolo Di Lorenzo

Semantic communication systems introduce a new paradigm in wireless communications, focusing on transmitting the intended meaning rather than ensuring strict bit-level accuracy. These systems often rely on Deep Neural Networks (DNNs) to learn and encode meaning directly from data, enabling more efficient communication. However, in multi-user settings where interacting agents are trained independently-without shared context or joint optimization-divergent latent representations across AI-native devices can lead to semantic mismatches, impeding mutual understanding even in the absence of traditional transmission errors. In this work, we address semantic mismatch in Multiple-Input Multiple-Output (MIMO) channels by proposing a joint physical and semantic channel equalization framework that leverages the presence of Reconfigurable Intelligent Surfaces (RIS). The semantic equalization is implemented as a sequence of transformations: (i) a pre-equalization stage at the transmitter; (ii) propagation through the RIS-aided channel; and (iii) a post-equalization stage at the receiver. We formulate the problem as a constrained Minimum Mean Squared Error (MMSE) optimization and propose two solutions: (i) a linear semantic equalization chain, and (ii) a non-linear DNN-based semantic equalizer. Both methods are designed to operate under semantic compression in the latent space and adhere to transmit power constraints. Through extensive evaluations, we show that the proposed joint equalization strategies consistently outperform conventional, disjoint approaches to physical and semantic channel equalization across a broad range of scenarios and wireless channel conditions.

Relevant ArXiv eess Papers - 2025-07-23

Fast Feeder Reconfiguration via Mesh Adaptive Direct Search in Black-Box Distribution System Environments

Authors: Junyuan Zheng, Wenlong Shi, Zhaoyu Wang

Feeder reconfiguration is a critical operational strategy in power distribution systems. However, existing optimization approaches typically rely on explicit mathematical formulations and analytical models, which are often infeasible in practical utility environments characterized by heterogeneous, proprietary, and black-box simulation modules. To address this challenge, this paper proposes a fast feeder reconfiguration framework based on Mesh Adaptive Direct Search (MADS). The proposed approach requires only performance metric evaluations through simulation modules used for power flow, protection, and voltage regulation analysis. A bi-objective formulation is adopted to jointly minimize active power loss and operational constraint violations. A Pareto-based frontier filter is integrated into the MADS algorithm to efficiently guide the search toward high-quality configurations while systematically pruning dominated solutions. The approach adaptively refines the search space around promising candidates using local polling strategies and convergence aware updates. Case studies on the IEEE-123 node test feeder demonstrate that the proposed approach achieves near-optimal configurations with significantly fewer evaluations compared to heuristic methods.

Analytical Framework for Power System Strength

Authors: Ignacio Ponce, Federico Milano

This paper proposes a general framework to evaluate power system strength. The formulation features twelve indicators, grouped in three dynamical orders, that quantify the resistance of bus voltage phasors and their first and second order rates of change to sudden current injection changes. To quantify such changes the paper introduces a novel finite differentiation technique, that we named Delta operator, able to properly capture "jumps" of algebraic variables and utilizes the recently developed concept of complex frequency. The paper also shows how the proposed framework can be systematically applied to any system device, and provides a variety of examples based on synchronous machines, converters and loads models are given. Numerical results in a benchmark system validate the exactness of the formulation.

Arbitrage Tactics in the Local Markets via Hierarchical Multi-agent Reinforcement Learning

Authors: Haoyang Zhang, Mina Montazeri, Philipp Heer, Koen Kok, Nikolaos G. Paterakis

Strategic bidding tactics employed by prosumers in local markets, including the Local Electricity Market (LEM) and Local Flexibility Market (LFM), have attracted significant attention due to their potential to enhance economic benefits for market participants through optimized energy management and bidding. While existing research has explored strategic bidding in a single market with multi-agent reinforcement learning (MARL) algorithms, arbitrage opportunities across local markets remain unexplored. This paper introduces a hierarchical MARL (HMARL) algorithm designed to enable aggregator arbitrage across multiple local markets. The strategic behavior of these aggregators in local markets is modeled as a two-stage Markov game: the first stage involves the LEM, while the second stage encompasses both the LFM and the balancing market. To solve this two-stage Markov game, the HMARL framework assigns two sub-agents to each aggregator, a primary sub-agent and a secondary sub-agent. Without the arbitrage strategy, these sub-agents operate in silos, with the primary sub-agent focusing on first-stage profits and the secondary sub-agent on second-stage profits, each employing independent MARLs. On the contrary, when implementing the arbitrage strategy with the proposed HMARL, the sub-agents communicate and coordinate to perform arbitrage across multiple local markets, enhancing overall efficiency. The case study, conducted under a scenario where all aggregators employ the arbitrage strategy, shows that despite higher initial costs in the LEM, this strategy generates substantial savings in the LFM and the balancing market, resulting in a total profit increase of $40.6\%$ on average. This highlights the capability of the proposed HMARL to address the two-stage Markov game and facilitate arbitrage across local markets, thereby enhancing profitability for participants.

Reconfigurable Intelligent Surface-Enabled Green and Secure Offloading for Mobile Edge Computing Networks

Authors: Tong-Xing Zheng, Xinji Wang, Xin Chen, Di Mao, Jia Shi, Cunhua Pan, Chongwen Huang, Haiyang Ding, Zan Li

This paper investigates a multi-user uplink mobile edge computing (MEC) network, where the users offload partial tasks securely to an access point under the non-orthogonal multiple access policy with the aid of a reconfigurable intelligent surface (RIS) against a multi-antenna eavesdropper. We formulate a non-convex optimization problem of minimizing the total energy consumption subject to secure offloading requirement, and we build an efficient block coordinate descent framework to iteratively optimize the number of local computation bits and transmit power at the users, the RIS phase shifts, and the multi-user detection matrix at the access point. Specifically, we successively adopt successive convex approximation, semi-definite programming, and semidefinite relaxation to solve the problem with perfect eavesdropper's channel state information (CSI), and we then employ S-procedure and penalty convex-concave to achieve robust design for the imperfect CSI case. We provide extensive numerical results to validate the convergence and effectiveness of the proposed algorithms. We demonstrate that RIS plays a significant role in realizing a secure and energy-efficient MEC network, and deploying a well-designed RIS can save energy consumption by up to 60\% compared to that without RIS. We further reveal impacts of various key factors on the secrecy energy efficiency, including RIS element number and deployment position, user number, task scale and duration, and CSI imperfection.

Multi-RIS-Empowered Communication Systems: Capacity Analysis and Optimization

Authors: Aris L. Moustakas, George C. Alexandropoulos

In this chapter, using statistical physics methods, asymptotic closed-form expressions for the mean and variance of the mutual information for a multi-antenna transmitter-receiver pair in the presence of multiple Reconfigurable Intelligent Surfaces (RISs) are presented. While nominally valid in the large-system limit, it is shown that the derived Gaussian approximation for the mutual information can be quite accurate, even for modest-sized antenna arrays and metasurfaces. The above results are particularly useful when fast-fading conditions are present, which renders channel estimation challenging. The derived analysis indicates that, when the channel close to an RIS is correlated, for instance due to small angle spread which is reasonable for wireless systems with increasing carrier frequencies, the communication link benefits significantly from statistical RIS optimization, resulting in gains that are surprisingly higher than the nearly uncorrelated case. More importantly, the presented novel asymptotic properties of the correlation matrices of the impinging and outgoing signals at the RISs can be deployed to optimize the metasurfaces without brute-force numerical optimization. The numerical investigation demonstrates that, when the desired reflection from any of the RISs departs significantly from geometrical optics, the metasurfaces can be optimized to provide robust communication links, without significant need for their optimal placement.

Integrating and Comparing Radiality Constraints for Optimized Distribution System Reconfiguration

Authors: Pablo Cortes, Alejandra Tabares, Fredy Franco

The reconfiguration of electrical power distribution systems is a crucial optimization problem aimed at minimizing power losses by altering the system topology through the operation of interconnection switches. This problem, typically modelled as a mixed integer nonlinear program demands high computational resources for large scale networks and requires specialized radiality constraints for maintaining the tree like structure of distribution networks. This paper presents a comprehensive analysis that integrates and compares the computational burden associated with different radiality constraint formulations proposed in the specialized literature for the reconfiguration of distribution systems. By using consistent hardware and software setups, we evaluate the performance of these constraints across several well known test cases. Our findings reveal significant differences in computational efficiency depending on the chosen set of radiality constraints, providing valuable insights for optimizing reconfiguration strategies in practical distribution networks.

Relevant ArXiv eess Papers - 2025-07-21

Heatwave-driven air conditioning adoption could increase German electricity demand by 14 GW in the near future

Authors: Leo Semmelmann, Frederik vom Scheidt

Intensifying heatwaves driven by climate change are accelerating the adoption of mobile air conditioning (AC) systems. A rapid mass adoption of such AC systems could create additional stress on electricity grids and the power system. This study presents a novel method to estimate the electricity demand from AC systems both at system level and at high temporal and spatial granularity. We apply the method to a near-future heatwave scenario in Germany in which household AC adoption increases from current 19% to 35% during a heatwave similar to the one of July 2025. We analyze the effects for 196,428 grid cells of one square kilometer across Germany, by combining weather data, census data, socio-demographic assumptions, mobility patterns, and temperature-dependent AC activation functions. We find that electricity demand of newly purchased mobile AC systems could increase the peak load by over 14 GW (23%), with urban hot-spots reaching 5.8 MW per square kilometer. The temporal pattern creates a pronounced afternoon peak that coincides with lower photovoltaic generation, potentially exacerbating power system stability challenges. Our findings underscore the urgency for proactive energy system planning to manage emerging demand peaks.

Smart fault detection in satellite electrical power system

Authors: Niloofar Nobahari, Alireza Rezaee

This paper presents an new approach for detecting in the electrical power system of satellites operating in Low Earth Orbit (LEO) without an Attitude Determination and Control Subsystem (ADCS). Components of these systems are prone to faults, such as line-to-line faults in the photovoltaic subsystem, open circuits, and short circuits in the DC-to-DC converter, as well as ground faults in batteries. In the previous research has largely focused on detecting faults in each components, such as photovoltaic arrays or converter systems, therefore, has been limited attention given to whole electrical power system of satellite as a whole system. Our approach addresses this gap by utilizing a Multi-Layer Perceptron (MLP) neural network model, which leverages input data such as solar radiation and surface temperature to predict current and load outputs. These machine learning techniques that classifiy use different approaches like Principal Component Analysis (PCA) and K-Nearest Neighbors (KNN), to classify faults effectively. The model presented achieves over 99% accuracy in identifying faults across multiple subsystems, marking a notable advancement from previous approaches by offering a complete diagnostic solution for the entire satellite power system. This thorough method boosts system reliability and helps lower the chances of mission failure

Solving Optimal Power Flow on a Data-Budget: Feature Selection on Smart Meter Data

Authors: Vassilis Kekatos, Ridley Annin, Manish K. Singh, Junjie Qin

How much data is needed to optimally schedule distributed energy resources (DERs)? Does the distribution system operator (DSO) have to know load demands at each bus of the feeder to solve an optimal power flow (OPF)? This work exploits redundancies in OPF's structure and data to minimize the communication of such a data deluge, and explores the trade-off between data compression and the grid's performance. We propose an OPF data distillation framework involving two steps: The DSO first collects OPF data from only a subset of nodes. It subsequently reconstructs the complete OPF data from the partial ones, and feeds them into the OPF solver. Selecting and reconstructing OPF data may be performed to maximize the fidelity of the reconstructed data or the associated OPF solutions. Under the first objective, OPF data distillation is posed as a sparsity-regularized convex problem. Under the second objective, it is posed as a sparsity-regularized bilevel program. Both problems are solved using proximal gradient algorithms. The second objective is superior in approximating OPF solutions at the expense of increased complexity. Numerical tests show that it enhances the fidelity and feasibility of the reconstructed OPF solutions, which can be approximated reasonably well even from partial data.

Mixed-integer Second-Order Cone Programming for Multi-period Scheduling of Flexible AC Transmission System Devices

Authors: Mohamad Charara (Polytechnique Montréal, GERAD & MILA, Canada), Martin De Montigny (Hydro-Québec, Canada), Nivine Abou Daher (Hydro-Québec, Canada), Hanane Dagdougui (Polytechnique Montréal, GERAD & MILA, Canada), Antoine Lesage-Landry (Polytechnique Montréal, GERAD & MILA, Canada)

With the increasing energy demand and the growing integration of renewable sources of energy, power systems face operational challenges such as overloads, losses, and stability concerns, particularly as networks operate near their capacity limits. Flexible alternating current transmission system (FACTS) devices are essential to ensure reliable grid operations and enable the efficient integration of renewable energy. This work introduces a mixed-integer second-order cone programming (MISOCP) model for the multi-period scheduling of key FACTS devices in electric transmission systems. The proposed model integrates four key control mechanisms: (i) on-load tap changers (OLTCs) for voltage regulation via discrete taps; (ii) static synchronous compensators (STATCOMs) and (iii) shunt reactors for reactive power compensation; and (iv) thyristor-controlled series capacitors (TCSCs) for adjustable impedance and flow control. The objective is to minimize active power losses using a limited number of control actions while meeting physical and operational constraints at all times throughout the defined time horizon. To ensure tractability, the model employs a second-order cone relaxation of the power flow. Device-specific constraints are handled via binary expansion and linearization: OLTCs and shunt reactors are modelled with discrete variables, STATCOMs through reactive power bounds, and TCSCs using a reformulation-linearization technique (RLT). A multi-period formulation captures the sequential nature of decision making, ensuring consistency across time steps. The model is evaluated on the IEEE 9-bus, 30-bus, and RTS96 test systems, demonstrating its ability to reduce losses, with potential applicability to larger-scale grids.

Relevant ArXiv eess Papers - 2025-07-18

Model Predictive Black Start for Dynamic Formation of DER-Led Microgrids with Inrush Current Impacts

Authors: Cong Bai, Salish Maharjan, Zhaoyu Wang

Black start (BS) of the distribution system (DS) with high penetration of distributed energy resources (DERs) requires advanced control frameworks to ensure secure and efficient restoration. This paper proposes a model predictive black start (MPBS) framework incorporating an inrush current feasibility module to dynamically generate real-time feasible and optimal restoration sequences. Short-term forecasts of DER output and transmission grid (TG) availability are utilized to construct adaptive cranking paths. The inrush current feasibility module analytically estimates the transient inrush current caused by energizing no-load distribution transformers (DTs). To mitigate excessive inrush current and avoid potential misoperations of protection devices, an emergency operation-inspired voltage control strategy and a switch blocking mechanism are developed. The proposed inrush model is validated against electromagnetic transient (EMT) simulations in PowerFactory with estimation accuracies exceeding 90 %. Case studies on a modified IEEE 123-node feeder demonstrate that the MPBS framework prevents misoperations of fuses and reclosers, reduces unnecessary DER energy consumption, and enhances load restoration efficiency during DER-led BS processes.

Joint Price and Power MPC for Peak Power Reduction at Workplace EV Charging Stations

Authors: Thibaud Cambronne, Samuel Bobick, Wente Zeng, Scott Moura

Demand charge often constitutes a significant portion of electricity costs for commercial electric vehicle charging station operators. This paper explores control methods to reduce peak power consumption at workplace EV charging stations in a joint price and power optimization framework. We optimize a menu of price options to incentivize users to select controllable charging service. Using this framework, we propose several solutions to achieve a reduction in both demand charge and overall operator costs. Through a Monte Carlo simulation, we find that model predictive control using a time series forecast can significantly reduce station operator costs.

A Stackelberg Game of Demand Response from the Aggregator's Perspective

Authors: Seangleng Khe, Parin Chaipunya, Athikom Bangviwat

In this paper, we investigate on the modeling of demand response activities between the single aggregator and multiple participating consumers. The model incorporates the bilevel structure that naturally occurs in the information structure and decision sequence, where the aggregator assumes the role of a leader and the participating consumers play the role of followers. The proposed model is demonstrated to be effective in load control, helping the aggregator to meet the target reduction while the consumers pay cheaper electricity bill.

Leveraging Asynchronous Cross-border Market Data for Improved Day-Ahead Electricity Price Forecasting in European Markets

Authors: Maria Margarida Mascarenhas, Jilles De Blauwe, Mikael Amelin, Hussain Kazmi

Accurate short-term electricity price forecasting is crucial for strategically scheduling demand and generation bids in day-ahead markets. While data-driven techniques have shown considerable prowess in achieving high forecast accuracy in recent years, they rely heavily on the quality of input covariates. In this paper, we investigate whether asynchronously published prices as a result of differing gate closure times (GCTs) in some bidding zones can improve forecasting accuracy in other markets with later GCTs. Using a state-of-the-art ensemble of models, we show significant improvements of 22% and 9% in forecast accuracy in the Belgian (BE) and Swedish bidding zones (SE3) respectively, when including price data from interconnected markets with earlier GCT (Germany-Luxembourg, Austria, and Switzerland). This improvement holds for both general as well as extreme market conditions. Our analysis also yields further important insights: frequent model recalibration is necessary for maximum accuracy but comes at substantial additional computational costs, and using data from more markets does not always lead to better performance - a fact we delve deeper into with interpretability analysis of the forecast models. Overall, these findings provide valuable guidance for market participants and decision-makers aiming to optimize bidding strategies within increasingly interconnected and volatile European energy markets.

Transient-Stability-Aware Frequency Provision in IBR-Rich Grids via Information Gap Decision Theory and Deep Learning

Authors: Amin Masoumi, Mert Korkali

This paper introduces a framework to address the critical loss of transient stability caused by reduced inertia in grids with high inverter-based resource (IBR) penetration. The proposed method integrates a predictive deep learning (DL) model with information gap decision theory (IGDT) to create a risk-averse dispatch strategy. By reformulating the conventional virtual inertia scheduling (VIS) problem, the framework uses early predictions of post-fault dynamics to proactively redispatch resources, ensuring the system's center of inertia remains stable under worst-case contingencies. Validated on the IEEE 39-bus system with 70% IBR penetration, the proposed approach prevents system collapse where a conventional VIS strategy fails, ensuring frequency stability at a cost increase of only 5%.

A Generalized Stability Analysis Method with Dynamic Phasors for LV AC Microgrids

Authors: Bülent Dağ

Representation of inductive coupling lines with conventional static phasors is the main reason of inadequacy of the existing phasors based simplified stability analysis methods for microgrids with inductive coupling lines. In the literature, dynamic phasors have been proposed for the dynamic modelling of inductive lines to conserve the simplified structure of the analysis method. In this study a generalized stability analysis method for LV AC microgrids, composed of droop controlled inverters, is presented. The proposed analysis method is based on the inclusion of dynamic phasors for inductive coupling lines into the existing phasors based stability analysis method. The results show that the stability analysis method with dynamic phasors successfully predicts the instability boundaries of LV AC microgrids.

A Roadmap for Climate-Relevant Robotics Research

Authors: Alan Papalia, Charles Dawson, Laurentiu L. Anton, Norhan Magdy Bayomi, Bianca Champenois, Jung-Hoon Cho, Levi Cai, Joseph DelPreto, Kristen Edwards, Bilha-Catherine Githinji, Cameron Hickert, Vindula Jayawardana, Matthew Kramer, Shreyaa Raghavan, David Russell, Shide Salimi, Jingnan Shi, Soumya Sudhakar, Yanwei Wang, Shouyi Wang, Luca Carlone, Vijay Kumar, Daniela Rus, John E. Fernandez, Cathy Wu, George Kantor, Derek Young, Hanumant Singh

Climate change is one of the defining challenges of the 21st century, and many in the robotics community are looking for ways to contribute. This paper presents a roadmap for climate-relevant robotics research, identifying high-impact opportunities for collaboration between roboticists and experts across climate domains such as energy, the built environment, transportation, industry, land use, and Earth sciences. These applications include problems such as energy systems optimization, construction, precision agriculture, building envelope retrofits, autonomous trucking, and large-scale environmental monitoring. Critically, we include opportunities to apply not only physical robots but also the broader robotics toolkit - including planning, perception, control, and estimation algorithms - to climate-relevant problems. A central goal of this roadmap is to inspire new research directions and collaboration by highlighting specific, actionable problems at the intersection of robotics and climate. This work represents a collaboration between robotics researchers and domain experts in various climate disciplines, and it serves as an invitation to the robotics community to bring their expertise to bear on urgent climate priorities.

Relevant ArXiv eess Papers - 2025-07-17

A Deep Reinforcement Learning Method for Multi-objective Transmission Switching

Authors: Ding Lin, Jianhui Wang, Tianqiao Zhao, Meng Yue

Transmission switching is a well-established approach primarily applied to minimize operational costs through strategic network reconfiguration. However, exclusive focus on cost reduction can compromise system reliability. While multi-objective transmission switching can balance cost savings with reliability improvements, feasible solutions become exceedingly difficult to obtain as system scale grows, due to the inherent nonlinearity and high computational demands involved. This paper proposes a deep reinforcement learning (DRL) method for multi-objective transmission switching. The method incorporates a dueling-based actor-critic framework to evaluate the relative impact of each line switching decision within the action space, which improves decision quality and enhances both system reliability and cost efficiency. Numerical studies on the IEEE 118-bus system verify the effectiveness and efficiency of the proposed approach compared to two benchmark DRL algorithms.

Reconfigurable Battery Systems for Enhanced Fast Charging in Electric Vehicles

Authors: Jonathan Olivares, Tyler Depe, Rakeshkumar Mahto

The adoption of electric vehicles (EVs) is rapidly growing as a key solution to reducing greenhouse gas emissions. However, prolonged charging times remain a significant barrier to widespread EV usage, especially for individuals without access to fast charging infrastructure. This paper explores the potential of reconfigurable battery systems to reduce EV charging times without compromising battery life. We propose innovative battery pack configurations that dynamically adjust the arrangement of cells to optimize charging performance. Simulations were conducted using MATLAB and Simulink to compare the efficiency of various battery configurations, focusing on charging times, state of charge (SOC), voltage, and current under different conditions. The results demonstrate that connecting more batteries in series through reconfigurability in battery packs can significantly reduce charging times while maintaining operational safety. This study offers insights into how reconfigurable battery designs can provide a practical solution for faster, more efficient home-based EV charging, making EV ownership more accessible and sustainable.

Mixed-integer Second-Order Cone Programming for Multi-period Scheduling of Flexible AC Transmission System Devices

Authors: Mohamad Charara (Polytechnique Montréal, GERAD & MILA, Canada), Martin De Montigny (Hydro-Québec, Canada), Nivine Abou Daher (Hydro-Québec, Canada), Hanane Dagdougui (Polytechnique Montréal, GERAD & MILA, Canada), Antoine Lesage-Landry (Polytechnique Montréal, GERAD & MILA, Canada)

With the increasing energy demand and the growing integration of renewable sources of energy, power systems face operational challenges such as overloads, losses, and stability concerns, particularly as networks operate near their capacity limits. Flexible alternating current transmission system (FACTS) devices are essential to ensure reliable grid operations and enable the efficient integration of renewable energy. This work introduces a mixed-integer second-order cone programming (MISOCP) model for the multi-period scheduling of key FACTS devices in electric transmission systems. The proposed model integrates four key control mechanisms: (i) on-load tap changers (OLTCs) for voltage regulation via discrete taps; (ii) static synchronous compensators (STATCOMs) and (iii) shunt reactors for reactive power compensation; and (iv) thyristor-controlled series capacitors (TCSCs) for adjustable impedance and flow control. The objective is to minimize active power losses using a limited number of control actions while meeting physical and operational constraints at all times throughout the defined time horizon. To ensure tractability, the model employs a second-order cone relaxation of the power flow. Device-specific constraints are handled via binary expansion and linearization: OLTCs and shunt reactors are modelled with discrete variables, STATCOMs through reactive power bounds, and TCSCs using a reformulation-linearization technique (RLT). A multi-period formulation captures the sequential nature of decision making, ensuring consistency across time steps. The model is evaluated on the IEEE 9-bus, 30-bus, and RTS96 test systems, demonstrating its ability to reduce losses, with potential applicability to larger-scale grids.

Embracing Fairness in Consumer Electricity Markets using an Automatic Market Maker

Authors: Shaun Sweeney, Chris King, Mark O'Malley, Robert Shorten

As consumer flexibility becomes expected, it is important that the market mechanisms which attain that flexibility are perceived as fair. We set out fairness issues in energy markets today, and propose a market design to address them. Consumption is categorised as either essential or flexible with different prices and reliability levels for each. Prices are generated by an Automatic Market Maker (AMM) based on instantaneous scarcity and resource is allocated using a novel Fair Play algorithm. We empirically show the performance of the system over 1 year for 101 UK households and benchmark its performance against more classical approaches.

Energy-Efficient and Intelligent ISAC in V2X Networks with Spiking Neural Networks-Driven DRL

Authors: Chen Shang, Jiadong Yu, Dinh Thai Hoang

Integrated sensing and communication (ISAC) is emerging as a key enabler for vehicle-to-everything (V2X) systems. However, designing efficient beamforming schemes for ISAC signals to achieve accurate sensing and enhance communication performance in the dynamic and uncertain environments of V2X networks presents significant challenges. While artificial intelligence technologies offer promising solutions, the energy-intensive nature of neural networks imposes substantial burdens on communication infrastructures. To address these challenges, this work proposes an energy-efficient and intelligent ISAC system for V2X networks. Specifically, we first leverage a Markov Decision Process framework to model the dynamic and uncertain nature of V2X networks. This framework allows the roadside unit to develop beamforming schemes relying solely on its current sensing information, eliminating the need for numerous pilot signals and extensive CSI acquisition. We then introduce an advanced deep reinforcement learning (DRL) algorithm, enabling the joint optimization of beamforming and power allocation to guarantee both communication rate and sensing accuracy in dynamic and uncertain V2X scenario. To alleviate the energy demands of neural networks, we integrate spiking neural networks (SNNs) into the DRL algorithm. The event-driven, sparse spike-based processing of SNNs significantly improves energy efficiency while maintaining strong performance. Extensive simulation results validate the effectiveness of the proposed scheme with lower energy consumption, superior communication performance, and improved sensing accuracy.

Relevant ArXiv eess Papers - 2025-07-16

Learning to Quantize and Precode in Massive MIMO Systems for Energy Reduction: a Graph Neural Network Approach

Authors: Thomas Feys, Liesbet Van der Perre, François Rottenberg

Massive MIMO systems are moving toward increased numbers of radio frequency chains, higher carrier frequencies and larger bandwidths. As such, digital-to-analog converters (DACs) are becoming a bottleneck in terms of hardware complexity and power consumption. In this work, non-linear precoding for coarsely quantized downlink massive MIMO is studied. Given the NP-hard nature of this problem, a graph neural network (GNN) is proposed that directly outputs the precoded quantized vector based on the channel matrix and the intended transmit symbols. The model is trained in a self-supervised manner, by directly maximizing the achievable rate. To overcome the non-differentiability of the objective function, introduced due to the non-differentiable DAC functions, a straight-through Gumbel-softmax estimation of the gradient is proposed. The proposed method achieves a significant increase in achievable sum rate under coarse quantization. For instance, in the single-user case, the proposed method can achieve the same sum rate as maximum ratio transmission (MRT) by using one-bit DAC's as compared to 3 bits for MRT. This reduces the DAC's power consumption by a factor 4-7 and 3 for baseband and RF DACs respectively. This, however, comes at the cost of increased digital signal processing power consumption. When accounting for this, the reduction in overall power consumption holds for a system bandwidth up to 3.5 MHz for baseband DACs, while the RF DACs can maintain a power reduction of 2.9 for higher bandwidths. Notably, indirect effects, which further reduce the power consumption, such as a reduced fronthaul consumption and reduction in other components, are not considered in this analysis.

Real-Time Foreign Object Recognition Based on Improved Wavelet Scattering Deep Network and Edge Computing

Authors: He Zhichao, Shen Xiangyu, Zhang Yong, Xie Nan

The increasing penetration rate of new energy in the power system has put forward higher requirements for the operation and maintenance of substations and transmission lines. Using the Unmanned Aerial Vehicles (UAV) to identify foreign object in real time can quickly and effectively eliminate potential safety hazards. However, due to the limited computation power, the captured image cannot be real-time processed on edge devices in UAV locally. To overcome this problem, a lightweight model based on an improved wavelet scatter deep network is proposed. This model contains improved wavelet scattering network for extracting the scatter coefficients and modulus coefficients of image single channel, replacing the role of convolutional layer and pooling layer in convolutional neural network. The following 3 fully connected layers, also constituted a simplified Multilayer Perceptron (MLP), are used to classify the extracted features. Experiments prove that the model constructed with biorthogonal wavelets basis is able to recognize and classify the foreign object in edge devices such as Raspberry Pi and Jetson Nano, with accuracy higher than 90% and inference time less than 7ms for 720P (1280*720) images. Further experiments demonstrate that the recognition accuracy of our model is 1.1% higher than YOLOv5s and 0.3% higher than YOLOv8s.

Moving Beyond Marginal Carbon Intensity: A Poor Metric for Both Carbon Accounting and Grid Flexibility

Authors: Philipp Wiesner, Odej Kao

Marginal Carbon Intensity (MCI) has been promoted as an effective metric for carbon-aware computing. Although it is already considered as impractical for carbon accounting purposes, many still view it as valuable when optimizing for grid flexibility by incentivizing electricity usage during curtailment periods. In this statement paper, we argue that MCI is neither reliable nor actionable for either purpose. We outline its fundamental limitations, including non-observability, reliance on opaque predictive models, and the lack of verifiability. Moreover, MCI fails to reflect curtailment caused by high-carbon sources and offers no insight into the quantity of available excess power. We advocate moving beyond MCI and instead call for research on more actionable metrics, such as direct reporting of excess power, explicit modeling of energy storage and grid stability, and integration with emerging granular renewable energy certificate markets.

Joint Power Allocation and Reflecting-Element Activation for Energy Efficiency Maximization in IRS-Aided Communications Under CSI Uncertainty

Authors: Christos N. Efrem, Ioannis Krikidis

We study the joint power allocation and reflecting element (RE) activation to maximize the energy efficiency (EE) in communication systems assisted by an intelligent reflecting surface (IRS), taking into account imperfections in channel state information (CSI). The robust optimization problem is mixed integer, i.e., the optimization variables are continuous (transmit power) and discrete (binary states of REs). In order to solve this challenging problem we develop two algorithms. The first one is an alternating optimization (AO) method that attains a suboptimal solution with low complexity, based on the Lambert W function and a dynamic programming (DP) algorithm. The second one is a branch-and-bound (B&B) method that uses AO as its subroutine and is formally guaranteed to achieve a globally optimal solution. Both algorithms do not require any external optimization solver for their implementation. Furthermore, numerical results show that the proposed algorithms outperform the baseline schemes, AO achieves near-optimal performance in most cases, and B&B has low computational complexity on average.

A Feed-Forward Artificial Intelligence Pipeline for Sustainable Desalination under Climate Uncertainties: UAE Insights

Authors: Obumneme Nwafor, Chioma Nwafor, Amro Zakaria, Nkechi Nwankwo

The United Arab Emirates (UAE) relies heavily on seawater desalination to meet over 90% of its drinking water needs. Desalination processes are highly energy intensive and account for approximately 15% of the UAE's electricity consumption, contributing to over 22% of the country's energy-related CO2 emissions. Moreover, these processes face significant sustainability challenges in the face of climate uncertainties such as rising seawater temperatures, salinity, and aerosol optical depth (AOD). AOD greatly affects the operational and economic performance of solar-powered desalination systems through photovoltaic soiling, membrane fouling, and water turbidity cycles. This study proposes a novel pipelined two-stage predictive modelling architecture: the first stage forecasts AOD using satellite-derived time series and meteorological data; the second stage uses the predicted AOD and other meteorological factors to predict desalination performance efficiency losses. The framework achieved 98% accuracy, and SHAP (SHapley Additive exPlanations) was used to reveal key drivers of system degradation. Furthermore, this study proposes a dust-aware rule-based control logic for desalination systems based on predicted values of AOD and solar efficiency. This control logic is used to adjust the desalination plant feed water pressure, adapt maintenance scheduling, and regulate energy source switching. To enhance the practical utility of the research findings, the predictive models and rule-based controls were packaged into an interactive dashboard for scenario and predictive analytics. This provides a management decision-support system for climate-adaptive planning.

Demo: Secure Edge Server for Network Slicing and Resource Allocation in Open RAN

Authors: Adhwaa Alchaab, Ayman Younis, Dario Pompili

Next-Generation Radio Access Networks (NGRAN) aim to support diverse vertical applications with strict security, latency, and Service-Level Agreement (SLA) requirements. These demands introduce challenges in securing the infrastructure, allocating resources dynamically, and enabling real-time reconfiguration. This demo presents SnSRIC, a secure and intelligent network slicing framework that mitigates a range of Distributed Denial-of-Service (DDoS) attacks in Open RAN environments. SnSRIC incorporates an AI-driven xApp that dynamically allocates Physical Resource Blocks (PRBs) to active users while enforcing slice-level security. The system detects anomalous behavior, distinguishes between benign and malicious devices, and uses the E2 interface to throttle rogue signaling while maintaining service continuity for legitimate users.

Orthogonality Analysis in LoRa Uplink Satellite Communications Affected by Doppler Effect

Authors: Jikang Deng, Fatma Benkhelifa, Mohamed-Slim Alouini

This paper provides, for the first time, analytical expressions for the Long-Range (LoRa) waveform and cross-correlation in both continuous and discrete time domains under the Doppler effect in satellite communication. We propose the concept and formulas of the shared visibility window for satellites toward two ground devices. Our analysis covers cross-correlation results with varying spreading factors (SF) for no-Doppler and with-Doppler cases. We find the maximum cross-correlation with different SFs and the mean cross-correlation are immune to the Doppler effect. However, the maximum cross-correlation with the same SFs is only immune to high Doppler shift, with its value fluctuating between 0.6 and 1 under high Doppler rate. We interpret this fluctuation by introducing the relationship between transmission start time and cross-correlation. We provide a parameter analysis for orbit height, ground device distance, and inclination angle. Additionally, we analyze the bit error rate (BER) for LoRa signals and observe worse performance under high Doppler shift or interference with same SF. Increasing the SNR or the SIR improves the BER only when Doppler effect is below a frequency threshold. Notably, under Doppler effect, the performance behaviors of BER no longer align with those of maximum cross-correlation. Finally, our results lead to two recommendations: 1) To mitigate Doppler impact on cross-correlation, we recommend utilizing low SFs, high orbit height, short ground device distance, and the transmission start time with high Doppler shift; 2) To mitigate Doppler impact on BER, we recommend employing low SFs, high bandwidth, and transmission start time with high Doppler rate. These conflicting recommendations regarding transmission start time highlight the necessity of Doppler shift compensation techniques to help operate LoRa in space properly.

Conformal Lyapunov Optimization: Optimal Resource Allocation under Deterministic Reliability Constraints

Authors: Francesco Binucci, Osvaldo Simeone, Paolo Banelli

This paper introduces conformal Lyapunov optimization (CLO), a novel resource allocation framework for networked systems that optimizes average long-term objectives, while satisfying deterministic long-term reliability constraints. Unlike traditional Lyapunov optimization (LO), which addresses resource allocation tasks under average long-term constraints, CLO provides formal worst-case deterministic reliability guarantees. This is achieved by integrating the standard LO optimization framework with online conformal risk control (O-CRC), an adaptive update mechanism controlling long-term risks. The effectiveness of CLO is verified via experiments for hierarchal edge inference targeting image segmentation tasks in a networked computing architecture. Specifically, simulation results confirm that CLO can control reliability constraints, measured via the false negative rate of all the segmentation decisions made in the network, while at the same time minimizing the weighted sum of energy consumption and precision loss, with the latter accounting for the rate of false positives.

Gain and phase type multipliers for feedback robustness

Authors: Axel Ringh, Xin Mao, Wei Chen, Li Qiu, Sei Zhen Khong

It is known that the stability of a feedback interconnection of two linear time-invariant systems implies that the graphs of the open-loop systems are quadratically separated. This separation is defined by an object known as the multiplier. The theory of integral quadratic constraints shows that the converse also holds under certain conditions. This paper establishes that if the feedback is robustly stable against certain structured uncertainty, then there always exists a multiplier that takes a corresponding form. In particular, if the feedback is robustly stable to certain gain-type uncertainty, then there exists a corresponding multiplier that is of phase-type, i.e., its diagonal blocks are zeros. These results build on the notion of phases of matrices and systems, which was recently introduced in the field of control. Similarly, if the feedback is robustly stable to certain phase-type uncertainty, then there exists a gain-type multiplier, i.e., its off-diagonal blocks are zeros. The results are meaningfully instructive in the search for a valid multiplier for establishing robust closed-loop stability, and cover the well-known small-gain and the recent small-phase theorems.

AoI-Energy-Spectrum Optimization in Post-Disaster Powered Communication Intelligent Network via Hierarchical Heterogeneous Graph Neural Network

Authors: Hanjian Liu, Jinsong Gui, Xiaoheng Deng

This paper designs a post-disaster powered communication intelligent network (PDPCIN) to address communication disruptions caused by ground base station (GBS) failures within the post-disaster area. PDPCIN employs unmanned aerial vehicles (UAVs) to provide wireless data collection (WDC) and wireless energy transmission (WET) for affected areas and leverages low earth orbit satellites (LEO SATs) to relay UAV data to the nearest survival GBS. To ensure basic post-disaster communication while co-optimizing age of information (AoI), energy efficiency, and spectrum efficiency, intelligent synchronization-UAV (IS-UAV) architecture, AoI-based four thresholds updating (AFTU) mechanism, and Dynamic multi-LEO access (DMLA) strategy are proposed. However, three key challenges remain: time-varying task-resource imbalances, complex topology caused by multi-device scheduling, and nonlinear coupling in multidimensional metric optimization, making system optimization NP-hard. Therefore, this paper proposes a hierarchical heterogeneous graph neural networks (HHGNN) framework. It models heterogeneous device nodes and their communication relations as a hierarchical heterogeneous graph structure, integrating our defined graph sensing, exchange, and mask layer to handle the network's input, feature propagation, and output. To search appropriate number of single-LEO SATs, we propose single-LEO SAT density optimization (S-LSDO) algorithm. Finally, we compare the proposed scheme with state-of-the-art benchmarks to validate its superior collaborative optimization of AoI, energy efficiency, and spectrum efficiency. Based on this, we derive the expressions for the expected values of AoI and stagnant AoI proportion.

Relevant ArXiv eess Papers - 2025-07-15

Counterfactual optimization for fault prevention in complex wind energy systems

Authors: Emilio Carrizosa, Martina Fischetti, Roshell Haaker, Juan Miguel Morales

Machine Learning models are increasingly used in businesses to detect faults and anomalies in complex systems. In this work, we take this approach a step further: beyond merely detecting anomalies, we aim to identify the optimal control strategy that restores the system to a safe state with minimal disruption. We frame this challenge as a counterfactual problem: given a Machine Learning model that classifies system states as either good or anomalous, our goal is to determine the minimal adjustment to the system's control variables (i.e., its current status) that is necessary to return it to the good state. To achieve this, we leverage a mathematical model that finds the optimal counterfactual solution while respecting system specific constraints. Notably, most counterfactual analysis in the literature focuses on individual cases where a person seeks to alter their status relative to a decision made by a classifier, such as for loan approval or medical diagnosis. Our work addresses a fundamentally different challenge: optimizing counterfactuals for a complex energy system, specifically an offshore wind turbine oil type transformer. This application not only advances counterfactual optimization in a new domain but also opens avenues for broader research in this area. Our tests on real world data provided by our industrial partner show that our methodology easily adapts to user preferences and brings savings in the order of 3 million euros per year in a typical farm.

Modelling and Control of a Buck Converter Using State-Space Averaging and Classical Feedback Techniques

Authors: Sampson E. Nwachukwu

This study presents the modeling, control design, and performance analysis of a DC-DC buck converter using state-space averaging techniques. Buck converters are essential in modern power electronics for regulating DC voltages in renewable energy and electric vehicle systems. The paper first introduces the basic operation of buck converters and emphasizes the need for voltage regulation through closed-loop control systems. A state-space averaged model is derived to simplify the nonlinear switched dynamics, enabling a more effective analysis and controller design. The small-signal transfer function from the duty cycle to the output voltage is obtained to support control development. In addition, the Proportional-Integral (PI) control based on the frequency-domain method was explored. The PI controller was tuned to achieve various phase margins and is evaluated through Bode plots, step responses, and performance metrics, revealing trade-offs between overshoot, settling time, and steady-state error. A complete simulation of the controlled buck converter verifies its ability to maintain a stable output voltage across wide input voltage variations. The results validate the effectiveness of state-space averaging in control design and highlight the robustness of feedback systems in power electronic converters.

Vertex-Guided Redundant Constraints Identification for Unit Commitment

Authors: Xuan He, Yuxin Pan, Yize Chen, Danny H.K. Tsang

Power systems Unit Commitment (UC) problem determines the generator commitment schedule and dispatch decisions to realize the reliable and economic operation of power networks. The growing penetration of stochastic renewables and demand behaviors makes it necessary to solve the UC problem timely. It is possible to derive lightweight, faster-to-solve UC models via constraint screening to eliminate redundant constraints. However, the screening process remains computationally cumbersome due to the need of solving numerous linear programming (LP) problems. To reduce the number of LPs to solve, we introduce a novel perspective on such classic LP-based screening. Our key insights lie in the principle that redundant constraints will be satisfied by all vertices of the screened feasible region. Using the UC decision variables' bounds tightened by solving much fewer LPs, we build an outer approximation for the UC feasible region as the screened region. A matrix operation is then designed and applied to the outer approximation's vertices to identify all redundant constraints on-the-fly. Adjustments for the outer approximation are further explored to improve screening efficiency by considering the load operating range and cutting planes derived from UC cost and discrete unit status prediction. Extensive simulations are performed on a set of testbeds up to 2,383 buses to substantiate the effectiveness of the proposed schemes. Compared to classic LP-based screening, our schemes can achieve up to 8.8x acceleration while finding the same redundant constraints.

Neural Two-Stage Stochastic Optimization for Solving Unit Commitment Problem

Authors: Zhentong Shao, Jingtao Qin, Nanpeng Yu

This paper proposes a neural stochastic optimization method for efficiently solving the two-stage stochastic unit commitment (2S-SUC) problem under high-dimensional uncertainty scenarios. The proposed method approximates the second-stage recourse problem using a deep neural network trained to map commitment decisions and uncertainty features to recourse costs. The trained network is subsequently embedded into the first-stage UC problem as a mixed-integer linear program (MILP), allowing for explicit enforcement of operational constraints while preserving the key uncertainty characteristics. A scenario-embedding network is employed to enable dimensionality reduction and feature aggregation across arbitrary scenario sets, serving as a data-driven scenario reduction mechanism. Numerical experiments on IEEE 5-bus, 30-bus, and 118-bus systems demonstrate that the proposed neural two-stage stochastic optimization method achieves solutions with an optimality gap of less than 1%, while enabling orders-of-magnitude speedup compared to conventional MILP solvers and decomposition-based methods. Moreover, the model's size remains constant regardless of the number of scenarios, offering significant scalability for large-scale stochastic unit commitment problems.

Electric Vehicle Public Charging Equity Considerations: A Systematic Review

Authors: Boyou Chen, Kaihan Zhang, Austin Moore, Bochen Jia, Mengqiu Cao

Public electric vehicle (EV) charging infrastructure is crucial for accelerating EV adoption and reducing transportation emissions; however, disparities in infrastructure access have raised significant equity concerns. This systematic review synthesizes existing knowledge and identifies gaps regarding equity in EV public charging research. Following structured review protocols, 91 peer-reviewed studies from Scopus and Google Scholar were analyzed, focusing explicitly on equity considerations. The findings indicate that current research on EV public charging equity mainly adopted geographic information systems (GIS), network optimization, behavioral modeling, and hybrid analytical frameworks, yet lacks consistent normative frameworks for assessing equity outcomes. Equity assessments highlight four key dimensions: spatial accessibility, cost burdens, reliability and usability, and user awareness and trust. Socio-economic disparities, particularly income, housing tenure, and ethnicity, frequently exacerbate inequitable access, disproportionately disadvantaging low-income, renter, and minority populations. Additionally, infrastructure-specific choices, including charger reliability, strategic location, and pricing strategies, significantly influence adoption patterns and equity outcomes. However, existing literature primarily reflects North American, European, and Chinese contexts, revealing substantial geographical and methodological limitations. This review suggests the need for more robust normative evaluations of equity, comprehensive demographic data integration, and advanced methodological frameworks, thereby guiding targeted, inclusive, and context-sensitive infrastructure planning and policy interventions.

Survey on Methods for Detection, Classification and Location of Faults in Power Systems Using Artificial Intelligence

Authors: Juan A. Martinez-Velasco, Alexandre Serrano-Fontova, Ricard Bosch-Tous, Pau Casals-Torrens

Components of electrical power systems are susceptible to failures caused by lightning strikes, aging or human errors. These faults can cause equipment damage, affect system reliability, and results in expensive repair costs. As electric power systems are becoming more complex, traditional protection methods face limitations and shortcomings. Faults in power systems can occur at anytime and anywhere, can be caused by a natural disaster or an accident, and their occurrence can be hardly predicted or avoided; therefore, it is crucial to accurately estimate the fault location and quickly restore service. The development of methods capable of accurately detecting, locating and removing faults is essential (i.e. fast isolation of faults is necessary to maintain the system stability at transmission levels; accurate and fast detection and location of faults are essential for increasing reliability and customer satisfaction at distribution levels). This has motivated the development of new and more efficient methods. Methods developed to detect and locate faults in power systems can be divided into two categories, conventional and artificial intelligence-based techniques. Although the utilization of artificial intelligence (AI) techniques offer tremendous potential, they are challenging and time consuming (i.e. many AI techniques require training data for processing). This paper presents a survey of the application of AI techniques to fault diagnosis (detection, classification and location of faults) of lines and cables of power systems at both transmission and distribution levels. The paper provides a short introduction to AI concepts, a brief summary of the application of AI techniques to power system analysis and design, and a discussion on AI-based fault diagnosis methods.

Survey on Methods for Detection, Classification and Location of Faults in Power Systems Using Artificial Intelligence

Authors: Juan A. Martinez-Velasco, Alexandre Serrano-Fontova, Ricard Bosch-Tous, Pau Casals-Torrens

Components of electrical power systems are susceptible to failures caused by lightning strikes, aging or human errors. These faults can cause equipment damage, affect system reliability, and results in expensive repair costs. As electric power systems are becoming more complex, traditional protection methods face limitations and shortcomings. Faults in power systems can occur at anytime and anywhere, can be caused by a natural disaster or an accident, and their occurrence can be hardly predicted or avoided; therefore, it is crucial to accurately estimate the fault location and quickly restore service. The development of methods capable of accurately detecting, locating and removing faults is essential (i.e. fast isolation of faults is necessary to maintain the system stability at transmission levels; accurate and fast detection and location of faults are essential for increasing reliability and customer satisfaction at distribution levels). This has motivated the development of new and more efficient methods. Methods developed to detect and locate faults in power systems can be divided into two categories, conventional and artificial intelligence-based techniques. Although the utilization of artificial intelligence (AI) techniques offer tremendous potential, they are challenging and time consuming (i.e. many AI techniques require training data for processing). This paper presents a survey of the application of AI techniques to fault diagnosis (detection, classification and location of faults) of lines and cables of power systems at both transmission and distribution levels. The paper provides a short introduction to AI concepts, a brief summary of the application of AI techniques to power system analysis and design, and a discussion on AI-based fault diagnosis methods.

Deep Learning-Based Beamforming Design Using Target Beam Patterns

Authors: Hongpu Zhang, Shu Sun, Hangsong Yan, Jianhua Mo

This paper proposes a deep learning-based beamforming design framework that directly maps a target beam pattern to optimal beamforming vectors across multiple antenna array architectures, including digital, analog, and hybrid beamforming. The proposed method employs a lightweight encoder-decoder network where the encoder compresses the complex beam pattern into a low-dimensional feature vector and the decoder reconstructs the beamforming vector while satisfying hardware constraints. To address training challenges under diverse and limited channel station information (CSI) conditions, a two-stage training process is introduced, which consists of an offline pre-training for robust feature extraction using an auxiliary module, followed by online training of the decoder with a composite loss function that ensures alignment between the synthesized and target beam patterns in terms of the main lobe shape and side lobe suppression. Simulation results based on NYUSIM-generated channels show that the proposed method can achieve spectral efficiency close to that of fully digital beamforming under limited CSI and outperforms representative existing methods.

Optimal Battery Placement in Power Grid

Authors: Ruotong Sun, Ermin Wei, Lihui Yi

We study the optimal placement of an unlimited-capacity battery in power grids under a centralized market model, where the independent system operator (ISO) aims to minimize total generation costs through load shifting. The optimal battery placement is not well understood by the existing literature, especially regarding the influence of network topology on minimizing generation costs. Our work starts with decomposing the Mixed-Integer Linear Programming (MILP) problem into a series of Linear Programming (LP) formulations. For power grids with sufficiently large generation capacity or tree topologies, we derive analytical cost expressions demonstrating that, under reasonable assumptions, the weighted degree is the only topological factor for optimal battery placement. We also discuss the minor impact of higher-order topological conditions on tree-topology networks. To find the localized nature of a single battery's impact, we establish that the relative cost-saving benefit of a single battery decreases as the network scales. Furthermore, we design a low-complexity algorithm for weakly-cyclic networks. Numerical experiments show that our algorithm is not only approximately 100 times faster than commercial solvers but also maintains high accuracy even when some theoretical assumptions are relaxed.

Intrinsic frequency distribution characterises neural dynamics

Authors: Ryohei Fukuma, Yoshinobu Kawahara, Okito Yamashita, Kei Majima, Haruhiko Kishima, Takufumi Yanagisawa

Decomposing multivariate time series with certain basic dynamics is crucial for understanding, predicting and controlling nonlinear spatiotemporally dynamic systems such as the brain. Dynamic mode decomposition (DMD) is a method for decomposing nonlinear spatiotemporal dynamics into several basic dynamics (dynamic modes; DMs) with intrinsic frequencies and decay rates. In particular, unlike Fourier transform-based methods, which are used to decompose a single-channel signal into the amplitudes of sinusoidal waves with discrete frequencies at a regular interval, DMD can derive the intrinsic frequencies of a multichannel signal on the basis of the available data; furthermore, it can capture nonstationary components such as alternations between states with different intrinsic frequencies. Here, we propose the use of the distribution of intrinsic frequencies derived from DMDs (DM frequencies) to characterise neural activities. The distributions of DM frequencies in the electroencephalograms of healthy subjects and patients with dementia or Parkinson's disease in a resting state were evaluated. By using the distributions, these patients were distinguished from healthy subjects with significantly greater accuracy than when using amplitude spectra derived by discrete Fourier transform. This finding suggests that the distribution of DM frequencies exhibits distinct behaviour from amplitude spectra, and therefore, the distribution may serve as a new biomarker by characterising the nonlinear spatiotemporal dynamics of electrophysiological signals.

Cyclic Multichannel Wiener Filter for Acoustic Beamforming

Authors: Giovanni Bologni, Richard Heusdens, Richard C. Hendriks

Acoustic beamforming models typically assume wide-sense stationarity of speech signals within short time frames. However, voiced speech is better modeled as a cyclostationary (CS) process, a random process whose mean and autocorrelation are $T_1$-periodic, where $\alpha_1=1/T_1$ corresponds to the fundamental frequency of vowels. Higher harmonic frequencies are found at integer multiples of the fundamental. This work introduces a cyclic multichannel Wiener filter (cMWF) for speech enhancement derived from a cyclostationary model. This beamformer exploits spectral correlation across the harmonic frequencies of the signal to further reduce the mean-squared error (MSE) between the target and the processed input. The proposed cMWF is optimal in the MSE sense and reduces to the MWF when the target is wide-sense stationary. Experiments on simulated data demonstrate considerable improvements in scale-invariant signal-to-distortion ratio (SI-SDR) on synthetic data but also indicate high sensitivity to the accuracy of the estimated fundamental frequency $\alpha_1$, which limits effectiveness on real data.

Pinching-Antenna Systems for Physical Layer Security

Authors: Kaidi Wang, Zhiguo Ding, Naofal Al-Dhahir

This letter investigates the potential of pinching-antenna systems for enhancing physical layer security. By pre-installing multiple pinching antennas at discrete positions along a waveguide, the capability of the considered system to perform amplitude and phase adjustment is validated through the formulation of a secrecy rate maximization problem. Specifically, amplitude control is applied to enhance the signal quality at the legitimate user, while phase alignment is designed to degrade the received signal quality at the eavesdropper. This cooperation among pinching antennas is modeled as a coalitional game, and a corresponding antenna activation algorithm is proposed. The individual impact of each antenna is quantified based on the Shapley value and marginal contribution, providing a fair and efficient method for performance evaluation. Simulation results show that the considered pinching-antenna system achieves significant improvements in secrecy rate, and that the Shapley value based algorithm outperforms conventional coalition value based solutions.

Harmonics to the Rescue: Why Voiced Speech is Not a Wss Process

Authors: Giovanni Bologni, Richard Heusdens, Richard C. Hendriks

Speech processing algorithms often rely on statistical knowledge of the underlying process. Despite many years of research, however, the debate on the most appropriate statistical model for speech still continues. Speech is commonly modeled as a wide-sense stationary (WSS) process. However, the use of the WSS model for spectrally correlated processes is fundamentally wrong, as WSS implies spectral uncorrelation. In this paper, we demonstrate that voiced speech can be more accurately represented as a cyclostationary (CS) process. By employing the CS rather than the WSS model for processes that are inherently correlated across frequency, it is possible to improve the estimation of cross-power spectral densities (PSDs), source separation, and beamforming. We illustrate how the correlation between harmonic frequencies of CS processes can enhance system identification, and validate our findings using both simulated and real speech data.

Natural Language-based Assessment of L2 Oral Proficiency using LLMs

Authors: Stefano Bannò, Rao Ma, Mengjie Qian, Siyuan Tang, Kate Knill, Mark Gales

Natural language-based assessment (NLA) is an approach to second language assessment that uses instructions - expressed in the form of can-do descriptors - originally intended for human examiners, aiming to determine whether large language models (LLMs) can interpret and apply them in ways comparable to human assessment. In this work, we explore the use of such descriptors with an open-source LLM, Qwen 2.5 72B, to assess responses from the publicly available S&I Corpus in a zero-shot setting. Our results show that this approach - relying solely on textual information - achieves competitive performance: while it does not outperform state-of-the-art speech LLMs fine-tuned for the task, it surpasses a BERT-based model trained specifically for this purpose. NLA proves particularly effective in mismatched task settings, is generalisable to other data types and languages, and offers greater interpretability, as it is grounded in clearly explainable, widely applicable language descriptors.

A new time-stepping strategy and boundary treatment to improve recent 2d traffic model

Authors: Friedemann Kemm

We show how a recently published 2d model for traffic flow can be further improved. Besides other improvements and simplifications, we present not only a method to compute the necessary time step restrictions, but also a subcycling for the inflow and outflow. This drastically reduces computational cost on large domains with coarse grids, i.\,e.\ for simulations of a whole region instead of a small part of a city or town.

DepViT-CAD: Deployable Vision Transformer-Based Cancer Diagnosis in Histopathology

Authors: Ashkan Shakarami, Lorenzo Nicole, Rocco Cappellesso, Angelo Paolo Dei Tos, Stefano Ghidoni

Accurate and timely cancer diagnosis from histopathological slides is vital for effective clinical decision-making. This paper introduces DepViT-CAD, a deployable AI system for multi-class cancer diagnosis in histopathology. At its core is MAViT, a novel Multi-Attention Vision Transformer designed to capture fine-grained morphological patterns across diverse tumor types. MAViT was trained on expert-annotated patches from 1008 whole-slide images, covering 11 diagnostic categories, including 10 major cancers and non-tumor tissue. DepViT-CAD was validated on two independent cohorts: 275 WSIs from The Cancer Genome Atlas and 50 routine clinical cases from pathology labs, achieving diagnostic sensitivities of 94.11% and 92%, respectively. By combining state-of-the-art transformer architecture with large-scale real-world validation, DepViT-CAD offers a robust and scalable approach for AI-assisted cancer diagnostics. To support transparency and reproducibility, software and code will be made publicly available at GitHub.

ASDKit: A Toolkit for Comprehensive Evaluation of Anomalous Sound Detection Methods

Authors: Takuya Fujimura, Kevin Wilkinghoff, Keisuke Imoto, Tomoki Toda

In this paper, we introduce ASDKit, a toolkit for anomalous sound detection (ASD) task. Our aim is to facilitate ASD research by providing an open-source framework that collects and carefully evaluates various ASD methods. First, ASDKit provides training and evaluation scripts for a wide range of ASD methods, all handled within a unified framework. For instance, it includes the autoencoder-based official DCASE baseline, representative discriminative methods, and self-supervised learning-based methods. Second, it supports comprehensive evaluation on the DCASE 2020--2024 datasets, enabling careful assessment of ASD performance, which is highly sensitive to factors such as datasets and random seeds. In our experiments, we re-evaluate various ASD methods using ASDKit and identify consistently effective techniques across multiple datasets and trials. We also demonstrate that ASDKit reproduces the state-of-the-art-level performance on the considered datasets.

A SUMO-Based Digital Twin for Evaluation of Conventional and Electric Vehicle Networks

Authors: Haomiaomiao Wang, Conor Fennell, Swati Poojary, Mingming Liu

Digital twins are increasingly applied in transportation modelling to replicate real-world traffic dynamics and evaluate mobility and energy efficiency. This study presents a SUMO-based digital twin that simulates mixed ICEV-EV traffic on a major motorway segment, leveraging multi-sensor data fusion from inductive loops, GPS probes, and toll records. The model is validated under both complete and partial information scenarios, achieving 93.1% accuracy in average speed estimation and 97.1% in average trip length estimation. Statistical metrics, including KL Divergence and Wasserstein Distance, demonstrate strong alignment between simulated and observed traffic patterns. Furthermore, CO2 emissions were overestimated by only 0.8-2.4%, and EV power consumption underestimated by 1.0-5.4%, highlighting the model's robustness even with incomplete vehicle classification information.

Enhanced Throughput and Seamless Handover Solutions for Urban 5G-Vehicle C-Band Integrated Satellite-Terrestrial Networks

Authors: Hung Nguyen-Kha, Vu Nguyen Ha, Eva Lagunas, Symeon Chatzinotas, Joel Grotz

This paper investigates downlink transmission in 5G Integrated Satellite-Terrestrial Networks (ISTNs) supporting automotive users (UEs) in urban environments, where base stations (BSs) and Low Earth Orbit (LEO) satellites (LSats) cooperate to serve moving UEs over shared C-band frequency carriers. Urban settings, characterized by dense obstructions, together with UE mobility, and the dynamic movement and coverage of LSats pose significant challenges to user association and resource allocation. To address these challenges, we formulate a multi-objective optimization problem designed to improve both throughput and seamless handover (HO). Particularly, the formulated problem balances sum-rate (SR) maximization and connection change (CC) minimization through a weighted trade-off by jointly optimizing power allocation and BS-UE/LSat-UE associations over a given time window. This is a mixed-integer and non-convex problem which is inherently difficult to solve. To solve this problem efficiently, we propose an iterative algorithm based on the Successive Convex Approximation (SCA) technique. Furthermore, we introduce a practical prediction-based algorithm capable of providing efficient solutions in real-world implementations. Especially, the simulations use a realistic 3D map of London and UE routes obtained from the Google Navigator application to ensure practical examination. Thanks to these realistic data, the simulation results can show valuable insights into the link budget assessment in urban areas due to the impact of buildings on transmission links under the blockage, reflection, and diffraction effects. Furthermore, the numerical results demonstrate the effectiveness of our proposed algorithms in terms of SR and the CC-number compared to the greedy and benchmark algorithms.

Improved Sum-of-Squares Stability Verification of Neural-Network-Based Controllers

Authors: Alvaro Detailleur, Guillaume Ducard, Christopher Onder

This work presents several improvements to the closed-loop stability verification framework using semialgebraic sets and convex semidefinite programming to examine neural-network-based control systems regulating nonlinear dynamical systems. First, the utility of the framework is greatly expanded: two semialgebraic functions mimicking common, smooth activation functions are presented and compatibility with control systems incorporating Recurrent Equilibrium Networks (RENs) and thereby Recurrent Neural Networks (RNNs) is established. Second, the validity of the framework's state-of-the-art stability analyses is established via an alternate proof. Third, based on this proof, two new optimization problems simplifying the analysis of local stability properties are presented. To simplify the analysis of a closed-loop system's Region of Attraction (RoA), the first problem explicitly parameterizes a class of candidate Lyapunov functions larger than in previous works. The second problem utilizes the unique guarantees available under the condition of invariance to further expand the set of candidate Lyapunov functions and directly determine whether an invariant set forms part of the system's RoA. These contributions are successfully demonstrated in two numerical examples and suggestions for future research are provided.

Domain-Adaptive Diagnosis of Lewy Body Disease with Transferability Aware Transformer

Authors: Xiaowei Yu, Jing Zhang, Tong Chen, Yan Zhuang, Minheng Chen, Chao Cao, Yanjun Lyu, Lu Zhang, Li Su, Tianming Liu, Dajiang Zhu

Lewy Body Disease (LBD) is a common yet understudied form of dementia that imposes a significant burden on public health. It shares clinical similarities with Alzheimer's disease (AD), as both progress through stages of normal cognition, mild cognitive impairment, and dementia. A major obstacle in LBD diagnosis is data scarcity, which limits the effectiveness of deep learning. In contrast, AD datasets are more abundant, offering potential for knowledge transfer. However, LBD and AD data are typically collected from different sites using different machines and protocols, resulting in a distinct domain shift. To effectively leverage AD data while mitigating domain shift, we propose a Transferability Aware Transformer (TAT) that adapts knowledge from AD to enhance LBD diagnosis. Our method utilizes structural connectivity (SC) derived from structural MRI as training data. Built on the attention mechanism, TAT adaptively assigns greater weights to disease-transferable features while suppressing domain-specific ones, thereby reducing domain shift and improving diagnostic accuracy with limited LBD data. The experimental results demonstrate the effectiveness of TAT. To the best of our knowledge, this is the first study to explore domain adaptation from AD to LBD under conditions of data scarcity and domain shift, providing a promising framework for domain-adaptive diagnosis of rare diseases.

Less Stress, More Privacy: Stress Detection on Anonymized Speech of Air Traffic Controllers

Authors: Janaki Viswanathan, Alexander Blatt, Konrad Hagemann, Dietrich Klakow

Air traffic control (ATC) demands multi-tasking under time pressure with high consequences of an error. This can induce stress. Detecting stress is a key point in maintaining the high safety standards of ATC. However, processing ATC voice data entails privacy restrictions, e.g. the General Data Protection Regulation (GDPR) law. Anonymizing the ATC voice data is one way to comply with these restrictions. In this paper, different architectures for stress detection for anonymized ATCO speech are evaluated. Our best networks reach a stress detection accuracy of 93.6% on an anonymized version of the Speech Under Simulated and Actual Stress (SUSAS) dataset and an accuracy of 80.1% on our anonymized ATC simulation dataset. This shows that privacy does not have to be an impediment in building well-performing deep-learning-based models.

CovertAuth: Joint Covert Communication and Authentication in MmWave Systems

Authors: Yulin Teng, Keshuang Han, Pinchang Zhang, Xiaohong Jiang, Yulong Shen, Fu Xiao

Beam alignment (BA) is a crucial process in millimeter-wave (mmWave) communications, enabling precise directional transmission and efficient link establishment. However, due to characteristics like omnidirectional exposure and the broadcast nature of the BA phase, it is particularly vulnerable to eavesdropping and identity impersonation attacks. To this end, this paper proposes a novel secure framework named CovertAuth, designed to enhance the security of the BA phase against such attacks. In particular, to combat eavesdropping attacks, the closed-form expressions of successful BA probability and covert transmission rate are first derived. Then, a covert communication problem aimed at jointly optimizing beam training budget and transmission power is formulated to maximize covert communication rate, subject to the covertness requirement. An alternating optimization algorithm combined with successive convex approximation is employed to iteratively achieve optimal results. To combat impersonation attacks, the mutual coupling effect of antenna array impairments is explored as a device feature to design a weighted-sum energy detector based physical layer authentication scheme. Moreover, theoretical models for authentication metrics like detection and false alarm probabilities are also provided to conduct performance analysis. Based on these models, an optimization problem is constructed to determine the optimal weight value that maximizes authentication accuracy. Finally, simulation results demonstrate that CovertAuth presents improved detection accuracy under the same covertness requirement compared to existing works.

MQFQ-Sticky: Fair Queueing For Serverless GPU Functions

Authors: Alexander Fuerst, Siddharth Anil, Vishakha Dixit, Purushottam (Puru)Kulkarni, Prateek Sharma

Hardware accelerators like GPUs are now ubiquitous in data centers, but are not fully supported by common cloud abstractions such as Functions as a Service (FaaS). Many popular and emerging FaaS applications such as machine learning and scientific computing can benefit from GPU acceleration. However, FaaS frameworks (such as OpenWhisk) are not capable of providing this acceleration because of the impedance mismatch between GPUs and the FaaS programming model, which requires virtualization and sandboxing of each function. The challenges are amplified due to the highly dynamic and heterogeneous FaaS workloads. This paper presents the design and implementation of a FaaS system for providing GPU acceleration in a black-box manner (without modifying function code). Running small functions in containerized sandboxes is challenging due to limited GPU concurrency and high cold-start overheads, resulting in heavy queueing of function invocations. We show how principles from I/O scheduling, such as fair queuing and anticipatory scheduling, can be translated to function scheduling on GPUs. We develop MQFQ-Sticky, an integrated fair queueing and GPU memory management approach, which balances the tradeoffs between locality, fairness, and latency. Empirical evaluation on a range of workloads shows that it reduces function latency by 2x to 20x compared to existing GPU and CPU queueing policies.

On the Gradient Domination of the LQG Problem

Authors: Kasra Fallah, Leonardo F. Toso, James Anderson

We consider solutions to the linear quadratic Gaussian (LQG) regulator problem via policy gradient (PG) methods. Although PG methods have demonstrated strong theoretical guarantees in solving the linear quadratic regulator (LQR) problem, despite its nonconvex landscape, their theoretical understanding in the LQG setting remains limited. Notably, the LQG problem lacks gradient dominance in the classical parameterization, i.e., with a dynamic controller, which hinders global convergence guarantees. In this work, we study PG for the LQG problem by adopting an alternative parameterization of the set of stabilizing controllers and employing a lifting argument. We refer to this parameterization as a history representation of the control input as it is parameterized by past input and output data from the previous p time-steps. This representation enables us to establish gradient dominance and approximate smoothness for the LQG cost. We prove global convergence and per-iteration stability guarantees for policy gradient LQG in model-based and model-free settings. Numerical experiments on an open-loop unstable system are provided to support the global convergence guarantees and to illustrate convergence under different history lengths of the history representation.

Behavioral Exploration: Learning to Explore via In-Context Adaptation

Authors: Andrew Wagenmaker, Zhiyuan Zhou, Sergey Levine

Developing autonomous agents that quickly explore an environment and adapt their behavior online is a canonical challenge in robotics and machine learning. While humans are able to achieve such fast online exploration and adaptation, often acquiring new information and skills in only a handful of interactions, existing algorithmic approaches tend to rely on random exploration and slow, gradient-based behavior updates. How can we endow autonomous agents with such capabilities on par with humans? Taking inspiration from recent progress on both in-context learning and large-scale behavioral cloning, in this work we propose behavioral exploration: training agents to internalize what it means to explore and adapt in-context over the space of ``expert'' behaviors. To achieve this, given access to a dataset of expert demonstrations, we train a long-context generative model to predict expert actions conditioned on a context of past observations and a measure of how ``exploratory'' the expert's behaviors are relative to this context. This enables the model to not only mimic the behavior of an expert, but also, by feeding its past history of interactions into its context, to select different expert behaviors than what have been previously selected, thereby allowing for fast online adaptation and targeted, ``expert-like'' exploration. We demonstrate the effectiveness of our method in both simulated locomotion and manipulation settings, as well as on real-world robotic manipulation tasks, illustrating its ability to learn adaptive, exploratory behavior.

Imitation Learning in Continuous Action Spaces: Mitigating Compounding Error without Interaction

Authors: Thomas T. Zhang, Daniel Pfrommer, Nikolai Matni, Max Simchowitz

We study the problem of imitating an expert demonstrator in a continuous state-and-action dynamical system. While imitation learning in discrete settings such as autoregressive language modeling has seen immense success and popularity in recent years, imitation in physical settings such as autonomous driving and robot learning has proven comparably more complex due to the compounding errors problem, often requiring elaborate set-ups to perform stably. Recent work has demonstrated that even in benign settings, exponential compounding errors are unavoidable when learning solely from expert-controlled trajectories, suggesting the need for more advanced policy parameterizations or data augmentation. To this end, we present minimal interventions that provably mitigate compounding errors in continuous state-and-action imitation learning. When the system is open-loop stable, we prescribe "action chunking," i.e., predicting and playing sequences of actions in open-loop; when the system is possibly unstable, we prescribe "noise injection," i.e., adding noise during expert demonstrations. These interventions align with popular choices in modern robot learning, though the benefits we derive are distinct from the effects they were designed to target. Our results draw insights and tools from both control theory and reinforcement learning; however, our analysis reveals novel considerations that do not naturally arise when either literature is considered in isolation.

Continuous-Time Signal Decomposition: An Implicit Neural Generalization of PCA and ICA

Authors: Shayan K. Azmoodeh, Krishna Subramani, Paris Smaragdis

We generalize the low-rank decomposition problem, such as principal and independent component analysis (PCA, ICA) for continuous-time vector-valued signals and provide a model-agnostic implicit neural signal representation framework to learn numerical approximations to solve the problem. Modeling signals as continuous-time stochastic processes, we unify the approaches to both the PCA and ICA problems in the continuous setting through a contrast function term in the network loss, enforcing the desired statistical properties of the source signals (decorrelation, independence) learned in the decomposition. This extension to a continuous domain allows the application of such decompositions to point clouds and irregularly sampled signals where standard techniques are not applicable.

Transformer based Collaborative Reinforcement Learning for Fluid Antenna System (FAS)-enabled 3D UAV Positioning

Authors: Xiaoren Xu, Hao Xu, Dongyu Wei, Walid Saad, Mehdi Bennis, Mingzhe Chen

In this paper, a novel Three dimensional (3D) positioning framework of fluid antenna system (FAS)-enabled unmanned aerial vehicles (UAVs) is developed. In the proposed framework, a set of controlled UAVs cooperatively estimate the real-time 3D position of a target UAV. Here, the active UAV transmits a measurement signal to the passive UAVs via the reflection from the target UAV. Each passive UAV estimates the distance of the active-target-passive UAV link and selects an antenna port to share the distance information with the base station (BS) that calculates the real-time position of the target UAV. As the target UAV is moving due to its task operation, the controlled UAVs must optimize their trajectories and select optimal antenna port, aiming to estimate the real-time position of the target UAV. We formulate this problem as an optimization problem to minimize the target UAV positioning error via optimizing the trajectories of all controlled UAVs and antenna port selection of passive UAVs. Here, an attention-based recurrent multi-agent reinforcement learning (AR-MARL) scheme is proposed, which enables each controlled UAV to use the local Q function to determine its trajectory and antenna port while optimizing the target UAV positioning performance without knowing the trajectories and antenna port selections of other controlled UAVs. Different from current MARL methods, the proposed method uses a recurrent neural network (RNN) that incorporates historical state-action pairs of each controlled UAV, and an attention mechanism to analyze the importance of these historical state-action pairs, thus improving the global Q function approximation accuracy and the target UAV positioning accuracy. Simulation results show that the proposed AR-MARL scheme can reduce the average positioning error by up to 17.5% and 58.5% compared to the VD-MARL scheme and the proposed method without FAS.

RoHOI: Robustness Benchmark for Human-Object Interaction Detection

Authors: Di Wen, Kunyu Peng, Kailun Yang, Yufan Chen, Ruiping Liu, Junwei Zheng, Alina Roitberg, Rainer Stiefelhagen

Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support. However, models trained on clean datasets degrade in real-world conditions due to unforeseen corruptions, leading to inaccurate prediction. To address this, we introduce the first robustness benchmark for HOI detection, evaluating model resilience under diverse challenges. Despite advances, current models struggle with environmental variability, occlusion, and noise. Our benchmark, RoHOI, includes 20 corruption types based on HICO-DET and V-COCO datasets and a new robustness-focused metric. We systematically analyze existing models in the related field, revealing significant performance drops under corruptions. To improve robustness, we propose a Semantic-Aware Masking-based Progressive Learning (SAMPL) strategy to guide the model to be optimized based on holistic and partial cues, dynamically adjusting the model's optimization to enhance robust feature learning. Extensive experiments show our approach outperforms state-of-the-art methods, setting a new standard for robust HOI detection. Benchmarks, datasets, and code will be made publicly available at this https URL.

Mixture of LoRA Experts with Multi-Modal and Multi-Granularity LLM Generative Error Correction for Accented Speech Recognition

Authors: Bingshen Mu, Kun Wei, Pengcheng Guo, Lei Xie

Despite substantial improvements in ASR, performance tends to degrade when faced with adverse conditions such as speaker accents. Generative error correction (GER) leverages the rich linguistic knowledge and exceptional reasoning ability of LLMs, significantly outperforming typical LM methods. However, it lacks specificity in accented speech scenarios. In this study, we leverage GER to improve the accuracy of transcription predictions by addressing the two primary features of accented speech recognition. To fully leverage pronunciation information, we propose the multi-modal GER, which integrates pronunciation information from the speech modality, and the multi-granularity GER, which incorporates fine-grained phoneme-level information related to pronunciation. These two methods enable the LLM to utilize the pronunciation information of accented speech and the semantic information from word-level hypotheses for accurate transcription predictions through LoRA fine-tuning. On the one hand, we employ a three-stage training strategy to train separate multi-modal GER models for each accent to obtain mono-accent LoRA experts. By adopting our proposed HDMoLE method, which incorporates hierarchical routing and dynamic thresholds within the mixture of LoRA experts, we effectively merge multiple mono-accent LoRA experts within a single multi-modal GER to overcome the challenges posed by accent diversity. On the other hand, multi-granularity GER leverages the N-best word-level and phoneme-level hypotheses generated by the HDMoLE model to predict the final accented speech transcriptions. Experimental results on the multi-accent English dataset demonstrate the efficacy of our proposed methods. Our methods achieve a remarkable relative WER reduction of 67.35% compared to the Whisper-large-v3 baseline.

Towards Spatial Audio Understanding via Question Answering

Authors: Parthasaarathy Sudarsanam, Archontis Politis

In this paper, we introduce a novel framework for spatial audio understanding of first-order ambisonic (FOA) signals through a question answering (QA) paradigm, aiming to extend the scope of sound event localization and detection (SELD) towards spatial scene understanding and reasoning. First, we curate and release fine-grained spatio-temporal textual descriptions for the STARSS23 dataset using a rule-based approach, and further enhance linguistic diversity using large language model (LLM)-based rephrasing. We also introduce a QA dataset aligned with the STARSS23 scenes, covering various aspects such as event presence, localization, spatial, and temporal relationships. To increase language variety, we again leverage LLMs to generate multiple rephrasings per question. Finally, we develop a baseline spatial audio QA model that takes FOA signals and natural language questions as input and provides answers regarding various occurrences, temporal, and spatial relationships of sound events in the scene formulated as a classification task. Despite being trained solely with scene-level question answering supervision, our model achieves performance that is comparable to a fully supervised sound event localization and detection model trained with frame-level spatiotemporal annotations. The results highlight the potential of language-guided approaches for spatial audio understanding and open new directions for integrating linguistic supervision into spatial scene analysis.

Controllable Patching for Compute-Adaptive Surrogate Modeling of Partial Differential Equations

Authors: Payel Mukhopadhyay, Michael McCabe, Ruben Ohana, Miles Cranmer

Patch-based transformer surrogates have become increasingly effective for modeling spatiotemporal dynamics, but the fixed patch size is a major limitation for budget-conscience deployment in production. We introduce two lightweight, architecture-agnostic modules-the Convolutional Kernel Modulator (CKM) and Convolutional Stride Modulator (CSM)-that enable dynamic patch size control at inference in patch based models, without retraining or accuracy loss. Combined with a cyclic patch-size rollout, our method mitigates patch artifacts and improves long-term stability for video-like prediction tasks. Applied to a range of challenging 2D and 3D PDE benchmarks, our approach improves rollout fidelity and runtime efficiency. To our knowledge, this is the first framework to enable inference-time patch-size tunability in patch-based PDE surrogates. Its plug-and-play design makes it broadly applicable across architectures-establishing a general foundation for compute-adaptive modeling in PDE surrogate tasks.

ClaritySpeech: Dementia Obfuscation in Speech

Authors: Dominika Woszczyk, Ranya Aloufi, Soteris Demetriou

Dementia, a neurodegenerative disease, alters speech patterns, creating communication barriers and raising privacy concerns. Current speech technologies, such as automatic speech transcription (ASR), struggle with dementia and atypical speech, further challenging accessibility. This paper presents a novel dementia obfuscation in speech framework, ClaritySpeech, integrating ASR, text obfuscation, and zero-shot text-to-speech (TTS) to correct dementia-affected speech while preserving speaker identity in low-data environments without fine-tuning. Results show a 16% and 10% drop in mean F1 score across various adversarial settings and modalities (audio, text, fusion) for ADReSS and ADReSSo, respectively, maintaining 50% speaker similarity. We also find that our system improves WER (from 0.73 to 0.08 for ADReSS and 0.15 for ADReSSo) and speech quality from 1.65 to ~2.15, enhancing privacy and accessibility.

DAA*: Deep Angular A Star for Image-based Path Planning

Authors: Zhiwei Xu

Path smoothness is often overlooked in path imitation learning from expert demonstrations. In this paper, we introduce a novel learning method, termed deep angular A* (DAA*), by incorporating the proposed path angular freedom (PAF) into A* to improve path similarity through adaptive path smoothness. The PAF aims to explore the effect of move angles on path node expansion by finding the trade-off between their minimum and maximum values, allowing for high adaptiveness for imitation learning. DAA* improves path optimality by closely aligning with the reference path through joint optimization of path shortening and smoothing, which correspond to heuristic distance and PAF, respectively. Throughout comprehensive evaluations on 7 datasets, including 4 maze datasets, 2 video-game datasets, and a real-world drone-view dataset containing 2 scenarios, we demonstrate remarkable improvements of our DAA* over neural A* in path similarity between the predicted and reference paths with a shorter path length when the shortest path is plausible, improving by 9.0% SPR, 6.9% ASIM, and 3.9% PSIM. Furthermore, when jointly learning pathfinding with both path loss and path probability map loss, DAA* significantly outperforms the state-of-the-art TransPath by 6.7% SPR, 6.5% PSIM, and 3.7% ASIM. We also discuss the minor trade-off between path optimality and search efficiency where applicable.

Voice Conversion for Lombard Speaking Style with Implicit and Explicit Acoustic Feature Conditioning

Authors: Dominika Woszczyk, Manuel Sam Ribeiro, Thomas Merritt, Daniel Korzekwa

Text-to-Speech (TTS) systems in Lombard speaking style can improve the overall intelligibility of speech, useful for hearing loss and noisy conditions. However, training those models requires a large amount of data and the Lombard effect is challenging to record due to speaker and noise variability and tiring recording conditions. Voice conversion (VC) has been shown to be a useful augmentation technique to train TTS systems in the absence of recorded data from the target speaker in the target speaking style. In this paper, we are concerned with Lombard speaking style transfer. Our goal is to convert speaker identity while preserving the acoustic attributes that define the Lombard speaking style. We compare voice conversion models with implicit and explicit acoustic feature conditioning. We observe that our proposed implicit conditioning strategy achieves an intelligibility gain comparable to the model conditioned on explicit acoustic features, while also preserving speaker similarity.

BENYO-S2ST-Corpus-1: A Bilingual English-to-Yoruba Direct Speech-to-Speech Translation Corpus

Authors: Emmanuel Adetiba, Abdultaofeek Abayomi, Raymond J. Kala, Ayodele H. Ifijeh, Oluwatobi E. Dare, Olabode Idowu-Bismark, Gabriel O. Sobola, Joy N. Adetiba, Monsurat Adepeju Lateef, Heather Cole-Lewis

There is a major shortage of Speech-to-Speech Translation (S2ST) datasets for high resource-to-low resource language pairs such as English-to-Yoruba. Thus, in this study, we curated the Bilingual English-to-Yoruba Speech-to-Speech Translation Corpus Version 1 (BENYO-S2ST-Corpus-1). The corpus is based on a hybrid architecture we developed for large-scale direct S2ST corpus creation at reduced cost. To achieve this, we leveraged non speech-to-speech Standard Yoruba (SY) real-time audios and transcripts in the YORULECT Corpus as well as the corresponding Standard English (SE) transcripts. YORULECT Corpus is small scale(1,504) samples, and it does not have paired English audios. Therefore, we generated the SE audios using pre-trained AI models (i.e. Facebook MMS). We also developed an audio augmentation algorithm named AcoustAug based on three latent acoustic features to generate augmented audios from the raw audios of the two languages. BENYO-S2ST-Corpus-1 has 12,032 audio samples per language, which gives a total of 24,064 sample size. The total audio duration for the two languages is 41.20 hours. This size is quite significant. Beyond building S2ST models, BENYO-S2ST-Corpus-1 can be used to build pretrained models or improve existing ones. The created corpus and Coqui framework were used to build a pretrained Yoruba TTS model (named YoruTTS-0.5) as a proof of concept. The YoruTTS-0.5 gave a F0 RMSE value of 63.54 after 1,000 epochs, which indicates moderate fundamental pitch similarity with the reference real-time audio. Ultimately, the corpus architecture in this study can be leveraged by researchers and developers to curate datasets for multilingual high-resource-to-low-resource African languages. This will bridge the huge digital divides in translations among high and low-resource language pairs. BENYO-S2ST-Corpus-1 and YoruTTS-0.5 are publicly available at (this https URL).

C-ZUPT: Stationarity-Aided Aerial Hovering

Authors: Daniel Engelsman, Itzik Klein

Autonomous systems across diverse domains have underscored the need for drift-resilient state estimation. Although satellite-based positioning and cameras are widely used, they often suffer from limited availability in many environments. As a result, positioning must rely solely on inertial sensors, leading to rapid accuracy degradation over time due to sensor biases and noise. To counteract this, alternative update sources-referred to as information aiding-serve as anchors of certainty. Among these, the zero-velocity update (ZUPT) is particularly effective in providing accurate corrections during stationary intervals, though it is restricted to surface-bound platforms. This work introduces a controlled ZUPT (C-ZUPT) approach for aerial navigation and control, independent of surface contact. By defining an uncertainty threshold, C-ZUPT identifies quasi-static equilibria to deliver precise velocity updates to the estimation filter. Extensive validation confirms that these opportunistic, high-quality updates significantly reduce inertial drift and control effort. As a result, C-ZUPT mitigates filter divergence and enhances navigation stability, enabling more energy-efficient hovering and substantially extending sustained flight-key advantages for resource-constrained aerial systems.

Reliable Task Offloading in MEC through Transmission Diversity and Jamming-Aware Scheduling

Authors: Ghazal Asemian, Mohammadreza Amini, Burak Kantarci

Mobile Edge Computing (MEC) enables low-latency applications by bringing computation closer to the user, but dynamic task arrivals and communication threats like jamming complicate reliable task offloading and resource allocation. In this paper, we formulate a dynamic MEC framework considering the transmission diversity that jointly addresses task scheduling and resource block (RB) assignment in the presence of jamming. First, we define and evaluate key network metrics-including dropped task ratio and bandwidth utilization-while maintaining service continuity by accounting for the existing commitments of the edge server to previously offloaded tasks. Then, we propose a jamming-aware offloading and RB allocation framework that leverages transmission diversity and optimal scheduling across distributed gNBs. The proposed solution is compared to a similar scenario without transmission diversity and two baseline strategies of first-come-first-served (FCFS) and shortest task first (STF). The proposed algorithm effectively mitigates the impact of jamming while enhancing resource utilization and minimizing task drop rates, making it highly suitable for mission-critical MEC applications. At signal-to-jamming-and-noise ratio (SJNR) of 4 dB, the proposed method achieves a $0.26$ task drop rate, outperforming the scenario without transmission diversity with a task drop rate of 0.50 and STF and FCFS strategies with 0.52 and 0.63 task drop rates, respectively.

Acoustic Wave Modeling Using 2D FDTD: Applications in Unreal Engine For Dynamic Sound Rendering

Authors: Bilkent Samsurya

Accurate sound propagation simulation is essential for delivering immersive experiences in virtual applications, yet industry methods for acoustic modeling often do not account for the full breadth of acoustic wave phenomena. This paper proposes a novel two-dimensional (2D) finite-difference time-domain (FDTD) framework that simulates sound propagation as a wave-based model in Unreal Engine, with an emphasis on capturing lower frequency wave phenomena, embedding occlusion, diffraction, reflection and interference in generated impulse responses. The process begins by discretizing the scene geometry into a 2D grid via a top-down projection from which obstacle masks and boundary conditions are derived. A Python-based FDTD solver injects a sine sweep at a source position, and virtual quadraphonic microphone arrays record pressure field responses at pre-defined listener positions. De-convolution of the pressure responses yields multi-channel impulse responses that retain spatial directionality which are then integrated into Unreal Engine's audio pipeline for dynamic playback. Benchmark tests confirm agreement with analytical expectations, and the paper outlines hybrid extensions aimed at commercial viability.

Joint Access Point Activation and Power Allocation for Cell-Free Massive MIMO Aided ISAC Systems

Authors: Nguyen Xuan Tung, Le Tung Giang, Trinh Van Chien, Hoang Trong Minh, Lajos Hanzo

Cell-free massive multiple-input multiple-output (MIMO)-aided integrated sensing and communication (ISAC) systems are investigated where distributed access points jointly serve users and sensing targets. We demonstrate that only a subset of access points (APs) has to be activated for both tasks, while deactivating redundant APs is essential for power savings. This motivates joint active AP selection and power control for optimizing energy efficiency. The resultant problem is a mixed-integer nonlinear program (MINLP). To address this, we propose a model-based Branch-and-Bound approach as a strong baseline to guide a semi-supervised heterogeneous graph neural network (HetGNN) for selecting the best active APs and the power allocation. Comprehensive numerical results demonstrate that the proposed HetGNN reduces power consumption by 20-25\% and runs nearly 10,000 times faster than model-based benchmarks.

Unmanned Aerial Vehicle (UAV) Data-Driven Modeling Software with Integrated 9-Axis IMUGPS Sensor Fusion and Data Filtering Algorithm

Authors: Azfar Azdi Arfakhsyad, Aufa Nasywa Rahman, Larasati Kinanti, Ahmad Ataka Awwalur Rizqi, Hannan Nur Muhammad

Unmanned Aerial Vehicles (UAV) have emerged as versatile platforms, driving the demand for accurate modeling to support developmental testing. This paper proposes data-driven modeling software for UAV. Emphasizes the utilization of cost-effective sensors to obtain orientation and location data subsequently processed through the application of data filtering algorithms and sensor fusion techniques to improve the data quality to make a precise model visualization on the software. UAV's orientation is obtained using processed Inertial Measurement Unit (IMU) data and represented using Quaternion Representation to avoid the gimbal lock problem. The UAV's location is determined by combining data from the Global Positioning System (GPS), which provides stable geographic coordinates but slower data update frequency, and the accelerometer, which has higher data update frequency but integrating it to get position data is unstable due to its accumulative error. By combining data from these two sensors, the software is able to calculate and continuously update the UAV's real-time position during its flight operations. The result shows that the software effectively renders UAV orientation and position with high degree of accuracy and fluidity

GenAI-based Multi-Agent Reinforcement Learning towards Distributed Agent Intelligence: A Generative-RL Agent Perspective

Authors: Hang Wang, Junshan Zhang

Multi-agent reinforcement learning faces fundamental challenges that conventional approaches have failed to overcome: exponentially growing joint action spaces, non-stationary environments where simultaneous learning creates moving targets, and partial observability that constrains coordination. Current methods remain reactive, employing stimulus-response mechanisms that fail when facing novel scenarios. We argue for a transformative paradigm shift from reactive to proactive multi-agent intelligence through generative AI-based reinforcement learning. This position advocates reconceptualizing agents not as isolated policy optimizers, but as sophisticated generative models capable of synthesizing complex multi-agent dynamics and making anticipatory decisions based on predictive understanding of future interactions. Rather than responding to immediate observations, generative-RL agents can model environment evolution, predict other agents' behaviors, generate coordinated action sequences, and engage in strategic reasoning accounting for long-term dynamics. This approach leverages pattern recognition and generation capabilities of generative AI to enable proactive decision-making, seamless coordination through enhanced communication, and dynamic adaptation to evolving scenarios. We envision this paradigm shift will unlock unprecedented possibilities for distributed intelligence, moving beyond individual optimization toward emergent collective behaviors representing genuine collaborative intelligence. The implications extend across autonomous systems, robotics, and human-AI collaboration, promising solutions to coordination challenges intractable under traditional reactive frameworks.

SC-TSE: Speaker Consistency-Aware Target Speaker Extraction

Authors: Shu Wu, Anbin Qi, Yanzhang Xie, Xiang Xie

Target Speaker Extraction (TSE) uses a reference cue to extract the target speech from a mixture. In TSE systems relying on audio cues, the speaker embedding from the enrolled speech is crucial to performance. However, these embeddings may suffer from speaker identity confusion. Unlike previous studies that focus on improving speaker embedding extraction, we improve TSE performance from the perspective of speaker consistency. In this paper, we propose a speaker consistency-aware target speaker extraction method that incorporates a centroid-based speaker consistency loss. This approach enhances TSE performance by ensuring speaker consistency between the enrolled and extracted speech. In addition, we integrate conditional loss suppression into the training process. The experimental results validate the effectiveness of our proposed methods in advancing the TSE performance. A speech demo is available online.\footnote{this https URL

Introducing Meta-Fiber into Stacked Intelligent Metasurfaces for MIMO Communications: A Low-Complexity Design with only Two Layers

Authors: Hong Niu, Jiancheng An, Tuo Wu, Jiangong Chen, Yufei Zhao, Yong Liang Guan, Marco Di Renzo, Merouane Debbah, George K. Karagiannidis, H. Vincent Poor, Chau Yuen

Stacked intelligent metasurfaces (SIMs), which integrate multiple programmable metasurface layers, have recently emerged as a promising technology for advanced wave-domain signal processing. SIMs benefit from flexible spatial degree-of-freedom (DoF) while reducing the requirement for costly radio-frequency (RF) chains. However, current state-of-the-art SIM designs face challenges such as complex phase shift optimization and energy attenuation from multiple layers. To address these aspects, we propose incorporating meta-fibers into SIMs, with the aim of reducing the number of layers and enhancing the energy efficiency. First, we introduce a meta-fiber-connected 2-layer SIM that exhibits the same flexible signal processing capabilities as conventional multi-layer structures, and explains the operating principle. Subsequently, we formulate and solve the optimization problem of minimizing the mean square error (MSE) between the SIM channel and the desired channel matrices. Specifically, by designing the phase shifts of the meta-atoms associated with the transmitting-SIM and receiving-SIM, a non-interference system with parallel subchannels is established. In order to reduce the computational complexity, a closed-form expression for each phase shift at each iteration of an alternating optimization (AO) algorithm is proposed. We show that the proposed algorithm is applicable to conventional multi-layer SIMs. The channel capacity bound and computational complexity are analyzed to provide design insights. Finally, numerical results are illustrated, demonstrating that the proposed two-layer SIM with meta-fiber achieves over a 25% improvement in channel capacity while reducing the total number of meta-atoms by 59% as compared with a conventional seven-layer SIM.

Ensemble Confidence Calibration for Sound Event Detection in Open-environment

Authors: Yuanjian Chen, Han Yin

Sound event detection (SED) has made strong progress in controlled environments with clear event categories. However, real-world applications often take place in open environments. In such cases, current methods often produce predictions with too much confidence and lack proper ways to measure uncertainty. This limits their ability to adapt and perform well in new situations. To solve this problem, we are the first to use ensemble methods in SED to improve robustness against out-of-domain (OOD) inputs. We propose a confidence calibration method called Energy-based Open-World Softmax (EOW-Softmax), which helps the system better handle uncertainty in unknown scenes. We further apply EOW-Softmax to sound occurrence and overlap detection (SOD) by adjusting the prediction. In this way, the model becomes more adaptable while keeping its ability to detect overlapping events. Experiments show that our method improves performance in open environments. It reduces overconfidence and increases the ability to handle OOD situations.

Wi-Fi: Twenty-Five Years and Counting

Authors: Giovanni Geraci, Francesca Meneghello, Francesc Wilhelmi, David Lopez-Perez, Iñaki Val, Lorenzo Galati Giordano, Carlos Cordeiro, Monisha Ghosh, Edward Knightly, Boris Bellalta

Today, Wi-Fi is over 25 years old. Yet, despite sharing the same branding name, today's Wi-Fi boasts entirely new capabilities that were not even on the roadmap 25 years ago. This article aims to provide a holistic and comprehensive technical and historical tutorial on Wi-Fi, beginning with IEEE 802.11b (Wi-Fi 1) and looking forward to IEEE 802.11bn (Wi-Fi 8). This is the first tutorial article to span these eight generations. Rather than a generation-by-generation exposition, we describe the key mechanisms that have advanced Wi-Fi. We begin by discussing spectrum allocation and coexistence, and detailing the IEEE 802.11 standardization cycle. Second, we provide an overview of the physical layer and describe key elements that have enabled data rates to increase by over 1,000x. Third, we describe how Wi-Fi Medium Access Control has been enhanced from the original Distributed Coordination Function to now include capabilities spanning from frame aggregation to wideband spectrum access. Fourth, we describe how Wi-Fi 5 first broke the one-user-at-a-time paradigm and introduced multi-user access. Fifth, given the increasing use of mobile, battery-powered devices, we describe Wi-Fi's energy-saving mechanisms over the generations. Sixth, we discuss how Wi-Fi was enhanced to seamlessly aggregate spectrum across 2.4 GHz, 5 GHz, and 6 GHz bands to improve throughput, reliability, and latency. Finally, we describe how Wi-Fi enables nearby Access Points to coordinate in order to improve performance and efficiency. In the Appendix, we further discuss Wi-Fi developments beyond 802.11bn, including integrated mmWave operations, sensing, security and privacy extensions, and the adoption of AI/ML.

THAI Speech Emotion Recognition (THAI-SER) corpus

Authors: Jilamika Wongpithayadisai, Chompakorn Chaksangchaichot, Soravitt Sangnark, Patawee Prakrankamanant, Krit Gangwanpongpun, Siwa Boonpunmongkol, Premmarin Milindasuta, Dangkamon Na-Pombejra, Sarana Nutanong, Ekapol Chuangsuwanich

We present the first sizeable corpus of Thai speech emotion recognition, THAI-SER, containing 41 hours and 36 minutes (27,854 utterances) from 100 recordings made in different recording environments: Zoom and two studio setups. The recordings contain both scripted and improvised sessions, acted by 200 professional actors (112 females and 88 males, aged 18 to 55) and were directed by professional directors. There are five primary emotions: neutral, angry, happy, sad, and frustrated, assigned to the actors when recording utterances. The utterances are annotated with an emotional category using crowdsourcing. To control the annotation process's quality, we also design an extensive filtering and quality control scheme to ensure that the majority agreement score remains above 0.71. We evaluate our annotated corpus using two metrics: inter-annotator reliability and human recognition accuracy. Inter-annotator reliability score was calculated using Krippendorff's alpha, where our corpus, after filtering, achieved an alpha score of 0.692, higher than a recommendation of 0.667. For human recognition accuracy, our corpus scored up to 0.772 post-filtering. We also provide the results of the model trained on the corpus evaluated on both in-corpus and cross-corpus setups. The corpus is publicly available under a Creative Commons BY-SA 4.0, as well as our codes for the experiments.

humancompatible.interconnect: Testing Properties of Repeated Uses of Interconnections of AI Systems

Authors: Rodion Nazarov, Anthony Quinn, Robert Shorten, Jakub Marecek

Artificial intelligence (AI) systems often interact with multiple agents. The regulation of such AI systems often requires that {\em a priori\/} guarantees of fairness and robustness be satisfied. With stochastic models of agents' responses to the outputs of AI systems, such {\em a priori\/} guarantees require non-trivial reasoning about the corresponding stochastic systems. Here, we present an open-source PyTorch-based toolkit for the use of stochastic control techniques in modelling interconnections of AI systems and properties of their repeated uses. It models robustness and fairness desiderata in a closed-loop fashion, and provides {\em a priori\/} guarantees for these interconnections. The PyTorch-based toolkit removes much of the complexity associated with the provision of fairness guarantees for closed-loop models of multi-agent systems.

Prompt2DEM: High-Resolution DEMs for Urban and Open Environments from Global Prompts Using a Monocular Foundation Model

Authors: Osher Rafaeli, Tal Svoray, Ariel Nahlieli

High-resolution elevation estimations are essential to understand catchment and hillslope hydrology, study urban morphology and dynamics, and monitor the growth, decline, and mortality of terrestrial ecosystems. Various deep learning approaches (e.g., super-resolution techniques, monocular depth estimation) have been developed to create high-resolution Digital Elevation Models (DEMs). However, super-resolution techniques are limited by the upscaling factor, and monocular depth estimation lacks global elevation context, making its conversion to a seamless DEM restricted. The recently introduced technique of prompt-based monocular depth estimation has opened new opportunities to extract estimates of absolute elevation in a global context. We present here a framework for the estimation of high-resolution DEMs as a new paradigm for absolute global elevation mapping. It is exemplified using low-resolution Shuttle Radar Topography Mission (SRTM) elevation data as prompts and high-resolution RGB imagery from the National Agriculture Imagery Program (NAIP). The approach fine-tunes a vision transformer encoder with LiDAR-derived DEMs and employs a versatile prompting strategy, enabling tasks such as DEM estimation, void filling, and updating. Our framework achieves a 100x resolution gain (from 30-m to 30-cm), surpassing prior methods by an order of magnitude. Evaluations across three diverse U.S. landscapes show robust generalization, capturing urban structures and fine-scale terrain features with < 5 m MAE relative to LiDAR, improving over SRTM by up to 18%. Hydrological analysis confirms suitability for hazard and environmental studies. We demonstrate scalability by applying the framework to large regions in the U.S. and Israel. All code and pretrained models are publicly available at: this https URL.

Curvature-adaptive gigapixel microscopy at submicron resolution and centimeter scale

Authors: Xi Yang, Haitao Chen, Lucas Kreiss, Clare B. Cook, Genevieve Kuczewski, Mark Harfouche, Martin O. Bohlen, Roarke Horstmeyer

Large-area microscopy with submicron resolution is limited by tradeoffs between field of view (FOV), resolution, and imaging speed. Samples are rarely flat across centimeter-scale FOV, which often requires existing solutions to use mechanical scanning to ensure focused capture at reduced throughput. Here, we present PANORAMA, a single-shot, re-imaging microscope that achieves seamless, gigapixel imaging over a 16.3$\times$18.8 $\text{mm}^2$ FOV at 0.84 um resolution without mechanical scanning. By using a telecentric photolithography lens, a large-aperture tube lens, and a flat micro-camera array with adaptive per-camera focus control, PANORAMA maintains submicron focus across flat, curved or uneven samples that span centimeters. This approach improves imaging throughput and adaptability, enabling gigapixel multi-modal microscopy of large flat and non-flat samples in one shot, thus broadening its applications in biomedical and materials imaging.

IteraOptiRacing: A Unified Planning-Control Framework for Real-time Autonomous Racing for Iterative Optimal Performance

Authors: Yifan Zeng, Yihan Li, Suiyi He, Koushil Sreenath, Jun Zeng

This paper presents a unified planning-control strategy for competing with other racing cars called IteraOptiRacing in autonomous racing environments. This unified strategy is proposed based on Iterative Linear Quadratic Regulator for Iterative Tasks (i2LQR), which can improve lap time performance in the presence of surrounding racing obstacles. By iteratively using the ego car's historical data, both obstacle avoidance for multiple moving cars and time cost optimization are considered in this unified strategy, resulting in collision-free and time-optimal generated trajectories. The algorithm's constant low computation burden and suitability for parallel computing enable real-time operation in competitive racing scenarios. To validate its performance, simulations in a high-fidelity simulator are conducted with multiple randomly generated dynamic agents on the track. Results show that the proposed strategy outperforms existing methods across all randomly generated autonomous racing scenarios, enabling enhanced maneuvering for the ego racing car.

Signed Graph Learning: Algorithms and Theory

Authors: Abdullah Karaaslanli, Bisakh Banerjee, Tapabrata Maiti, Selin Aviyente

Real-world data is often represented through the relationships between data samples, forming a graph structure. In many applications, it is necessary to learn this graph structure from the observed data. Current graph learning research has primarily focused on unsigned graphs, which consist only of positive edges. However, many biological and social systems are better described by signed graphs that account for both positive and negative interactions, capturing similarity and dissimilarity between samples. In this paper, we develop a method for learning signed graphs from a set of smooth signed graph signals. Specifically, we employ the net Laplacian as a graph shift operator (GSO) to define smooth signed graph signals as the outputs of a low-pass signed graph filter defined by the net Laplacian. The signed graph is then learned by formulating a non-convex optimization problem where the total variation of the observed signals is minimized with respect to the net Laplacian. The proposed problem is solved using alternating direction method of multipliers (ADMM) and a fast algorithm reducing the per-ADMM iteration complexity from quadratic to linear in the number of nodes is introduced. Furthermore, theoretical proofs of convergence for the algorithm and a bound on the estimation error of the learned net Laplacian as a function of sample size, number of nodes, and graph topology are provided. Finally, the proposed method is evaluated on simulated data and gene regulatory network inference problem and compared to existing signed graph learning methods.

MB-RIRs: a Synthetic Room Impulse Response Dataset with Frequency-Dependent Absorption Coefficients

Authors: Enric Gusó, Joanna Luberadzka, Umut Sayin, Xavier Serra

We investigate the effects of four strategies for improving the ecological validity of synthetic room impulse response (RIR) datasets for monoaural Speech Enhancement (SE). We implement three features on top of the traditional image source method-based (ISM) shoebox RIRs: multiband absorption coefficients, source directivity and receiver directivity. We additionally consider mesh-based RIRs from the SoundSpaces dataset. We then train a DeepFilternet3 model for each RIR dataset and evaluate the performance on a test set of real RIRs both objectively and subjectively. We find that RIRs which use frequency-dependent acoustic absorption coefficients (MB-RIRs) can obtain +0.51dB of SDR and a +8.9 MUSHRA score when evaluated on real RIRs. The MB-RIRs dataset is publicly available for free download.

Knowing When to Quit: Probabilistic Early Exits for Speech Separation

Authors: Kenny Falkær Olsen. Mads Østergaard, Karl Ulbæk, Søren Føns Nielsen, Rasmus Malik Høegh Lindrup, Bjørn Sand Jensen, Morten Mørup

In recent years, deep learning-based single-channel speech separation has improved considerably, in large part driven by increasingly compute- and parameter-efficient neural network architectures. Most such architectures are, however, designed with a fixed compute and parameter budget, and consequently cannot scale to varying compute demands or resources, which limits their use in embedded and heterogeneous devices such as mobile phones and hearables. To enable such use-cases we design a neural network architecture for speech separation capable of early-exit, and we propose an uncertainty-aware probabilistic framework to jointly model the clean speech signal and error variance which we use to derive probabilistic early-exit conditions in terms of desired signal-to-noise ratios. We evaluate our methods on both speech separation and enhancement tasks, and we show that a single early-exit model can be competitive with state-of-the-art models trained at many compute and parameter budgets. Our framework enables fine-grained dynamic compute-scaling of speech separation networks while achieving state-of-the-art performance and interpretable exit conditions.

Active Probing with Multimodal Predictions for Motion Planning

Authors: Darshan Gadginmath, Farhad Nawaz, Minjun Sung, Faizan M Tariq, Sangjae Bae, David Isele, Fabio Pasqualetti, Jovin Dsa

Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at this https URL.

Multi-residual Mixture of Experts Learning for Cooperative Control in Multi-vehicle Systems

Authors: Vindula Jayawardana, Sirui Li, Yashar Farid, Cathy Wu

Autonomous vehicles (AVs) are becoming increasingly popular, with their applications now extending beyond just a mode of transportation to serving as mobile actuators of a traffic flow to control flow dynamics. This contrasts with traditional fixed-location actuators, such as traffic signals, and is referred to as Lagrangian traffic control. However, designing effective Lagrangian traffic control policies for AVs that generalize across traffic scenarios introduces a major challenge. Real-world traffic environments are highly diverse, and developing policies that perform robustly across such diverse traffic scenarios is challenging. It is further compounded by the joint complexity of the multi-agent nature of traffic systems, mixed motives among participants, and conflicting optimization objectives subject to strict physical and external constraints. To address these challenges, we introduce Multi-Residual Mixture of Expert Learning (MRMEL), a novel framework for Lagrangian traffic control that augments a given suboptimal nominal policy with a learned residual while explicitly accounting for the structure of the traffic scenario space. In particular, taking inspiration from residual reinforcement learning, MRMEL augments a suboptimal nominal AV control policy by learning a residual correction, but at the same time dynamically selects the most suitable nominal policy from a pool of nominal policies conditioned on the traffic scenarios and modeled as a mixture of experts. We validate MRMEL using a case study in cooperative eco-driving at signalized intersections in Atlanta, Dallas Fort Worth, and Salt Lake City, with real-world data-driven traffic scenarios. The results show that MRMEL consistently yields superior performance-achieving an additional 4%-9% reduction in aggregate vehicle emissions relative to the strongest baseline in each setting.

UavNetSim-v1: A Python-based Simulation Platform for UAV Communication Networks

Authors: Zihao Zhou, Zipeng Dai, Linyi Huang, Cui Yang, Youjun Xiang, Jie Tang, Kai-kit Wong

In unmanned aerial vehicle (UAV) networks, communication protocols and algorithms are essential for cooperation and collaboration between UAVs. Simulation provides a cost-effective solution for prototyping, debugging, and analyzing protocols and algorithms, avoiding the prohibitive expenses of field experiments. In this paper, we present ``UavNetSim-v1'', an open-source Python-based simulation platform designed for rapid development, testing, and evaluating the protocols and algorithms in UAV networks. ``UavNetSim-v1'' provides most of the functionalities developers may need, including routing/medium access control (MAC) protocols, topology control algorithms and mobility/energy models, while maintaining ease of use. Furthermore, the platform supports comprehensive performance evaluation and features an interactive visualization interface for in-depth algorithm analysis. In short, ``UavNetSim-v1'' lends itself to both rapid prototyping and educational purposes, and can serve as a lightweight yet powerful alternative to mature network simulators for UAV communication research.

Optimal Design of Satellite Constellation Configurations with Mixed Integer Linear Programming

Authors: David O. Williams Rogers, Dongshik Won, Dongwook Koh, Kyungwoo Hong, Hang Woon Lee

Designing satellite constellation systems involves complex multidisciplinary optimization in which coverage serves as a primary driver of overall system cost and performance. Among the various design considerations, constellation configuration -- how satellites are placed and distributed in space relative to each other -- predominantly determines the resulting coverage. In constellation configuration design, coverage can be considered either as an objective or a constraint, driven by mission objectives. State-of-the-art literature addresses each situation on a case-by-case basis, applying a unique set of assumptions, modeling, and solution methods. Although such a problem-based methodology is valuable, users often face implementation challenges when performing trade-off studies across different mission scenarios, as each scenario must be handled distinctly. In response, we propose a unifying framework consisting of five mixed-integer linear program formulations that are of practical significance, extensible to more complex mission narratives using additional constraints, and capable of obtaining provably optimal constellation configurations. It can handle various metrics and mission scenarios, such as percent coverage, average or maximum revisit times, fixed number of satellites, spatiotemporally varying coverage requirements, and ground-, aerial-, or space-based, static or mobile targets. The paper presents several add-ons, case studies, and comparative analyses to demonstrate the versatility of the proposed framework.

SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation

Authors: Youliang Zhang, Zhaoyang Li, Duomin Wang, Jiahe Zhang, Deyu Zhou, Zixin Yin, Xili Dai, Gang Yu, Xiu Li

The rapid development of large-scale models has catalyzed significant breakthroughs in the digital human domain. These advanced methodologies offer high-fidelity solutions for avatar driving and rendering, leading academia to focus on the next major challenge: audio-visual dyadic interactive virtual human. To facilitate research in this emerging area, we present SpeakerVid-5M dataset, the first large-scale, high-quality dataset designed for audio-visual dyadic interactive virtual human generation. Totaling over 8,743 hours, SpeakerVid-5M contains more than 5.2 million video clips of human portraits. It covers diverse scales and interaction types, including monadic talking, listening, and dyadic conversations. Crucially, the dataset is structured along two key dimensions: interaction type and data quality. First, it is categorized into four types (dialogue branch, single branch, listening branch and multi-turn branch) based on the interaction scenario. Second, it is stratified into a large-scale pre-training subset and a curated, high-quality subset for Supervised Fine-Tuning (SFT). This dual structure accommodates a wide array of 2D virtual human tasks. In addition, we provide an autoregressive (AR)-based video chat baseline trained on this data, accompanied by a dedicated set of metrics and test data to serve as a benchmark VidChatBench for future work. Both the dataset and the corresponding data processing code will be publicly released. Project page: this https URL

TolerantECG: A Foundation Model for Imperfect Electrocardiogram

Authors: Huynh Nguyen Dang, Thang Pham, Ngan Le, Van Nguyen

The electrocardiogram (ECG) is an essential and effective tool for diagnosing heart diseases. However, its effectiveness can be compromised by noise or unavailability of one or more leads of the standard 12-lead recordings, resulting in diagnostic errors or uncertainty. To address these challenges, we propose TolerantECG, a foundation model for ECG signals that is robust to noise and capable of functioning with arbitrary subsets of the standard 12-lead ECG. TolerantECG training combines contrastive and self-supervised learning frameworks to jointly learn ECG signal representations alongside their corresponding knowledge-retrieval-based text report descriptions and corrupted or lead-missing signals. Comprehensive benchmarking results demonstrate that TolerantECG consistently ranks as the best or second-best performer across various ECG signal conditions and class levels in the PTB-XL dataset, and achieves the highest performance on the MIT-BIH Arrhythmia Database.

ASTAR-NTU solution to AudioMOS Challenge 2025 Track1

Authors: Fabian Ritter-Gutierrez, Yi-Cheng Lin, Jui-Chiang Wei, Jeremy H.M. Wong, Nancy F. Chen, Hung-yi Lee

Evaluation of text-to-music systems is constrained by the cost and availability of collecting experts for assessment. AudioMOS 2025 Challenge track 1 is created to automatically predict music impression (MI) as well as text alignment (TA) between the prompt and the generated musical piece. This paper reports our winning system, which uses a dual-branch architecture with pre-trained MuQ and RoBERTa models as audio and text encoders. A cross-attention mechanism fuses the audio and text representations. For training, we reframe the MI and TA prediction as a classification task. To incorporate the ordinal nature of MOS scores, one-hot labels are converted to a soft distribution using a Gaussian kernel. On the official test set, a single model trained with this method achieves a system-level Spearman's Rank Correlation Coefficient (SRCC) of 0.991 for MI and 0.952 for TA, corresponding to a relative improvement of 21.21\% in MI SRCC and 31.47\% in TA SRCC over the challenge baseline.

Green-LLM: Optimal Workload Allocation for Environmentally-Aware Distributed Inference

Authors: Jiaming Cheng, Duong Tung Nguyen

This letter investigates the optimal allocation of large language model (LLM) inference workloads across heterogeneous edge data centers (DCs) over time. Each DC features on-site renewable generation and faces dynamic electricity prices and spatiotemporal variability in renewable availability. The central question is: how can inference workloads be optimally distributed to the DCs to minimize energy consumption, carbon emissions, and water usage while enhancing user experience? This letter proposes a novel optimization model for LLM service providers to reduce operational costs and environmental impacts. Numerical results validate the efficacy of the proposed approach.

The Man Behind the Sound: Demystifying Audio Private Attribute Profiling via Multimodal Large Language Model Agents

Authors: Lixu Wang, Kaixiang Yao, Xinfeng Li, Dong Yang, Haoyang Li, Xiaofeng Wang, Wei Dong

Our research uncovers a novel privacy risk associated with multimodal large language models (MLLMs): the ability to infer sensitive personal attributes from audio data -- a technique we term audio private attribute profiling. This capability poses a significant threat, as audio can be covertly captured without direct interaction or visibility. Moreover, compared to images and text, audio carries unique characteristics, such as tone and pitch, which can be exploited for more detailed profiling. However, two key challenges exist in understanding MLLM-employed private attribute profiling from audio: (1) the lack of audio benchmark datasets with sensitive attribute annotations and (2) the limited ability of current MLLMs to infer such attributes directly from audio. To address these challenges, we introduce AP^2, an audio benchmark dataset that consists of two subsets collected and composed from real-world data, and both are annotated with sensitive attribute labels. Additionally, we propose Gifts, a hybrid multi-agent framework that leverages the complementary strengths of audio-language models (ALMs) and large language models (LLMs) to enhance inference capabilities. Gifts employs an LLM to guide the ALM in inferring sensitive attributes, then forensically analyzes and consolidates the ALM's inferences, overcoming severe hallucinations of existing ALMs in generating long-context responses. Our evaluations demonstrate that Gifts significantly outperforms baseline approaches in inferring sensitive attributes. Finally, we investigate model-level and data-level defense strategies to mitigate the risks of audio private attribute profiling. Our work validates the feasibility of audio-based privacy attacks using MLLMs, highlighting the need for robust defenses, and provides a dataset and framework to facilitate future research.

Analyzing the Crowding-Out Effect of Investment Herding on Consumption: An Optimal Control Theory Approach

Authors: Huisheng Wang, H. Vicky Zhao

Investment herding, a phenomenon where households mimic the decisions of others rather than relying on their own analysis, has significant effects on financial markets and household behavior. Excessive investment herding may reduce investments and lead to a depletion of household consumption, which is called the crowding-out effect. While existing research has qualitatively examined the impact of investment herding on consumption, quantitative studies in this area remain limited. In this work, we investigate the optimal investment and consumption decisions of households under the impact of investment herding. We formulate an optimization problem to model how investment herding influences household decisions over time. Based on the optimal control theory, we solve for the analytical solutions of optimal investment and consumption decisions. We theoretically analyze the impact of investment herding on household consumption decisions and demonstrate the existence of the crowding-out effect. We further explore how parameters, such as interest rate, excess return rate, and volatility, influence the crowding-out effect. Finally, we conduct a real data test to validate our theoretical analysis of the crowding-out effect. This study is crucial to understanding the impact of investment herding on household consumption and offering valuable insights for policymakers seeking to stimulate consumption and mitigate the negative effects of investment herding on economic growth.

Learning-Aided Iterative Receiver for Superimposed Pilots: Design and Experimental Evaluation

Authors: Xinjie Li, Xingyu Zhou, Yixiao Cao, Jing Zhang, Chao-Kai Wen, Xiao Li, Shi Jin

The superimposed pilot transmission scheme offers substantial potential for improving spectral efficiency in MIMO-OFDM systems, but it presents significant challenges for receiver design due to pilot contamination and data interference. To address these issues, we propose an advanced iterative receiver based on joint channel estimation, detection, and decoding, which refines the receiver outputs through iterative feedback. The proposed receiver incorporates two adaptive channel estimation strategies to enhance robustness under time-varying and mismatched channel conditions. First, a variational message passing (VMP) method and its low-complexity variant (VMP-L) are introduced to perform inference without relying on time-domain correlation. Second, a deep learning (DL) based estimator is developed, featuring a convolutional neural network with a despreading module and an attention mechanism to extract and fuse relevant channel features. Extensive simulations under multi-stream and high-mobility scenarios demonstrate that the proposed receiver consistently outperforms conventional orthogonal pilot baselines in both throughput and block error rate. Moreover, over-the-air experiments validate the practical effectiveness of the proposed design. Among the methods, the DL based estimator achieves a favorable trade-off between performance and complexity, highlighting its suitability for real-world deployment in dynamic wireless environments.

TGLD: A Trust-Aware Game-Theoretic Lane-Changing Decision Framework for Automated Vehicles in Heterogeneous Traffic

Authors: Jie Pan, Tianyi Wang, Yangyang Wang, Junfeng Jiao, Christian Claudel

Automated vehicles (AVs) face a critical need to adopt socially compatible behaviors and cooperate effectively with human-driven vehicles (HVs) in heterogeneous traffic environment. However, most existing lane-changing frameworks overlook HVs' dynamic trust levels, limiting their ability to accurately predict human driver behaviors. To address this gap, this study proposes a trust-aware game-theoretic lane-changing decision (TGLD) framework. First, we formulate a multi-vehicle coalition game, incorporating fully cooperative interactions among AVs and partially cooperative behaviors from HVs informed by real-time trust evaluations. Second, we develop an online trust evaluation method to dynamically estimate HVs' trust levels during lane-changing interactions, guiding AVs to select context-appropriate cooperative maneuvers. Lastly, social compatibility objectives are considered by minimizing disruption to surrounding vehicles and enhancing the predictability of AV behaviors, thereby ensuring human-friendly and context-adaptive lane-changing strategies. A human-in-the-loop experiment conducted in a highway on-ramp merging scenario validates our TGLD approach. Results show that AVs can effectively adjust strategies according to different HVs' trust levels and driving styles. Moreover, incorporating a trust mechanism significantly improves lane-changing efficiency, maintains safety, and contributes to transparent and adaptive AV-HV interactions.

Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction

Authors: Hiroki Sakamoto, Kazuhiro Sato

Deep learning models incorporating linear SSMs have gained attention for capturing long-range dependencies in sequential data. However, their large parameter sizes pose challenges for deployment on resource-constrained devices. In this study, we propose an efficient parameter reduction method for these models by applying $H^{2}$ model order reduction techniques from control theory to their linear SSM components. In experiments, the LRA benchmark results show that the model compression based on our proposed method outperforms an existing method using the Balanced Truncation, while successfully reducing the number of parameters in the SSMs to $1/32$ without sacrificing the performance of the original models.

Unscented Kalman Filter with a Nonlinear Propagation Model for Navigation Applications

Authors: Amit Levy, Itzik Klein

The unscented Kalman filter is a nonlinear estimation algorithm commonly used in navigation applications. The prediction of the mean and covariance matrix is crucial to the stable behavior of the filter. This prediction is done by propagating the sigma points according to the dynamic model at hand. In this paper, we introduce an innovative method to propagate the sigma points according to the nonlinear dynamic model of the navigation error state vector. This improves the filter accuracy and navigation performance. We demonstrate the benefits of our proposed approach using real sensor data recorded by an autonomous underwater vehicle during several scenarios.

Fast-Response Variable-Frequency Series-Capacitor Buck VRM Through Integrated Control Approaches

Authors: Guanyu Qian, Haoxian Yan, Xiaofan Cui

Fast-response voltage regulation is essential for data-center Voltage Regulation Modules (VRMs) powering Artificial Intelligence (AI) workloads, which exhibit both small-amplitude fluctuations and abrupt full-load steps. This paper introduces a control scheme that integrates a linear controller and a nonlinear controller for variable-frequency Series-Capacitor Buck (SCB) converters. First, an accurate small-signal model is derived via a Switching-Synchronized Sampled State-Space (5S) framework, yielding discrete-time transfer functions and root-locus insights for direct digital design. A critical concern for SCB converters is series-capacitor oscillation during heavy load steps if the strict switching sequence is not maintained. To accelerate large-signal transients, a time-optimal control strategy based on Pontryagins Maximum Principle (PMP) relaxes the switching constraints to compute time-optimal switching sequences. A transition logic is then proposed to integrate the high-bandwidth small-signal controller and the large-signal controller. Simulations demonstrate a rapid output voltage recovery under a heavy load step-up, over ten times faster than a linear controller-only design. Preliminary hardware tests indicate a stable rejection to heavy load disturbances with zero steady-state error.

DualDub: Video-to-Soundtrack Generation via Joint Speech and Background Audio Synthesis

Authors: Wenjie Tian, Xinfa Zhu, Haohe Liu, Zhixian Zhao, Zihao Chen, Chaofan Ding, Xinhan Di, Junjie Zheng, Lei Xie

While recent video-to-audio (V2A) models can generate realistic background audio from visual input, they largely overlook speech, an essential part of many video soundtracks. This paper proposes a new task, video-to-soundtrack (V2ST) generation, which aims to jointly produce synchronized background audio and speech within a unified framework. To tackle V2ST, we introduce DualDub, a unified framework built on a multimodal language model that integrates a multimodal encoder, a cross-modal aligner, and dual decoding heads for simultaneous background audio and speech generation. Specifically, our proposed cross-modal aligner employs causal and non-causal attention mechanisms to improve synchronization and acoustic harmony. Besides, to handle data scarcity, we design a curriculum learning strategy that progressively builds the multimodal capability. Finally, we introduce DualBench, the first benchmark for V2ST evaluation with a carefully curated test set and comprehensive metrics. Experimental results demonstrate that DualDub achieves state-of-the-art performance, generating high-quality and well-synchronized soundtracks with both speech and background audio.

Improved Differential Evolution for Enhancing the Aggregated Channel Estimation of RIS-Aided Cell-Free Massive MIMO

Authors: Trinh Van Chien, Nguyen Hoang Viet, Symeon Chatzinotas, Lajos Hanzo

Cell-Free Massive multiple-input multiple-output (MIMO) systems are investigated with the support of a reconfigurable intelligent surface (RIS). The RIS phase shifts are designed for improved channel estimation in the presence of spatial correlation. Specifically, we formulate the channel estimate and estimation error expressions using linear minimum mean square error (LMMSE) estimation for the aggregated channels. An optimization problem is then formulated to minimize the average normalized mean square error (NMSE) subject to practical phase shift constraints. To circumvent the problem of inherent nonconvexity, we then conceive an enhanced version of the differential evolution algorithm that is capable of avoiding local minima by introducing an augmentation operator applied to some high-performing Diffential Evolution (DE) individuals. Numerical results indicate that our proposed algorithm can significantly improve the channel estimation quality of the state-of-the-art benchmarks.

REACT: Real-time Entanglement-Aware Coverage Path Planning for Tethered Underwater Vehicles

Authors: Abdelhakim Amer, Mohit Mehindratta, Yury Brodskiy, Bilal Wehbe, Erdal Kayacan

Inspection of complex underwater structures with tethered underwater vehicles is often hindered by the risk of tether entanglement. We propose REACT (real-time entanglement-aware coverage path planning for tethered underwater vehicles), a framework designed to overcome this limitation. REACT comprises a fast geometry-based tether model using the signed distance field (SDF) map for accurate, real-time simulation of taut tether configurations around arbitrary structures in 3D. This model enables an efficient online replanning strategy by enforcing a maximum tether length constraint, thereby actively preventing entanglement. By integrating REACT into a coverage path planning framework, we achieve safe and optimal inspection paths, previously challenging due to tether constraints. The complete REACT framework's efficacy is validated in a pipe inspection scenario, demonstrating safe, entanglement-free navigation and full-coverage inspection. Simulation results show that REACT achieves complete coverage while maintaining tether constraints and completing the total mission 20% faster than conventional planners, despite a longer inspection time due to proactive avoidance of entanglement that eliminates extensive post-mission disentanglement. Real-world experiments confirm these benefits, where REACT completes the full mission, while the baseline planner fails due to physical tether entanglement.

Low-Power Wake-Up Signal Design in 3GPP 5G-Advanced Release 19

Authors: Sebastian Wagner

The Low-Power Wake-Up Signal (LP-WUS) and Low-Power Synchronization Signal (LP-SS), introduced in 3GPP 5G-Advanced Release 19, represent a major step forward in enabling power-efficient IoT communications. This paper presents a comprehensive overview of the LP-WUS and LP-SS procedures in the RRC_IDLE and RRC_INACTIVE states, and outlines key physical layer design choices. The LP-WUS is designed to be detected by a low-power energy detector (ED), allowing the main radio (MR) to remain switched off. This architecture enables power savings of up to 80% compared to conventional 5G paging mechanisms.

Spatial Lifting for Dense Prediction

Authors: Mingzhi Xu, Yizhe Zhang

We present Spatial Lifting (SL), a novel methodology for dense prediction tasks. SL operates by lifting standard inputs, such as 2D images, into a higher-dimensional space and subsequently processing them using networks designed for that higher dimension, such as a 3D U-Net. Counterintuitively, this dimensionality lifting allows us to achieve good performance on benchmark tasks compared to conventional approaches, while reducing inference costs and significantly lowering the number of model parameters. The SL framework produces intrinsically structured outputs along the lifted dimension. This emergent structure facilitates dense supervision during training and enables robust, near-zero-additional-cost prediction quality assessment at test time. We validate our approach across 19 benchmark datasets (13 for semantic segmentation and 6 for depth estimation), demonstrating competitive dense prediction performance while reducing the model parameter count by over 98% (in the U-Net case) and lowering inference costs. Spatial Lifting introduces a new vision modeling paradigm that offers a promising path toward more efficient, accurate, and reliable deep networks for dense prediction tasks in vision.

Exogeneous PpIX model for brain tumour assessment

Authors: John Raschke, Jean Pierre Ndabakuranye, Bobbi Fleiss, Arman Ahnood

Reliable in-vitro models are used for optoelectronic device development such as fluorescence detection devices for fluorescence-guided surgery of gliomas. A common approach is based on inducing gliomas in animal models. This is followed by a dosage of 5-ALA to induce Protoporphyrin IX (PpIX) in the glioma and which fluoresces. Although these approaches excel in capturing key biomolecular and physiological features of the tumour, they are inherently indeterministic. This limits the scope of their use for preclinical device development, where consistent and controllable tumour reproduction across multiple animals is needed. Approaches using fluorescence markers in gelatine provide a simple replication but fail to capture the complexities of in-vivo models. In this study, we introduce an exogenous brain tumour model for assessing PpIX fluorescence detection. The model was developed by injecting a PpIX solution into the cortical region of a resected adult rat brain, the injection site simulated a tumoral region with elevated PpIX concentration. The tumoral region had a gradient of concentrations, with a peak at the centre and a decrease towards the margins, akin to in-vivo gliomas. The fluorescence profile was compared to in-vivo conditions using 5-ALA and correlated well with other reported works, achieving a correlation of R2>0.93. The model's validity was tested by examining the effect of the solvent, DMSO, on the Autofluorescence (AF) of the brain sample and the short-term effect of storage on AF was analysed. Examinations confirmed the solvent did not alter AF, and the brain sample should be stored in Hanks Balanced Salt Solution and refrigerated to maintain moisture and preserve AF. The model accurately replicated surgical fluorescence conditions and offers a suitable alternative to glioma induction, benefiting the development of fluorescence detection devices across design iterations.

Polygonal Obstacle Avoidance Combining Model Predictive Control and Fuzzy Logic

Authors: Michael Schröder, Eric Schöneberg, Daniel Görges, Hans D. Schotten

In practice, navigation of mobile robots in confined environments is often done using a spatially discrete cost-map to represent obstacles. Path following is a typical use case for model predictive control (MPC), but formulating constraints for obstacle avoidance is challenging in this case. Typically the cost and constraints of an MPC problem are defined as closed-form functions and typical solvers work best with continuously differentiable functions. This is contrary to spatially discrete occupancy grid maps, in which a grid's value defines the cost associated with occupancy. This paper presents a way to overcome this compatibility issue by re-formulating occupancy grid maps to continuously differentiable functions to be embedded into the MPC scheme as constraints. Each obstacle is defined as a polygon -- an intersection of half-spaces. Any half-space is a linear inequality representing one edge of a polygon. Using AND and OR operators, the combined set of all obstacles and therefore the obstacle avoidance constraints can be described. The key contribution of this paper is the use of fuzzy logic to re-formulate such constraints that include logical operators as inequality constraints which are compatible with standard MPC formulation. The resulting MPC-based trajectory planner is successfully tested in simulation. This concept is also applicable outside of navigation tasks to implement logical or verbal constraints in MPC.

DQLoRA: A Lightweight Domain-Aware Denoising ASR via Adapter-guided Distillation

Authors: Yiru Yang

We present a demo of DQLoRA, an Adapter-Guided Distillation framework for robust speech recognition under low-resource and noisy conditions. Our method employs a frozen Whisper model as the teacher to provide semantic supervision, and a lightweight Wav2Vec2 student equipped with QLoRA-based Adapters. Training is conducted on the FLEURS dataset augmented with DNS-style noise. The student is optimized by jointly minimizing CTC loss and KL-based distillation loss, enabling efficient adaptation while preserving recognition accuracy.

Streamlined Airborne Software Development for Large UAVs: From Unified Data Collection to Automated Code Generation

Authors: Viktor Sinitsyn, Nils Schlautmann, Florian Schwaiger, Florian Holzapfel

The aerospace industry has experienced significant transformations over the last decade, driven by technological advancements and innovative solutions in goods and personal transportation. This evolution has spurred the emergence of numerous start-ups that now face challenges traditionally encountered by established aerospace companies. Among these challenges is the efficient processing of digital intra-device communication interfaces for onboard equipment - a critical component for ensuring seamless system integration and functionality. Addressing this challenge requires solutions that emphasize clear and consistent interface descriptions, automation of processes, and reduced labor-intensive efforts. This paper presents a novel process and toolchain designed to streamline the development of digital interfaces and onboard software, which our team has successfully applied in several completed projects. The proposed approach focuses on automation and flexibility while maintaining compliance with design assurance requirements.

Convergence of Agnostic Federated Averaging

Authors: Herlock (SeyedAbolfazl)Rahimi, Dionysis Kalogerias

Federated learning (FL) enables decentralized model training without centralizing raw data. However, practical FL deployments often face a key realistic challenge: Clients participate intermittently in server aggregation and with unknown, possibly biased participation probabilities. Most existing convergence results either assume full-device participation, or rely on knowledge of (in fact uniform) client availability distributions -- assumptions that rarely hold in practice. In this work, we characterize the optimization problem that consistently adheres to the stochastic dynamics of the well-known \emph{agnostic Federated Averaging (FedAvg)} algorithm under random (and variably-sized) client availability, and rigorously establish its convergence for convex, possibly nonsmooth losses, achieving a standard rate of order $\mathcal{O}(1/\sqrt{T})$, where $T$ denotes the aggregation horizon. Our analysis provides the first convergence guarantees for agnostic FedAvg under general, non-uniform, stochastic client participation, without knowledge of the participation distribution. We also empirically demonstrate that agnostic FedAvg in fact outperforms common (and suboptimal) weighted aggregation FedAvg variants, even with server-side knowledge of participation weights.

The Reconfigurable Earth Observation Satellite Scheduling Problem

Authors: Brycen D. Pearl, Joseph M. Miller, Hang Woon Lee

Earth observation satellites (EOS) play a pivotal role in capturing and analyzing planetary phenomena, ranging from natural disasters to societal development. The EOS scheduling problem (EOSSP), which optimizes the schedule of EOS, is often solved with respect to nadir-directional EOS systems, thus restricting the observation time of targets and, consequently, the effectiveness of each EOS. This paper leverages state-of-the-art constellation reconfigurability to develop the reconfigurable EOS scheduling problem (REOSSP), wherein EOS are assumed to be maneuverable, forming a more optimal constellation configuration at multiple opportunities during a schedule. This paper develops a novel mixed-integer linear programming formulation for the REOSSP to optimally solve the scheduling problem for given parameters. Additionally, since the REOSSP can be computationally expensive for large-scale problems, a rolling horizon procedure (RHP) solution method is developed. The performance of the REOSSP is benchmarked against the EOSSP, which serves as a baseline, through a set of random instances where problem characteristics are varied and a case study in which Hurricane Sandy is used to demonstrate realistic performance. These experiments demonstrate the value of constellation reconfigurability in its application to the EOSSP, yielding solutions that improve performance, while the RHP enhances computational runtime for large-scale REOSSP instances.

Evaluating Fake Music Detection Performance Under Audio Augmentations

Authors: Tomasz Sroka, Tomasz Wężowicz, Dominik Sidorczuk, Mateusz Modrzejewski

With the rapid advancement of generative audio models, distinguishing between human-composed and generated music is becoming increasingly challenging. As a response, models for detecting fake music have been proposed. In this work, we explore the robustness of such systems under audio augmentations. To evaluate model generalization, we constructed a dataset consisting of both real and synthetic music generated using several systems. We then apply a range of audio transformations and analyze how they affect classification accuracy. We test the performance of a recent state-of-the-art musical deepfake detection model in the presence of audio augmentations. The performance of the model decreases significantly even with the introduction of light augmentations.

Radif corpus: a symbolic dataset for non-metric iranian classical music

Authors: Maziar Kanani, Sean O Leary, James McDermott

Non-metric music forms the core of the repertoire in Iranian classical music. Dastgahi music serves as the underlying theoretical system for both Iranian art music and certain folk traditions. At the heart of Iranian classical music lies the radif, a foundational repertoire that organizes melodic material central to performance and pedagogy. In this study, we introduce the first digital corpus representing the complete non-metrical radif repertoire, covering all 13 existing components of this repertoire. We provide MIDI files (about 281 minutes in total) and data spreadsheets describing notes, note durations, intervals, and hierarchical structures for 228 pieces of music. We faithfully represent the tonality including quarter-tones, and the non-metric aspect. Furthermore, we provide supporting basic statistics, and measures of complexity and similarity over the corpus. Our corpus provides a platform for computational studies of Iranian classical music. Researchers might employ it in studying melodic patterns, investigating improvisational styles, or for other tasks in music information retrieval, music theory, and computational (ethno)musicology.

RAPNet: A Receptive-Field Adaptive Convolutional Neural Network for Pansharpening

Authors: Tao Tang, Chengxu Yang

Pansharpening refers to the process of integrating a high resolution panchromatic (PAN) image with a lower resolution multispectral (MS) image to generate a fused product, which is pivotal in remote sensing. Despite the effectiveness of CNNs in addressing this challenge, they are inherently constrained by the uniform application of convolutional kernels across all spatial positions, overlooking local content variations. To overcome this issue, we introduce RAPNet, a new architecture that leverages content-adaptive convolution. At its core, RAPNet employs the Receptive-field Adaptive Pansharpening Convolution (RAPConv), designed to produce spatially adaptive kernels responsive to local feature context, thereby enhancing the precision of spatial detail extraction. Additionally, the network integrates the Pansharpening Dynamic Feature Fusion (PAN-DFF) module, which incorporates an attention mechanism to achieve an optimal balance between spatial detail enhancement and spectral fidelity. Comprehensive evaluations on publicly available datasets confirm that RAPNet delivers superior performance compared to existing approaches, as demonstrated by both quantitative metrics and qualitative assessments. Ablation analyses further substantiate the effectiveness of the proposed adaptive components.

AudioMAE++: learning better masked audio representations with SwiGLU FFNs

Authors: Sarthak Yadav, Sergios Theodoridis, Zheng-Hua Tan

Masked Autoencoders (MAEs) trained on audio spectrogram patches have emerged as a prominent approach for learning self-supervised audio representations. While several recent papers have evaluated key aspects of training MAEs on audio data, the majority of these approaches still leverage vanilla transformer building blocks, whereas the transformer community has seen steady integration of newer architectural advancements. In this work, we propose AudioMAE++, a revamped audio masked autoencoder with two such enhancements, namely macaron-style transformer blocks with gated linear units. When pretrained on the AudioSet dataset, the proposed AudioMAE++ models outperform existing MAE based approaches on 10 diverse downstream tasks, demonstrating excellent performance on audio classification and speech-based benchmarks. The proposed AudioMAE++ models also demonstrate excellent scaling characteristics, outperforming directly comparable standard MAE baselines with up to 4x more parameters.

Neural Architecture Search generated Phase Retrieval Net for Real-time Off-axis Quantitative Phase Imaging

Authors: Xin Shu, Mengxuan Niu, Yi Zhang, Wei Luo, Renjie Zhou

In off-axis Quantitative Phase Imaging (QPI), artificial neural networks have been recently applied for phase retrieval with aberration compensation and phase unwrapping. However, the involved neural network architectures are largely unoptimized and inefficient with low inference speed, which hinders the realization of real-time imaging. Here, we propose a Neural Architecture Search (NAS) generated Phase Retrieval Net (NAS-PRNet) for accurate and fast phase retrieval. NAS-PRNet is an encoder-decoder style neural network, automatically found from a large neural network architecture search space through NAS. By modifying the differentiable NAS scheme from SparseMask, we learn the optimized skip connections through gradient descent. Specifically, we implement MobileNet-v2 as the encoder and define a synthesized loss that incorporates phase reconstruction loss and network sparsity loss. NAS-PRNet has achieved high-fidelity phase retrieval by achieving a peak Signal-to-Noise Ratio (PSNR) of 36.7 dB and a Structural SIMilarity (SSIM) of 86.6% as tested on interferograms of biological cells. Notably, NAS-PRNet achieves phase retrieval in only 31 ms, representing 15x speedup over the most recent Mamba-UNet with only a slightly lower phase retrieval accuracy.

Unmixing Optical Signals from Undersampled Volumetric Measurements by Filtering the Pixel Latent Variables

Authors: Catherine Bouchard, Andréanne Deschênes, Vincent Boulanger, Jean-Michel Bellavance, Julia Chabbert, Alexy Pelletier-Rioux, Flavie Lavoie-Cardinal, Christian Gagné

The development of signal unmixing algorithms is essential for leveraging multimodal datasets acquired through a wide array of scientific imaging technologies, including hyperspectral or time-resolved acquisitions. In experimental physics, enhancing the spatio-temporal resolution or expanding the number of detection channels often leads to diminished sampling rate and signal-to-noise ratio, significantly affecting the efficacy of signal unmixing algorithms. We propose Latent Unmixing, a new approach which applies bandpass filters to the latent space of a multidimensional convolutional neural network to disentangle overlapping signal components. It enables better isolation and quantification of individual signal contributions, especially in the context of undersampled distributions. Using multidimensional convolution kernels to process all dimensions simultaneously enhances the network's ability to extract information from adjacent pixels, and time or spectral bins. This approach enables more effective separation of components in cases where individual pixels do not provide clear, well-resolved information. We showcase the method's practical use in experimental physics through two test cases that highlight the versatility of our approach: fluorescence lifetime microscopy and mode decomposition in optical fibers. The latent unmixing method extracts valuable information from complex signals that cannot be resolved by standard methods. It opens up new possibilities in optics and photonics for multichannel separation at an increased sampling rate.

Enhancing Hemodynamic Parameter Estimations: Nonlinear Blood Behavior in 4D Flow MRI

Authors: Hernán Mella, Felipe Galarce, Tetsuro Sekine, Julio Sotelo, Ernesto Castillo

Hemodynamic parameters are often estimated assuming a constant Newtonian viscosity, even though blood exhibits shear-thinning behavior. This article investigates the influence of blood rheology and hematocrit (Hct) percentage on the estimation of Wall Shear Stress (WSS), rate of viscous Energy Loss ($\dot{E}_L$) at different points in the cardiac cycle, and the Oscillatory Shear Index (OSI). We focus on a hematocrit-dependent power-law non-Newtonian model, considering a wide range of Hct values at physiological temperature, with rheological parameters obtained from previously reported experimental data. In all cases, we systematically compared WSS, $\dot{E}_L$, and OSI using both Newtonian and power-law models, underscoring the crucial role of blood rheology in accurately assessing cardiovascular diseases. Our results show that, in in-silico experiments, differences in WSS and $\dot{E}_L$ across a wide range of Hct values can reach as high as 190\% and 113\% at systole, and as low as -72\% and -74\% at diastole, respectively. In in-vivo data, differences in WSS and $\dot{E}_L$ can reach up to -45\% and -60\% at systole, and range from -69\% to 73\% at diastole. This study enhances our understanding of the impact of blood rheology on hemodynamic parameter estimations using both in-silico and in-vivo aortic 4D Flow MRI data.

Input-Output Extension of Underactuated Nonlinear Systems

Authors: Mirko Mizzoni, Amr Afifi, Antonio Franchi

This letter proposes a method to integrate auxiliary actuators that enhance the task-space capabilities of commercial underactuated systems, while leaving the internal certified low-level controller untouched. The additional actuators are combined with a feedback-linearizing outer-loop controller, enabling full-pose tracking. We provide conditions under which legacy high-level commands and new actuator inputs can be cohesively coordinated to achieve decoupled control of all degrees of freedom. A comparative study with a standard quadrotor-originally not designed for physical interaction-demonstrates that the proposed modified platform remains stable under contact, while the baseline system diverges. Additionally, simulation results under parameter uncertainty illustrate the robustness of the proposed approach.

Advancing Automatic Photovoltaic Defect Detection using Semi-Supervised Semantic Segmentation of Electroluminescence Images

Authors: Abhishek Jha, Yogesh Rawat, Shruti Vyas

Photovoltaic (PV) systems allow us to tap into all abundant solar energy, however they require regular maintenance for high efficiency and to prevent degradation. Traditional manual health check, using Electroluminescence (EL) imaging, is expensive and logistically challenging which makes automated defect detection essential. Current automation approaches require extensive manual expert labeling, which is time-consuming, expensive, and prone to errors. We propose PV-S3 (Photovoltaic-Semi-supervised Semantic Segmentation), a Semi-Supervised Learning approach for semantic segmentation of defects in EL images that reduces reliance on extensive labeling. PV-S3 is an artificial intelligence (AI) model trained using a few labeled images along with numerous unlabeled images. We introduce a novel Semi Cross-Entropy loss function to deal with class imbalance. We evaluate PV-S3 on multiple datasets and demonstrate its effectiveness and adaptability. With merely 20% labeled samples, we achieve an absolute improvement of 9.7% in mean Intersection-over-Union (mIoU), 13.5% in Precision, 29.15% in Recall, and 20.42% in F1-Score over prior state-of-the-art supervised method (which uses 100% labeled samples) on University of Central Florida-Electroluminescence (UCF-EL) dataset (largest dataset available for semantic segmentation of EL images) showing improvement in performance while reducing the annotation costs by 80%. For more details, visit our GitHub repository: this https URL.

Distributed Online Feedback Optimization for Real-time Distribution System Voltage Regulation

Authors: Sen Zhan, Nikolaos G. Paterakis, Wouter van den Akker, Anne van der Molen, Johan Morren, Han Slootweg

We investigate the real-time voltage regulation problem in distribution systems employing online feedback optimization (OFO) with short-range communication between physical neighbours. OFO does not need an accurate grid model nor estimated consumption of non-controllable loads, affords fast calculations, and demonstrates robustness to uncertainties and disturbances, which render it particularly suitable for real-time distribution system applications. However, many OFO controllers require centralized communication, making them susceptible to single-point failures. This paper proposes a distributed OFO design based on a nested feedback optimization strategy and analyzes its convergence. The strategy preserves end-users' privacy by keeping voltage data local. Numerical study results demonstrate that the proposed design achieves effective voltage regulation and outperforms other distributed and local approaches.

MGA-Net: A Novel Mask-Guided Attention Neural Network for Precision Neonatal Brain Imaging

Authors: Bahram Jafrasteh, Simon Pedro Lubian-Lopez, Emiliano Trimarco, Macarena Roman Ruiz, Carmen Rodriguez Barrios, Yolanda Marin Almagro, Isabel Benavente-Fernandez

In this study, we introduce MGA-Net, a novel mask-guided attention neural network, which extends the U-net model for precision neonatal brain imaging. MGA-Net is designed to extract the brain from other structures and reconstruct high-quality brain images. The network employs a common encoder and two decoders: one for brain mask extraction and the other for brain region reconstruction. A key feature of MGA-Net is its high-level mask-guided attention module, which leverages features from the brain mask decoder to enhance image reconstruction. To enable the same encoder and decoder to process both MRI and ultrasound (US) images, MGA-Net integrates sinusoidal positional encoding. This encoding assigns distinct positional values to MRI and US images, allowing the model to effectively learn from both modalities. Consequently, features learned from a single modality can aid in learning a modality with less available data, such as US. We extensively validated the proposed MGA-Net on diverse and independent datasets from varied clinical settings and neonatal age groups. The metrics used for assessment included the DICE similarity coefficient, recall, and accuracy for image segmentation; structural similarity for image reconstruction; and root mean squared error for total brain volume estimation from 3D ultrasound images. Our results demonstrate that MGA-Net significantly outperforms traditional methods, offering superior performance in brain extraction and segmentation while achieving high precision in image reconstruction and volumetric analysis. Thus, MGA-Net represents a robust and effective preprocessing tool for MRI and 3D ultrasound images, marking a significant advance in neuroimaging that enhances both research and clinical diagnostics in the neonatal period and this http URL code is available at this https URL

Detecting $\sim$10 mK Face Temperature Change Based on Lock-in Thermography Referencing Heartbeat

Authors: Nanami Kotani, Yasuaki Monnai

Infrared thermography, which has widely spread particularly during the COVID-19 period, has been effectively used for research on health monitoring and emotion estimation. Nevertheless, detecting minute temperature changes with thermography is challenging as it is disturbed by not only noise but also outside temperature surrounding the object. In this study, we demonstrate detecting face temperature variation by implementing lock-in thermography using heartbeat signals as a reference. It allows us to detect minute temperature changes, as low as $\sim$10 mK, on the forehead with a commercially available thermal camera. The proposed approach enables stable measurement of body temperature variation, showing potential for non-contact emotion estimation.

HiFAKES: Synthetic High-Frequency NILM Data for NILM Models Diagnostics and Generalization Testing

Authors: Ilia Kamyshev, Sahar Moghimian, Henni Ouerdane

Monitoring electricity consumption at the appliance level is crucial for increasing energy efficiency in residential and commercial buildings. Using a single meter, the non-intrusive load monitoring (NILM) breaks down household consumption down to appliance-level, providing comprehensive insights into end-user electricity behavior. NILM models are trained on a household's total power consumption paired with submetered appliance labels. When sampled at high frequencies ($\geq$ 1 kHz), these datasets capture the full waveform characteristics, significantly improving disaggregation accuracy and model generalization. Nevertheless, such datasets are scarce, collected from a limited number of households, and rarely include labels for power estimation, which complicates their use for model training, evaluation, or debugging. We propose HiFAKES, a pre-trained synthetic data generator that can instantly generate unlimited amounts of fully labeled high-frequency NILM data, including aggregated and submetered current signatures. The data is ready-to-use and annotated for load identification (classification) and power estimation (regression). It allows simulating seen and completely unseen scenarios of appliances' behavior with full control over the number of appliance classes, operational modes, class similarity, brand diversity, and the number of concurrently running devices. We propose a structured methodology to test the generalization of NILM models on simulated unseen households. The reliability of the HiFAKES synthetic data is assessed using a domain-agnostic 3-dimensional metric. The generated signatures achieve high realism (93\% authenticity), closely resemble real-world data (84\% fidelity), and include a reasonable portion of unseen signatures (5\%).

Quantifying the Dunkelflaute: An analysis of variable renewable energy droughts in Europe

Authors: Martin Kittel, Wolf-Peter Schill

Variable renewable energy droughts, also called "Dunkelflaute", emerge as a challenge for climate-neutral energy systems based on variable renewables. Drawing on 38 historic weather years and an advanced identification method, we characterize European drought events for on- and offshore wind power, solar photovoltaics, and renewable technology portfolios. We show that their characteristics heavily depend on the chosen drought threshold, questioning the usefulness of single-threshold analyses. Applying a multi-threshold framework, we quantify how the complementarity of wind and solar power temporally and spatially alleviates drought frequency, duration, and severity within (portfolio effect) and across countries (balancing effect). We identify the most extreme droughts and show how these drive major discharging periods of long-duration storage in a fully renewable European energy system, based on a policy-relevant decarbonization scenario. Such events comprise sequences of shorter droughts of varying severity. The most extreme event occurred in winter 1996/97 and lasted 55 days in a perfectly interconnected setting. While the average renewable availability during this period was still 47% of its long-run mean, we argue that system planners must consider such events when planning for storage and other flexibility technologies. Methodologically, we conclude that using single calendar years is not suitable for modeling weather-resilient energy scenarios.

A Machine Learning-Based Reference Governor for Nonlinear Systems With Application to Automotive Fuel Cells

Authors: Mostafaali Ayubirad, Hamid R. Ossareh

The prediction-based nonlinear reference governor (PRG) is an add-on algorithm to enforce constraints on pre-stabilized nonlinear systems by modifying, whenever necessary, the reference signal. The implementation of PRG carries a heavy computational burden, as it may require multiple numerical simulations of the plant model at each sample time. To this end, this paper proposes an alternative approach based on machine learning, where we first use a regression neural network (NN) to approximate the input-output map of the PRG from a set of training data. During the real-time operation, at each sample time, we use the trained NN to compute a nominal reference command, which may not be constraint admissible due to training errors and limited data. We adopt a novel sensitivity-based approach to minimally adjust the nominal reference while ensuring constraint enforcement. We thus refer to the resulting control strategy as the modified neural network reference governor (MNN-RG), which is significantly more computationally efficient than the PRG. The computational and theoretical properties of MNN-RG are presented. Finally, the effectiveness and limitations of the proposed method are studied by applying it as a load governor for constraint management in automotive fuel cell systems through simulation-based case studies.

A System Parameterization for Direct Data-Driven Estimator Synthesis

Authors: Felix Brändle, Frank Allgöwer

This paper introduces a novel parameterization to characterize unknown linear time-invariant systems using noisy data. The presented parameterization describes exactly the set of all systems consistent with the available data. We then derive verifiable conditions, when the consistency constraint reduces the set to the true system and when it does not have any impact. Furthermore, we demonstrate how to use this parameterization to perform a direct data-driven estimator synthesis with guarantees on the H_{\infty}-norm. Lastly, we conduct numerical experiments to compare our approach to existing methods.

A New 8/14 Two-Phase Switched Reluctance Motor

Authors: Gholamreza Davarpanah, Hossein Shirzad, Jawad Faiz

Despite their simple and robust structure, low cost, and simple cooling system, switched reluctance motors (SRMs) face the challenge of low mean torque. A possible solution is to change the structure of SRMs. This article introduces an innovative combination of the number of rotor teeth and stator teeth of a two-phase switch reluctance motor (TPSRM) with eight teeth for the stator and fourteen teeth for the rotor. As a result of its unique design, which has a short path for passing the main flux, it requires less magnetomotive force. This leads to less core and copper loss, resulting in increased efficiency. Each tooth of the stator in a phase develops a positive torque during the rotation of the rotor, which increases the torque and consequently increases the mean torque of the proposed TPSRM. A current hysteresis control (CHC) is simulated by 2D FEM for the proposed 8/14 TPSRM and the conventional 8/12 TPSRM under the same mechanical load on the shaft to get a current hysteresis reference of 15A at the nominal speed of 600 rpm. To verify the novelty and advantages of the suggested TPSRM, it is compared with the conventional 8/12 TPSRM in terms of mean and peak torque, torque density, and core and copper losses were compared. Lastly, the proposed 8/14 TPSRM is shown to have better performance than the conventional 8/12 TPSRM.

Differentially Private Gradient-Tracking-Based Distributed Stochastic Optimization over Directed Graphs

Authors: Jialong Chen, Jimin Wang, Ji-Feng Zhang

This paper proposes a differentially private gradient-tracking-based distributed stochastic optimization algorithm over directed graphs. In particular, privacy noises are incorporated into each agent's state and tracking variable to mitigate information leakage, after which the perturbed states and tracking variables are transmitted to neighbors. We design two novel schemes for the step-sizes and the sampling number within the algorithm. The sampling parameter-controlled subsampling method employed by both schemes enhances the differential privacy level, and ensures a finite cumulative privacy budget even over infinite iterations. The algorithm achieves both almost sure and mean square convergence for nonconvex objectives. Furthermore, when nonconvex objectives satisfy the Polyak-Lojasiewicz condition, Scheme (S1) achieves a polynomial mean square convergence rate, and Scheme (S2) achieves an exponential mean square convergence rate. The trade-off between privacy and convergence is presented. The effectiveness of the algorithm and its superior performance compared to existing works are illustrated through numerical examples of distributed training on the benchmark datasets "MNIST" and "CIFAR-10".

WaveNet-SF: A Hybrid Network for Retinal Disease Detection Based on Wavelet Transform in the Spatial-Frequency Domain

Authors: Jilan Cheng, Guoli Long, Zeyu Zhang, Zhenjia Qi, Hanyu Wang, Libin Lu, Shuihua Wang, Yudong Zhang, Jin Hong

Retinal diseases are a leading cause of vision impairment and blindness, with timely diagnosis being critical for effective treatment. Optical Coherence Tomography (OCT) has become a standard imaging modality for retinal disease diagnosis, but OCT images often suffer from issues such as speckle noise, complex lesion shapes, and varying lesion sizes, making interpretation challenging. In this paper, we propose a novel framework, WaveNet-SF, to enhance retinal disease detection by integrating the spatial-domain and frequency-domain learning. The framework utilizes wavelet transforms to decompose OCT images into low- and high-frequency components, enabling the model to extract both global structural features and fine-grained details. To improve lesion detection, we introduce a Multi-Scale Wavelet Spatial Attention (MSW-SA) module, which enhances the model's focus on regions of interest at multiple scales. Additionally, a High-Frequency Feature Compensation (HFFC) block is incorporated to recover edge information lost during wavelet decomposition, suppress noise, and preserve fine details crucial for lesion detection. Our approach achieves state-of-the-art (SOTA) classification accuracies of 97.82% and 99.58% on the OCT-C8 and OCT2017 datasets, respectively, surpassing existing methods. These results demonstrate the efficacy of WaveNet-SF in addressing the challenges of OCT image analysis and its potential as a powerful tool for retinal disease diagnosis.

Guided Neural Schrödinger bridge for Brain MR image synthesis with Limited Data

Authors: Hanyeol Yang, Sunggyu Kim, Mi Kyung Kim, Yongseon Yoo, Yu-Mi Kim, Min-Ho Shin, Insung Chung, Sang Baek Koh, Hyeon Chang Kim, Jong-Min Lee

Multi-modal brain MRI provides essential complementary information for clinical diagnosis. However, acquiring all modalities in practice is often constrained by time and cost. To address this, various methods have been proposed to generate missing modalities from available ones. Traditional approaches can be broadly categorized into two main types: paired and unpaired methods. While paired methods for synthesizing missing modalities achieve high accuracy, obtaining large-scale paired datasets is typically impractical. In contrast, unpaired methods, though scalable, often fail to preserve critical anatomical features, such as lesions. In this paper, we propose Fully Guided Schrödinger Bridge (FGSB), a novel framework designed to overcome these limitations by enabling high-fidelity generation with extremely limited paired data. Furthermore, when provided with lesion-specific information such as expert annotations, segmentation tools, or simple intensity thresholds for critical regions, FGSB can generate missing modalities while preserving these significant lesion with reduced data requirements. Our model comprises two stages: 1) Generation Phase: Iteratively refines synthetic images using paired target image and Gaussian noise. Training Phase: Learns optimal transformation pathways from source to target modality by mapping all intermediate states, ensuring consistent and high-fidelity synthesis. Experimental results across multiple datasets demonstrate that FGSB achieved performance comparable to large-data-trained models, while using only two subjects. Incorporating lesion-specific priors further improves the preservation of clinical features.

A Framework for Fractional Matrix Programming Problems with Applications in FBL MU-MIMO

Authors: Mohammad Soleymani, Eduard Jorswieck, Robert Schober, Lajos Hanzo

An efficient framework is conceived for fractional matrix programming (FMP) optimization problems (OPs) namely for minimization and maximization. In each generic OP, either the objective or the constraints are functions of multiple arbitrary continuous-domain fractional functions (FFs). This ensures the framework's versatility, enabling it to solve a broader range of OPs than classical FMP solvers, like Dinkelbach-based algorithms. Specifically, the generalized Dinkelbach algorithm can only solve multiple-ratio FMP problems. By contrast, our framework solves OPs associated with a sum or product of multiple FFs as the objective or constraint functions. Additionally, our framework provides a single-loop solution, while most FMP solvers require twin-loop algorithms. Many popular performance metrics of wireless communications are FFs. For instance, latency has a fractional structure, and minimizing the sum delay leads to an FMP problem. Moreover, the mean square error (MSE) and energy efficiency (EE) metrics have fractional structures. Thus, optimizing EE-related metrics such as the sum or geometric mean of EEs and enhancing the metrics related to spectral-versus-energy-efficiency tradeoff yield FMP problems. Furthermore, both the signal-to-interference-plus-noise ratio and the channel dispersion are FFs. In this paper, we also develop resource allocation schemes for multi-user multiple-input multiple-output (MU-MIMO) systems, using finite block length (FBL) coding, demonstrating attractive practical applications of FMP by optimizing the aforementioned metrics.

Comprehensive Review of Deep Unfolding Techniques for Next-Generation Wireless Communication Systems

Authors: Sukanya Deka, Kuntal Deka, Nhan Thanh Nguyen, Sanjeev Sharma, Vimal Bhatia, Nandana Rajatheva

The application of machine learning in wireless communications has been extensively explored, with deep unfolding emerging as a powerful model-based technique. Deep unfolding enhances interpretability by transforming complex iterative algorithms into structured layers of deep neural networks (DNNs). This approach seamlessly integrates domain knowledge with deep learning (DL), leveraging the strengths of both methods to simplify complex signal processing tasks in communication systems. To provide a solid foundation, we first present a brief overview of DL and deep unfolding. We then explore the applications of deep unfolding in key areas, including signal detection, channel estimation, beamforming design, decoding for error-correcting codes, sensing and communication, power allocation, and security. Each section focuses on a specific task, highlighting its significance in emerging 6G technologies and reviewing recent advancements in deep unfolding-based solutions. Finally, we discuss the challenges associated with developing deep unfolding techniques and propose potential improvements to enhance their applicability across diverse wireless communication scenarios.

Balancing service provision by EV aggregator in different TSO-DSO coordination schemes

Authors: Hang Nguyen, Phuong Nguyen, Koen Kok

The increasing penetration of Distributed Energy Resources (DERs) in the distribution system has led to the emergence of a new market actor - the aggregator. The aggregator serves as a facilitator, enabling flexibility asset owners to get access to different markets. In which, EVs aggregators are gaining more attention due to their expanding use and potential to provide services in various types of markets, particularly in the reserve market. Currently, TSO indirectly utilizes these resources under the management of the distribution system operators (DSO), which can negatively impact the distribution grid. Conversely, adjustments from DSOs can impact service provision to TSO due to the shortage of TSO usage information. These factors highlight the importance of evaluating the service provision from aggregators under different TSO-DSO coordination schemes. This paper focuses on the provision of flexibility from electric vehicles (EVs) aggregators for balancing service in the TSO-DSO hybrid-managed and compares it with the DSO-managed coordination schemes. The behavior of aggregators reacting to price fluctuations and TSO requests under different coordination schemes and simulation scenarios is thoroughly evaluated. Additionally, their impact on the grid is analyzed through the DSO's congestion management process and validated using data from a real part of the Dutch distribution network. Results find that the hybrid-managed coordination scheme gives more benefit to the aggregator than the DSO-managed scheme and the EVs aggregator will gain more profit in winter than summer due to more upward regulation service is needed.

Advances in Anti-Deception Jamming Strategies for Radar Systems: A Survey

Authors: Helena Calatrava, Shuo Tang, Pau Closas

Deception jamming has long been a significant threat to radar systems, interfering with search, acquisition, and tracking by introducing false information that diverts attention from the targets of interest. As deception strategies become more sophisticated, the vulnerability of radar systems to these attacks continues to escalate. This paper offers a comprehensive review of the evolution of anti-deception jamming techniques, starting with legacy solutions and progressing to the latest advancements. Current research is categorized into three key areas: prevention strategies, which hinder the ability of jammers to alter radar processing; detection strategies, which alert the system to deception and may classify the type of attack; and mitigation strategies, which aim to reduce or suppress the impact of jamming. Additionally, key avenues for further research are highlighted, with a particular emphasis on distributed, cognitive, and AI-enabled radar systems. We envision this paper as a gateway to the existing literature on anti-deception jamming, a critical area for safeguarding radar systems against evolving threats.

Comprehensive Evaluation of OCT-based Automated Segmentation of Retinal Layer, Fluid and Hyper-Reflective Foci: Impact on Clinical Assessment of Diabetic Retinopathy Severity

Authors: S. Chen, D. Ma, M. Raviselvan, S. Sundaramoorthy, K. Popuri, M. J. Ju, M. V. Sarunic, D. Ratra, M. F. Beg

Diabetic retinopathy (DR) is a leading cause of vision loss, requiring early and accurate assessment to prevent irreversible damage. Spectral Domain Optical Coherence Tomography (SD-OCT) enables high-resolution retinal imaging, but automated segmentation performance varies, especially in cases with complex fluid and hyperreflective foci (HRF) patterns. This study proposes an active-learning-based deep learning pipeline for automated segmentation of retinal layers, fluid, and HRF, using four state-of-the-art models: U-Net, SegFormer, SwinUNETR, and VM-UNet, trained on expert-annotated SD-OCT volumes. Segmentation accuracy was evaluated with five-fold cross-validation, and retinal thickness was quantified using a K-nearest neighbors algorithm and visualized with Early Treatment Diabetic Retinopathy Study (ETDRS) maps. SwinUNETR achieved the highest overall accuracy (DSC = 0.7719; NSD = 0.8149), while VM-UNet excelled in specific layers. Structural differences were observed between non-proliferative and proliferative DR, with layer-specific thickening correlating with visual acuity impairment. The proposed framework enables robust, clinically relevant DR assessment while reducing the need for manual annotation, supporting improved disease monitoring and treatment planning.

Handling Domain Shifts for Anomalous Sound Detection: A Review of DCASE-Related Work

Authors: Kevin Wilkinghoff, Takuya Fujimura, Keisuke Imoto, Jonathan Le Roux, Zheng-Hua Tan, Tomoki Toda

When detecting anomalous sounds in complex environments, one of the main difficulties is that trained models must be sensitive to subtle differences in monitored target signals, while many practical applications also require them to be insensitive to changes in acoustic domains. Examples of such domain shifts include changing the type of microphone or the location of acoustic sensors, which can have a much stronger impact on the acoustic signal than subtle anomalies themselves. Moreover, users typically aim to train a model only on source domain data, which they may have a relatively large collection of, and they hope that such a trained model will be able to generalize well to an unseen target domain by providing only a minimal number of samples to characterize the acoustic signals in that domain. In this work, we review and discuss recent publications focusing on this domain generalization problem for anomalous sound detection in the context of the DCASE challenges on acoustic machine condition monitoring.

COVID-19 Pneumonia Diagnosis Using Medical Images: Deep Learning-Based Transfer Learning Approach

Authors: Anjali Dharmik

SARS-CoV-2, the causative agent of COVID-19, remains a global health concern due to its high transmissibility and evolving variants. Although vaccination efforts and therapeutic advancements have mitigated disease severity, emerging mutations continue to challenge diagnostics and containment strategies. As of mid-February 2025, global test positivity has risen to 11%, marking the highest level in over six months despite widespread immunization efforts. Newer variants demonstrate enhanced host cell binding, increasing both infectivity and diagnostic complexity. This study evaluates the effectiveness of deep transfer learning in delivering rapid, accurate, and mutation-resilient COVID-19 diagnosis from medical imaging, with a focus on scalability and accessibility. We developed an automated detection system using state-of-the-art CNNs, including VGG16, ResNet50, ConvNetXtTiny, MobileNet, NASNetMobile, and DenseNet121 among others, to detect COVID-19 from chest X-ray and CT images. Among all the models evaluated, DenseNet121 emerged as the best-performing architecture for COVID-19 diagnosis using CT and X-ray images. It achieved an impressive accuracy of 98%, with 96.9% precision, 98.9% recall, 97.9% F1-score and 99.8% AUC score, indicating a high degree of consistency and reliability in both detecting positive and negative cases. The confusion matrix showed minimal false positives and false negatives, underscoring the model's robustness in real-world diagnostic scenarios.

Goal-Oriented Remote Tracking Through Correlated Observations in Pull-based Communications

Authors: Abolfazl Zakeri, Mohammad Moltafet, Marian Codreanu

We address the real-time remote tracking problem in a status update system comprising two sensors, two independent information sources, and a remote monitor. The status updating follows a pull-based communication, where the monitor commands/pulls the sensors for status updates, i.e., the actual state of the sources. We consider that the observations are correlated, meaning that each sensor sent data could also include the state of the other source due to, e.g., inter-sensor communication or proximity-based monitoring. The effectiveness of data communication is measured by a generic distortion, capturing the underlying application goal. We provide optimal command/pulling policies for the monitor that minimize the average weighted sum distortion and transmission cost. Since the monitor cannot fully observe the exact state of each source, we propose a partially observable Markov decision process (POMDP) and reformulate it as a belief MDP problem. We then effectively truncate the infinite belief space and transform it into a finite-state MDP problem, which is solved via relative value iteration. Simulation results show the effectiveness of the derived policy over age-based and deep-Q network baseline policies.

Anatomically and Metabolically Informed Diffusion for Unified Denoising and Segmentation in Low-Count PET Imaging

Authors: Menghua Xia, Kuan-Yin Ko, Der-Shiun Wang, Ming-Kai Chen, Qiong Liu, Huidong Xie, Liang Guo, Wei Ji, Jinsong Ouyang, Reimund Bayerlein, Benjamin A. Spencer, Quanzheng Li, Ramsey D. Badawi, Georges El Fakhri, Chi Liu

Positron emission tomography (PET) image denoising, along with lesion and organ segmentation, are critical steps in PET-aided diagnosis. However, existing methods typically treat these tasks independently, overlooking inherent synergies between them as correlated steps in the analysis pipeline. In this work, we present the anatomically and metabolically informed diffusion (AMDiff) model, a unified framework for denoising and lesion/organ segmentation in low-count PET imaging. By integrating multi-task functionality and exploiting the mutual benefits of these tasks, AMDiff enables direct quantification of clinical metrics, such as total lesion glycolysis (TLG), from low-count inputs. The AMDiff model incorporates a semantic-informed denoiser based on diffusion strategy and a denoising-informed segmenter utilizing nnMamba architecture. The segmenter constrains denoised outputs via a lesion-organ-specific regularizer, while the denoiser enhances the segmenter by providing enriched image information through a denoising revision module. These components are connected via a warming-up mechanism to optimize multi-task interactions. Experiments on multi-vendor, multi-center, and multi-noise-level datasets demonstrate the superior performance of AMDiff. For test cases below 20% of the clinical count levels from participating sites, AMDiff achieves TLG quantification biases of -21.60%, outperforming its ablated versions which yield biases of -30.83% (without the lesion-organ-specific regularizer) and -35.63% (without the denoising revision module).

Kernel-based error bounds of bilinear Koopman surrogate models for nonlinear data-driven control

Authors: Robin Strässer, Manuel Schaller, Julian Berberich, Karl Worthmann, Frank Allgöwer

We derive novel deterministic bounds on the approximation error of data-based bilinear surrogate models for unknown nonlinear systems. The surrogate models are constructed using kernel-based extended dynamic mode decomposition to approximate the Koopman operator in a reproducing kernel Hilbert space. Unlike previous methods that require restrictive assumptions on the invariance of the dictionary, our approach leverages kernel-based dictionaries that allow us to control the projection error via pointwise error bounds, overcoming a significant limitation of existing theoretical guarantees. The derived state- and input-dependent error bounds allow for direct integration into Koopman-based robust controller designs with closed-loop guarantees for the unknown nonlinear system. Numerical examples illustrate the effectiveness of the proposed framework.

D2SA: Dual-Stage Distribution and Slice Adaptation for Efficient Test-Time Adaptation in MRI Reconstruction

Authors: Lipei Zhang, Rui Sun, Zhongying Deng, Yanqi Cheng, Carola-Bibiane Schönlieb, Angelica I Aviles-Rivero

Variations in Magnetic resonance imaging (MRI) scanners and acquisition protocols cause distribution shifts that degrade reconstruction performance on unseen data. Test-time adaptation (TTA) offers a promising solution to address this discrepancies. However, previous single-shot TTA approaches are inefficient due to repeated training and suboptimal distributional models. Self-supervised learning methods may risk over-smoothing in scarce data scenarios. To address these challenges, we propose a novel Dual-Stage Distribution and Slice Adaptation (D2SA) via MRI implicit neural representation (MR-INR) to improve MRI reconstruction performance and efficiency, which features two stages. In the first stage, an MR-INR branch performs patient-wise distribution adaptation by learning shared representations across slices and modelling patient-specific shifts with mean and variance adjustments. In the second stage, single-slice adaptation refines the output from frozen convolutional layers with a learnable anisotropic diffusion module, preventing over-smoothing and reducing computation. Experiments across five MRI distribution shifts demonstrate that our method can integrate well with various self-supervised learning (SSL) framework, improving performance and accelerating convergence under diverse conditions.

Algorithm Design and Prototype Validation for Reconfigurable Intelligent Sensing Surface: Forward-Only Transmission

Authors: Cheng Luo, Luping Xiang, Jie Hu, Kun Yang

Sensing-assisted communication schemes have recently garnered significant research attention. In this work, we design a dual-function reconfigurable intelligent surface (RIS), integrating both active and passive elements, referred to as the reconfigurable intelligent sensing surface (RISS), to enhance communication. By leveraging sensing results from the active elements, we propose communication enhancement and robust interference suppression schemes for both near-field and far-field models, implemented through the passive elements. These schemes remove the need for base station (BS) feedback for RISS control, simplifying the communication process by replacing traditional channel state information (CSI) feedback with real-time sensing from the active elements. The proposed schemes are theoretically analyzed and then validated using software-defined radio (SDR). Experimental results demonstrate the effectiveness of the sensing algorithms in real-world scenarios, such as direction of arrival (DOA) estimation and radio frequency (RF) identification recognition. Moreover, the RISS-assisted communication system shows strong performance in communication enhancement and interference suppression, particularly in near-field models.

Reinforcement learning for robust dynamic metabolic control

Authors: Sebastián Espinel-Ríos, River Walser, Dongda Zhang

Dynamic metabolic control allows key metabolic fluxes to be modulated in real time, enhancing bioprocess flexibility and expanding available optimization degrees of freedom. This is achieved, e.g., via targeted modulation of metabolic enzyme expression. However, identifying optimal dynamic control policies is challenging due to the generally high-dimensional solution space and the need to manage metabolic burden and cytotoxic effects arising from inducible enzyme expression. The task is further complicated by stochastic dynamics, which reduce bioprocess reproducibility. We propose a reinforcement learning framework} to derive optimal policies by allowing an agent (the controller) to interact with a surrogate dynamic model. To promote robustness, we apply domain randomization, enabling the controller to generalize across uncertainties. When transferred to an experimental system, the agent can in principle continue fine-tuning the policy. Our framework provides an alternative to conventional model-based control such as model predictive control, which requires model differentiation with respect to decision variables; often impractical for complex stochastic, nonlinear, stiff, and piecewise-defined dynamics. In contrast, our approach relies on forward integration of the model, thereby simplifying the task. We demonstrate the framework in two $\textit{Escherichia coli}$ bioprocesses: dynamic control of acetyl-CoA carboxylase for fatty-acid synthesis and of adenosine triphosphatase for lactate synthesis.

Selective Variable Convolution Meets Dynamic Content-Guided Attention for Infrared Small Target Detection

Authors: Yirui Chen, Yiming Zhu, Yuxin Jing, Tianpei Zhang, Jufeng Zhao

Infrared Small Target Detection (IRSTD) system aims to identify small targets in complex backgrounds. Due to the convolution operation in Convolutional Neural Networks (CNNs), applying traditional CNNs to IRSTD presents challenges, since the feature extraction of small targets is often insufficient, resulting in the loss of critical features. To address these issues, we propose a dynamic content-guided attention multiscale feature aggregation network (DCGANet), which adheres to the attention principle of 'coarse-to-fine' and achieves high detection accuracy. First, we propose a selective variable convolution (SVC) module that integrates the benefits of standard convolution, irregular deformable convolution, and multi-rate dilated convolution. This module is designed to expand the receptive field and enhance non-local features, thereby effectively improving the discrimination between targets and backgrounds. Second, the core component of DCGANet is a two-stage content-guided attention module. This module employs a two-stage attention mechanism to initially direct the network's focus to salient regions within the feature maps and subsequently determine whether these regions correspond to targets or background interference. By retaining the most significant responses, this mechanism effectively suppresses false alarms. Additionally, we propose an Adaptive Dynamic Feature Fusion (ADFF) module to substitute for static feature cascading. This dynamic feature fusion strategy enables DCGANet to adaptively integrate contextual features, thereby enhancing its ability to discriminate true targets from false alarms. DCGANet has achieved new benchmarks across multiple datasets.

Prime and Co-prime Integer Matrices

Authors: Xiang-Gen Xia, Guangpu Guo

This paper investigates prime and co-prime integer matrices and their properties. It characterizes all pairwise co-prime integer matrices that are also prime integer matrices. This provides a simple way to construct families of pairwise co-prime integer matrices, that may have applications in multidimensional co-prime sensing and multidimensional Chinese remainder theorem.

Embracing Diffraction: A Paradigm Shift in Wireless Sensing and Communication

Authors: Anurag Pallaprolu, Winston Hurst, Yasamin Mostofi

Wireless signals are integral to modern society, enabling both communication and increasingly, environmental sensing. While various propagation models exist, ranging from empirical methods to full-wave simulations, the phenomenon of electromagnetic diffraction is often treated as a secondary effect or a correction factor. This paper positions diffraction as a fundamentally important and underutilized mechanism that is rich with information about the physical environment. Specifically, diffraction-inducing elements generate distinct signatures that are rich with information about their underlying properties such as their geometries. We then argue that by understanding and exploiting these relationships, diffraction can be harnessed strategically. We introduce a general optimization framework to formalize this concept, illustrating how diffraction can be leveraged for both inverse problems (sensing scene details such as object geometries from measured fields) and design problems (shaping radio frequency (RF) fields for communication objectives by configuring diffracting elements). Focusing primarily on edge diffraction and Keller's Geometrical Theory of Diffraction (GTD), we discuss specific applications in RF sensing for scene understanding and in communications for RF field programming, drawing upon recent work. Overall, this paper lays out a vision for systematically incorporating diffraction into the design and operation of future wireless systems, paving the way for enhanced sensing capabilities and more robust communication strategies.

GPS-Aided Deep Learning for Beam Prediction and Tracking in UAV mmWave Communication

Authors: Vendi Ardianto Nugroho, Byung Moo Lee

Millimeter-wave (mmWave) communication enables high data rates for cellular-connected Unmanned Aerial Vehicles (UAVs). However, a robust beam management remains challenging due to significant path loss and the dynamic mobility of UAVs, which can destabilize the UAV-base station (BS) link. This research presents a GPS-aided deep learning (DL) model that simultaneously predicts current and future optimal beams for UAV mmWave communications, maintaining a Top-1 prediction accuracy exceeding 70% and an average power loss below 0.6 dB across all prediction steps. These outcomes stem from a proposed data set splitting method ensuring balanced label distribution, paired with a GPS preprocessing technique that extracts key positional features, and a DL architecture that maps sequential position data to beam index predictions. The model reduces overhead by approximately 93% (requiring the training of 2 ~ 3 beams instead of 32 beams) with 95% beam prediction accuracy guarantees, and ensures 94% to 96% of predictions exhibit mean power loss not exceeding 1 dB.

Robust Stability Analysis of Positive Lure System with Neural Network Feedback

Authors: Hamidreza Montazeri Hedesh, Moh. Kamalul Wafi, Bahram Shafai, Milad Siami

This paper investigates the robustness of the Lur'e problem under positivity constraints, drawing on results from the positive Aizerman conjecture and robustness properties of Metzler matrices. Specifically, we consider a control system of Lur'e type in which not only the linear part includes parametric uncertainty but also the nonlinear sector bound is unknown. We investigate tools from positive linear systems to effectively solve the problems in complicated and uncertain nonlinear systems. By leveraging the positivity characteristic of the system, we derive an explicit formula for the stability radius of Lur'e systems. Furthermore, we extend our analysis to systems with neural network (NN) feedback loops. Building on this approach, we also propose a refinement method for sector bounds of NNs. This study introduces a scalable and efficient approach for robustness analysis of both Lur'e and NN-controlled systems. Finally, the proposed results are supported by illustrative examples.

Real-Time High-Accuracy Digital Wireless Time, Frequency, and Phase Calibration For Coherent Distributed Antenna Arrays

Authors: Jason M. Merlo, Samuel Wagner, John Lancaster, Jeffrey A. Nanzer

This work presents a fully-digital high-accuracy real-time calibration procedure for frequency and time alignment of open-loop wirelessly coordinated coherent distributed antenna array (CDA) modems, enabling radio frequency (RF) phase coherence of spatially separated commercial off-the-shelf (COTS) software-defined radios (SDRs) without cables or external references such as global navigation satellite system (GNSS). Building on previous work using high-accuracy spectrally-sparse time of arrival (ToA) waveforms and a multi-step ToA refinement process, a high-accuracy two-way time transfer (TWTT)-based timefrequency coordination approach is demonstrated. Due to the two-way nature of the high-accuracy TWTT approach, the time and frequency estimates are Doppler and multi-path tolerant, so long as the channel is reciprocal over the synchronization epoch. This technique is experimentally verified using COTS SDRs in a lab environment in static and dynamic scenarios and with significant multipath scatterers. Time, frequency, and phase stability were evaluated by beamforming over coaxial cables to an oscilloscope which achieved time and phase precisions of ~60 ps-70 ps, with median coherent gains above 99 % using optimized coordination parameters, and a beamforming frequency root-mean-square error (RMSE) of 3.73 ppb in a dynamic scenario. Finally, experiments were conducted to compare the performance of this technique with previous works using an analog continuous-wave two-tone (CWTT) frequency reference technique in both static and dynamic settings.

CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation

Authors: Xinlei Yu, Changmiao Wang, Hui Jin, Ahmed Elazab, Gangyong Jia, Xiang Wan, Changqing Zou, Ruiquan Ge

Multi-organ medical segmentation is a crucial component of medical image processing, essential for doctors to make accurate diagnoses and develop effective treatment plans. Despite significant progress in this field, current multi-organ segmentation models often suffer from inaccurate details, dependence on geometric prompts and loss of spatial information. Addressing these challenges, we introduce a novel model named CRISP-SAM2 with CRoss-modal Interaction and Semantic Prompting based on SAM2. This model represents a promising approach to multi-organ medical segmentation guided by textual descriptions of organs. Our method begins by converting visual and textual inputs into cross-modal contextualized semantics using a progressive cross-attention interaction mechanism. These semantics are then injected into the image encoder to enhance the detailed understanding of visual information. To eliminate reliance on geometric prompts, we use a semantic prompting strategy, replacing the original prompt encoder to sharpen the perception of challenging targets. In addition, a similarity-sorting self-updating strategy for memory and a mask-refining process is applied to further adapt to medical imaging and enhance localized details. Comparative experiments conducted on seven public datasets indicate that CRISP-SAM2 outperforms existing models. Extensive analysis also demonstrates the effectiveness of our method, thereby confirming its superior performance, especially in addressing the limitations mentioned earlier. Our code is available at: this https URL.

Human-CLAP: Human-perception-based contrastive language-audio pretraining

Authors: Taisei Takano, Yuki Okamoto, Yusuke Kanamori, Yuki Saito, Ryotaro Nagase, Hiroshi Saruwatari

Contrastive language-audio pretraining (CLAP) is widely used for audio generation and recognition tasks. For example, CLAPScore, which utilizes the similarity of CLAP embeddings, has been a major metric for the evaluation of the relevance between audio and text in text-to-audio. However, the relationship between CLAPScore and human subjective evaluation scores is still unclarified. We show that CLAPScore has a low correlation with human subjective evaluation scores. Additionally, we propose a human-perception-based CLAP called Human-CLAP by training a contrastive language-audio model using the subjective evaluation score. In our experiments, the results indicate that our Human-CLAP improved the Spearman's rank correlation coefficient (SRCC) between the CLAPScore and the subjective evaluation scores by more than 0.25 compared with the conventional CLAP.

Multi Source COVID-19 Detection via Kernel-Density-based Slice Sampling

Authors: Chia-Ming Lee, Bo-Cheng Qiu, Ting-Yao Chen, Ming-Han Sun, Fang-Ying Lin, Jung-Tse Tsai, I-An Tsai, Yu-Fan Lin, Chih-Chung Hsu

We present our solution for the Multi-Source COVID-19 Detection Challenge, which classifies chest CT scans from four distinct medical centers. To address multi-source variability, we employ the Spatial-Slice Feature Learning (SSFL) framework with Kernel-Density-based Slice Sampling (KDS). Our preprocessing pipeline combines lung region extraction, quality control, and adaptive slice sampling to select eight representative slices per scan. We compare EfficientNet and Swin Transformer architectures on the validation set. The EfficientNet model achieves an F1-score of 94.68%, compared to the Swin Transformer's 93.34%. The results demonstrate the effectiveness of our KDS-based pipeline on multi-source data and highlight the importance of dataset balance in multi-institutional medical imaging evaluation.

MEGANet-W: A Wavelet-Driven Edge-Guided Attention Framework for Weak Boundary Polyp Detection

Authors: Zhe Yee Tan

Colorectal polyp segmentation is critical for early detection of colorectal cancer, yet weak and low contrast boundaries significantly limit automated accuracy. Existing deep models either blur fine edge details or rely on handcrafted filters that perform poorly under variable imaging conditions. We propose MEGANet-W, a Wavelet Driven Edge Guided Attention Network that injects directional, parameter free Haar wavelet edge maps into each decoder stage to recalibrate semantic features. Our two main contributions are: (1) a two-level Haar wavelet head for multi orientation edge extraction; and (2) Wavelet Edge Guided Attention (WEGA) modules that fuse wavelet cues with boundary and input branches. On five public polyp datasets, MEGANet-W consistently outperforms existing methods, improving mIoU by up to 2.3% and mDice by 1.2%, while introducing no additional learnable parameters.

Musical Source Separation Bake-Off: Comparing Objective Metrics with Human Perception

Authors: Noah Jaffe, John Ashley Burgoyne

Music source separation aims to extract individual sound sources (e.g., vocals, drums, guitar) from a mixed music recording. However, evaluating the quality of separated audio remains challenging, as commonly used metrics like the source-to-distortion ratio (SDR) do not always align with human perception. In this study, we conducted a large-scale listener evaluation on the MUSDB18 test set, collecting approximately 30 ratings per track from seven distinct listener groups. We compared several objective energy-ratio metrics, including legacy measures (BSSEval v4, SI-SDR variants), and embedding-based alternatives (Frechet Audio Distance using CLAP-LAION-music, EnCodec, VGGish, Wave2Vec2, and HuBERT). While SDR remains the best-performing metric for vocal estimates, our results show that the scale-invariant signal-to-artifacts ratio (SI-SAR) better predicts listener ratings for drums and bass stems. Frechet Audio Distance (FAD) computed with the CLAP-LAION-music embedding also performs competitively--achieving Kendall's tau values of 0.25 for drums and 0.19 for bass--matching or surpassing energy-based metrics for those stems. However, none of the embedding-based metrics, including CLAP, correlate positively with human perception for vocal estimates. These findings highlight the need for stem-specific evaluation strategies and suggest that no single metric reliably reflects perceptual quality across all source types. We release our raw listener ratings to support reproducibility and further research.

MetaH2: A Snapshot Metasurface HDR Hyperspectral Camera

Authors: Yuxuan Liu, Qi Guo

We present a metasurface camera that jointly performs high-dynamic range (HDR) and hyperspectral imaging in a snapshot. The system integrates exposure bracketing and computed tomography imaging spectrometry (CTIS) by simultaneously forming multiple spatially multiplexed projections with unique power ratios and chromatic aberrations on a photosensor. The measurements are subsequently processed through a deep reconstruction model to generate an HDR image and a hyperspectral datacube. Our simulation studies show that the proposed system achieves higher reconstruction accuracy than previous snapshot hyperspectral imaging methods on benchmark datasets. We assemble a working prototype and demonstrate snapshot reconstruction of 60 dB dynamic range and 10 nm spectral resolution from 600 nm to 700 nm on real-world scenes from a monochrome photosensor.

Faster Reinforcement Learning by Freezing Slow States

Authors: Yijia Wang, Daniel R. Jiang

We study infinite horizon Markov decision processes (MDPs) with "fast-slow" structure, where some state variables evolve rapidly ("fast states") while others change more gradually ("slow states"). This structure commonly arises in practice when decisions must be made at high frequencies over long horizons, and where slowly changing information still plays a critical role in determining optimal actions. Examples include inventory control under slowly changing demand indicators or dynamic pricing with gradually shifting consumer behavior. Modeling the problem at the natural decision frequency leads to MDPs with discount factors close to one, making them computationally challenging. We propose a novel approximation strategy that "freezes" slow states during phases of lower-level planning and subsequently applies value iteration to an auxiliary upper-level MDP that evolves on a slower timescale. Freezing states for short periods of time leads to easier-to-solve lower-level problems, while a slower upper-level timescale allows for a more favorable discount factor. On the theoretical side, we analyze the regret incurred by our frozen-state approach, which leads to simple insights on how to trade off regret versus computational cost. Empirically, we benchmark our new frozen-state methods on three domains, (i) inventory control with fixed order costs, (ii) a gridworld problem with spatial tasks, and (iii) dynamic pricing with reference-price effects. We demonstrate that the new methods produce high-quality policies with significantly less computation, and we show that simply omitting slow states is often a poor heuristic.

MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian

Authors: Willy Fitra Hendria

Multimodal learning on video and text has seen significant progress, particularly in tasks like text-to-video retrieval, video-to-text retrieval, and video captioning. However, most existing methods and datasets focus exclusively on English. Despite Indonesian being one of the most widely spoken languages, multimodal research in Indonesian remains under-explored, largely due to the lack of benchmark datasets. To address this gap, we introduce the first public Indonesian video-text dataset by translating the English captions in the MSVD dataset into Indonesian. Using this dataset, we evaluate neural network models which were developed for the English video-text dataset on three tasks, i.e., text-to-video retrieval, video-to-text retrieval, and video captioning. Most existing models rely on feature extractors pretrained on English vision-language datasets, raising concerns about their applicability to Indonesian, given the scarcity of large-scale pretraining resources in the language. We apply a cross-lingual transfer learning approach by leveraging English-pretrained extractors and fine-tuning models on our Indonesian dataset. Experimental results demonstrate that this strategy improves performance across all tasks and metrics. We release our dataset publicly to support future research and hope it will inspire further progress in Indonesian multimodal learning.

Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding

Authors: Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti

Neural compression has brought tremendous progress in designing lossy compressors with good rate-distortion (RD) performance at low complexity. Thus far, neural compression design involves transforming the source to a latent vector, which is then rounded to integers and entropy coded. While this approach has been shown to be optimal on a few specific sources, we show that it can be highly sub-optimal on synthetic sources whose intrinsic dimensionality is greater than one. With integer rounding in the latent space, the quantization regions induced by neural transformations, remain square-like and fail to match those of optimal vector quantization. We demonstrate that this phenomenon is due to the choice of scalar quantization in the latent space, and not the transform design. By employing lattice quantization instead, we propose Lattice Transform Coding (LTC) and show that it approximately recovers optimal vector quantization at reasonable complexity. On real-world sources, LTC improves upon standard neural compressors. LTC also provides a framework that can integrate structurally (near) optimal information-theoretic designs into lossy compression; examples include block coding, which yields coding gain over optimal one-shot coding and approaches the asymptotically-achievable rate-distortion function, as well as nested lattice quantization for low complexity fixed-rate coding.

Regret Analysis of Policy Optimization over Submanifolds for Linearly Constrained Online LQG

Authors: Ting-Jui Chang, Shahin Shahrampour

Recent advancement in online optimization and control has provided novel tools to study online linear quadratic regulator (LQR) problems, where cost matrices are time-varying and unknown in advance. In this work, we study the online linear quadratic Gaussian (LQG) problem over the manifold of stabilizing controllers that are linearly constrained to impose physical conditions such as sparsity. By adopting a Riemannian perspective, we propose the online Newton on manifold (ONM) algorithm, which generates an online controller on-the-fly based on the second-order information of the cost function sequence. To quantify the algorithm performance, we use the notion of regret, defined as the sub-optimality of the algorithm cumulative cost against a (locally) minimizing controller sequence. We establish a regret bound in terms of the path-length of the benchmark minimizer sequence, and we further verify the effectiveness of ONM via simulations.

Frenet-Serret Frame-based Decomposition for Part Segmentation of 3D Curvilinear Structures

Authors: Leslie Gu, Jason Ken Adhinarta, Mikhail Bessmeltsev, Jiancheng Yang, Yongjie Jessica Zhang, Wenjie Yin, Daniel Berger, Jeff Lichtman, Hanspeter Pfister, Donglai Wei

Accurately segmenting 3D curvilinear structures in medical imaging remains challenging due to their complex geometry and the scarcity of diverse, large-scale datasets for algorithm development and evaluation. In this paper, we use dendritic spine segmentation as a case study and address these challenges by introducing a novel Frenet--Serret Frame-based Decomposition, which decomposes 3D curvilinear structures into a globally \( C^2 \) continuous curve that captures the overall shape, and a cylindrical primitive that encodes local geometric properties. This approach leverages Frenet--Serret Frames and arc length parameterization to preserve essential geometric features while reducing representational complexity, facilitating data-efficient learning, improved segmentation accuracy, and generalization on 3D curvilinear structures. To rigorously evaluate our method, we introduce two datasets: CurviSeg, a synthetic dataset for 3D curvilinear structure segmentation that validates our method's key properties, and DenSpineEM, a benchmark for dendritic spine segmentation, which comprises 4,476 manually annotated spines from 70 dendrites across three public electron microscopy datasets, covering multiple brain regions and species. Our experiments on DenSpineEM demonstrate exceptional cross-region and cross-species generalization: models trained on the mouse somatosensory cortex subset achieve 91.9\% Dice, maintaining strong performance in zero-shot segmentation on both mouse visual cortex (94.1\% Dice) and human frontal lobe (81.8\% Dice) subsets. Moreover, we test the generalizability of our method on the IntrA dataset, where it achieves 77.08\% Dice (5.29\% higher than prior arts) on intracranial aneurysm segmentation. These findings demonstrate the potential of our approach for accurately analyzing complex curvilinear structures across diverse medical imaging fields.

On the Capacity of Correlated Phase-Noise Channels: An Electro-Optic Frequency Comb Example

Authors: Mohammad Farsi, Hamdi Joudeh, Gabriele Liga, Alex Alvarado, Magnus Karlsson, Erik Agrell

The capacity of a discrete-time channel with correlated phase noises is investigated. In particular, the electro-optic frequency comb system is considered, where the phase noise of each subchannel is a combination of two independent Wiener phase-noise sources. Capacity upper and lower bounds are derived for this channel and are compared with lower bounds obtained by numerically evaluating the achievable information rates using quadrature amplitude modulation constellations. Capacity upper and lower bounds are provided for the high signal-to-noise ratio (SNR) regime. The multiplexing gain (pre-log) is shown to be $M-1$, where $M$ represents the number of subchannels. A constant gap between the asymptotic upper and lower bounds is observed, which depends on the number of subchannels $M$. For the specific case of $M=2$, capacity is characterized up to a term that vanishes as the SNR grows large.

Robust Instance Optimal Phase-Only Compressed Sensing

Authors: Junren Chen, Michael K. Ng, Jonathan Scarlett

Phase-only compressed sensing (PO-CS) concerns the recovery of sparse signals from the phases of complex measurements. Recent results show that sparse signals in the standard sphere $\mathbb{S}^{n-1}$ can be exactly recovered from complex Gaussian phases by a linearization procedure, which recasts PO-CS as linear compressed sensing and then applies (quadratically constrained) basis pursuit to obtain $\mathbf{x}^\sharp$. This paper focuses on the instance optimality and robustness of $\mathbf{x}^{\sharp}$. First, we strengthen the nonuniform instance optimality of Jacques and Feuillen (2021) to a uniform one over the entire signal space. We show the existence of some universal constant $C$ such that $\|\mathbf{x}^\sharp-\mathbf{x}\|_2\le Cs^{-1/2}\sigma_{\ell_1}(\mathbf{x},\Sigma^n_s)$ holds for all $\mathbf{x}$ in the unit Euclidean sphere, where $\sigma_{\ell_1}(\mathbf{x},\Sigma^n_s)$ is the $\ell_1$ distance of $\mathbf{x}$ to its closest $s$-sparse signal. This is achieved by showing the new sensing matrices corresponding to all approximately sparse signals simultaneously satisfy RIP. Second, we investigate the estimator's robustness to noise and corruption. We show that dense noise with entries bounded by some small $\tau_0$, appearing either prior or posterior to retaining the phases, increments $\|\mathbf{x}^\sharp-\mathbf{x}\|_2$ by $O(\tau_0)$. This is near-optimal (up to log factors) for any algorithm. On the other hand, adversarial corruption, which changes an arbitrary $\zeta_0$-fraction of the measurements to any phase-only values, increments $\|\mathbf{x}^\sharp-\mathbf{x}\|_2$ by $O(\sqrt{\zeta_0\log(1/\zeta_0)})$. The developments are then combined to yield a robust instance optimal guarantee that resembles the standard one in linear compressed sensing.

Re-boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration

Authors: Xin Lin, Yuyan Zhou, Jingtong Yue, Chao Ren, Kelvin C.K. Chan, Lu Qi, Ming-Hsuan Yang

Unsupervised restoration approaches based on generative adversarial networks (GANs) offer a promising solution without requiring paired datasets. Yet, these GAN-based approaches struggle to surpass the performance of conventional unsupervised GAN-based frameworks without significantly modifying model structures or increasing the computational complexity. To address these issues, we propose a self-collaboration (SC) strategy for existing restoration models. This strategy utilizes information from the previous stage as feedback to guide subsequent stages, achieving significant performance improvement without increasing the framework's inference complexity. The SC strategy comprises a prompt learning (PL) module and a restorer ($Res$). It iteratively replaces the previous less powerful fixed restorer $\overline{Res}$ in the PL module with a more powerful $Res$. The enhanced PL module generates better pseudo-degraded/clean image pairs, leading to a more powerful $Res$ for the next iteration. Our SC can significantly improve the $Res$'s performance by over 1.5 dB without adding extra parameters or computational complexity during inference. Meanwhile, existing self-ensemble (SE) and our SC strategies enhance the performance of pre-trained restorers from different perspectives. As SE increases computational complexity during inference, we propose a re-boosting module to the SC (Reb-SC) to improve the SC strategy further by incorporating SE into SC without increasing inference time. This approach further enhances the restorer's performance by approximately 0.3 dB. Extensive experimental results on restoration tasks demonstrate that the proposed model performs favorably against existing state-of-the-art unsupervised restoration methods. Source code and trained models are publicly available at: this https URL.

Screen Them All: High-Throughput Pan-Cancer Genetic and Phenotypic Biomarker Screening from H&E Whole Slide Images

Authors: Yi Kan Wang, Ludmila Tydlitatova, Jeremy D. Kunz, Gerard Oakley, Bonnie Kar Bo Chow, Ran A. Godrich, Matthew C. H. Lee, Hamed Aghdam, Alican Bozkurt, Michal Zelechowski, Chad Vanderbilt, Christopher Kanan, Juan A. Retamero, Peter Hamilton, Razik Yousfi, Thomas J. Fuchs, David S. Klimstra, Siqi Liu

Molecular assays are standard of care for detecting genomic alterations in cancer prognosis and therapy selection but are costly, tissue-destructive and time-consuming. Artificial intelligence (AI) applied to routine hematoxylin and eosin (H&E)-stained whole slide images (WSIs) offers a fast and economical alternative for screening molecular biomarkers. We introduce OmniScreen, a high-throughput AI-based system leveraging Virchow2 embeddings extracted from 60,529 cancer patients with paired 489-gene MSK-IMPACT targeted biomarker panel and WSIs. Unlike conventional approaches that train separate models for each biomarker, OmniScreen employs a unified model to predict a broad range of clinically relevant biomarkers across cancers, including low-prevalence targets impractical to model individually. OmniScreen reliably identifies therapeutic targets and shared phenotypic features across common and rare tumors. We investigate the biomarker prediction probabilities and accuracies of OmniScreen in relation to tumor area, cohort size, histologic subtype alignment, and pathway-level morphological patterns. These findings underscore the potential of OmniScreen for routine clinical screening.

LSTP-Nav: Lightweight Spatiotemporal Policy for Map-free Multi-agent Navigation with LiDAR

Authors: Xingrong Diao, Zhirui Sun, Jianwei Peng, Jiankun Wang

Safe and efficient multi-agent navigation in dynamic environments remains inherently challenging, particularly when real-time decision-making is required on resource-constrained platforms. Ensuring collision-free trajectories while adapting to uncertainties without relying on pre-built maps further complicates real-world deployment. To address these challenges, we propose LSTP-Nav, a lightweight end-to-end policy for multi-agent navigation that enables map-free collision avoidance in complex environments by directly mapping raw LiDAR point clouds to motion commands. At the core of this framework lies LSTP-Net, an efficient network that processes raw LiDAR data using a GRU architecture, enhanced with attention mechanisms to dynamically focus on critical environmental features while minimizing computational overhead. Additionally, a novel HS reward optimizes collision avoidance by incorporating angular velocity, prioritizing obstacles along the predicted heading, and enhancing training stability. To narrow the sim-to-real gap, we develop PhysReplay-Simlab, a physics-realistic multi-agent simulator, employs localized replay to mine near-failure experiences. Relying solely on LiDA, LSTP-Nav achieves efficient zero-shot sim-to-real transfer on a CPU-only robotic platform, enabling robust navigation in dynamic environments while maintaining computation frequencies above 40 Hz. Extensive experiments demonstrate that LSTP-Nav outperforms baselines with a 9.58\% higher success rate and a 12.30\% lower collision rate, underscoring its practicality and robustness for real-world applications.

Enhancing Underwater Imaging with 4-D Light Fields: Dataset and Method

Authors: Yuji Lin, Junhui Hou, Xianqiang Lyu, Qian Zhao, Deyu Meng

In this paper, we delve into the realm of 4-D light fields (LFs) to enhance underwater imaging plagued by light absorption, scattering, and other challenges. Contrasting with conventional 2-D RGB imaging, 4-D LF imaging excels in capturing scenes from multiple perspectives, thereby indirectly embedding geometric information. This intrinsic property is anticipated to effectively address the challenges associated with underwater imaging. By leveraging both explicit and implicit depth cues present in 4-D LF images, we propose a progressive, mutually reinforcing framework for underwater 4-D LF image enhancement and depth estimation. Specifically, our framework explicitly utilizes estimated depth information alongside implicit depth-related dynamic convolutional kernels to modulate output features. The entire framework decomposes this complex task, iteratively optimizing the enhanced image and depth information to progressively achieve optimal enhancement results. More importantly, we construct the first 4-D LF-based underwater image dataset for quantitative evaluation and supervised training of learning-based methods, comprising 75 underwater scenes and 3675 high-resolution 2K pairs. To craft vibrant and varied underwater scenes, we build underwater environments with various objects and adopt several types of degradation. Through extensive experimentation, we showcase the potential and superiority of 4-D LF-based underwater imaging vis-a-vis traditional 2-D RGB-based approaches. Moreover, our method effectively corrects color bias and achieves state-of-the-art performance. The dataset and code will be publicly available at this https URL.

Enabling Advanced Land Cover Analytics: An Integrated Data Extraction Pipeline for Predictive Modeling with the Dynamic World Dataset

Authors: Victor Radermecker, Andrea Zanon, Nancy Thomas, Annita Vapsi, Saba Rahimi, Rama Ramakrishnan, Daniel Borrajo

Understanding land cover holds considerable potential for a myriad of practical applications, particularly as data accessibility transitions from being exclusive to governmental and commercial entities to now including the broader research community. Nevertheless, although the data is accessible to any community member interested in exploration, there exists a formidable learning curve and no standardized process for accessing, pre-processing, and leveraging the data for subsequent tasks. In this study, we democratize this data by presenting a flexible and efficient end to end pipeline for working with the Dynamic World dataset, a cutting-edge near-real-time land use/land cover (LULC) dataset. This includes a pre-processing and representation framework which tackles noise removal, efficient extraction of large amounts of data, and re-representation of LULC data in a format well suited for several downstream tasks. To demonstrate the power of our pipeline, we use it to extract data for an urbanization prediction problem and build a suite of machine learning models with excellent performance. This task is easily generalizable to the prediction of any type of land cover and our pipeline is also compatible with a series of other downstream tasks.

Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge

Authors: Ruiyang Qin, Dancheng Liu, Gelei Xu, Zheyu Yan, Chenhui Xu, Yuting Hu, Shaocong Wang, X. Sharon Hu, Jinjun Xiong, Yiyu Shi

The combination of Large Language Models (LLM) and Automatic Speech Recognition (ASR), when deployed on edge devices (called edge ASR-LLM), can serve as a powerful personalized assistant to enable audio-based interaction for users. Compared to text-based interaction, edge ASR-LLM allows accessible and natural audio interactions. Unfortunately, existing ASR-LLM models are mainly trained in high-performance computing environments and produce substantial model weights, making them difficult to deploy on edge devices. More importantly, to better serve users' personalized needs, the ASR-LLM must be able to learn from each distinct user, given that audio input often contains highly personalized characteristics that necessitate personalized on-device training. Since individually fine-tuning the ASR or LLM often leads to suboptimal results due to modality-specific limitations, end-to-end training ensures seamless integration of audio features and language understanding (cross-modal alignment), ultimately enabling a more personalized and efficient adaptation on edge devices. However, due to the complex training requirements and substantial computational demands of existing approaches, cross-modal alignment between ASR audio and LLM can be challenging on edge devices. In this work, we propose a resource-efficient cross-modal alignment framework that bridges ASR and LLMs on edge devices to handle personalized audio input. Our framework enables efficient ASR-LLM alignment on resource-constrained devices like NVIDIA Jetson Orin (8GB RAM), achieving 50x training time speedup while improving the alignment quality by more than 50\%. To the best of our knowledge, this is the first work to study efficient ASR-LLM alignment on resource-constrained edge devices.

Kernel-based Koopman approximants for control: Flexible sampling, error analysis, and stability

Authors: Lea Bold, Friedrich M. Philipp, Manuel Schaller, Karl Worthmann

Data-driven techniques for analysis, modeling, and control of complex dynamical systems are on the uptake. Koopman theory provides the theoretical foundation for the popular kernel extended dynamic mode decomposition (kEDMD). In this work, we propose a novel kEDMD scheme to approximate nonlinear control systems accompanied by an in-depth error analysis. Key features are regularization-based robustness and an adroit decomposition into micro and macro grids enabling flexible sampling. But foremost, we prove proportionality, i.e., explicit dependence on the distance to the (controlled) equilibrium, of the derived bound on the full approximation error. Leveraging this key property, we rigorously show that asymptotic stability of the data-driven surrogate (control) system implies asymptotic stability of the original (control) system and vice versa.

UAV Communications: Impact of Obstacles on Channel Characteristics

Authors: Kamal Shayegan

In recent years, Unmanned Aerial Vehicles (UAVs) have been utilized as effective platforms for carrying Wi-Fi Access Points (APs) and cellular Base Stations (BSs), enabling low-cost, agile, and flexible wireless networks with high Quality of Service (QoS). The next generation of wireless communications will rely on increasingly higher frequencies, which are easily obstructed by obstacles. One of the most critical concepts yet to be fully addressed is positioning the UAV at optimal coordinates while accounting for obstacles. To ensure a line of sight (LoS) between UAVs and user equipment (UE), improve QoS, and establish reliable wireless links with maximum coverage, obstacles must be integrated into the proposed placement algorithms. This paper introduces a simulation-based measurement approach for characterizing an air-to-ground (AG) channel in a simple scenario. By considering obstacles, we present a novel perspective on channel characterization. The results, in terms of throughput, packet delivery, packet loss, and delay, are compared using the proposed positioning approach.

AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment

Authors: Yuqin Cao, Xiongkuo Min, Yixuan Gao, Wei Sun, Guangtao Zhai

Many video-to-audio (VTA) methods have been proposed for dubbing silent AI-generated videos. An efficient quality assessment method for AI-generated audio-visual content (AGAV) is crucial for ensuring audio-visual quality. Existing audio-visual quality assessment methods struggle with unique distortions in AGAVs, such as unrealistic and inconsistent elements. To address this, we introduce AGAVQA-3k, the first large-scale AGAV quality assessment dataset, comprising $3,382$ AGAVs from $16$ VTA methods. AGAVQA-3k includes two subsets: AGAVQA-MOS, which provides multi-dimensional scores for audio quality, content consistency, and overall quality, and AGAVQA-Pair, designed for optimal AGAV pair selection. We further propose AGAV-Rater, a LMM-based model that can score AGAVs, as well as audio and music generated from text, across multiple dimensions, and selects the best AGAV generated by VTA methods to present to the user. AGAV-Rater achieves state-of-the-art performance on AGAVQA-3k, Text-to-Audio, and Text-to-Music datasets. Subjective tests also confirm that AGAV-Rater enhances VTA performance and user experience. The dataset and code is available at this https URL.

RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining

Authors: Tengfei Zhang, Ziheng Zhao, Chaoyi Wu, Xiao Zhou, Ya Zhang, Yanfeng Wang, Weidi Xie

Developing advanced medical imaging retrieval systems is challenging due to the varying definitions of `similar images' across different medical contexts. This challenge is compounded by the lack of large-scale, high-quality medical imaging retrieval datasets and benchmarks. In this paper, we propose a novel methodology that leverages dense radiology reports to define image-wise similarity ordering at multiple granularities in a scalable and fully automatic manner. Using this approach, we construct two comprehensive medical imaging retrieval datasets: MIMIC-IR for Chest X-rays and CTRATE-IR for CT scans, providing detailed image-image ranking annotations conditioned on diverse anatomical structures. Furthermore, we develop two retrieval systems, RadIR-CXR and model-ChestCT, which demonstrate superior performance in traditional image-image and image-report retrieval tasks. These systems also enable flexible, effective image retrieval conditioned on specific anatomical structures described in text, achieving state-of-the-art results on 77 out of 78 metrics.

Towards a Universal Image Degradation Model via Content-Degradation Disentanglement

Authors: Wenbo Yang, Zhongling Wang, Zhou Wang

Image degradation synthesis is highly desirable in a wide variety of applications ranging from image restoration to simulating artistic effects. Existing models are designed to generate one specific or a narrow set of degradations, which often require user-provided degradation parameters. As a result, they lack the generalizability to synthesize degradations beyond their initial design or adapt to other applications. Here we propose the first universal degradation model that can synthesize a broad spectrum of complex and realistic degradations containing both homogeneous (global) and inhomogeneous (spatially varying) components. Our model automatically extracts and disentangles homogeneous and inhomogeneous degradation features, which are later used for degradation synthesis without user intervention. A disentangle-by-compression method is proposed to separate degradation information from images. Two novel modules for extracting and incorporating inhomogeneous degradations are created to model inhomogeneous components in complex degradations. We demonstrate the model's accuracy and adaptability in film-grain simulation and blind image restoration tasks. The demo video, code, and dataset of this project will be released at this http URL.

Learning-Based Multiuser Scheduling in MIMO-OFDM Systems with Hybrid Beamforming

Authors: Pouya Agheli, Tugce Kobal, François Durand, Matthew Andrews

We investigate the multiuser scheduling problem in multiple-input multiple-output (MIMO) systems using orthogonal frequency division multiplexing (OFDM) and hybrid beamforming in which a base station (BS) communicates with multiple users over millimeter wave (mmWave) channels in the downlink. Improved scheduling is critical for enhancing spectral efficiency and the long-term performance of the system from the perspective of proportional fairness (PF) metric in hybrid beamforming systems due to its limited multiplexing gain. Our objective is to maximize PF by properly designing the analog and digital precoders within the hybrid beamforming and selecting the users subject to the number of radio frequency (RF) chains. Leveraging the characteristics of mmWave channels, we apply a two-timescale protocol. On a long timescale, we assign an analog beam to each user. Scheduling the users and designing the digital precoder are done accordingly on a short timescale. To conduct scheduling, we propose combinatorial solutions, such as greedy and sorting algorithms, followed by a machine learning (ML) approach. Our numerical results highlight the trade-off between the performance and complexity of the proposed approaches. Consequently, we show that the choice of approach depends on the specific criteria within a given scenario.

DeepGesture: A conversational gesture synthesis system based on emotions and semantics

Authors: Thanh Hoang-Minh

Along with the explosion of large language models, improvements in speech synthesis, advancements in hardware, and the evolution of computer graphics, the current bottleneck in creating digital humans lies in generating character movements that correspond naturally to text or speech inputs. In this work, we present DeepGesture, a diffusion-based gesture synthesis framework for generating expressive co-speech gestures conditioned on multimodal signals - text, speech, emotion, and seed motion. Built upon the DiffuseStyleGesture model, DeepGesture introduces novel architectural enhancements that improve semantic alignment and emotional expressiveness in generated gestures. Specifically, we integrate fast text transcriptions as semantic conditioning and implement emotion-guided classifier-free diffusion to support controllable gesture generation across affective states. To visualize results, we implement a full rendering pipeline in Unity based on BVH output from the model. Evaluation on the ZeroEGGS dataset shows that DeepGesture produces gestures with improved human-likeness and contextual appropriateness. Our system supports interpolation between emotional states and demonstrates generalization to out-of-distribution speech, including synthetic voices - marking a step forward toward fully multimodal, emotionally aware digital humans. Project page: this https URL

Robust Localization of Partially Fake Speech: Metrics, Models, and Out-of-Domain Evaluation

Authors: Hieu-Thi Luong, Inbal Rimon, Haim Permuter, Kong Aik Lee, Eng Siong Chng

Partial audio deepfake localization pose unique challenges and remain underexplored compared to full-utterance spoofing detection. While recent methods report strong in-domain performance, their real-world utility remains unclear. In this analysis, we critically examine the limitations of current evaluation practices, particularly the widespread use of Equal Error Rate (EER), which often obscures generalization and deployment readiness. We propose reframing the localization task as a sequential anomaly detection problem and advocate for the use of threshold-dependent metrics such as accuracy, precision, recall, and F1-score, which better reflect real-world behavior. Specifically, we analyze the performance of the open-source Coarse-to-Fine Proposal Refinement Framework (CFPRF), which achieves a 20-ms EER of 7.61% on the in-domain PartialSpoof evaluation set, but 43.25% and 27.59% on the LlamaPartialSpoof and Half-Truth out-of-domain test sets. Interestingly, our reproduced version of the same model performs worse on in-domain data (9.84%) but better on the out-of-domain sets (41.72% and 14.98%, respectively). This highlights the risks of over-optimizing for in-domain EER, which can lead to models that perform poorly in real-world scenarios. It also suggests that while deep learning models can be effective on in-domain data, they generalize poorly to out-of-domain scenarios, failing to detect novel synthetic samples and misclassifying unfamiliar bona fide audio. Finally, we observe that adding more bona fide or fully synthetic utterances to the training data often degrades performance, whereas adding partially fake utterances improves it.

Mutual Information Bounds for Lossy Common Information

Authors: Anderson de Andrade

We show the mutual information between the targets in a Gray-Wyner Network as a bound that separates Wyner's lossy common information and Gács-Körner lossy common information. The results are a generalization of the lossless case presented by Wyner (1975).

Hear-Your-Click: Interactive Object-Specific Video-to-Audio Generation

Authors: Yingshan Liang, Keyu Fan, Zhicheng Du, Yiran Wang, Qingyang Shi, Xinyu Zhang, Jiasheng Lu, Peiwu Qin

Video-to-audio (V2A) generation shows great potential in fields such as film production. Despite significant advances, current V2A methods relying on global video information struggle with complex scenes and generating audio tailored to specific objects. To address these limitations, we introduce Hear-Your-Click, an interactive V2A framework enabling users to generate sounds for specific objects by clicking on the frame. To achieve this, we propose Object-aware Contrastive Audio-Visual Fine-tuning (OCAV) with a Mask-guided Visual Encoder (MVE) to obtain object-level visual features aligned with audio. Furthermore, we tailor two data augmentation strategies, Random Video Stitching (RVS) and Mask-guided Loudness Modulation (MLM), to enhance the model's sensitivity to segmented objects. To measure audio-visual correspondence, we designed a new evaluation metric, the CAV score. Extensive experiments demonstrate that our framework offers more precise control and improves generation performance across various metrics. Project Page: this https URL

Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model

Authors: Koki Yamane, Yunhan Li, Masashi Konosu, Koki Inami, Junji Oaki, Sho Sakaino, Toshiaki Tsuji

In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.

Stream Function-Based Navigation for Complex Quadcopter Obstacle Avoidance

Authors: Sean Smith, Emmanuel Witrant, Ya-Jun Pan

This article presents a novel stream function-based navigational control system for obstacle avoidance, where obstacles are represented as two-dimensional (2D) rigid surfaces in inviscid, incompressible flows. The approach leverages the vortex panel method (VPM) and incorporates safety margins to control the stream function and flow properties around virtual surfaces, enabling navigation in complex, partially observed environments using real-time sensing. To address the limitations of the VPM in managing relative distance and avoiding rapidly accelerating obstacles at close proximity, the system integrates a model predictive controller (MPC) based on higher-order control barrier functions (HOCBF). This integration incorporates VPM trajectory generation, state estimation, and constraint handling into a receding-horizon optimization problem. The 2D rigid surfaces are enclosed using minimum bounding ellipses (MBEs), while an adaptive Kalman filter (AKF) captures and predicts obstacle dynamics, propagating these estimates into the MPC-HOCBF for rapid avoidance maneuvers. Evaluation is conducted using a PX4-powered Clover drone Gazebo simulator and real-time experiments involving a COEX Clover quadcopter equipped with a 360 degree LiDAR sensor.

StreamUni: Achieving Streaming Speech Translation with a Unified Large Speech-Language Model

Authors: Shoutao Guo, Xiang Li, Mengge Liu, Wei Chen, Yang Feng

Streaming speech translation (StreamST) requires determining appropriate timing, known as policy, to generate translations while continuously receiving source speech inputs, balancing low latency with high translation quality. However, existing StreamST methods typically operate on sentence-level speech segments, referred to as simultaneous speech translation (SimulST). In practice, they require collaboration with segmentation models to accomplish StreamST, where the truncated speech segments constrain SimulST models to make policy decisions and generate translations based on limited contextual information. Moreover, SimulST models struggle to learn effective policies due to the complexity of speech inputs and cross-lingual generation. To address these challenges, we propose StreamUni, which achieves StreamST through a unified Large Speech-Language Model (LSLM). Specifically, StreamUni incorporates speech Chain-of-Thought (CoT) in guiding the LSLM to generate multi-stage outputs. Leveraging these multi-stage outputs, StreamUni simultaneously accomplishes speech segmentation, policy decision, and translation generation, completing StreamST without requiring massive policy-specific training. Additionally, we propose a streaming CoT training method that enhances low-latency policy decisions and generation capabilities using limited CoT data. Experiments demonstrate that our approach achieves state-of-the-art performance on StreamST tasks.

Token-based Audio Inpainting via Discrete Diffusion

Authors: Tali Dror, Iftach Shoham, Moshe Buchris, Oren Gal, Haim Permuter, Gilad Katz, Eliya Nachmani

Audio inpainting refers to the task of reconstructing missing segments in corrupted audio recordings. While prior approaches-including waveform and spectrogram-based diffusion models-have shown promising results for short gaps, they often degrade in quality when gaps exceed 100 milliseconds (ms). In this work, we introduce a novel inpainting method based on discrete diffusion modeling, which operates over tokenized audio representations produced by a pre-trained audio tokenizer. Our approach models the generative process directly in the discrete latent space, enabling stable and semantically coherent reconstruction of missing audio. We evaluate the method on the MusicNet dataset using both objective and perceptual metrics across gap durations up to 300 ms. We further evaluated our approach on the MTG dataset, extending the gap duration to 500 ms. Experimental results demonstrate that our method achieves competitive or superior performance compared to existing baselines, particularly for longer gaps, offering a robust solution for restoring degraded musical recordings. Audio examples of our proposed method can be found at this https URL

Relevant ArXiv eess Papers - 2025-07-14

3D forest semantic segmentation using multispectral LiDAR and 3D deep learning

Authors: Narges Takhtkeshha, Lauris Bocaux, Lassi Ruoppa, Fabio Remondino, Gottfried Mandlburger, Antero Kukko, Juha Hyyppä

Conservation and decision-making regarding forest resources necessitate regular forest inventory. Light detection and ranging (LiDAR) in laser scanning systems has gained significant attention over the past two decades as a remote and non-destructive solution to streamline the labor-intensive and time-consuming procedure of forest inventory. Advanced multispectral (MS) LiDAR systems simultaneously acquire three-dimensional (3D) spatial and spectral information across multiple wavelengths of the electromagnetic spectrum. Consequently, MS-LiDAR technology enables the estimation of both the biochemical and biophysical characteristics of forests. Forest component segmentation is crucial for forest inventory. The synergistic use of spatial and spectral laser information has proven to be beneficial for achieving precise forest semantic segmentation. Thus, this study aims to investigate the potential of MS-LiDAR data, captured by the HeliALS system, providing high-density multispectral point clouds to segment forests into six components: ground, low vegetation, trunks, branches, foliage, and woody debris. Three point-wise 3D deep learning models and one machine learning model, including kernel point convolution, superpoint transformer, point transformer V3, and random forest, are implemented. Our experiments confirm the superior accuracy of the KPConv model. Additionally, various geometric and spectral feature vector scenarios are examined. The highest accuracy is achieved by feeding all three wavelengths (1550 nm, 905 nm, and 532 nm) as the initial features into the deep learning model, resulting in improvements of 33.73% and 32.35% in mean intersection over union (mIoU) and in mean accuracy (mAcc), respectively. This study highlights the excellent potential of multispectral LiDAR for improving the accuracy in fully automated forest component segmentation.

AI-Augmented Visible Light Communication: A Framework for Noise Mitigation and Secure Data Transmission

Authors: A. A. Nutfaji, Moustafa Hassan Elmallah

This paper presents a proposed AI Deep Learning model that addresses common challenges encountered in Visible Light Communication (VLC) systems. In this work, we run a Python simulation that models a basic VLC system primarily affected by Additive White Gaussian Noise (AWGN). A Deep Neural Network (DNN) is then trained to equalize the noisy signal received and improve signal integrity. The system evaluates and compares the Bit Error Rate (BER) before and after equalization to demonstrate the effectiveness of the proposed model. This paper starts by introducing the concept of visible light communication, then it dives deep into some details about the process of VLC and the challenges it faces, shortly after we propose our project which helps overcome these challenges. We finally conclude with a lead for future work, highlighting the areas that are most suitable for future improvements.

A Generalized Stability Analysis Method with Dynamic Phasors for LV AC Microgrids

Authors: Bülent Dağ

Representation of inductive coupling lines with conventional static phasors is the main reason of inadequacy of the existing phasors based simplified stability analysis methods for microgrids with inductive coupling lines. In the literature, dynamic phasors have been proposed for the dynamic modelling of inductive lines to conserve the simplified structure of the analysis method. In this study a generalized stability analysis method for LV AC microgrids, composed of droop controlled inverters, is presented. The proposed analysis method is based on the inclusion of dynamic phasors for inductive coupling lines into the existing phasors based stability analysis method. The results show that the stability analysis method with dynamic phasors successfully predicts the instability boundaries of LV AC microgrids.

Large-Scale Processing and Validation of Grid Data for Assessing the Fair Spatial Distribution of PV Hosting Capacity

Authors: Ali Mohamed Ali, Yaser Raeisi, Plouton Grammatikos, Davide Pavanello, Pierre Roduit, Fabrizio Sossan

The integration of PV systems and increased electrification levels present significant challenges to the traditional design and operation of distribution grids. This paper presents a methodology for extracting, validating, and adapting grid data from a distribution system operator's (DSO) database to facilitate large-scale grid studies, including load flow and optimal power flow analyses. The validation process combines rule-based sanity checks and offline automated power flow analyses to ensure data consistency and detect potential errors in the grid database, allowing for their correction. As a practical application, the paper proposes a method to assess the PV hosting capacity of distribution grids, with a focus on ensuring fairness in their spatial distribution. By incorporating fairness criteria into the analyses, we quantify the costs (in terms of missed revenues from selling PV generation) associated with spatial fairness.

Energy Management for Renewable-Colocated Artificial Intelligence Data Centers

Authors: Siying Li, Lang Tong, Timothy D. Mount

We develop an energy management system (EMS) for artificial intelligence (AI) data centers with colocated renewable generation. Under a profit-maximizing framework, the EMS of renewable-colocated data center (RCDC) co-optimizes AI workload scheduling, on-site renewable utilization, and electricity market participation. Within both wholesale and retail market participation models, the economic benefit of the RCDC operation is maximized. Empirical evaluations using real-world traces of electricity prices, data center power consumption, and renewable generation demonstrate significant profit gains from renewable and AI data center colocations.

Electricity-Aware Bid Format for Coordinated Heat and Electricity Market Clearing

Authors: Lesia Mitridati, Jalal Kazempour, Pascal Van Hentenryck

Coordination between heat and electricity markets is essential to achieve a cost-effective and efficient operation of the energy system. In the current sequential market practice, the heat market is cleared before the electricity market and has no insight into the impacts of heat dispatch on the electricity market. While preserving this sequential practice, this paper introduces an electricity-aware bid format for the coordination of heat and electricity systems. This novel market mechanism defines heat bids conditionally on the day-ahead electricity prices. Prior to clearing heat and electricity markets, the proposed bid selection mechanism selects the valid bids which minimize the heat system operating cost while anticipating heat and electricity market clearing. This mechanism is modeled as a trilevel optimization problem, which we recast as a mixed-integer linear program using a lexicographic function. We use a realistic case study based on the Danish electricity and heat system and show that the proposed bid selection mechanism yields a 4.5% reduction in the total operating cost of heat and electricity systems compared to the existing market-clearing procedure while reducing the financial losses of combined heat and power plants and heat pumps due to invalid bids by up to 20.3 million euros.

Wholesale Market Participation of DERA: Competitive DER Aggregation

Authors: Cong Chen, Ahmed S. Alahmed, Timothy D. Mount, Lang Tong

We consider the aggregation of distributed energy resources (DERs) by a profit-seeking aggregator participating directly in wholesale market under distribution network access constraints. We propose a competitive DER aggregator (DERA) model that maximizes the DERA's profit while ensuring each aggregated customer gains no less surplus and pays no higher energy cost than under the regulated retail tariff. The DERA participates in wholesale electricity market as virtual storage with optimized generation offers and consumption bids derived from our competitive aggregation model. Also derived are DERA's bid curves for the distribution network access and DERA's profitability when competing with the regulated retail tariff. We show that, with the same distribution network access, the proposed DERA's wholesale market participation achieves the same welfare-maximizing outcome as when its customers participate directly in the wholesale market. Numerical studies compare the proposed DERA with existing methods in terms of customer surplus and DERA profit. We empirically evaluate how many DERAs can survive in the competition at long-run equilibrium, and assess the impacts of DER adoption levels and distribution network access on short-run market outcomes.

A Preventive-Corrective Scheme for Ensuring Power System Security During Active Wildfire Risks

Authors: Satyaprajna Sahoo, Anamitra Pal

The focus of this paper is on operating the electric power grid in a secure manner when wildfire risks are high. This is a challenging problem because of the uncertain ways in which the fires can impact the operation of the power system. To address this challenge, we propose a novel preventive-corrective coordinated decision-making scheme that quickly mitigates both static and dynamic insecurities given the risk of active wildfires in a region. The scheme utilizes a comprehensive contingency analysis tool for multi-asset outages that leverages: (i) a Feasibility Test algorithm which exhaustively desaturates overloaded cut-sets to prevent cascading line outages, and (ii) a data-driven transient stability analyzer which alleviates dynamic instabilities. This tool is then used to operate a coordinated unit commitment/optimal power flow model that is designed to adapt to varying risk levels associated with wildfires. Depending on the allowed risk, the model balances economical operation and grid robustness. The results obtained using the IEEE 118-bus system indicate that the proposed approach alleviates system vulnerabilities to wildfires while also minimizing operational cost.

Relevant ArXiv eess Papers - 2025-07-11

Multilayer GNN for Predictive Maintenance and Clustering in Power Grids

Authors: Muhammad Kazim, Harun Pirim, Chau Le, Trung Le, Om Prakash Yadav

Unplanned power outages cost the US economy over $150 billion annually, partly due to predictive maintenance (PdM) models that overlook spatial, temporal, and causal dependencies in grid failures. This study introduces a multilayer Graph Neural Network (GNN) framework to enhance PdM and enable resilience-based substation clustering. Using seven years of incident data from Oklahoma Gas & Electric (292,830 records across 347 substations), the framework integrates Graph Attention Networks (spatial), Graph Convolutional Networks (temporal), and Graph Isomorphism Networks (causal), fused through attention-weighted embeddings. Our model achieves a 30-day F1-score of 0.8935 +/- 0.0258, outperforming XGBoost and Random Forest by 3.2% and 2.7%, and single-layer GNNs by 10 to 15 percent. Removing the causal layer drops performance to 0.7354 +/- 0.0418. For resilience analysis, HDBSCAN clustering on HierarchicalRiskGNN embeddings identifies eight operational risk groups. The highest-risk cluster (Cluster 5, 44 substations) shows 388.4 incidents/year and 602.6-minute recovery time, while low-risk groups report fewer than 62 incidents/year. ANOVA (p < 0.0001) confirms significant inter-cluster separation. Our clustering outperforms K-Means and Spectral Clustering with a Silhouette Score of 0.626 and Davies-Bouldin index of 0.527. This work supports proactive grid management through improved failure prediction and risk-aware substation clustering.

Remote Renewable Energy Hubs: a Taxonomy

Authors: Victor Dachet, Antoine Dubois, Bardhyl Miftari, Raphaël Fonteneau, Damien Ernst

Serving the energy demand with renewable energy is hindered by its limited availability near load centres (i.e. places where the energy demand is high). To address this challenge, the concept of Remote Renewable Energy Hubs (RREH) emerges as a promising solution. RREHs are energy hubs located in areas with abundant renewable energy sources, such as sun in the Sahara Desert or wind in Greenland. In these hubs, renewable energy sources are used to synthetise energy molecules. To produce specific energy molecules, a tailored hub configuration must be designed, which means choosing a set of technologies that are interacting with each other as well as defining how they are integrated in their local environment. The plurality of technologies that may be employed in RREHs results in a large diversity of hubs. In order to characterize this diversity, we propose in this paper a taxonomy for accurately defining these hubs. This taxonomy allows to better describe and compare designs of hubs as well as to identify new ones. Thus, it may guide policymakers and engineers in hub design, contributing to cost efficiency and/or improving local integration.

Ammonia, Methane, Hydrogen and Methanol Produced in Remote Renewable Energy Hubs: a Comparative Quantitative Analysis

Authors: Antoine Larbanois, Victor Dachet, Antoine Dubois, Raphaël Fonteneau, Damien Ernst

Remote renewable energy hubs (RREHs) for synthetic fuel production are engineering systems harvesting renewable energy where it is particularly abundant. They produce transportable synthetic fuels for export to distant load centers. This article aims to evaluate the production costs of different energy carriers, and includes a discussion on advantages and disadvantages in terms of technical performance. To do so, we extend the study of Berger et al., (2021) which focuses on methane (CH4) as energy carrier and introduce three new carriers: ammonia (NH3), hydrogen (H2) and methanol (CH3OH). The four different RREHs are located in the Algerian Sahara desert and must serve to the load center, Belgium, a constant electro-fuel demand of 10 TWh per year. The modelling and optimisation of these systems are performed using the modelling language GBOML (Graph-Based Optimisation Modelling Language). Our findings reveal that the three new RREHs, each with its respective carrier (ammonia, hydrogen, and methanol), are all more cost-effective than the methane-based system. Ammonia demonstrates the most favourable cost-to-energy exported ratio.

Flying Base Stations for Offshore Wind Farm Monitoring and Control: Holistic Performance Evaluation and Optimization

Authors: Xinyi Lin, Peizheng Li, Adnan Aijaz

Ensuring reliable and low-latency communication in offshore wind farms is critical for efficient monitoring and control, yet remains challenging due to the harsh environment and lack of infrastructure. This paper investigates a flying base station (FBS) approach for wide-area monitoring and control in the UK Hornsea offshore wind farm project. By leveraging mobile, flexible FBS platforms in the remote and harsh offshore environment, the proposed system offers real-time connectivity for turbines without the need for deploying permanent infrastructure at the sea. We develop a detailed and practical end-to-end latency model accounting for five key factors: flight duration, connection establishment, turbine state information upload, computational delay, and control transmission, to provide a holistic perspective often missing in prior studies. Furthermore, we combine trajectory planning, beamforming, and resource allocation into a multi-objective optimization framework for the overall latency minimization, specifically designed for large-scale offshore wind farm deployments. Simulation results verify the effectiveness of our proposed method in minimizing latency and enhancing efficiency in FBS-assisted offshore monitoring across various power levels, while consistently outperforming baseline designs.

Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF Infeasible

Authors: Samuel Chevalier, William A. Wheeler

What is the globally smallest load perturbation that renders DC-OPF infeasible? Reliably identifying such "adversarial attack" perturbations has useful applications in a variety of emerging grid-related contexts, including machine learning performance verification, cybersecurity, and operational robustness of power systems dominated by stochastic renewable energy resources. In this paper, we formulate the inherently nonconvex adversarial attack problem by applying a parameterized version of Farkas' lemma to a perturbed set of DC-OPF equations. Since the resulting formulation is very hard to globally optimize, we also propose a parameterized generation control policy which, when applied to the primal DC-OPF problem, provides solvability guarantees. Together, these nonconvex problems provide guaranteed upper and lower bounds on adversarial attack size; by combining them into a single optimization problem, we can efficiently "squeeze" these bounds towards a common global solution. We apply these methods on a range of small- to medium-sized test cases from PGLib, benchmarking our results against the best adversarial attack lower bounds provided by Gurobi 12.0's spatial Branch and Bound solver.

Impact Assessment of Cyberattacks in Inverter-Based Microgrids

Authors: Kerd Topallaj, Colin McKerrell, Suraj Ramanathan, Ioannis Zografopoulos

In recent years, the evolution of modern power grids has been driven by the growing integration of remotely controlled grid assets. Although Distributed Energy Resources (DERs) and Inverter-Based Resources (IBRs) enhance operational efficiency, they also introduce cybersecurity risks. The remote accessibility of such critical grid components creates entry points for attacks that adversaries could exploit, posing threats to the stability of the system. To evaluate the resilience of energy systems under such threats, this study employs real-time simulation and a modified version of the IEEE 39-bus system that incorporates a Microgrid (MG) with solar-based IBR. The study assesses the impact of remote attacks impacting the MG stability under different levels of IBR penetration through hardware-in-the-loop (HIL) simulations. Namely, we analyze voltage, current, and frequency profiles before, during, and after cyberattack-induced disruptions. The results demonstrate that real-time HIL testing is a practical approach to uncover potential risks and develop robust mitigation strategies for resilient MG operations.

Sensing Rate Optimization for Multi-Band Cooperative ISAC Systems

Authors: Nemanja Stefan Perović, Mark F. Flanagan, Le-Nam Tran

Integrated sensing and communication (ISAC) has been recognized as one of the key technologies for future wireless networks, which potentially need to operate in multiple frequency bands to satisfy ever-increasing demands for both communication and sensing services. Motivated by this, we consider the sum sensing rate (SR) optimization for a cooperative ISAC system with linear precoding, where each base station (BS) works in a different frequency band. With this aim, we propose an optimization algorithm based on the semi-definite rank relaxation that introduces covariance matrices as optimization variables, and we apply the inner approximation (IA) method to deal with the nonconvexity of the resulting problem. Simulation results show that the proposed algorithm increases the SR by approximately 25 % and 40 % compared to the case of equal power distribution in a cooperative ISAC system with two and three BSs, respectively. Additionally, the algorithm converges in only a few iterations, while its most beneficial implementation scenario is in the low power regime

Radiation Footprint Control in Cell-Free Cooperative ISAC: Optimal Joint BS Activation and Beamforming Coordination

Authors: Jie Chen, Xianbin Wang

Coordinated beamforming across distributed base stations (BSs) in cell-free wireless infrastructure can efficiently support integrated sensing and communication (ISAC) users by enhancing resource sharing and suppressing interference in the spatial domain. However, intensive coordination among distributed BSs within the ISAC-enabled network poses risks of generating substantial interference to other coexisting networks sharing the same spectrum, while also incurring elevated costs from energy consumption and signaling exchange. To address these challenges, this paper develops an interference-suppressed and cost-efficient cell-free ISAC network, which opportunistically and cooperatively orchestrates distributed radio resources to accommodate the competing demands of sensing and communication (S\&C) services. Specifically, we conceive a radiation footprint control mechanism that autonomously suppresses interference across the entire signal propagation space to safeguard other networks without exchanging channel knowledge signaling. Then, we propose joint BS activation and beamforming coordination to dynamically activate appropriate BSs and orchestrate their spatial beams for service provisioning. Building upon this framework, we formulate a cost-efficient utility maximization problem that considers individual S\&C demands and location-dependent radiation footprint constraints. Since this results in a non-convex optimization problem, we develop a monotonic optimization embedded branch-and-bound (MO-BRB) algorithm to find the optimal solution. Additionally, we apply a low-complexity iterative method to obtain near-optimal solutions. Finally, simulation results validate the effectiveness of the proposed algorithms.

Relevant ArXiv eess Papers - 2025-07-10

Voltage Regulation in Distribution Systems with Data Center Loads

Authors: Yize Chen, Baosen Zhang

Recent boom in foundation models and AI computing have raised growing concerns on the power and energy trajectories of large-scale data centers. This paper focuses on the voltage issues caused by volatile and intensity of data center power demand, which also aligns with recent observations of more frequent voltage disturbances in power grids. To address these data center integration challenges, we propose a dynamic voltage control scheme by harnessing data center's load regulation capabilities. By taking local voltage measurements and adjusting power injections at each data center buses through the dynamic voltage and frequency scaling (DVFS) scheme, we are able to maintain safe voltage magnitude in a distributed fashion with higher data center computing load. Simulations using real large language model (LLM) inference load validate the effectiveness of our proposed mechanism. Both the LLM power data and proposed control scheme are open sourced.

Effects of Net Metering Policies on Distributed Energy Resource Valuation and Operation

Authors: Lane D. Smith, Daniel S. Kirschen

Net energy metering has been a successful policy for increasing solar generation installations and reducing the costs of photovoltaic arrays for consumers. However, increased maturity of solar technologies and concerns over cost shifts created by net energy metering have recently caused the policy to change its incentives. What once favored behind-the-meter solar generation now is focused on compensating flexible operation. This paper explores the impacts that different net energy metering policies have on commercial consumers with various distributed energy resources. We show that the newest iteration of net energy metering is less beneficial for consumers with only solar generation and instead favors those that pair energy storage with solar. Though shiftable flexible demand offers consumers the ability to operate flexibly, the export prices offered by the latest net energy metering policy provide limited value to flexible demand.

Coordinated Fast Frequency Regulation in Dynamic Virtual Power Plants via Disturbance Estimation

Authors: Saif Ahmad, Seifeddine Ben Elghali, Hafiz Ahmed

In the context of dynamic virtual power plants (DVPPs), the integration of frequency containment reserve (FCR) and fast frequency control (FFC) enabled via local compensation of power imbalance represents a significant advancement in decentralized frequency regulation. However, they still have to cope with the limited power and energy capacities associated with commonly available storage solutions. This work combines a disturbance estimation based decentralized local control with distributed imbalance compensation in the event of local shortfall. The layered architecture facilitates fast local corrections in power setpoints while enabling coordination between neighbouring DVPP nodes to leverage the aggregated capacity, ensuring scalable and efficient operation suitable for renewable-heavy future grids. The proposed approach is validated on an illustrative 4-bus system with a high percentage of renewables.

Techno-economic analysis of decarbonized backup power systems using scenario-based stochastic optimization

Authors: Jonas Schweiger, Ruaridh Macdonald

In the context of growing concerns about power disruptions, grid reliability and the need for decarbonization, this study evaluates a broad range of clean backup power systems (BPSs) to replace traditional emergency diesel generators. A scenario-based stochastic optimization framework using actual load profiles and outage probabilities is proposed to assess the most promising options from a pool of 27 technologies. This framework allows a comparison of cost-effectiveness and environmental impact of individual technologies and hybrid BPSs across various scenarios. The results highlight the trade-off between total annual system cost and emissions. Significant emission reductions can be achieved at moderate cost increases but deep decarbonization levels incur higher costs. Primary and secondary batteries are included in optimal clean fuel-based systems across all decarbonization levels, combining cost-effective power delivery and long-term storage benefits. The findings highlight the often-overlooked importance of fuel replacement on both emissions and costs. Among the assessed technologies, ammonia generators and hydrogen fuel cells combined with secondary iron-air batteries emerge as cost-effective solutions for achieving decarbonization goals. To ensure a broad range of applicability, the study outlines the impact of emergency fuel purchases, varying demand patterns and demand response options on the optimal BPS. The research findings are valuable for optimizing the design of clean BPSs to economically meet the needs of many facility types and decarbonization targets.

Optimisation of Electrolyser Operation: Integrating External Heat

Authors: Matthias Derez, Alexander Hoogsteyn, Erik Delarue

Integrating external heat into electrolysers can reduce the electrical power demand for carbon-neutral hydrogen production. Efficient operation requires detailed models that incorporate heat availability and its effect on startup costs. This paper advances existing operational models by endogenously modelling startup costs and direct heat integration, based on a piecewise linear approximation of the electrochemical equations. We analyse the impact of low- and high-temperature heat integration on the efficiency and profitability of hydrogen production for solid oxide and proton exchange membrane electrolysis technologies.

Wireless Energy Transfer Beamforming Optimization for Intelligent Transmitting Surface

Authors: Osmel Martínez Rosabal, Onel Alcaraz López, Victoria Dala Pegorara Souto, Richard Demo Souza, Samuel Montejo-Sánchez, Robert Schober, Hirley Alves

Radio frequency (RF) wireless energy transfer (WET) is a promising technology for powering the growing ecosystem of Internet of Things (IoT) devices using power beacons (PBs). Recent research focuses on designing efficient PB architectures that can support numerous antennas. In this context, PBs equipped with intelligent surfaces present a promising approach, enabling physically large, reconfigurable arrays. Motivated by these advantages, this work aims to minimize the power consumption of a PB equipped with a passive intelligent transmitting surface (ITS) and a collocated digital beamforming-based feeder to charge multiple single-antenna devices. To model the PB's power consumption accurately, we consider power amplifiers nonlinearities, ITS control power, and feeder-to-ITS air interface losses. The resulting optimization problem is highly nonlinear and nonconvex due to the high-power amplifier (HPA), the received power constraints at the devices, and the unit-modulus constraint imposed by the phase shifter configuration of the ITS. To tackle this issue, we apply successive convex approximation (SCA) to iteratively solve convex subproblems that jointly optimize the digital precoder and phase configuration. Given SCA's sensitivity to initialization, we propose an algorithm that ensures initialization feasibility while balancing convergence speed and solution quality. We compare the proposed ITS-equipped PB's power consumption against benchmark architectures featuring digital and hybrid analog-digital beamforming. Results demonstrate that the proposed architecture efficiently scales with the number of RF chains and ITS elements. We also show that nonuniform ITS power distribution influences beamforming and can shift a device between near- and far-field regions, even with a constant aperture.

Heterogeneous Graph Neural Networks for Short-term State Forecasting in Power Systems across Domains and Time Scales: A Hydroelectric Power Plant Case Study

Authors: Raffael Theiler, Olga Fink

Accurate short-term state forecasting is essential for efficient and stable operation of modern power systems, especially in the context of increasing variability introduced by renewable and distributed energy resources. As these systems evolve rapidly, it becomes increasingly important to reliably predict their states in the short term to ensure operational stability, support control decisions, and enable interpretable monitoring of sensor and machine behavior. Modern power systems often span multiple physical domains - including electrical, mechanical, hydraulic, and thermal - posing significant challenges for modeling and prediction. Graph Neural Networks (GNNs) have emerged as a promising data-driven framework for system state estimation and state forecasting in such settings. By leveraging the topological structure of sensor networks, GNNs can implicitly learn inter-sensor relationships and propagate information across the network. However, most existing GNN-based methods are designed under the assumption of homogeneous sensor relationships and are typically constrained to a single physical domain. This limitation restricts their ability to integrate and reason over heterogeneous sensor data commonly encountered in real-world energy systems, such as those used in energy conversion infrastructure. In this work, we propose the use of Heterogeneous Graph Attention Networks to address these limitations. Our approach models both homogeneous intra-domain and heterogeneous inter-domain relationships among sensor data from two distinct physical domains - hydraulic and electrical - which exhibit fundamentally different temporal dynamics. Experimental results demonstrate that our method significantly outperforms conventional baselines on average by 35.5% in terms of normalized root mean square error, confirming its effectiveness in multi-domain, multi-rate power system state forecasting.

Manifolds in Power Systems Optimization

Authors: Lucca Rodrigues Pinto, Wilson de Souza Junior, Jaime Laelson Jacob, Luis Alfonso Gallego Pareja, Taufik Abrão

Manifold optimization (MO) is a powerful mathematical framework that can be applied to solving complex optimization problems with objective functions (OFs) and constraints on complex geometric structures, which is particularly useful in advanced power systems. We explore the application of MO techniques, which offer a robust framework for solving complex, non-convex optimization problems in electrical power distribution systems (EPDS) and electrical power transmission systems (EPTS), particularly for power flow analysis. This paper introduces the principles of MO and demonstrates its advantages over conventional methods by applying it to power flow optimization. For EPDS, a cost function derived from a backward-forward sweep (BFS) algorithm is optimized using the Manopt toolbox, yielding high accuracy and competitive computational times on 14-bus, 33-bus, and 69-bus systems when compared to established solvers. Similarly, for EPTS, MO applied via Manopt to 3-bus and 4-bus systems effectively solves power flow equations, matching traditional methods such as Newton-Raphson in performance. The study highlights that tools such as Manopt can mitigate implementation complexities, positioning MO as an efficient and accessible tool for power system analysis and potentially broader planning applications. The paper provides a comprehensive tutorial on MO, detailing its theoretical foundations, practical methodologies, and specific applications in power systems, particularly in power flow optimization.

Distributionally Robust Joint Chance-Constrained Optimization for Electricity Imbalance: Integrating Renewables and Storage

Authors: Amir Noori, Babak Tavassoli, Alireza Fereidunian

Integrating Distributed Energy Resources (DERs) with peer-to-peer (P2P) energy trading offers promising solutions for grid modernization by incentivizing prosumers to participate in mitigating peak demand. However, this integration also introduces operational uncertainties and computational challenges. This paper aims to address these challenges with a novel scalable and tractable distributionally robust joint chance-constrained (DRJCC) optimization framework that effectively facilitates P2P energy trading by enhancing flexibility provision from large-scale DER operations under uncertain supply and demand. Therefore, a practical framework is proposed to solve the core challenges of DRJCC by integrating three key components: (1) a Wasserstein ambiguity set that effectively quantifies uncertainty with sparse data, (2) a CVaR-based approximation of joint chance constraints to balance computational efficiency with risk control, and (3) a privacy-preserving ADMM algorithm that enables distributed implementation through decomposition. To discern patterns in the data that indicate collaboration potential and adjust ambiguity sets for improved efficiency, K-means clustering is applied to historical scenarios. Simulation results show that the proposed framework reduces peak demand by approximately 28% and total community costs by around 31%, underscoring its effectiveness in enhancing grid robustness, operational reliability, and economic optimization in renewable-based energy management.

Relevant ArXiv eess Papers - 2025-07-09

Dual-Attention U-Net++ with Class-Specific Ensembles and Bayesian Hyperparameter Optimization for Precise Wound and Scale Marker Segmentation

Authors: Daniel Cieślak, Miriam Reca, Olena Onyshchenko, Jacek Rumiński

Accurate segmentation of wounds and scale markers in clinical images remainsa significant challenge, crucial for effective wound management and automatedassessment. In this study, we propose a novel dual-attention U-Net++ archi-tecture, integrating channel-wise (SCSE) and spatial attention mechanisms toaddress severe class imbalance and variability in medical images this http URL, extensive benchmarking across diverse architectures and encoders via 5-fold cross-validation identified EfficientNet-B7 as the optimal encoder this http URL, we independently trained two class-specific models with tailoredpreprocessing, extensive data augmentation, and Bayesian hyperparameter tun-ing (WandB sweeps). The final model ensemble utilized Test Time Augmentationto further enhance prediction reliability. Our approach was evaluated on a bench-mark dataset from the NBC 2025 & PCBBE 2025 competition. Segmentationperformance was quantified using a weighted F1-score (75% wounds, 25% scalemarkers), calculated externally by competition organizers on undisclosed hard-ware. The proposed approach achieved an F1-score of 0.8640, underscoring itseffectiveness for complex medical segmentation tasks.

Low voltage user phase reconfiguration as a planning problem

Authors: Sari Kerckhove, Marta Vanin, Reinhilde D'hulst, Dirk Van Hertem

Considerable levels of phase imbalance in low voltage (LV) distribution networks imply that grid assets are suboptimally utilized and can cause additional losses, equipment failure and degradation. With the ongoing energy transition, the installation of additional single-phase distributed energy resources may further increase the phase imbalance if no countermeasures are taken. Phase reconfiguration is a cost-effective solution to reduce imbalance. However, dynamic reconfiguration, through real-time phase swapping of loads using remotely controlled switches, is often impractical because these switches are too costly for widespread installation at LV users. Approaching phase reconfiguration as a planning problem, i.e. static reconfiguration, is an underaddressed but promising alternative. Effective static approaches that allow appropriate imbalance objectives are currently lacking. This paper presents reliable and expressive static phase reconfiguration methods that grid operators can easily integrate into routine maintenance for effective phase balancing. We present and compare three static methods, an exact mixed-integer nonlinear formulation (MINLP), a mixed-integer quadratic approximation (MIQP), and a genetic algorithm (GA), each supporting different imbalance objectives. The MIQP approach, despite using proxy objectives, efficiently mitigates the different types of imbalance considered, and outperforms both MINLP and GA in scalability and consistency.

Optimal Placement of Smart Hybrid Transformers in Distribution Networks

Authors: Samuel Hayward, Martin Doff-Sotta, Michael Merlin, Matthew Williams, Thomas Morstyn

Hybrid transformers are a relatively new technology that combine conventional power transformers with power electronics to provide voltage and reactive power control capabilities in distribution networks. This paper proposes a novel method of determining the optimal location and utilisation of hybrid transformers in 3-phase distribution networks to maximise the net present value of hybrid transformers based on their ability to increase the export of power produced by distributed generators over their operational lifespan. This has been accomplished through sequential linear programming, a key feature of which is the consideration of nonlinear characteristics and constraints relating to hybrid transformer power electronics and control capabilities. Test cases were carried out in a modified version of the Cigre European Low Voltage Distribution Network Benchmark, which has been extended by connecting it with two additional low voltage distribution test networks. All test case results demonstrate that the installation and utilisation of hybrid transformers can improve the income earned from exporting excess active power, justifying their installation cost (with the highest net present value being £6.56 million, resulting from a 45.53 percent increase in estimated annual profits due to coordinated HT compensation).

Robust Power System State Estimation using Physics-Informed Neural Networks

Authors: Solon Falas, Markos Asprou, Charalambos Konstantinou, Maria K. Michael

Modern power systems face significant challenges in state estimation and real-time monitoring, particularly regarding response speed and accuracy under faulty conditions or cyber-attacks. This paper proposes a hybrid approach using physics-informed neural networks (PINNs) to enhance the accuracy and robustness, of power system state estimation. By embedding physical laws into the neural network architecture, PINNs improve estimation accuracy for transmission grid applications under both normal and faulty conditions, while also showing potential in addressing security concerns such as data manipulation attacks. Experimental results show that the proposed approach outperforms traditional machine learning models, achieving up to 83% higher accuracy on unseen subsets of the training dataset and 65% better performance on entirely new, unrelated datasets. Experiments also show that during a data manipulation attack against a critical bus in a system, the PINN can be up to 93% more accurate than an equivalent neural network.

Discrete-Time CRLB-based Power Allocation for CF MIMO-ISAC with Joint Localization and Velocity Sensing

Authors: Guoqing Xia, Pei Xiao, Qu Luo, Bing Ji, Yue Zhang, Huiyu Zhou

In this paper, we investigate integrated sensing and communication (ISAC) in a cell-free (CF) multiple-input multiple-output (MIMO) network, where each access point functions either as an ISAC transmitter or as a sensing receiver. We devote into the ISAC sensing metric using the discrete-time signal-based Cramer-Rao lower bounds (CRLBs) for joint location and velocity estimation under arbitrary power allocation ratios under the deterministic radar cross section assumption (RCS). Then, we consider the power allocation optimization problem for the CF MIMO-ISAC as the maximization of the communication signal-to-interference-plus-noise ratio (SINR), subject to CRLB-based sensing constraints and per-transmitter power limits. To solve the resulting nonlinear and non-convex problem, we propose a penalty function and projection-based modified conjugate gradient algorithm with inexact line search (PP-MCG-ILS), and an alternative method based on a modified steepest descent approach (PP-MSD-ILS). We show that the proposed algorithms are scalable and can be extended to a broad class of optimization problems involving nonlinear inequality constraints and affine equality constraints. In addition, we extend the PP-MCG-ILS algorithm to the pure sensing scenario, where a penalty function-based normalized conjugate gradient algorithm (P-NCG-ILS) is developed for sensing power minimization. Finally, we analyze the convergence behavior and qualitatively compare the computational complexity of the proposed algorithms. Simulation results confirm the accuracy of the derived CRLBs and demonstrate the effectiveness of the proposed power allocation strategies in enhancing both sensing and overall ISAC performance.

Relevant ArXiv eess Papers - 2025-07-08

Neural Substitute Solver for Efficient Edge Inference of Power Electronic Hybrid Dynamics

Authors: Jialin Zheng, Haoyu Wang, Yangbin Zeng, Han Xu, Di Mou, Hong Li, Sergio Vazquez, Leopoldo G. Franquelo

Advancing the dynamics inference of power electronic systems (PES) to the real-time edge-side holds transform-ative potential for testing, control, and monitoring. How-ever, efficiently inferring the inherent hybrid continu-ous-discrete dynamics on resource-constrained edge hardware remains a significant challenge. This letter pro-poses a neural substitute solver (NSS) approach, which is a neural-network-based framework aimed at rapid accurate inference with significantly reduced computational costs. Specifically, NSS leverages lightweight neural networks to substitute time-consuming matrix operation and high-order numerical integration steps in traditional solvers, which transforms sequential bottlenecks into highly parallel operation suitable for edge hardware. Experimental vali-dation on a multi-stage DC-DC converter demonstrates that NSS achieves 23x speedup and 60% hardware resource reduction compared to traditional solvers, paving the way for deploying edge inference of high-fidelity PES dynamics.

A Hybrid Mean Field Framework for Aggregators Participating in Wholesale Electricity Markets

Authors: Jun He, Andrew L. Liu

The rapid growth of distributed energy resources (DERs), including rooftop solar and energy storage, is transforming the grid edge, where distributed technologies and customer-side systems increasingly interact with the broader power grid. DER aggregators, entities that coordinate and optimize the actions of many small-scale DERs, play a key role in this transformation. This paper presents a hybrid Mean-Field Control (MFC) and Mean-Field Game (MFG) framework for integrating DER aggregators into wholesale electricity markets. Unlike traditional approaches that treat market prices as exogenous, our model captures the feedback between aggregators' strategies and locational marginal prices (LMPs) of electricity. The MFC component optimizes DER operations within each aggregator, while the MFG models strategic interactions among multiple aggregators. To account for various uncertainties, we incorporate reinforcement learning (RL), which allows aggregators to learn optimal bidding strategies in dynamic market conditions. We prove the existence and uniqueness of a mean-field equilibrium and validate the framework through a case study of the Oahu Island power system. Results show that our approach reduces price volatility and improves market efficiency, offering a scalable and decentralized solution for DER integration in wholesale markets.

On Decision-Dependent Uncertainties in Power Systems with High-Share Renewables

Authors: Yunfan Zhang, Yifan Su, Feng Liu

The continuously increasing renewable energy sources (RES) and demand response (DR) are becoming important sources of system flexibility. As a consequence, decision-dependent uncertainties (DDUs), interchangeably referred to as endogenous uncertainties, impose new characteristics to power system dispatch. The DDUs faced by system operators originate from uncertain dispatchable resources such as RES units or DR, while reserve providers encounter DDUs arising from the uncertain reserve deployment. This paper presents a systematic framework for addressing robust dispatch problems with DDUs. The main contributions include i) the robust characterization of DDUs with a dependency decomposition structure; ii) a generic DDU coping mechanism, manifested as the bilateral matching between uncertainty and flexibility; iii) analyses of the influence of DDU incorporation on the convexity/non-convexity of robust dispatch problems; and iv) generic solution algorithms adaptive for DDUs. Under this framework, the inherent distinctions and correlations between DDUs and DIUs are revealed, providing a fundamental theoretical basis for the economic and reliable operation of RES-dominated power systems. Applications in the source and demand sides illustrate the importance of considering DDUs and verify the effectiveness of proposed algorithms for robust dispatch with DDUs.

Optimal Sizing and Control of a Grid-Connected Battery in a Stacked Revenue Model Including an Energy Community

Authors: Tudor Octavian Pocola, Valentin Robu, Jip Rietveld, Sonam Norbu, Benoit Couraud, Merlinda Andoni, David Flynn, H. Vincent Poor

Recent years have seen rapid increases in intermittent renewable generation, requiring novel battery energy storage systems (BESS) solutions. One recent trend is the emergence of large grid-connected batteries, that can be controlled to provide multiple storage and flexibility services, using a stacked revenue model. Another emerging development is renewable energy communities (REC), in which prosumers invest in their own renewable generation capacity, but also requiring battery storage for flexibility. In this paper, we study settings in which energy communities rent battery capacity from a battery operator through a battery-as-a-service (BaaS) model. We present a methodology for determining the sizing and pricing of battery capacity that can be rented, such that it provides economic benefits to both the community and the battery operator that participates in the energy market. We examine how sizes and prices vary across a number of different scenarios for different types of tariffs (flat, dynamic) and competing energy market uses. Second, we conduct a systematic study of linear optimization models for battery control when deployed to provide flexibility to energy communities. We show that existing approaches for battery control with daily time windows have a number of important limitations in practical deployments, and we propose a number of regularization functions in the optimization to address them. Finally, we investigate the proposed method using real generation, demand, tariffs, and battery data, based on a practical case study from a large battery operator in the Netherlands. For the settings in our case study, we find that a community of 200 houses with a 330 kW wind turbine can save up to 12,874 euros per year by renting just 280 kWh of battery capacity (after subtracting battery rental costs), with the methodology applicable to a wide variety of settings and tariff types.

Multi-Objective Nonlinear Power Split Control For BESS With Real-Time Simulation Feedback

Authors: Vivek Teja Tanjavooru, Prashant Pant, Thomas Hamacher, Holger Hesse

This paper presents a mixed-integer, nonlinear, multi-objective optimization strategy for optimal power allocation among parallel strings in Battery Energy Storage Systems (BESS). High-fidelity control is achieved by co-simulating the optimizer with a BESS electro-thermal simulation that models spatial thermal dynamics of the battery, providing real-time State of Charge (SOC) and temperature feedback. The optimizer prioritizes reliability by enforcing power availability as a hard constraint and penalizing battery thermal derating. Within these bounds, the controller performs a Pareto sweep on the relative weights of inverter and battery losses to balance the trade-off between inverter efficiency and battery efficiency. The inverter loss model is based on an empirical lookup table (LUT) derived from a commercial inverter system, while the battery thermal loss model uses SOC and temperature-dependent internal resistance, with electric current computed from the battery Equivalent Circuit Model (ECM). When the optimization was applied to a two-string BESS, the competing effects of inverter and battery losses on system availability and thermal derating were observed. The balanced operation yielded improvements of 1% in battery efficiency, 1.5% in inverter efficiency, and 2% in derating efficiency, while maintaining higher availability. Additionally, a 5 degrees C reduction in BESS peak temperature also suggests reduced thermal stress without compromising availability.

AI-Driven Mobility Management for High-Speed Railway Communications: Compressed Measurements and Proactive Handover

Authors: Wen Li, Wei Chen, Shiyue Wang, Yuanyuan Zhang, Michail Matthaiou, Bo Ai

High-speed railway (HSR) communications are pivotal for ensuring rail safety, operations, maintenance, and delivering passenger information services. The high speed of trains creates rapidly time-varying wireless channels, increases the signaling overhead, and reduces the system throughput, making it difficult to meet the growing and stringent needs of HSR applications. In this article, we explore artificial intelligence (AI)-based beam-level and cell-level mobility management suitable for HSR communications. Particularly, we propose a compressed spatial multi-beam measurements scheme via compressive sensing for beam-level mobility management in HSR communications. In comparison to traditional down-sampling spatial beam measurements, this method leads to improved spatial-temporal beam prediction accuracy with the same measurement overhead. Moreover, we propose a novel AI-based proactive handover scheme to predict handover events and reduce radio link failure (RLF) rates in HSR communications. Compared with the traditional event A3-based handover mechanism, the proposed approach significantly reduces the RLF rates which saves 50% beam measurement overhead.

Evaluating the Impact of Multiple DER Aggregators on Wholesale Energy Markets: A Hybrid Mean Field Approach

Authors: Jun He, Andrew L. Liu

The integration of distributed energy resources (DERs) into wholesale energy markets can greatly enhance grid flexibility, improve market efficiency, and contribute to a more sustainable energy future. As DERs -- such as solar PV panels and energy storage -- proliferate, effective mechanisms are needed to ensure that small prosumers can participate meaningfully in these markets. We study a wholesale market model featuring multiple DER aggregators, each controlling a portfolio of DER resources and bidding into the market on behalf of the DER asset owners. The key of our approach lies in recognizing the repeated nature of market interactions the ability of participants to learn and adapt over time. Specifically, Aggregators repeatedly interact with each other and with other suppliers in the wholesale market, collectively shaping wholesale electricity prices (aka the locational marginal prices (LMPs)). We model this multi-agent interaction using a mean-field game (MFG), which uses market information -- reflecting the average behavior of market participants -- to enable each aggregator to predict long-term LMP trends and make informed decisions. For each aggregator, because they control the DERs within their portfolio under certain contract structures, we employ a mean-field control (MFC) approach (as opposed to a MFG) to learn an optimal policy that maximizes the total rewards of the DERs under their management. We also propose a reinforcement learning (RL)-based method to help each agent learn optimal strategies within the MFG framework, enhancing their ability to adapt to market conditions and uncertainties. Numerical simulations show that LMPs quickly reach a steady state in the hybrid mean-field approach. Furthermore, our results demonstrate that the combination of energy storage and mean-field learning significantly reduces price volatility compared to scenarios without storage.

CT-Mamba: A Hybrid Convolutional State Space Model for Low-Dose CT Denoising

Authors: Linxuan Li, Wenjia Wei, Luyao Yang, Wenwen Zhang, Jiashu Dong, Yahua Liu, Hongshi Huang, Wei Zhao

Low-dose CT (LDCT) significantly reduces the radiation dose received by patients, however, dose reduction introduces additional noise and artifacts. Currently, denoising methods based on convolutional neural networks (CNNs) face limitations in long-range modeling capabilities, while Transformer-based denoising methods, although capable of powerful long-range modeling, suffer from high computational complexity. Furthermore, the denoised images predicted by deep learning-based techniques inevitably exhibit differences in noise distribution compared to normal-dose CT (NDCT) images, which can also impact the final image quality and diagnostic outcomes. This paper proposes CT-Mamba, a hybrid convolutional State Space Model for LDCT image denoising. The model combines the local feature extraction advantages of CNNs with Mamba's strength in capturing long-range dependencies, enabling it to capture both local details and global context. Additionally, we introduce an innovative spatially coherent Z-shaped scanning scheme to ensure spatial continuity between adjacent pixels in the image. We design a Mamba-driven deep noise power spectrum (NPS) loss function to guide model training, ensuring that the noise texture of the denoised LDCT images closely resembles that of NDCT images, thereby enhancing overall image quality and diagnostic value. Experimental results have demonstrated that CT-Mamba performs excellently in reducing noise in LDCT images, enhancing detail preservation, and optimizing noise texture distribution, and exhibits higher statistical similarity with the radiomics features of NDCT images. The proposed CT-Mamba demonstrates outstanding performance in LDCT denoising and holds promise as a representative approach for applying the Mamba framework to LDCT denoising tasks.

Liquid Lens-Based Imaging Receiver for MIMO VLC Systems

Authors: Kapila W. S. Palitharathna, Christodoulos Skouroumounis, Ioannis Krikidis

In this paper, we consider a tunable liquid convex lens-assisted imaging receiver for indoor multiple-input multiple-output (MIMO) visible light communication (VLC) systems. In contrast to existing MIMO VLC receivers that rely on fixed optical lenses, the proposed receiver leverages the additional degrees of freedom offered by liquid lenses via adjusting both focal length and orientation angles of the lens. This capability facilitates the mitigation of spatial correlation between the channel gains, thereby enhancing the overall signal quality and leading to improved bit-error rate (BER) performance. We present an accurate channel model for the liquid lens-assisted VLC system by using three-dimensional geometry and geometric optics. To achieve optimal performance under practical conditions such as random receiver orientation and user mobility, optimization of both focal length and orientation angles of the lens are required. To this end, driven by the fact that channel models are mathematically complex, we present two optimization schemes including a blockwise machine learning (ML) architecture that includes convolution layers to extract spatial features from the received signal, long-short term memory layers to predict the user position and orientation, and fully connected layers to estimate the optimal lens parameters. Numerical results are presented to compare the performance of each scheme with conventional receivers. Results show that a significant BER improvement is achieved when liquid lenses and presented ML-based optimization approaches are used. Specifically, the BER can be improved from $6\times 10^{-2}$ to $1.4\times 10^{-3}$ at an average signal-to-noise ratio of $30$ dB.

Inference and Learning of Nonlinear LFR State-Space Models

Authors: Merijn Floren, Jean-Philippe Noël, Jan Swevers

Estimating the parameters of nonlinear block-oriented state-space models from input-output data typically involves solving a highly non-convex optimization problem, which is prone to poor local minima and slow convergence. This paper presents a computationally efficient initialization method for nonlinear linear fractional representation (NL-LFR) models using periodic data. By first inferring the latent signals and subsequently estimating the model parameters, the approach generates initial estimates for use in a later nonlinear optimization step. The proposed method shows robustness against poor local minima, and achieves a twofold error reduction compared to the state-of-the-art on a challenging benchmark dataset.

Relevant ArXiv eess Papers - 2025-07-07

Enhancing Power Flow Estimation with Topology-Aware Gated Graph Neural Networks

Authors: Shrenik Jadhav, Birva Sevak, Srijita Das, Wencong Su, Van-Hai Bui

Accurate and scalable surrogate models for AC power flow are essential for real-time grid monitoring, contingency analysis, and decision support in increasingly dynamic and inverter-dominated power systems. However, most existing surrogates fall short of practical deployment due to their limited capacity to capture long-range nonlinear dependencies in meshed transmission networks and their weak enforcement of physical laws. These models often require extensive hyperparameter tuning, exhibit poor generalization under topology changes or large load swings, and typically do not quantify uncertainty or scale well beyond a few hundred buses. To address these challenges, this paper proposes a \textit{gated graph neural network (GGNN)} surrogate for AC power-flow estimation under topological uncertainty. The model is trained across multiple IEEE benchmark networks of varying size and complexity, each incorporating randomized line contingencies and up to 40\% load variation. To improve robustness and generalization, we explore both conventional supervised learning and physics-informed self-supervised training strategies. Comparative evaluations show that the proposed GGNN consistently outperforms prior GNN-based surrogates, achieving predictions closely aligned with Newton--Raphson solutions. By embedding operational constraints directly into the architecture and loss function, the model ensures physical consistency and delivers a lightweight, accurate, and scalable tool for real-time grid operations.

Optimality Loss Minimization in Distributed Control with Application to District Heating

Authors: Audrey Blizard, Stephanie Stockar

This paper presents a novel partitioning method designed to minimize control performance degradation resulting from partitioning a system for distributed control while maintaining the computational benefits of these methods. A game-theoretic performance metric, the modified Price of Anarchy, is introduced and is used in a generalizable partitioning metric to quantify optimality losses in a distributed controller. By finding the partition that minimizes the partitioning metric, the best-performing distributed control design is chosen. The presented partitioning metric is control-design agnostic, making it broadly applicable to many control design problems. In this paper, the developed metric is used to minimize the performance losses in the distributed control of a demand-flexible District Heating Network. The final distributed controller is provably feasible and stable. In simulation, this novel partitioning performed similarly to the centralized controller, increasing overall heat losses by only 1.9%, as compared to a similarly-sized baseline partition, which resulted in a 22% increase in losses.

Grid-Connected, Data-Driven Inverter Control, Theory to Hardware

Authors: Sebastian Graf, Keith Moffat, Anurag Mohapatra, Alessandro Chiuso, Florian Dörfler

Grid-connected inverter control is challenging to implement due to the difficulty of obtaining and maintaining an accurate grid model. Direct Data-Driven Predictive Control provides a model-free alternative to traditional model-based control methods. This paper describes how the recently-proposed Transient Predictive Control (TPC) can be used for real-world, plug-and-play inverter control. The following hypotheses were tested: 1) The TPC algorithm can be run online using standard hardware, and 2) TPC, which is derived using Linear Time-Invariant assumptions, is effective for grid-connected inverter control, which is a nonlinear and time-varying system. Experiments conducted on a two-converter benchtop setup and at the CoSES Laboratory on a 25 kVA converter connected to the Munich grid support these hypotheses.

Deep-Reinforcement-Learning-Based Adaptive State-Feedback Control for Inter-Area Oscillation Damping with Continuous Eigenvalue Configurations

Authors: Siyuan Liang, Long Huo, Wenyu Qin, Xin Chen, Peiyuan Sun

Controlling inter-area oscillation (IAO) across wide areas is crucial for the stability of modern power systems. Recent advances in deep learning, combined with the extensive deployment of phasor measurement units (PMUs) and generator sensors, have catalyzed the development of data-driven IAO damping controllers. In this paper, a novel IAO damping control framework is presented by modeling the control problem as a Markov Decision Process (MDP) and solving it through deep reinforcement learning (DRL). The DRL-based controller is trained in the state space with continuous eigenvalue configurations. To optimize control performance and cost-efficiency, only a subset of generators, identified by global participation factors, are selected for control. In addition, a switching control strategy (SCS) is introduced that effectively integrates the DRL-based controller with power system stabilizers (PSSs) to enhance overall performance. The simulation results on the IEEE 39-bus New England power system show that the proposed method outperforms two benchmark methods regarding the transient response. The DRL-based controller trained on the linear state-space environment can be directly tested in the nonlinear differential-algebraic environment. The robustness of the proposed method against communication delays has been thoroughly investigated.

Online Convex Optimization for Coordinated Long-Term and Short-Term Isolated Microgrid Dispatch

Authors: Ning Qi, Yousuf Baker, Bolun Xu

This paper proposes a novel non-anticipatory long-short-term coordinated dispatch framework for isolated microgrid with hybrid short-long-duration energy storages (LDES). We introduce a convex hull approximation model for nonconvex LDES electrochemical dynamics, facilitating computational tractability and accuracy. To address temporal coupling in SoC dynamics and long-term contracts, we generate hindsight-optimal state-of-charge (SoC) trajectories of LDES and netloads for offline training. In the online stage, we employ kernel regression to dynamically update the SoC reference and propose an adaptive online convex optimization (OCO) algorithm with SoC reference tracking and expert tracking to mitigate myopia and enable adaptive step-size optimization. We rigorously prove that both long-term and short-term policies achieve sublinear regret bounds over time, which improves with more regression scenarios, stronger tracking penalties, and finer convex approximations. Simulation results show that the proposed method outperforms state-of-the-art methods, reducing costs by 73.4%, eliminating load loss via reference tracking, and achieving an additional 2.4% cost saving via the OCO algorithm. These benefits scale up with longer LDES durations, and the method demonstrates resilience to poor forecasts and unexpected system faults.

RIS-Aided Cooperative ISAC Networks for Structural Health Monitoring

Authors: Jie Yang, Chao-Kai Wen, Xiao Li, Shi Jin

Integrated sensing and communication (ISAC) is a key feature of future cellular systems, enabling applications such as intruder detection, monitoring, and tracking using the same infrastructure. However, its potential for structural health monitoring (SHM), which requires the detection of slow and subtle structural changes, remains largely unexplored due to challenges such as multipath interference and the need for ultra-high sensing precision. This study introduces a novel theoretical framework for SHM via ISAC by leveraging reconfigurable intelligent surfaces (RIS) as reference points in collaboration with base stations and users. By dynamically adjusting RIS phases to generate distinct radio signals that suppress background multipath interference, measurement accuracy at these reference points is enhanced. We theoretically analyze RIS-aided collaborative sensing in three-dimensional cellular networks using Fisher information theory, demonstrating how increasing observation time, incorporating additional receivers (even with self-positioning errors), optimizing RIS phases, and refining collaborative node selection can reduce the position error bound to meet SHM's stringent accuracy requirements. Furthermore, we develop a Bayesian inference model to identify structural states and validate damage detection probabilities. Both theoretical and numerical analyses confirm ISAC's capability for millimeter-level deformation detection, highlighting its potential for high-precision SHM applications.

Utilizing 5G NR SSB Blocks for Passive Detection and Localization of Low-Altitude Drones

Authors: Palatip Jopanya, Diana P. M. Osorio

With the exponential growth of the unmanned aerial vehicle (UAV) industry and a broad range of applications expected to appear in the coming years, the employment of traditional radar systems is becoming increasingly cumbersome for UAV supervision. Motivated by this emerging challenge, this paper investigates the feasibility of employing integrated sensing and communication (ISAC) systems implemented over current and future wireless networks to perform this task. We propose a sensing mechanism based on the synchronization signal block (SSB) in the fifth-generation (5G) standard that performs sensing in a passive bistatic setting. By assuming planar arrays at the sensing nodes and according to the 5G standard, we consider that the SSB signal is sent in a grid of orthogonal beams that are multiplexed in time, with some of them pointing toward a surveillance region where low-altitude drones can be flying. The Cramer-Rao Bound (CRB) is derived as the theoretical bound for range and velocity estimation. Our results demonstrate the potential of employing SSB signals for UAV-like target localization at low SNR.

Privacy-Preserving Quantized Federated Learning with Diverse Precision

Authors: Dang Qua Nguyen, Morteza Hashemi, Erik Perrins, Sergiy A. Vorobyov, David J. Love, Taejoon Kim

Federated learning (FL) has emerged as a promising paradigm for distributed machine learning, enabling collaborative training of a global model across multiple local devices without requiring them to share raw data. Despite its advancements, FL is limited by factors such as: (i) privacy risks arising from the unprotected transmission of local model updates to the fusion center (FC) and (ii) decreased learning utility caused by heterogeneity in model quantization resolution across participating devices. Prior work typically addresses only one of these challenges because maintaining learning utility under both privacy risks and quantization heterogeneity is a non-trivial task. In this paper, our aim is therefore to improve the learning utility of a privacy-preserving FL that allows clusters of devices with different quantization resolutions to participate in each FL round. Specifically, we introduce a novel stochastic quantizer (SQ) that is designed to simultaneously achieve differential privacy (DP) and minimum quantization error. Notably, the proposed SQ guarantees bounded distortion, unlike other DP approaches. To address quantization heterogeneity, we introduce a cluster size optimization technique combined with a linear fusion approach to enhance model aggregation accuracy. Numerical simulations validate the benefits of our approach in terms of privacy protection and learning utility compared to the conventional LaplaceSQ-FL algorithm.

Relevant ArXiv eess Papers - 2025-07-04

Enhancing Power Flow Estimation with Topology-Aware Gated Graph Neural Networks

Authors: Shrenik Jadhav, Birva Sevak, Srijita Das, Wencong Su, Van-Hai Bui

Accurate and scalable surrogate models for AC power flow are essential for real-time grid monitoring, contingency analysis, and decision support in increasingly dynamic and inverter-dominated power systems. However, most existing surrogates fall short of practical deployment due to their limited capacity to capture long-range nonlinear dependencies in meshed transmission networks and their weak enforcement of physical laws. These models often require extensive hyperparameter tuning, exhibit poor generalization under topology changes or large load swings, and typically do not quantify uncertainty or scale well beyond a few hundred buses. To address these challenges, this paper proposes a \textit{gated graph neural network (GGNN)} surrogate for AC power-flow estimation under topological uncertainty. The model is trained across multiple IEEE benchmark networks of varying size and complexity, each incorporating randomized line contingencies and up to 40\% load variation. To improve robustness and generalization, we explore both conventional supervised learning and physics-informed self-supervised training strategies. Comparative evaluations show that the proposed GGNN consistently outperforms prior GNN-based surrogates, achieving predictions closely aligned with Newton--Raphson solutions. By embedding operational constraints directly into the architecture and loss function, the model ensures physical consistency and delivers a lightweight, accurate, and scalable tool for real-time grid operations.

Optimality Loss Minimization in Distributed Control with Application to District Heating

Authors: Audrey Blizard, Stephanie Stockar

This paper presents a novel partitioning method designed to minimize control performance degradation resulting from partitioning a system for distributed control while maintaining the computational benefits of these methods. A game-theoretic performance metric, the modified Price of Anarchy, is introduced and is used in a generalizable partitioning metric to quantify optimality losses in a distributed controller. By finding the partition that minimizes the partitioning metric, the best-performing distributed control design is chosen. The presented partitioning metric is control-design agnostic, making it broadly applicable to many control design problems. In this paper, the developed metric is used to minimize the performance losses in the distributed control of a demand-flexible District Heating Network. The final distributed controller is provably feasible and stable. In simulation, this novel partitioning performed similarly to the centralized controller, increasing overall heat losses by only 1.9%, as compared to a similarly-sized baseline partition, which resulted in a 22% increase in losses.

Grid-Connected, Data-Driven Inverter Control, Theory to Hardware

Authors: Sebastian Graf, Keith Moffat, Anurag Mohapatra, Alessandro Chiuso, Florian Dörfler

Grid-connected inverter control is challenging to implement due to the difficulty of obtaining and maintaining an accurate grid model. Direct Data-Driven Predictive Control provides a model-free alternative to traditional model-based control methods. This paper describes how the recently-proposed Transient Predictive Control (TPC) can be used for real-world, plug-and-play inverter control. The following hypotheses were tested: 1) The TPC algorithm can be run online using standard hardware, and 2) TPC, which is derived using Linear Time-Invariant assumptions, is effective for grid-connected inverter control, which is a nonlinear and time-varying system. Experiments conducted on a two-converter benchtop setup and at the CoSES Laboratory on a 25 kVA converter connected to the Munich grid support these hypotheses.

Deep-Reinforcement-Learning-Based Adaptive State-Feedback Control for Inter-Area Oscillation Damping with Continuous Eigenvalue Configurations

Authors: Siyuan Liang, Long Huo, Wenyu Qin, Xin Chen, Peiyuan Sun

Controlling inter-area oscillation (IAO) across wide areas is crucial for the stability of modern power systems. Recent advances in deep learning, combined with the extensive deployment of phasor measurement units (PMUs) and generator sensors, have catalyzed the development of data-driven IAO damping controllers. In this paper, a novel IAO damping control framework is presented by modeling the control problem as a Markov Decision Process (MDP) and solving it through deep reinforcement learning (DRL). The DRL-based controller is trained in the state space with continuous eigenvalue configurations. To optimize control performance and cost-efficiency, only a subset of generators, identified by global participation factors, are selected for control. In addition, a switching control strategy (SCS) is introduced that effectively integrates the DRL-based controller with power system stabilizers (PSSs) to enhance overall performance. The simulation results on the IEEE 39-bus New England power system show that the proposed method outperforms two benchmark methods regarding the transient response. The DRL-based controller trained on the linear state-space environment can be directly tested in the nonlinear differential-algebraic environment. The robustness of the proposed method against communication delays has been thoroughly investigated.

Online Convex Optimization for Coordinated Long-Term and Short-Term Isolated Microgrid Dispatch

Authors: Ning Qi, Yousuf Baker, Bolun Xu

This paper proposes a novel non-anticipatory long-short-term coordinated dispatch framework for isolated microgrid with hybrid short-long-duration energy storages (LDES). We introduce a convex hull approximation model for nonconvex LDES electrochemical dynamics, facilitating computational tractability and accuracy. To address temporal coupling in SoC dynamics and long-term contracts, we generate hindsight-optimal state-of-charge (SoC) trajectories of LDES and netloads for offline training. In the online stage, we employ kernel regression to dynamically update the SoC reference and propose an adaptive online convex optimization (OCO) algorithm with SoC reference tracking and expert tracking to mitigate myopia and enable adaptive step-size optimization. We rigorously prove that both long-term and short-term policies achieve sublinear regret bounds over time, which improves with more regression scenarios, stronger tracking penalties, and finer convex approximations. Simulation results show that the proposed method outperforms state-of-the-art methods, reducing costs by 73.4%, eliminating load loss via reference tracking, and achieving an additional 2.4% cost saving via the OCO algorithm. These benefits scale up with longer LDES durations, and the method demonstrates resilience to poor forecasts and unexpected system faults.

RIS-Aided Cooperative ISAC Networks for Structural Health Monitoring

Authors: Jie Yang, Chao-Kai Wen, Xiao Li, Shi Jin

Integrated sensing and communication (ISAC) is a key feature of future cellular systems, enabling applications such as intruder detection, monitoring, and tracking using the same infrastructure. However, its potential for structural health monitoring (SHM), which requires the detection of slow and subtle structural changes, remains largely unexplored due to challenges such as multipath interference and the need for ultra-high sensing precision. This study introduces a novel theoretical framework for SHM via ISAC by leveraging reconfigurable intelligent surfaces (RIS) as reference points in collaboration with base stations and users. By dynamically adjusting RIS phases to generate distinct radio signals that suppress background multipath interference, measurement accuracy at these reference points is enhanced. We theoretically analyze RIS-aided collaborative sensing in three-dimensional cellular networks using Fisher information theory, demonstrating how increasing observation time, incorporating additional receivers (even with self-positioning errors), optimizing RIS phases, and refining collaborative node selection can reduce the position error bound to meet SHM's stringent accuracy requirements. Furthermore, we develop a Bayesian inference model to identify structural states and validate damage detection probabilities. Both theoretical and numerical analyses confirm ISAC's capability for millimeter-level deformation detection, highlighting its potential for high-precision SHM applications.

Utilizing 5G NR SSB Blocks for Passive Detection and Localization of Low-Altitude Drones

Authors: Palatip Jopanya, Diana P. M. Osorio

With the exponential growth of the unmanned aerial vehicle (UAV) industry and a broad range of applications expected to appear in the coming years, the employment of traditional radar systems is becoming increasingly cumbersome for UAV supervision. Motivated by this emerging challenge, this paper investigates the feasibility of employing integrated sensing and communication (ISAC) systems implemented over current and future wireless networks to perform this task. We propose a sensing mechanism based on the synchronization signal block (SSB) in the fifth-generation (5G) standard that performs sensing in a passive bistatic setting. By assuming planar arrays at the sensing nodes and according to the 5G standard, we consider that the SSB signal is sent in a grid of orthogonal beams that are multiplexed in time, with some of them pointing toward a surveillance region where low-altitude drones can be flying. The Cramer-Rao Bound (CRB) is derived as the theoretical bound for range and velocity estimation. Our results demonstrate the potential of employing SSB signals for UAV-like target localization at low SNR.

Privacy-Preserving Quantized Federated Learning with Diverse Precision

Authors: Dang Qua Nguyen, Morteza Hashemi, Erik Perrins, Sergiy A. Vorobyov, David J. Love, Taejoon Kim

Federated learning (FL) has emerged as a promising paradigm for distributed machine learning, enabling collaborative training of a global model across multiple local devices without requiring them to share raw data. Despite its advancements, FL is limited by factors such as: (i) privacy risks arising from the unprotected transmission of local model updates to the fusion center (FC) and (ii) decreased learning utility caused by heterogeneity in model quantization resolution across participating devices. Prior work typically addresses only one of these challenges because maintaining learning utility under both privacy risks and quantization heterogeneity is a non-trivial task. In this paper, our aim is therefore to improve the learning utility of a privacy-preserving FL that allows clusters of devices with different quantization resolutions to participate in each FL round. Specifically, we introduce a novel stochastic quantizer (SQ) that is designed to simultaneously achieve differential privacy (DP) and minimum quantization error. Notably, the proposed SQ guarantees bounded distortion, unlike other DP approaches. To address quantization heterogeneity, we introduce a cluster size optimization technique combined with a linear fusion approach to enhance model aggregation accuracy. Numerical simulations validate the benefits of our approach in terms of privacy protection and learning utility compared to the conventional LaplaceSQ-FL algorithm.

Relevant ArXiv eess Papers - 2025-07-03

Scalable Offline ASR for Command-Style Dictation in Courtrooms

Authors: Kumarmanas Nethil, Vaibhav Mishra, Kriti Anandan, Kavya Manohar

We propose an open-source framework for Command-style dictation that addresses the gap between resource-intensive Online systems and high-latency Batch processing. Our approach uses Voice Activity Detection (VAD) to segment audio and transcribes these segments in parallel using Whisper models, enabling efficient multiplexing across audios. Unlike proprietary systems like SuperWhisper, this framework is also compatible with most ASR architectures, including widely used CTC-based models. Our multiplexing technique maximizes compute utilization in real-world settings, as demonstrated by its deployment in around 15% of India's courtrooms. Evaluations on live data show consistent latency reduction as user concurrency increases, compared to sequential batch processing. The live demonstration will showcase our open-sourced implementation and allow attendees to interact with it in real-time.

An Adaptive Estimation Approach based on Fisher Information to Overcome the Challenges of LFP Battery SOC Estimation

Authors: Junzhe Shi, Shida Jiang, Shengyu Tao, Jaewong Lee, Manashita Borah, Scott Moura

Robust and Real-time State of Charge (SOC) estimation is essential for Lithium Iron Phosphate (LFP) batteries, which are widely used in electric vehicles (EVs) and energy storage systems due to safety and longevity. However, the flat Open Circuit Voltage (OCV)-SOC curve makes this task particularly challenging. This challenge is complicated by hysteresis effects, and real-world conditions such as current bias, voltage quantization errors, and temperature that must be considered in the battery management system use. In this paper, we proposed an adaptive estimation approach to overcome the challenges of LFPSOC estimation. Specifically, the method uses an adaptive fisher information fusion strategy that adaptively combines the SOC estimation from two different models, which are Coulomb counting and equivalent circuit model-based parameter identification. The effectiveness of this strategy is rationalized by the information richness excited by external cycling signals. A 3D OCV-H-SOC map that captures the relationship between OCV, hysteresis, and SOC was proposed as the backbone, and can be generalizable to other widely adopted parameter-identification methods. Extensive validation under ideal and real-world use scenarios, including SOC-OCV flat zones, current bias, voltage quantization errors, low temperatures, and insufficient current excitations, have been performed using 4 driving profiles, i.e., the Orange County Transit Bus Cycle, the California Unified Cycle, the US06 Drive Cycle, and the New York City Cycle, where the results demonstrate superiority over the state-of-the-art unscented Kalman filter, long short-term memory networks and transformer in all validation cases.

Teaching Cars to Drive: Spotlight on Connected and Automated Vehicles

Authors: Filippos N. Tzortzoglou, Andreas A. Malikopoulos

In recent decades, society has witnessed significant advancements in emerging mobility systems. These systems refer to transportation solutions that incorporate digital technologies, automation, connectivity, and sustainability to create safer, more efficient, and user-centered mobility. Examples include connected and automated vehicles (CAVs), shared mobility services (car-pooling), electric vehicles, and mobility-as-a-service platforms. These innovations have the potential to greatly impact areas such as safety, pollution, comfort, travel time, and fairness. In this article, we explore the current landscape of CAVs. We discuss their role in daily life and their future potential, while also addressing the challenges they may introduce. Following, we also examine the practical difficulties in research associated with CAVs especially simulating and testing CAV-related algorithms in real-world settings. We present existing solutions that aim to overcome these limitations. Finally, we provide an accessible introduction to modeling CAVs using basic kinematic principles and offer an open-source tutorial to help interested students begin exploring the field.

Synchronising DER inverters to weak grid using Kalman filter and LQR current controller

Authors: Phuoc Sang Nguyen, Ghavameddin Nourbakhsh, Gerard Ledwich

Grid-following (GFL) inverters are commonly used for integrating renewable energy sources into power grids. However, the dynamic performance of GFL models can be significantly impacted by the Phase-Locked Loop (PLL) in a weak grid, leading to instability due to inaccuracies in grid source phase angle estimation. The proposed method in this manuscript replaces the PLL with an Advanced Angle Estimation based Kalman Filter including a Linear Quadratic Regulator (LQR) controller of the GFL. This method is robust in incorporating grid impedance terms as part of state space models in the Kalman Filter approach to estimate instantaneous phase angle using {\alpha}-\b{eta} Synchronous Reference Frame equations. The stability performance of the proposed approach is validated through eigenvalue analysis in a two-source case. Additionally, an LQR controller is employed to regulate capacitor voltage, inverter current, and the current at the Point of Common Coupling (PCC). The proposed controller surpasses existing approaches in terms of accuracy and distortion reduction under abrupt grid impedance increases. Moreover, drop compensation is integrated into the Kalman Filter to enhance robustness of the inverter against external oscillation disturbances from a synchronous machine connected to the GFL via the PCC. The results in this paper demonstrate substantial improvement in oscillation damping across a range of frequencies compared with published research works.

Auto-optimization of Energy Generation for Wave Energy Converters with Active Learning

Authors: Siyang Tang, Wen-Hua Chen, Cunjia Liu

This paper presents an auto-optimization control framework for wave energy converters (WECs) to maximize energy generation under unknown and changing ocean conditions. The proposed control framework consists of two levels. The high-level controller operating at a longer time scale aims to maximize the average energy generation over several wave periods. The generated Power Take-Off (PTO) profile as the reference for the low-level physical system to follow. The new auto-optimization process leverages the parameterization of the non-stationary operation condition in WECs, establishing the relationship between the average energy generation and the key design parameters of the PTO force subject to the unknown wave parameters. The high-level controller is designed based on the concept of Dual Control for Exploration and Exploitation (DCEE) to quickly learn the unknown wave parameters by actively probing the ocean condition, while generating the optimal PTO profile. During this process, the uncertainty of the estimated wave condition is quantified and embedded in the optimization cost function to enable active learning. Simulation results under unknown regular and irregular waves demonstrate the effectiveness and robustness of this novel auto-optimization WEC systems with active learning, outperforming model predictive control, extremum seeking and classic Bang-Bang control approaches.

Cybersecurity Issues in Local Energy Markets

Authors: Al Hussein Dabashi, Sajjad Maleki, Biswarup Mukherjee, Gregory Epiphaniou, Carsten Maple, Charalambos Konstantinou, Subhash Lakshminarayana

Local Energy Markets (LEMs), though pivotal to the energy transition, face growing cybersecurity threats due to their reliance on smart grid communication standards and vulnerable Internet-of-Things (IoT)-enabled devices. This is a critical issue because such vulnerabilities can be exploited to manipulate market operations, compromise participants' privacy, and destabilize power distribution networks. This work maps LEM communication flows to existing standards, highlights potential impacts of key identified vulnerabilities, and simulates cyberattack scenarios on a privacy-preserving LEM model to assess their impacts. Findings reveal how attackers could distort pricing and demand patterns. We finally present recommendations for researchers, industry developers, policymakers, and LEM stakeholders to secure future LEM deployments.

Complex-Phase, Data-Driven Identification of Grid-Forming Inverter Dynamics

Authors: Anna Büttner, Hans Würfel, Sebastian Liemann, Johannes Schiffer, Frank Hellmann

The increasing integration of renewable energy sources (RESs) into power systems requires the deployment of grid-forming inverters to ensure a stable operation. Accurate modeling of these devices is necessary. In this paper, a system identification approach to obtain low-dimensional models of grid-forming inverters is presented. The proposed approach is based on a Hammerstein-Wiener parametrization of the normal-form model. The normal-form is a gray-box model that utilizes complex frequency and phase to capture non-linear inverter dynamics. The model is validated on two well-known control strategies: droop-control and dispatchable virtual oscillators. Simulations and hardware-in-the-loop experiments demonstrate that the normal-form accurately models inverter dynamics across various operating conditions. The approach shows great potential for enhancing the modeling of RES-dominated power systems, especially when component models are unavailable or computationally expensive.

Power-Gas Infrastructure Planning under Weather-induced Supply and Demand Uncertainties

Authors: Rahman Khorramfar, Dharik Mallapragada, Saurabh Amin

Implementing economy-wide decarbonization strategies based on decarbonizing the power grid via variable renewable energy (VRE) expansion and electrification of end-uses requires new approaches for energy infrastructure planning that consider, among other factors, weather-induced uncertainty in demand and VRE supply. An energy planning model that fails to account for these uncertainties can hinder the intended transition efforts to a low-carbon grid and increase the risk of supply shortage especially during extreme weather conditions. Here, we consider the generation and transmission expansion problem of joint power-gas infrastructure and operations planning under the uncertainty of both demand and renewable supply. We propose two distributionally robust optimization approaches based on moment (MDRO) and Wasserstein distance (WDRO) ambiguity sets to endogenize these uncertainties and account for the change in the underlying distribution of these parameters that is caused by the climate change, among other factors. Furthermore, our model considers the risk-aversion of the energy planners in the modeling framework via the conditional value-at-risk (CVaR) metric. An equivalent mixed-integer linear programming (MILP) reformulation of both modeling frameworks is presented, and a computationally efficient approximation scheme to obtain near-optimal solutions is proposed. We demonstrate the resulting DRO planning models and solution strategy via a New England case study under different levels of end-use electrification and decarbonization targets. Our experiments systematically explore different modeling aspects and compare the DRO models with stochastic programming (SP) results.

Small-signal stability of power systems with voltage droop

Authors: Jakob Niehues, Robin Delabays, Anna Büttner, Frank Hellmann

The stability of inverter-dominated power grids remains an active area of research. This paper presents novel sufficient conditions for ensuring small-signal stability in lossless and constant $R/X$ grids with highly heterogeneous mixes of grid-forming inverters that implement an adapted $V$-$q$ droop control. The proposed conditions can be evaluated in the neighborhood of each bus without information on the rest of the grid. Apart from the presence of $V$-$q$ droop, no additional assumptions are made regarding the inverter control strategies, nor is dynamical homogeneity across the system assumed. The analysis is enabled by recasting the node dynamics in terms of complex frequency and power, resulting in transfer functions that directly capture the small-signal frequency and amplitude responses to active and reactive power imbalances. These transfer functions are directly aligned with typical design considerations in grid-forming control. Building on an adapted small-phase theorem and viewing the system as a closed feedback loop between nodes and lines, the derived stability conditions also yield new insights when applied to established inverter control designs. We demonstrate in simulations that our conditions are not overly conservative and can identify individual inverters that are misconfigured and cause instability.

Relevant ArXiv eess Papers - 2025-07-02

Getting Dynamic Line Ratings into Markets

Authors: Zhiyi Zhou, Christoph Graf, Yury Dvorkin

Static transmission line ratings may lead to underutilization of line capacity due to overly conservative (worst-case) assumptions. Grid-enhancing technologies (GETs) such as dynamic line ratings (DLRs), which adjust line capacity based on real-time conditions, are a techno-economically viable alternative to increase the utilization of existing power lines. Nonetheless, their adoption has been slow, partly due to the absence of operational tools that effectively account for simultaneous impacts on dispatch and pricing. In this paper, we represent transmission capacity with DLRs as a stock-like resource with time-variant interdependency, which is modeled via an approximation of line temperature evolution process, decoupling the impacts of ambient weather conditions and power flow on transmission line temperature and thus capacity. We integrate DLRs into a multi-period DC optimal power flow problem, with chance constrains addressing correlated uncertainty in DLRs and renewable generation. This yields non-convex problems that we transform into a tractable convex form by linearization. We derive locational marginal energy and ancillary services prices consistent with a competitive equilibrium. Numerical experiments on the 11-zone and 1814-node NYISO systems demonstrate its performance, including impacts on dispatch, pricing, and marginal carbon emissions.

Augmented Physics-Based Li-ion Battery Model via Adaptive Ensemble Sparse Learning and Conformal Prediction

Authors: Samuel Filgueira da Silva, Mehmet Fatih Ozkan, Faissal El Idrissi, Marcello Canova

Accurate electrochemical models are essential for the safe and efficient operation of lithium-ion batteries in real-world applications such as electrified vehicles and grid storage. Reduced-order models (ROM) offer a balance between fidelity and computational efficiency but often struggle to capture complex and nonlinear behaviors, such as the dynamics in the cell voltage response under high C-rate conditions. To address these limitations, this study proposes an Adaptive Ensemble Sparse Identification (AESI) framework that enhances the accuracy of reduced-order li-ion battery models by compensating for unpredictable dynamics. The approach integrates an Extended Single Particle Model (ESPM) with an evolutionary ensemble sparse learning strategy to construct a robust hybrid model. In addition, the AESI framework incorporates a conformal prediction method to provide theoretically guaranteed uncertainty quantification for voltage error dynamics, thereby improving the reliability of the model's predictions. Evaluation across diverse operating conditions shows that the hybrid model (ESPM + AESI) improves the voltage prediction accuracy, achieving mean squared error reductions of up to 46% on unseen data. Prediction reliability is further supported by conformal prediction, yielding statistically valid prediction intervals with coverage ratios of 96.85% and 97.41% for the ensemble models based on bagging and stability selection, respectively.

Cyber Attacks Detection, Prevention, and Source Localization in Digital Substation Communication using Hybrid Statistical-Deep Learning

Authors: Nicola Cibin, Bas Mulder, Herman Carstens, Peter Palensky, Alexandru Ştefanov

The digital transformation of power systems is accelerating the adoption of IEC 61850 standard. However, its communication protocols, including Sampled Values (SV), lack built-in security features such as authentication and encryption, making them vulnerable to malicious packet injection. Such cyber attacks can delay fault clearance or trigger unintended circuit breaker operations. While most existing research focuses on detecting cyber attacks in digital substations, intrusion prevention systems have been disregarded because of the risk of potential communication network disruptions. This paper proposes a novel method using hybrid statistical-deep learning for the detection, prevention, and source localization of IEC 61850 SV injection attacks. The method uses exponentially modified Gaussian distributions to model communication network latency and long short-term memory and Elman recurrent neural network to detect anomalous variations in the estimated probability distributions. It effectively discards malicious SV frames with minimal processing overhead and latency, maintains robustness against communication network latency variation and time-synchronization issues, and guarantees a near-zero false positive rate in non-attack scenarios. Comprehensive validation is conducted on three testbeds involving industrial-grade devices, hardware-in-the-loop simulations, virtualized intelligent electronic devices and merging units, and high-fidelity emulated communication networks. Results demonstrate the method's suitability for practical deployment in IEC 61850-compliant digital substations.

Turning AI Data Centers into Grid-Interactive Assets: Results from a Field Demonstration in Phoenix, Arizona

Authors: Philip Colangelo, Ayse K. Coskun, Jack Megrue, Ciaran Roberts, Shayan Sengupta, Varun Sivaram, Ethan Tiao, Aroon Vijaykar, Chris Williams, Daniel C. Wilson, Zack MacFarland, Daniel Dreiling, Nathan Morey, Anuja Ratnayake, Baskar Vairamohan

Artificial intelligence (AI) is fueling exponential electricity demand growth, threatening grid reliability, raising prices for communities paying for new energy infrastructure, and stunting AI innovation as data centers wait for interconnection to constrained grids. This paper presents the first field demonstration, in collaboration with major corporate partners, of a software-only approach--Emerald Conductor--that transforms AI data centers into flexible grid resources that can efficiently and immediately harness existing power systems without massive infrastructure buildout. Conducted at a 256-GPU cluster running representative AI workloads within a commercial, hyperscale cloud data center in Phoenix, Arizona, the trial achieved a 25% reduction in cluster power usage for three hours during peak grid events while maintaining AI quality of service (QoS) guarantees. By orchestrating AI workloads based on real-time grid signals without hardware modifications or energy storage, this platform reimagines data centers as grid-interactive assets that enhance grid reliability, advance affordability, and accelerate AI's development.

Thévenin Equivalent Parameters Identification Based on Statistical Characteristics of System Ambient Data

Authors: Boying Zhou, Chen Shen, Kexuan Tang

This paper proposes a novel method for identifying Thévenin equivalent parameters (TEP) in power system, based on the statistical characteristics of the system's stochastic response. The method leverages stochastic fluctuation data under steady-state grid conditions and applies sliding window techniques to compute sensitivity parameters between voltage magnitude, current magnitude and power. This enables high-accuracy and robust TEP identification. In contrast to traditional methods, the proposed approach does not rely on large disturbances or probing signals but instead utilizes the natural fluctuation behavior of the system. Additionally, the method supports distributed implementation using local measurements of voltage magnitude, current magnitude, and power, offering significant practical value for engineering applications. The theoretical analysis demonstrates the method's robustness in the presence of low signal-to-noise ratio (SNR), asynchronous measurements, and data collinearity issues. Simulation results further confirm the effectiveness of the proposed method in diverse practical scenarios, demonstrating its ability to consistently provide accurate and reliable identification of TEP using system ambient data.

Graph Neural Networks in Wind Power Forecasting

Authors: Javier Castellano, Ignacio Villanueva

We study the applicability of GNNs to the problem of wind energy forecasting. We find that certain architectures achieve performance comparable to our best CNN-based benchmark. The study is conducted on three wind power facilities using five years of historical data. Numerical Weather Prediction (NWP) variables were used as predictors, and models were evaluated on a 24 to 36 hour ahead test horizon.

Symmetric Sliding-Mode Control of Grid-Forming Inverters With Precision Region Under AC and DC Sides Varying

Authors: Qianxi Tang, Li Peng, Xuefeng Wang, Xinchen Yao

Voltage regulation under conventional grid-forming controllers is tightly coupled to power sharing and dc-link dynamics. Consequently, its tracking accuracy deteriorates during grid faults, sudden power sharing changes, or dc-bus voltage varying. To address this issue, a symmetric sliding-mode control (SSMC) method is developed and its voltage precision region is derived. It illustrates how much ac-side power dynamics and dc-link voltage varying can be decoupled from the voltage regulation task, which helps predict when an abnormal entangling appears. While conventional sliding-mode controls address voltage-tracking error through complex sliding surface designs, repetitive correction techniques or special reaching laws, this work identifies that the error at power-line frequency primarily stem from the asymmetry property of inverters with the delay effect and the computational inaccuracy. Guided by this insight, an asymmetry compensation structure is proposed, which avoids added design complexity and directly mitigates voltage tracking error. Furthermore, the control design is supported by a physical and quantitative explanation, aiding in parameter tuning. Simulation and experimental results demonstrate that the proposed method achieves faster tracking responses while maintaining robust and more accurate tracking under both dc-link voltage and ac-side current variations. Conventional grid-forming and classical sliding-mode controllers, which handle these variations separately, cannot match this combined speed and robustness. Furthermore, the voltage precision region is explicitly verified.

Modern Base Station Architecture: Enabling Passive Beamforming with Beyond Diagonal RISs

Authors: Mahmoud Raeisi, Hui Chen, Henk Wymeersch, Ertugrul Basar

Beamforming plays a crucial role in millimeter wave (mmWave) communication systems to mitigate the severe attenuation inherent to this spectrum. However, the use of large active antenna arrays in conventional architectures often results in high implementation costs and excessive power consumption, limiting their practicality. As an alternative, deploying large arrays at transceivers using passive devices, such as reconfigurable intelligent surfaces (RISs), offers a more cost-effective and energy-efficient solution. In this paper, we investigate a promising base station (BS) architecture that integrates a beyond diagonal RIS (BD-RIS) within the BS to enable passive beamforming. By utilizing Takagi's decomposition and leveraging the effective beamforming vector, the RIS profile can be designed to enable passive beamforming directed toward the target. Through the beamforming analysis, we reveal that BD-RIS provides robust beamforming performance across various system configurations, whereas the traditional diagonal RIS (D-RIS) exhibits instability with increasing RIS size and decreasing BS-RIS separation-two critical factors in optimizing RIS-assisted systems. Comprehensive computer simulation results across various aspects validate the superiority of the proposed BS-integrated BD-RIS over conventional D-RIS architectures, showcasing performance comparable to active analog beamforming antenna arrays.

Hierarchical Decentralized Stochastic Control for Cyber-Physical Systems

Authors: Kesav Kaza, Ramachandran Anantharaman, Rahul Meshram

This paper presents a two-timescale hierarchical decentralized architecture for control of Cyber-Physical Systems. The architecture consists of $N$ independent sub-processes, a global controller, and $N$ local controllers, each formulated as a Markov Decision Process (MDP). The global controller, operating at a slower timescale optimizes the infinite-horizon discounted cumulative reward under budget constraints. For the local controllers, operating at a faster timescale, we propose two different optimization frameworks, namely the COpt and FOpt. In the COpt framework, the local controller also optimizes an infinite-horizon MDP, while in the FOpt framework, the local controller optimizes a finite-horizon MDP. The FOpt framework mimics a federal structure, where the local controllers have more autonomy in their decision making. First, the existence of stationary deterministic optimal policies for both these frameworks is established. Then, various relationships between the two frameworks are studied, including a bound on the difference between the two optimal value functions. Additionally, sufficiency conditions are provided such that the two frameworks lead to the same optimal values.

Relevant ArXiv eess Papers - 2025-07-01

Unsupervised Learning-Based Joint Resource Allocation and Beamforming Design for RIS-Assisted MISO-OFDMA Systems

Authors: Yu Ma, Xingyu Zhou, Xiao Li, Le Liang, Shi Jin

Reconfigurable intelligent surfaces (RIS) are key enablers for 6G wireless systems. This paper studies downlink transmission in an RIS-assisted MISO-OFDMA system, addressing resource allocation challenges. A two-stage unsupervised learning-based framework is proposed to jointly design RIS phase shifts, BS beamforming, and resource block (RB) allocation. The framework includes BeamNet, which predicts RIS phase shifts from CSI, and AllocationNet, which allocates RBs using equivalent CSI derived from BeamNet outputs. Active beamforming is implemented via maximum ratio transmission and water-filling. To handle discrete constraints while ensuring differentiability, quantization and the Gumbel-softmax trick are adopted. A customized loss and phased training enhance performance under QoS constraints. Simulations show the method achieves 99.93% of the sum rate of the SCA baseline with only 0.036% of its runtime, and it remains robust across varying channel and user conditions.

Real-Time Energy Management Strategies for Community Microgrids

Authors: Moslem Uddin, Huadong Mo, Daoyi Dong

This study presents a real-time energy management framework for hybrid community microgrids integrating photovoltaic, wind, battery energy storage systems, diesel generators, and grid interconnection. The proposed approach formulates the dispatch problem as a multi-objective optimization task that aims to minimize operational costs. Two control strategies are proposed and evaluated: a conventional rule-based control (RBC) method and an advanced deep reinforcement learning (DRL) approach utilizing proximal policy optimization (PPO). A realistic case study based on Australian load and generation profiles is used to validate the framework. Simulation results demonstrate that DRL-PPO reduces operational costs by 18%, CO_2 emissions by 20%, and improves system reliability by 87.5% compared to RBC. Beside, DRL-PPO increases renewable energy utilization by 13%, effectively reducing dependence on diesel generation and grid imports. These findings demonstrate the potential of DRL-based approaches to enable cost-effective and resilient microgrid operations, particularly in regional and remote communities.

Power Flow Analysis of a 5-Bus Power System Based on Newton-Raphson Method

Authors: Sampson E. Nwachukwu

Load flow analysis is a fundamental technique used by electrical engineers to simulate and evaluate power system behavior under steady-state conditions. It enables efficient operation and control by determining how active and reactive power flows throughout the system. Selecting an appropriate solution method is critical to ensuring reliable and economical operation of power generation, transmission, and distribution networks. While the conventional loop method may be used in small-scale systems, it is limited by its reliance on impedance-based load data and its inability to scale to complex networks. In contrast, iterative techniques such as the Gauss-Seidel (GS) and Newton-Raphson (NR) methods are better suited for analyzing large systems. Of these, the NR method offers significant advantages due to its quadratic convergence and improved numerical stability. This study presents a power flow analysis of a 5-bus system using the Newton-Raphson approach. The system was modeled and simulated in PowerWorld Simulator (PWS), and a custom MATLAB implementation was developed to verify the results under a base case scenario. The comparative analysis demonstrates that the NR method provides accurate and robust solutions for power flow problems, making it well-suited for evaluating system performance under various operating conditions.

Power-Gas Infrastructure Planning under Weather-induced Supply and Demand Uncertainties

Authors: Rahman Khorramfar, Dharik Mallapragada, Saurabh Amin

Implementing economy-wide decarbonization strategies based on decarbonizing the power grid via variable renewable energy (VRE) expansion and electrification of end-uses requires new approaches for energy infrastructure planning that consider, among other factors, weather-induced uncertainty in demand and VRE supply. An energy planning model that fails to account for these uncertainties can hinder the intended transition efforts to a low-carbon grid and increase the risk of supply shortage especially during extreme weather conditions. Here, we consider the generation and transmission expansion problem of joint power-gas infrastructure and operations planning under the uncertainty of both demand and renewable supply. We propose two distributionally robust optimization approaches based on moment (MDRO) and Wasserstein distance (WDRO) ambiguity sets to endogenize these uncertainties and account for the change in the underlying distribution of these parameters that is caused by the climate change, among other factors. Furthermore, our model considers the risk-aversion of the energy planners in the modeling framework via the conditional value-at-risk (CVaR) metric. An equivalent mixed-integer linear programming (MILP) reformulation of both modeling frameworks is presented, and a computationally efficient approximation scheme to obtain near-optimal solutions is proposed. We demonstrate the resulting DRO planning models and solution strategy via a New England case study under different levels of end-use electrification and decarbonization targets. Our experiments systematically explore different modeling aspects and compare the DRO models with stochastic programming (SP) results.

Reliability Assessment of Power System Based on the Dichotomy Method

Authors: Wenjie Wan, Han Hu, Feiyu Chen, Xiaoyu Liu, Kequan Zhao

With a sustainable increase in the scale of power system, the number of states in the state space grows exponentially, and the reliability assessment of the power system faces enormous challenges. Traditional state-by-state assessment methods, such as state enumeration (SE) and Monte Carlo simulation (MCS) methods, have encountered performance bottlenecks in terms of efficiency and accuracy. In this paper, the Boolean lattice representation theory of the state space was studied, and a dichotomy method was proposed to efficiently partition the state space into some disjoint sub-lattices with a relatively small number of optimal power flow (OPF) operations. Based on lattice partition, the reliability indices of the entire space can be calculated lattice-by-lattice. In addition, alone with the partitioning procedure, the calculated loss of load probability (LOLP) monotonically increases and rapidly tends to the analytic value with the designated error bound. Moreover, we designed a customized Monte Carlo sampling method in lattices of interest to compute expected energy not supply (EENS). The experiments are conducted on the RBTS and RTS-79 systems. The results show that the proposed method achieves the analytic LOLP of the RBTS system after five hundreds of OPF operations, which is about hundreds of times faster than traditional methods, and the designed Monte Carlo sampling method converged after thousands of OPF operations on test systems.

Energy-Aware Model Predictive Control for Batch Manufacturing System Scheduling Under Different Electricity Pricing Strategies

Authors: Hongliang Li, Herschel C. Pangborn, Ilya Kovalenko

Manufacturing industries are among the highest energy-consuming sectors, facing increasing pressure to reduce energy costs. This paper presents an energy-aware Model Predictive Control (MPC) framework to dynamically schedule manufacturing processes in response to time-varying electricity prices without compromising production goals or violating production constraints. A network-based manufacturing system model is developed to capture complex material flows, batch processing, and capacities of buffers and machines. The scheduling problem is formulated as a Mixed-Integer Quadratic Program (MIQP) that balances energy costs, buffer levels, and production requirements. A case study evaluates the proposed MPC framework under four industrial electricity pricing schemes. Numerical results demonstrate that the approach reduces energy usage expenses while satisfying production goals and adhering to production constraints. The findings highlight the importance of considering the detailed electricity cost structure in manufacturing scheduling decisions and provide practical insights for manufacturers when selecting among different electricity pricing strategies.

A Reinforcement Learning Approach for Optimal Control in Microgrids

Authors: Davide Salaorni, Federico Bianchi, Francesco Trovò, Marcello Restelli

The increasing integration of renewable energy sources (RESs) is transforming traditional power grid networks, which require new approaches for managing decentralized energy production and consumption. Microgrids (MGs) provide a promising solution by enabling localized control over energy generation, storage, and distribution. This paper presents a novel reinforcement learning (RL)-based methodology for optimizing microgrid energy management. Specifically, we propose an RL agent that learns optimal energy trading and storage policies by leveraging historical data on energy production, consumption, and market prices. A digital twin (DT) is used to simulate the energy storage system dynamics, incorporating degradation factors to ensure a realistic emulation of the analysed setting. Our approach is validated through an experimental campaign using real-world data from a power grid located in the Italian territory. The results indicate that the proposed RL-based strategy outperforms rule-based methods and existing RL benchmarks, offering a robust solution for intelligent microgrid management.

VisionScores -- A system-segmented image score dataset for deep learning tasks

Authors: Alejandro Romero Amezcua, Mariano José Juan Rivera Meraz

VisionScores presents a novel proposal being the first system-segmented image score dataset, aiming to offer structure-rich, high information-density images for machine and deep learning tasks. Delimited to two-handed piano pieces, it was built to consider not only certain graphic similarity but also composition patterns, as this creative process is highly instrument-dependent. It provides two scenarios in relation to composer and composition type. The first, formed by 14k samples, considers works from different authors but the same composition type, specifically, Sonatinas. The latter, consisting of 10.8K samples, presents the opposite case, various composition types from the same author, being the one selected Franz Liszt. All of the 24.8k samples are formatted as grayscale jpg images of $128 \times 512$ pixels. VisionScores supplies the users not only the formatted samples but the systems' order and pieces' metadata. Moreover, unsegmented full-page scores and the pre-formatted images are included for further analysis.

Fragile, Robust, and Antifragile: A Perspective from Parameter Responses in Reinforcement Learning Under Stress

Authors: Zain ul Abdeen, Ming Jin

This paper explores Reinforcement learning (RL) policy robustness by systematically analyzing network parameters under internal and external stresses. Inspired by synaptic plasticity in neuroscience, synaptic filtering introduces internal stress by selectively perturbing parameters, while adversarial attacks apply external stress through modified agent observations. This dual approach enables the classification of parameters as fragile, robust, or antifragile, based on their influence on policy performance in clean and adversarial settings. Parameter scores are defined to quantify these characteristics, and the framework is validated on PPO-trained agents in Mujoco continuous control environments. The results highlight the presence of antifragile parameters that enhance policy performance under stress, demonstrating the potential of targeted filtering techniques to improve RL policy adaptability. These insights provide a foundation for future advancements in the design of robust and antifragile RL systems.

AURA: Agent for Understanding, Reasoning, and Automated Tool Use in Voice-Driven Tasks

Authors: Leander Melroy Maben, Gayathri Ganesh Lakshmy, Srijith Radhakrishnan, Siddhant Arora, Shinji Watanabe

Despite advances in language and speech technologies, no open-source system enables full speech-to-speech, multi-turn dialogue with integrated tool use and agentic reasoning. We introduce AURA (Agent for Understanding, Reasoning, and Automated Tool Use), the first open-source, speech-native assistant capable of completing complex, goal-driven tasks through dynamic tool invocation and multi-turn conversation. AURA combines open-weight ASR, TTS, and LLMs in a cascaded pipeline and supports tools such as calendar booking, contact lookup, web search, and email. Its modular design allows easy integration of new tools using natural language prompts and action classes. On VoiceBench, AURA scores 92.75% on OpenBookQA-outperforming all open-weight systems and nearing GPT-4o-and 4.39 on AlpacaEval, competitive with other open-weight systems. Human evaluation shows 90% task success on complex, multi-turn speech tasks.

Flexible Intelligent Metasurface for Enhancing Multi-Target Wireless Sensing

Authors: Zihao Teng, Jiancheng An, Lu Gan, Naofal Al-Dhahir, Zhu Han

Flexible intelligent metasurface (FIM) has emerged as a transformative technology to enhance wireless sensing by dynamically morphing its three-dimensional (3D) surface shape and electromagnetic response. Unlike conventional rigid arrays, an FIM consists of low-cost radiating elements that can independently adjust their positions and radiation characteristics, thereby allowing for real-time optimization of the sensing environment. This paper investigates the impact of FIM on wireless sensing performance. Specifically, we focus on the maximization of the cumulated power of the probing signals at the target locations under the per-antenna power constraint by jointly optimizing the transmit covariance matrix and the surface shape of the transmitting FIM. We propose a block coordinate descend (BCD) algorithm to find a locally optimal solution, by alternatively updating the FIM surface shape and the transmit covariance matrix, while keeping the other one fixed at each step. Furthermore, we analyze the computational complexity and convergence properties of the proposed algorithm and demonstrate that FIM enhances wireless sensing by providing a new design degree-of-freedom to coordinate the correlation between steering vectors at different angles. Numerical results demonstrate that FIM significantly improves wireless sensing performance under the considered multi-target scenario.

CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding

Authors: Yuchen Zhou, Jiamin Wu, Zichen Ren, Zhouheng Yao, Weiheng Lu, Kunyu Peng, Qihao Zheng, Chunfeng Song, Wanli Ouyang, Chao Gou

Understanding and decoding brain activity from electroencephalography (EEG) signals is a fundamental challenge in neuroscience and AI, with applications in cognition, emotion recognition, diagnosis, and brain-computer interfaces. While recent EEG foundation models advance generalized decoding via unified architectures and large-scale pretraining, they adopt a scale-agnostic dense modeling paradigm inherited from NLP and vision. This design neglects a core property of neural activity: cross-scale spatiotemporal structure. EEG task patterns span a wide range of temporal and spatial scales, from short bursts to slow rhythms, and from localized cortical responses to distributed interactions. Ignoring this diversity leads to suboptimal representations and weak generalization. We propose CSBrain, a Cross-scale Spatiotemporal Brain foundation model for generalized EEG decoding. CSBrain introduces: (i) Cross-scale Spatiotemporal Tokenization (CST), which aggregates multi-scale features from localized temporal windows and anatomical brain regions into compact scale-aware tokens; and (ii) Structured Sparse Attention (SSA), which captures cross-window and cross-region dependencies, enhancing scale diversity while removing spurious correlations. CST and SSA are alternately stacked to progressively integrate multi-scale dependencies. Experiments on 11 EEG tasks across 16 datasets show that CSBrain consistently outperforms task-specific and foundation model baselines. These results establish cross-scale modeling as a key inductive bias and position CSBrain as a robust backbone for future brain-AI research.

The Florence Price Art Song Dataset and Piano Accompaniment Generator

Authors: Tao-Tao He, Martin E. Malandro, Douglas Shadle

Florence B. Price was a composer in the early 20th century whose music reflects her upbringing in the American South, her African heritage, and her Western classical training. She is noted as the first African-American woman to have a symphony performed by a major orchestra. Her music has recently received renewed attention from both the public and the research community, decades after her death. In addition to other genres, Price was a prolific composer for solo voice and piano. Music historians have documented the existence of 134 art songs and piano/voice arrangements for spirituals and folk songs written by Price. We release a digital catalog of 112 of these works in MuseScore, MusicXML, MIDI, and PDF format. We also use this dataset to fine-tune a symbolic music generation model to generate accompaniments to melodies, and we conduct a blind listening experiment that shows that accompaniments generated by our model are perceived as being reflective of Florence Price's style more frequently than accompaniments generated by a baseline model. We release our model as the Florence Price Piano Accompaniment Generator alongside our dataset.

External Data-Enhanced Meta-Representation for Adaptive Probabilistic Load Forecasting

Authors: Haoran Li, Muhao Guo, Marija Ilic, Yang Weng, Guangchun Ruan

Accurate residential load forecasting is critical for power system reliability with rising renewable integration and demand-side flexibility. However, most statistical and machine learning models treat external factors, such as weather, calendar effects, and pricing, as extra input, ignoring their heterogeneity, and thus limiting the extraction of useful external information. We propose a paradigm shift: external data should serve as meta-knowledge to dynamically adapt the forecasting model itself. Based on this idea, we design a meta-representation framework using hypernetworks that modulate selected parameters of a base Deep Learning (DL) model in response to external conditions. This provides both expressivity and adaptability. We further integrate a Mixture-of-Experts (MoE) mechanism to enhance efficiency through selective expert activation, while improving robustness by filtering redundant external inputs. The resulting model, dubbed as a Meta Mixture of Experts for External data (M2oE2), achieves substantial improvements in accuracy and robustness with limited additional overhead, outperforming existing state-of-the-art methods in diverse load datasets. The dataset and source code are publicly available at this https URL\_load\this http URL.

Nuisance parameters and elliptically symmetric distributions: a geometric approach to parametric and semiparametric efficiency

Authors: Stefano Fortunati, Jean-Pierre Delmas, Esa Ollila

Elliptically symmetric distributions are a classic example of a semiparametric model where the location vector and the scatter matrix (or a parameterization of them) are the two finite-dimensional parameters of interest, while the density generator represents an \textit{infinite-dimensional nuisance} term. This basic representation of the elliptic model can be made more accurate, rich, and flexible by considering additional \textit{finite-dimensional nuisance} parameters. Our aim is therefore to investigate the deep and counter-intuitive links between statistical efficiency in estimating the parameters of interest in the presence of both finite and infinite-dimensional nuisance parameters. Unlike previous works that addressed this problem using Le Cam's asymptotic theory, our approach here is purely geometric: efficiency will be analyzed using tools such as projections and tangent spaces embedded in the relevant Hilbert space. This allows us to obtain original results also for the case where the location vector and the scatter matrix are parameterized by a finite-dimensional vector that can be partitioned in two sub-vectors: one containing the parameters of interest and the other containing the nuisance parameters. As an example, we illustrate how the obtained results can be applied to the well-known \virg{low-rank} parameterization. Furthermore, while the theoretical analysis will be developed for Real Elliptically Symmetric (RES) distributions, we show how to extend our results to the case of Circular and Non-Circular Complex Elliptically Symmetric (C-CES and NC-CES) distributions.

PixelBoost: Leveraging Brownian Motion for Realistic-Image Super-Resolution

Authors: Aradhana Mishra, Bumshik Lee

Diffusion-model-based image super-resolution techniques often face a trade-off between realistic image generation and computational efficiency. This issue is exacerbated when inference times by decreasing sampling steps, resulting in less realistic and hazy images. To overcome this challenge, we introduce a novel diffusion model named PixelBoost that underscores the significance of embracing the stochastic nature of Brownian motion in advancing image super-resolution, resulting in a high degree of realism, particularly focusing on texture and edge definitions. By integrating controlled stochasticity into the training regimen, our proposed model avoids convergence to local optima, effectively capturing and reproducing the inherent uncertainty of image textures and patterns. Our proposed model demonstrates superior objective results in terms of learned perceptual image patch similarity (LPIPS), lightness order error (LOE), peak signal-to-noise ratio(PSNR), structural similarity index measure (SSIM), as well as visual quality. To determine the edge enhancement, we evaluated the gradient magnitude and pixel value, and our proposed model exhibited a better edge reconstruction capability. Additionally, our model demonstrates adaptive learning capabilities by effectively adjusting to Brownian noise patterns and introduces a sigmoidal noise sequencing method that simplifies training, resulting in faster inference speeds.

Parallax QAMA: Novel Downlink Multiple Access for MISO Systems with Simple Receivers

Authors: Jie Huang, Ming Zhao, Shengli Zhou, Ling Qiu, Jinkang Zhu

In this paper, we propose a novel downlink multiple access system with a multi-antenna transmitter and two single-antenna receivers, inspired by the underlying principles of hierarchical quadrature amplitude modulation (H-QAM) based multiple access (QAMA) and space-division multiple access (SDMA). In the proposed scheme, coded bits from two users are split and assigned to one shared symbol and two private symbols carried by different beams. Based on joint symbol mapping of H-QAM constellations and phase-aligned precoding at the transmitter, each receiver observes a different H-QAM constellation with Gray mapping, a unique parallax feature not shared by existing schemes. In addition to avoiding successive interference cancellation (SIC), each user independently demodulates its own bits on separate I and Q branches with calculations based on closed-form expressions. Hence the receiver complexity is on par with that of orthogonal multiple access (OMA), which is much lower than that in other competing alternatives such as non-orthogonal multiple access (NOMA) and rate-splitting multiple access (RSMA). We carry out system optimization and determine the achievable rate region. Numerical results show that the proposed system has a larger rate region relative to other benchmark schemes with receivers not using SIC, and even achieves a comparable rate region to those benchmark schemes with SIC receivers.

XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs

Authors: Yitian Gong, Luozhijie Jin, Ruifan Deng, Dong Zhang, Xin Zhang, Qinyuan Cheng, Zhaoye Fei, Shimin Li, Xipeng Qiu

Speech codecs serve as bridges between speech signals and large language models. An ideal codec for speech language models should not only preserve acoustic information but also capture rich semantic information. However, existing speech codecs struggle to balance high-quality audio reconstruction with ease of modeling by language models. In this study, we analyze the limitations of previous codecs in balancing semantic richness and acoustic fidelity. We propose XY-Tokenizer, a novel codec that mitigates the conflict between semantic and acoustic capabilities through multi-stage, multi-task learning. Experimental results demonstrate that XY-Tokenizer achieves performance in both semantic and acoustic tasks comparable to that of state-of-the-art codecs operating at similar bitrates, even though those existing codecs typically excel in only one aspect. Specifically, XY-Tokenizer achieves strong text alignment, surpassing distillation-based semantic modeling methods such as SpeechTokenizer and Mimi, while maintaining a speaker similarity score of 0.83 between reconstructed and original audio. The reconstruction performance of XY-Tokenizer is comparable to that of BigCodec, the current state-of-the-art among acoustic-only codecs, which achieves a speaker similarity score of 0.84 at a similar bitrate. Code and models are available at this https URL.

Safe and Performant Deployment of Autonomous Systems via Model Predictive Control and Hamilton-Jacobi Reachability Analysis

Authors: Hao Wang, Armand Jordana, Ludovic Righetti, Somil Bansal

While we have made significant algorithmic developments to enable autonomous systems to perform sophisticated tasks, it remains difficult for them to perform tasks effective and safely. Most existing approaches either fail to provide any safety assurances or substantially compromise task performance for safety. In this work, we develop a framework, based on model predictive control (MPC) and Hamilton-Jacobi (HJ) reachability, to optimize task performance for autonomous systems while respecting the safety constraints. Our framework guarantees recursive feasibility for the MPC controller, and it is scalable to high-dimensional systems. We demonstrate the effectiveness of our framework with two simulation studies using a 4D Dubins Car and a 6 Dof Kuka iiwa manipulator, and the experiments show that our framework significantly improves the safety constraints satisfaction of the systems over the baselines.

Layer Decomposition and Morphological Reconstruction for Task-Oriented Infrared Image Enhancement

Authors: Siyuan Chai, Xiaodong Guo, Tong Liu

Infrared image helps improve the perception capabilities of autonomous driving in complex weather conditions such as fog, rain, and low light. However, infrared image often suffers from low contrast, especially in non-heat-emitting targets like bicycles, which significantly affects the performance of downstream high-level vision tasks. Furthermore, achieving contrast enhancement without amplifying noise and losing important information remains a challenge. To address these challenges, we propose a task-oriented infrared image enhancement method. Our approach consists of two key components: layer decomposition and saliency information extraction. First, we design an layer decomposition method for infrared images, which enhances scene details while preserving dark region features, providing more features for subsequent saliency information extraction. Then, we propose a morphological reconstruction-based saliency extraction method that effectively extracts and enhances target information without amplifying noise. Our method improves the image quality for object detection and semantic segmentation tasks. Extensive experiments demonstrate that our approach outperforms state-of-the-art methods.

You Sound a Little Tense: L2 Tailored Clear TTS Using Durational Vowel Properties

Authors: Paige Tuttösí, H. Henny Yeung, Yue Wang, Jean-Julien Aucouturier, Angelica Lim

We present the first text-to-speech (TTS) system tailored to second language (L2) speakers. We use duration differences between American English tense (longer) and lax (shorter) vowels to create a "clarity mode" for Matcha-TTS. Our perception studies showed that French-L1, English-L2 listeners had fewer (at least 9.15%) transcription errors when using our clarity mode, and found it more encouraging and respectful than overall slowed down speech. Remarkably, listeners were not aware of these effects: despite the decreased word error rate in clarity mode, listeners still believed that slowing all target words was the most intelligible, suggesting that actual intelligibility does not correlate with perceived intelligibility. Additionally, we found that Whisper-ASR did not use the same cues as L2 speakers to differentiate difficult vowels and is not sufficient to assess the intelligibility of TTS systems for these individuals.

A Model Predictive Control Framework to Enhance Safety and Quality in Mobile Additive Manufacturing Systems

Authors: Yifei Li, Joshua A. Robbins, Guha Manogharan, Herschel C. Pangborn, Ilya Kovalenko

In recent years, the demand for customized, on-demand production has grown in the manufacturing sector. Additive Manufacturing (AM) has emerged as a promising technology to enhance customization capabilities, enabling greater flexibility, reduced lead times, and more efficient material usage. However, traditional AM systems remain constrained by static setups and human worker dependencies, resulting in long lead times and limited scalability. Mobile robots can improve the flexibility of production systems by transporting products to designated locations in a dynamic environment. By integrating AM systems with mobile robots, manufacturers can optimize travel time for preparatory tasks and distributed printing operations. Mobile AM robots have been deployed for on-site production of large-scale structures, but often neglect critical print quality metrics like surface roughness. Additionally, these systems do not have the precision necessary for producing small, intricate components. We propose a model predictive control framework for a mobile AM platform that ensures safe navigation on the plant floor while maintaining high print quality in a dynamic environment. Three case studies are used to test the feasibility and reliability of the proposed systems.

From Large-scale Audio Tagging to Real-Time Explainable Emergency Vehicle Sirens Detection

Authors: Stefano Giacomelli, Marco Giordano, Claudia Rinaldi, Fabio Graziosi

Accurate recognition of Emergency Vehicle (EV) sirens is critical for the integration of intelligent transportation systems, smart city monitoring systems, and autonomous driving technologies. Modern automatic solutions are limited by the lack of large scale, curated datasets and by the computational demands of state of the art sound event detection models. This work introduces E2PANNs (Efficient Emergency Pre trained Audio Neural Networks), a lightweight Convolutional Neural Network architecture derived from the PANNs framework, specifically optimized for binary EV siren detection. Leveraging our dedicated subset of AudioSet (AudioSet EV) we fine-tune and evaluate E2PANNs across multiple reference datasets and test its viability on embedded hardware. The experimental campaign includes ablation studies, cross-domain benchmarking, and real-time inference deployment on edge device. Interpretability analyses exploiting Guided Backpropagation and ScoreCAM algorithms provide insights into the model internal representations and validate its ability to capture distinct spectrotemporal patterns associated with different types of EV sirens. Real time performance is assessed through frame wise and event based detection metrics, as well as a detailed analysis of false positive activations. Results demonstrate that E2PANNs establish a new state of the art in this research domain, with high computational efficiency, and suitability for edge-based audio monitoring and safety-critical applications.

Evaluation of Geolocation Capabilities of Multimodal Large Language Models and Analysis of Associated Privacy Risks

Authors: Xian Zhang, Xiang Cheng

Objectives: The rapid advancement of Multimodal Large Language Models (MLLMs) has significantly enhanced their reasoning capabilities, enabling a wide range of intelligent applications. However, these advancements also raise critical concerns regarding privacy and ethics. MLLMs are now capable of inferring the geographic location of images -- such as those shared on social media or captured from street views -- based solely on visual content, thereby posing serious risks of privacy invasion, including doxxing, surveillance, and other security threats. Methods: This study provides a comprehensive analysis of existing geolocation techniques based on MLLMs. It systematically reviews relevant litera-ture and evaluates the performance of state-of-the-art visual reasoning models on geolocation tasks, particularly in identifying the origins of street view imagery. Results: Empirical evaluation reveals that the most advanced visual large models can successfully localize the origin of street-level imagery with up to $49\%$ accuracy within a 1-kilometer radius. This performance underscores the models' powerful capacity to extract and utilize fine-grained geographic cues from visual data. Conclusions: Building on these findings, the study identifies key visual elements that contribute to suc-cessful geolocation, such as text, architectural styles, and environmental features. Furthermore, it discusses the potential privacy implications associated with MLLM-enabled geolocation and discuss several technical and policy-based coun-termeasures to mitigate associated risks. Our code and dataset are available at this https URL.

TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity

Authors: Yuzhuo Chen, Zehua Ma, Han Fang, Weiming Zhang, Nenghai Yu

AI-generated content (AIGC) enables efficient visual creation but raises copyright and authenticity risks. As a common technique for integrity verification and source tracing, digital image watermarking is regarded as a potential solution to above issues. Among these, watermarking methods capable of preserving the generation quality are receiving increased attention. However, the proliferation and high performance of generative image editing applications have elevated the risks of malicious tampering, creating new demands. 1) The tamper robustness of current lossless visual quality watermarks remains constrained by the modification-sensitive diffusion inversion process, necessitating enhanced robustness. 2) The improved tampering quality and rapid iteration cycles render passive tampering detection methods inadequate, making proactive tampering localization capability a desired feature for watermarks. To address these requirements, this paper proposes a Tamper-Aware Generative image WaterMarking method named TAG-WM. The proposed method comprises four key modules: a dual-mark joint sampling (DMJS) algorithm for embedding copyright and localization watermarks into the latent space while preserving generative quality, the watermark latent reconstruction (WLR) utilizing reversed DMJS, a dense variation region detector (DVRD) leveraging diffusion inversion sensitivity to identify tampered areas via statistical deviation analysis, and the tamper-aware decoding (TAD) guided by localization results. The experimental results indicate that TAG-WM achieves SOTA tampering robustness and tampering localization capability with distortions while maintaining lossless generation quality and a considerable capacity of 256 bits.

Securing the Sky: Integrated Satellite-UAV Physical Layer Security for Low-Altitude Wireless Networks

Authors: Jiahui Li, Geng Sun, Xiaoyu Sun, Fang Mei, Jingjing Wang, Xiangwang Hou, Daxin Tian, Victor C. M. Leung

Low-altitude wireless networks (LAWNs) have garnered significant attention in the forthcoming 6G networks. In LAWNs, satellites with wide coverage and unmanned aerial vehicles (UAVs) with flexible mobility can complement each other to form integrated satellite-UAV networks, providing ubiquitous and high-speed connectivity for low-altitude operations. However, the higher line-of-sight probability in low-altitude airspace increases transmission security concerns. In this work, we present a collaborative beamforming-based physical layer security scheme for LAWNs. We introduce the fundamental aspects of integrated satellite-UAV networks, physical layer security, UAV swarms, and collaborative beamforming for LAWN applications. Following this, we highlight several opportunities for collaborative UAV swarm secure applications enabled by satellite networks, including achieving physical layer security in scenarios involving data dissemination, data relay, eavesdropper collusion, and imperfect eavesdropper information. Next, we detail two case studies: a secure relay system and a two-way aerial secure communication framework specifically designed for LAWN environments. Simulation results demonstrate that these physical layer security schemes are effective and beneficial for secure low-altitude wireless communications. A short practicality analysis shows that the proposed method is applicable to LAWN scenarios. Finally, we discuss current challenges and future research directions for enhancing security in LAWNs.

JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching

Authors: Mingi Kwon, Joonghyuk Shin, Jaeseok Jung, Jaesik Park, Youngjung Uh

The intrinsic link between facial motion and speech is often overlooked in generative modeling, where talking head synthesis and text-to-speech (TTS) are typically addressed as separate tasks. This paper introduces JAM-Flow, a unified framework to simultaneously synthesize and condition on both facial motion and speech. Our approach leverages flow matching and a novel Multi-Modal Diffusion Transformer (MM-DiT) architecture, integrating specialized Motion-DiT and Audio-DiT modules. These are coupled via selective joint attention layers and incorporate key architectural choices, such as temporally aligned positional embeddings and localized joint attention masking, to enable effective cross-modal interaction while preserving modality-specific strengths. Trained with an inpainting-style objective, JAM-Flow supports a wide array of conditioning inputs-including text, reference audio, and reference motion-facilitating tasks such as synchronized talking head generation from text, audio-driven animation, and much more, within a single, coherent model. JAM-Flow significantly advances multi-modal generative modeling by providing a practical solution for holistic audio-visual synthesis. project page: this https URL

Tensor Train Quantum State Tomography using Compressed Sensing

Authors: Shakir Showkat Sofi, Charlotte Vermeylen, Lieven De Lathauwer

Quantum state tomography (QST) is a fundamental technique for estimating the state of a quantum system from measured data and plays a crucial role in evaluating the performance of quantum devices. However, standard estimation methods become impractical due to the exponential growth of parameters in the state representation. In this work, we address this challenge by parameterizing the state using a low-rank block tensor train decomposition and demonstrate that our approach is both memory- and computationally efficient. This framework applies to a broad class of quantum states that can be well approximated by low-rank decompositions, including pure states, nearly pure states, and ground states of Hamiltonians.

Alleviating CoD in Renewable Energy Profile Clustering Using an Optical Quantum Computer

Authors: Chengjun Liu, Yijun Xu, Wei Gu, Bo Sun, Kai Wen, Shuai Lu, Lamine Mili

The traditional clustering problem of renewable energy profiles is typically formulated as a combinatorial optimization that suffers from the Curse of Dimensionality (CoD) on classical computers. To address this issue, this paper first proposed a kernel-based quantum clustering method. More specifically, the kernel-based similarity between profiles with minimal intra-group distance is encoded into the ground-state of the Hamiltonian in the form of an Ising model. Then, this NP-hard problem can be reformulated into a Quadratic Unconstrained Binary Optimization (QUBO), which a Coherent Ising Machine (CIM) can naturally solve with significant improvement over classical computers. The test results from a real optical quantum computer verify the validity of the proposed method. It also demonstrates its ability to address CoD in an NP-hard clustering problem.

RELATE: Subjective evaluation dataset for automatic evaluation of relevance between text and audio

Authors: Yusuke Kanamori, Yuki Okamoto, Taisei Takano, Shinnosuke Takamichi, Yuki Saito, Hiroshi Saruwatari

In text-to-audio (TTA) research, the relevance between input text and output audio is an important evaluation aspect. Traditionally, it has been evaluated from both subjective and objective perspectives. However, subjective evaluation is costly in terms of money and time, and objective evaluation is unclear regarding the correlation to subjective evaluation scores. In this study, we construct RELATE, an open-sourced dataset that subjectively evaluates the relevance. Also, we benchmark a model for automatically predicting the subjective evaluation score from synthesized audio. Our model outperforms a conventional CLAPScore model, and that trend extends to many sound categories.

Towards Universal Shared Control in Teleoperation Without Haptic Feedback

Authors: Max Grobbel, Tristan Schneider, Sören Hohmann

Teleoperation with non-haptic VR controllers deprives human operators of critical motion feedback. We address this by embedding a multi-objective optimization problem that converts user input into collision-free UR5e joint trajectories while actively suppressing liquid slosh in a glass. The controller maintains 13 ms average planning latency, confirming real-time performance and motivating the augmentation of this teleoperation approach to further objectives.

Efficient Interleaved Speech Modeling through Knowledge Distillation

Authors: Mohammadmahdi Nouriborji, Morteza Rohanian

Current speech language models exceed the size and latency constraints of many deployment environments. We build compact, expressive speech generation models through layer-aligned distillation, matching hidden states, attention maps, and softened logits to compress large multimodal transformers by 3x with minimal loss in performance. We introduce TinyWave, a family of 2B-parameter models for speech-to-speech and interleaved speech-text generation, trained on 50,000 hours of public audio. TinyWave supports (i) speech-only generation using phonetic or expressive tokens and (ii) mixed speech-text continuations. Evaluation on Libri-Light shows TinyWave within 1.4 normalized perplexity points of its teacher. Accuracy on spoken StoryCloze and SALMon reaches 93-97% of the teacher's performance, outperforming size-matched baselines. These models are optimized for deployment on commodity hardware, enabling applications in real-time conversational agents, assistive technologies, and low-resource environments. We release models, training code, and evaluation scripts to support reproducible research on compact, expressive speech generation.

How Long Can I Transmit? A Mobility Aware mmWave-based UAV Communication Framework

Authors: Shawon Mitra, Subhojit Sarkar, Sasthi C. Ghosh

One primary focus of next generation wireless communication networks is the millimeterwave (mmWave) spectrum, typically considered in the 30 GHz to 300 GHz frequency range. Despite their promise of high data rates, mmWaves suffer from severe attenuation while passing through obstacles. Unmanned aerial vehicles (UAVs) have been proposed to offset this limitation on account of their additional degrees of freedom, which can be leveraged to provide line of sight (LoS) transmission paths. While some prior works have proposed analytical frameworks to compute the LoS probability for static ground users and a UAV, the same is lacking for mobile users on the ground. In this paper, we consider the popular Manhattan point line process (MPLP) to model an urban environment, within which a ground user moves with a known velocity for a small time interval along the roads. We derive an expression for the expected duration of LoS between a static UAV in the air and a mobile ground user, and validate the same through simulations. To demonstrate the efficacy of the proposed analysis, we propose a simple user association algorithm that greedily assigns the UAVs to users with the highest expected LoS time, and show that it outperforms the existing benchmark schemes that assign the users to the nearest UAVs with LoS without considering the user mobility.

Data-Driven Predictive Planning and Control for Aerial 3D Inspection with Back-face Elimination

Authors: Savvas Papaioannou, Panayiotis Kolios, Christos G. Panayiotou, Marios M. Polycarpou

Automated inspection with Unmanned Aerial Systems (UASs) is a transformative capability set to revolutionize various application domains. However, this task is inherently complex, as it demands the seamless integration of perception, planning, and control which existing approaches often treat separately. Moreover, it requires accurate long-horizon planning to predict action sequences, in contrast to many current techniques, which tend to be myopic. To overcome these limitations, we propose a 3D inspection approach that unifies perception, planning, and control within a single data-driven predictive control framework. Unlike traditional methods that rely on known UAS dynamic models, our approach requires only input-output data, making it easily applicable to off-the-shelf black-box UASs. Our method incorporates back-face elimination, a visibility determination technique from 3D computer graphics, directly into the control loop, thereby enabling the online generation of accurate, long-horizon 3D inspection trajectories.

Scaling Self-Supervised Representation Learning for Symbolic Piano Performance

Authors: Louis Bradshaw, Honglu Fan, Alexander Spangher, Stella Biderman, Simon Colton

We study the capabilities of generative autoregressive transformer models trained on large amounts of symbolic solo-piano transcriptions. After first pretraining on approximately 60,000 hours of music, we use a comparatively smaller, high-quality subset, to finetune models to produce musical continuations, perform symbolic classification tasks, and produce general-purpose contrastive MIDI embeddings by adapting the SimCLR framework to symbolic music. When evaluating piano continuation coherence, our generative model outperforms leading symbolic generation techniques and remains competitive with proprietary audio generation models. On MIR classification benchmarks, frozen representations from our contrastive model achieve state-of-the-art results in linear probe experiments, while direct finetuning demonstrates the generalizability of pretrained representations, often requiring only a few hundred labeled examples to specialize to downstream tasks.

Emergent musical properties of a transformer under contrastive self-supervised learning

Authors: Yuexuan Kong, Gabriel Meseguer-Brocal, Vincent Lostanlen, Mathieu Lagrange, Romain Hennequin

In music information retrieval (MIR), contrastive self-supervised learning for general-purpose representation models is effective for global tasks such as automatic tagging. However, for local tasks such as chord estimation, it is widely assumed that contrastively trained general-purpose self-supervised models are inadequate and that more sophisticated SSL is necessary; e.g., masked modeling. Our paper challenges this assumption by revealing the potential of contrastive SSL paired with a transformer in local MIR tasks. We consider a lightweight vision transformer with one-dimensional patches in the time--frequency domain (ViT-1D) and train it with simple contrastive SSL through normalized temperature-scaled cross-entropy loss (NT-Xent). Although NT-Xent operates only over the class token, we observe that, potentially thanks to weight sharing, informative musical properties emerge in ViT-1D's sequence tokens. On global tasks, the temporal average of class and sequence tokens offers a performance increase compared to the class token alone, showing useful properties in the sequence tokens. On local tasks, sequence tokens perform unexpectedly well, despite not being specifically trained for. Furthermore, high-level musical features such as onsets emerge from layer-wise attention maps and self-similarity matrices show different layers capture different musical dimensions. Our paper does not focus on improving performance but advances the musical interpretation of transformers and sheds light on some overlooked abilities of contrastive SSL paired with transformers for sequence modeling in MIR.

Dimension and model reduction approaches for linear Bayesian inverse problems with rank-deficient prior covariances

Authors: Josie König, Elizabeth Qian, Melina A. Freitag

Bayesian inverse problems use observed data to update a prior probability distribution for an unknown state or parameter of a scientific system to a posterior distribution conditioned on the data. In many applications, the unknown parameter is high-dimensional, making computation of the posterior expensive due to the need to sample in a high-dimensional space and the need to evaluate an expensive high-dimensional forward model relating the unknown parameter to the data. However, inverse problems often exhibit low-dimensional structure due to the fact that the available data are only informative in a low-dimensional subspace of the parameter space. Dimension reduction approaches exploit this structure by restricting inference to the low-dimensional subspace informed by the data, which can be sampled more efficiently. Further computational cost reductions can be achieved by replacing expensive high-dimensional forward models with cheaper lower-dimensional reduced models. In this work, we propose new dimension and model reduction approaches for linear Bayesian inverse problems with rank-deficient prior covariances, which arise in many practical inference settings. The dimension reduction approach is applicable to general linear Bayesian inverse problems whereas the model reduction approaches are specific to the problem of inferring the initial condition of a linear dynamical system. We provide theoretical approximation guarantees as well as numerical experiments demonstrating the accuracy and efficiency of the proposed approaches.

StreamFlow: Streaming Flow Matching with Block-wise Guided Attention Mask for Speech Token Decoding

Authors: Dake Guo, Jixun Yao, Linhan Ma, Wang He, Lei Xie

Recent advancements in discrete token-based speech generation have highlighted the importance of token-to-waveform generation for audio quality, particularly in real-time interactions. Traditional frameworks integrating semantic tokens with flow matching (FM) struggle with streaming capabilities due to their reliance on a global receptive field. Additionally, directly implementing token-by-token streaming speech generation often results in degraded audio quality. To address these challenges, we propose StreamFlow, a novel neural architecture that facilitates streaming flow matching with diffusion transformers (DiT). To mitigate the long-sequence extrapolation issues arising from lengthy historical dependencies, we design a local block-wise receptive field strategy. Specifically, the sequence is first segmented into blocks, and we introduce block-wise attention masks that enable the current block to receive information from the previous or subsequent block. These attention masks are combined hierarchically across different DiT-blocks to regulate the receptive field of DiTs. Both subjective and objective experimental results demonstrate that our approach achieves performance comparable to non-streaming methods while surpassing other streaming methods in terms of speech quality, all the while effectively managing inference time during long-sequence generation. Furthermore, our method achieves a notable first-packet latency of only 180 ms.\footnote{Speech samples: this https URL}

WaRA: Wavelet Low Rank Adaptation

Authors: Moein Heidari, Yasamin Medghalchi, Mahdi Khoursha, Reza Rezaeian, Ilker Hacihaliloglu

Parameter-efficient fine-tuning (PEFT) has gained widespread adoption across various applications. Among PEFT techniques, Low-Rank Adaptation (LoRA) and its extensions have emerged as particularly effective, allowing efficient model adaptation while significantly reducing computational overhead. However, existing approaches typically rely on global low-rank factorizations, which overlook local or multi-scale structure, failing to capture complex patterns in the weight updates. To address this, we propose WaRA, a novel PEFT method that leverages wavelet transforms to decompose the weight update matrix into a multi-resolution representation. By performing low-rank factorization in the wavelet domain and reconstructing updates through an inverse transform, WaRA obtains compressed adaptation parameters that harness multi-resolution analysis, enabling it to capture both coarse and fine-grained features while providing greater flexibility and sparser representations than standard LoRA. Through comprehensive experiments and analysis, we demonstrate that WaRA performs superior on diverse vision tasks, including image generation, classification, and semantic segmentation, significantly enhancing generated image quality while reducing computational complexity. Although WaRA was primarily designed for vision tasks, we further showcase its effectiveness in language tasks, highlighting its broader applicability and generalizability. The code is publicly available at \href{GitHub}{this https URL}.

Suboptimality analysis of receding horizon quadratic control with unknown linear systems and its applications in learning-based control

Authors: Shengling Shi, Anastasios Tsiamis, Bart De Schutter

This work analyzes how the trade-off between the modeling error, the terminal value function error, and the prediction horizon affects the performance of a nominal receding-horizon linear quadratic (LQ) controller. By developing a novel perturbation result of the Riccati difference equation, a novel performance upper bound is obtained and suggests that for many cases, the prediction horizon can be either one or infinity to improve the control performance, depending on the relative difference between the modeling error and the terminal value function error. The result also shows that when an infinite horizon is desired, a finite prediction horizon that is larger than the controllability index can be sufficient for achieving a near-optimal performance, revealing a close relation between the prediction horizon and controllability. The obtained suboptimality performance upper bound is applied to provide novel sample complexity and regret guarantees for nominal receding-horizon LQ controllers in a learning-based setting. We show that an adaptive prediction horizon that increases as a logarithmic function of time is beneficial for regret minimization.

Gaussian Process-Based Nonlinear Moving Horizon Estimation

Authors: Tobias M. Wolff, Victor G. Lopez, Matthias A. Müller

In this paper, we propose a novel Gaussian process-based moving horizon estimation (MHE) framework for unknown nonlinear systems. On the one hand, we approximate the system dynamics by the posterior means of the learned Gaussian processes (GPs). On the other hand, we exploit the posterior variances of the Gaussian processes to design the weighting matrices in the MHE cost function and account for the uncertainty in the learned system dynamics. The data collection and the tuning of the hyperparameters are done offline. We prove robust stability of the GP-based MHE scheme using a Lyapunov-based proof technique. Furthermore, as additional contribution, we derive a sufficient condition under which incremental input/output-to-state stability (a nonlinear detectability notion) is preserved when approximating the system dynamics using, e.g., machine learning techniques. Finally, we illustrate the performance of the GP-based MHE scheme in two simulation case studies and show how the chosen weighting matrices can lead to an improved performance compared to standard cost functions.

Consensus seeking in diffusive multidimensional networks with a repeated interaction pattern and time-delays

Authors: Hoang Huy Vu, Quyen Ngoc Nguyen, Tuynh Van Pham, Chuong Van Nguyen, Minh Hoang Trinh

This paper studies a consensus problem in multidimensional networks having the same agent-to-agent interaction pattern under both intra- and cross-layer time delays. Several conditions for the agents to asymptotically reach a consensus are derived, which involve the overall network's structure, the local interacting pattern, and the assumptions specified on the time delays. The validity of these conditions is proved by direct eigenvalue evaluation and supported by numerical simulations.

Zak-OTFS to Integrate Sensing the I/O Relation and Data Communication

Authors: Muhammad Ubadah, Saif Khan Mohammed, Ronny Hadani, Shachar Kons, Ananthanarayanan Chockalingam, Robert Calderbank

The Zak-OTFS input/output (I/O) relation is predictable and non-fading when the delay and Doppler periods are greater than the effective channel delay and Doppler spreads, a condition which we refer to as the crystallization condition. The filter taps can simply be read off from the response to a single Zak-OTFS point (impulse) pulsone waveform, and the I/O relation can be reconstructed for a sampled system that operates under finite duration and bandwidth constraints. Predictability opens up the possibility of a model-free mode of operation. The time-domain realization of a Zak-OTFS point pulsone is a pulse train modulated by a tone, hence the name, pulsone. The Peak-to-Average Power Ratio (PAPR) of a pulsone is about $15$ dB, and we describe a general method for constructing a spread pulsone for which the time-domain realization has a PAPR of about 6dB. We construct the spread pulsone by applying a type of discrete spreading filter to a Zak-OTFS point pulsone. The self-ambiguity function of the point pulsone is supported on the period lattice ${\Lambda}_{p}$, and by applying a discrete chirp filter, we obtain a spread pulsone with a self-ambiguity function that is supported on a rotated lattice ${\Lambda^*}$. We show that if the channel satisfies the crystallization conditions with respect to ${\Lambda^*}$ then the effective DD domain filter taps can simply be read off from the cross-ambiguity between the channel response to the spread pulsone and the transmitted spread pulsone. If, in addition, the channel satisfies the crystallization conditions with respect to the period lattice ${\Lambda}_{p}$, then in an OTFS frame consisting of a spread pilot pulsone and point data pulsones, after cancelling the received signal corresponding to the spread pulsone, we can recover the channel response to any data pulsone.

Robust and tractable multidimensional exponential analysis

Authors: H. N. Mhaskar, S. Kitimoon, Raghu G. Raj

Motivated by a number of applications in signal processing, we study the following question. Given samples of a multidimensional signal of the form $$ f(\boldsymbol\ell)=\sum_{k=1}^K a_k\exp(-i\langle \boldsymbol\ell, \mathbf{w}_k\rangle), \quad \mathbf{w}_1,\cdots,\mathbf{w}_k\in\mathbb{R}^q, \ \boldsymbol\ell\in \mathbb{Z}^q, \ |\boldsymbol\ell|

Learning Li-ion battery health and degradation modes from data with aging-aware circuit models

Authors: Zihao Zhou, Antti Aitio, David Howey

Non-invasive estimation of Li-ion battery state-of-health from operational data is valuable for battery applications, but remains challenging. Pure model-based methods may suffer from inaccuracy and long-term instability of parameter estimates, whereas pure data-driven methods rely heavily on training data quality and quantity, causing lack of generality when extrapolating to unseen cases. We apply an aging-aware equivalent circuit model for health estimation, combining the flexibility of data-driven techniques within a model-based approach. A simplified electrical model with voltage source and resistor incorporates Gaussian process regression to learn capacity fade over time and also the dependence of resistance on operating conditions and time. The approach was validated against two datasets and shown to give accurate performance with less than 1% relative root mean square error (RMSE) in capacity and less than 2% mean absolute percentage error (MAPE). Critically, we show that the open circuit voltage versus state-of-charge function must be accurately known, and any inaccuracies or changes in this over time strongly influence the inferred resistance. However, this feature (or bug) may also be used to estimate in operando differential voltage curves from operational data.

BraTS-PEDs: Results of the Multi-Consortium International Pediatric Brain Tumor Segmentation Challenge 2023

Authors: Anahita Fathi Kazerooni, Nastaran Khalili, Xinyang Liu, Debanjan Haldar, Zhifan Jiang, Anna Zapaishchykova, Julija Pavaine, Lubdha M. Shah, Blaise V. Jones, Nakul Sheth, Sanjay P. Prabhu, Aaron S. McAllister, Wenxin Tu, Khanak K. Nandolia, Andres F. Rodriguez, Ibraheem Salman Shaikh, Mariana Sanchez Montano, Hollie Anne Lai, Maruf Adewole, Jake Albrecht, Udunna Anazodo, Hannah Anderson, Syed Muhammed Anwar, Alejandro Aristizabal, Sina Bagheri, Ujjwal Baid, Timothy Bergquist, Austin J. Borja, Evan Calabrese, Verena Chung, Gian-Marco Conte, James Eddy, Ivan Ezhov, Ariana M. Familiar, Keyvan Farahani, Deep Gandhi, Anurag Gottipati, Shuvanjan Haldar, Juan Eugenio Iglesias, Anastasia Janas, Elaine Elaine, Alexandros Karargyris, Hasan Kassem, Neda Khalili, Florian Kofler, Dominic LaBella, Koen Van Leemput, Hongwei B. Li, Nazanin Maleki, Zeke Meier, Bjoern Menze, Ahmed W. Moawad, Sarthak Pati, Marie Piraud, Tina Poussaint, Zachary J. Reitman, Jeffrey D. Rudie, Rachit Saluja, MIcah Sheller, Russell Takeshi Shinohara, Karthik Viswanathan, Chunhao Wang, Benedikt Wiestler, Walter F. Wiggins, Christos Davatzikos, Phillip B. Storm, Miriam Bornhorst, Roger Packer, Trent Hummel, Peter de Blank, Lindsey Hoffman, Mariam Aboian, Ali Nabavizadeh, Jeffrey B. Ware, Benjamin H. Kann, Brian Rood, Adam Resnick, Spyridon Bakas, Arastoo Vossough, Marius George Linguraru

Pediatric central nervous system tumors are the leading cause of cancer-related deaths in children. The five-year survival rate for high-grade glioma in children is less than 20%. The development of new treatments is dependent upon multi-institutional collaborative clinical trials requiring reproducible and accurate centralized response assessment. We present the results of the BraTS-PEDs 2023 challenge, the first Brain Tumor Segmentation (BraTS) challenge focused on pediatric brain tumors. This challenge utilized data acquired from multiple international consortia dedicated to pediatric neuro-oncology and clinical trials. BraTS-PEDs 2023 aimed to evaluate volumetric segmentation algorithms for pediatric brain gliomas from magnetic resonance imaging using standardized quantitative performance evaluation metrics employed across the BraTS 2023 challenges. The top-performing AI approaches for pediatric tumor analysis included ensembles of nnU-Net and Swin UNETR, Auto3DSeg, or nnU-Net with a self-supervised framework. The BraTSPEDs 2023 challenge fostered collaboration between clinicians (neuro-oncologists, neuroradiologists) and AI/imaging scientists, promoting faster data sharing and the development of automated volumetric analysis techniques. These advancements could significantly benefit clinical trials and improve the care of children with brain tumors.

BIRA: A Spherical Bistatic Radar Reflectivity Measurement System

Authors: Carsten Andrich, Tobias F. Nowack, Alexander Ihlow, Sebastian Giehl, Maximilian Engelhardt, Gerd Sommerkorn, Andreas Schwind, Willi Hofmann, Christian Bornkessel, Matthias A. Hein, Reiner S. Thomä

The upcoming 6G mobile communication standard will offer a revolutionary new feature: Integrated sensing and communication (ISAC) reuses mobile communication signals to realize multi-static radar for various applications including localization. Consequently, applied ISAC propagation research necessitates to evolve from classical monostatic radar cross section (RCS) measurement of static targets on to bistatic radar reflectivity characterization of dynamic objects. Here, we introduce our Bistatic Radar (BIRA) measurement facility for independent spherical positioning of two probes with sub-millimeter accuracy on a diameter of up to 7 m and with almost continuous frequency coverage from 0.7 up to 260 GHz. Currently, BIRA is the only bistatic measurement facility capable of unrestricted ISAC research: In addition to vector network analysis, it employs advanced wideband transceiver technology with an instantaneous bandwidth of up to 4 GHz. These transceivers grant BIRA the unique capability to characterize dynamic targets in both Doppler and range, while also significantly accelerating measurements on static objects. Additionally, the installation is capable of spherical near-field antenna measurements over these wide frequency ranges.

Iterative approach to reconstructing neural disparity fields from light-field data

Authors: Ligen Shi, Chang Liu, Xing Zhao, Jun Qiu

This study proposes a neural disparity field (NDF) that establishes an implicit, continuous representation of scene disparity based on a neural field and an iterative approach to address the inverse problem of NDF reconstruction from light-field data. NDF enables seamless and precise characterization of disparity variations in three-dimensional scenes and can discretize disparity at any arbitrary resolution, overcoming the limitations of traditional disparity maps that are prone to sampling errors and interpolation inaccuracies. The proposed NDF network architecture utilizes hash encoding combined with multilayer perceptrons to capture detailed disparities in texture levels, thereby enhancing its ability to represent the geometric information of complex scenes. By leveraging the spatial-angular consistency inherent in light-field data, a differentiable forward model to generate a central view image from the light-field data is developed. Based on the forward model, an optimization scheme for the inverse problem of NDF reconstruction using differentiable propagation operators is established. Furthermore, an iterative solution method is adopted to reconstruct the NDF in the optimization scheme, which does not require training datasets and applies to light-field data captured by various acquisition methods. Experimental results demonstrate that high-quality NDF can be reconstructed from light-field data using the proposed method. High-resolution disparity can be effectively recovered by NDF, demonstrating its capability for the implicit, continuous representation of scene disparities.

Practical Challenges for Reliable RIS Deployment in Heterogeneous Multi-Operator Multi-Band Networks

Authors: Mehdi Monemi, Mehdi Rasti, Arthur S. de Sena, Mohammad Amir Fallah, Matti Latva-Aho, Marco Di Renzo

Reconfigurable intelligent surfaces (RISs) have been introduced as arrays of nearly passive elements with software-tunable electromagnetic properties to dynamically manipulate the reflection/transmission of radio signals. Research works in this area are focused on two applications, namely {\it user-assist} RIS aiming at tuning the RIS to enhance the quality-of-service (QoS) of target users, and the {\it malicious} RIS aiming for an attacker to degrade the QoS at victim receivers through generating {\it intended} destructive interference. While both user-assist and malicious RIS applications have been explored extensively, the impact of RIS deployments on imposing {\it unintended} interference on various wireless user-equipments (EUs) remains underexplored. This paper investigates the challenges of integrating RISs into multi-carrier, multi-user, and multi-operator networks. We discuss how RIS deployments intended to benefit specific users can negatively impact other users served at various carrier frequencies through different network operators. While not an ideal solution, we discuss how ultra-narrowband metasurfaces can be incorporated into the manufacturing of RISs to mitigate some challenges of RIS deployment in wireless networks. We also present a simulation scenario to illuminate some practical challenges associated with the deployment of RISs in shared public environments.

Performance Analysis of Joint Antenna Selection and Precoding Methods in Multi-user Massive MISO

Authors: Xiuxiu Ma, Abla Kammoun, Mohamed-Slim Alouini, Tareq Y. Al-Naffouri

This paper presents a performance analysis of two distinct techniques for antenna selection and precoding in downlink multi-user massive multiple-input single-output systems with limited dynamic range power amplifiers. Both techniques are derived from the original formulation of the regularized-zero forcing precoder, designed as the solution to minimizing a regularized distortion. Based on this, the first technique, called the $\ell_1$-norm precoder, adopts an $\ell_1$-norm regularization term to encourage sparse solutions, thereby enabling antenna selection. The second technique, termed the thresholded $\ell_1$-norm precoder, involves post-processing the precoder solution obtained from the first method by applying an entry-wise thresholding operation. This work conducts a precise performance analysis to compare these two techniques. The analysis leverages the Gaussian min-max theorem which is effective for examining the asymptotic behavior of optimization problems without explicit solutions. While the analysis of the $\ell_1$-norm precoder follows the conventional Gaussian min-max theorem framework, understanding the thresholded $\ell_1$-norm precoder is more complex due to the non-linear behavior introduced by the thresholding operation. To address this complexity, we develop a novel Gaussian min-max theorem tailored to these scenarios. We provide precise asymptotic behavior analysis of the precoders, focusing on metrics such as received signal-to-noise and distortion ratio and bit error rate. Our analysis demonstrates that the thresholded $\ell_1$-norm precoder can offer superior performance when the threshold parameter is carefully selected. Simulations confirm that the asymptotic results are accurate for systems equipped with hundreds of antennas at the base station, serving dozens of user terminals.

Ring Artifacts Removal Based on Implicit Neural Representation of Sinogram Data

Authors: Ligen Shi, Xu Jiang, YunZe Liu, Chang Liu, Ping Yang, Shifeng Guo, Xing Zhao

Inconsistent responses of X-ray detector elements lead to stripe artifacts in the sinogram data, which manifest as ring artifacts in the reconstructed CT images, severely degrading image quality. This paper proposes a method for correcting stripe artifacts in the sinogram data. The proposed method leverages implicit neural representation (INR) to correct defective pixel response values using implicit continuous functions and simultaneously learns stripe features in the angular direction of the sinogram data. These two components are combined within an optimization constraint framework, achieving unsupervised iterative correction of stripe artifacts in the projection domain. Experimental results demonstrate that the proposed method significantly outperforms current state-of-the-art techniques in removing ring artifacts while maintaining the clarity of CT images.

Segment as You Wish -- Free-Form Language-Based Segmentation for Medical Images

Authors: Longchao Da, Rui Wang, Xiaojian Xu, Parminder Bhatia, Taha Kass-Hout, Hua Wei, Cao Xiao

Medical imaging is crucial for diagnosing a patient's health condition, and accurate segmentation of these images is essential for isolating regions of interest to ensure precise diagnosis and treatment planning. Existing methods primarily rely on bounding boxes or point-based prompts, while few have explored text-related prompts, despite clinicians often describing their observations and instructions in natural language. To address this gap, we first propose a RAG-based free-form text prompt generator, that leverages the domain corpus to generate diverse and realistic descriptions. Then, we introduce FLanS, a novel medical image segmentation model that handles various free-form text prompts, including professional anatomy-informed queries, anatomy-agnostic position-driven queries, and anatomy-agnostic size-driven queries. Additionally, our model also incorporates a symmetry-aware canonicalization module to ensure consistent, accurate segmentations across varying scan orientations and reduce confusion between the anatomical position of an organ and its appearance in the scan. FLanS is trained on a large-scale dataset of over 100k medical images from 7 public datasets. Comprehensive experiments demonstrate the model's superior language understanding and segmentation precision, along with a deep comprehension of the relationship between them, outperforming SOTA baselines on both in-domain and out-of-domain datasets.

A Global Coordinate-Free Approach to Invariant Contraction on Homogeneous Manifolds

Authors: Akash Harapanahalli, Samuel Coogan

In this work, we provide a global condition for contraction with respect to an invariant Riemannian metric on reductive homogeneous spaces. Using left-invariant frames, vector fields on the manifold are horizontally lifted to the ambient Lie group, where the Levi-Civita connection is globally characterized as a real matrix multiplication. By linearizing in these left-invariant frames, we characterize contraction using matrix measures on real square matrices, avoiding the use of local charts. Applying this global condition, we provide a necessary condition for a prescribed subset of the manifold to possibly admit a contracting system, which accounts for the underlying geometry of the invariant metric. Applied to the sphere, this condition implies that no great circle can be contained in a contraction region. Finally, we apply our results to compute reachable sets for an attitude control problem.

Pixel super-resolved virtual staining of label-free tissue using diffusion models

Authors: Yijie Zhang, Luzhe Huang, Nir Pillar, Yuzhu Li, Hanlong Chen, Aydogan Ozcan

Virtual staining of tissue offers a powerful tool for transforming label-free microscopy images of unstained tissue into equivalents of histochemically stained samples. This study presents a diffusion model-based super-resolution virtual staining approach utilizing a Brownian bridge process to enhance both the spatial resolution and fidelity of label-free virtual tissue staining, addressing the limitations of traditional deep learning-based methods. Our approach integrates novel sampling techniques into a diffusion model-based image inference process to significantly reduce the variance in the generated virtually stained images, resulting in more stable and accurate outputs. Blindly applied to lower-resolution auto-fluorescence images of label-free human lung tissue samples, the diffusion-based super-resolution virtual staining model consistently outperformed conventional approaches in resolution, structural similarity and perceptual accuracy, successfully achieving a super-resolution factor of 4-5x, increasing the output space-bandwidth product by 16-25-fold compared to the input label-free microscopy images. Diffusion-based super-resolved virtual tissue staining not only improves resolution and image quality but also enhances the reliability of virtual staining without traditional chemical staining, offering significant potential for clinical diagnostics.

Identification and Clustering of Unseen Ragas in Indian Art Music

Authors: Parampreet Singh, Adwik Gupta, Aakarsh Mishra, Vipul Arora

Raga classification in Indian Art Music is an open-set problem where unseen classes may appear during testing. However, traditional approaches often treat it as a closed set problem, rejecting the possibility of encountering unseen classes. In this work, we try to tackle this problem by first employing an Uncertainty-based Out-Of-Distribution (OOD) detection, given a set containing known and unknown classes. Next, for the audio samples identified as OOD, we employ Novel Class Discovery (NCD) approach to cluster them into distinct unseen Raga classes. We achieve this by harnessing information from labelled data and further applying contrastive learning on unlabelled data. With thorough analysis, we demonstrate the influence of different components of the loss function on clustering performance and examine how varying openness affects the NCD task in hand.

Quantifying the benefit of load uncertainty reduction for the design of district energy systems under grid constraints using the Value of Information

Authors: Max Langtry, Ruchi Choudhary

Load uncertainty must be accounted for during design to ensure building energy systems can meet energy demands during operation. Reducing building load uncertainty allows for improved designs with less compromise to be identified, reducing the cost of decarbonizing energy usage. However, the building monitoring required to reduce load uncertainty is costly. This study uses Value of Information analysis (VoI) to quantify the economic benefit of practical building monitoring for supporting energy system design decisions, and determine if its benefits outweigh its cost. An extension of the VoI framework, termed 'On-Policy' VoI, is proposed, which admits complex decision making tasks where decision policies are required. This is applied to a case study district energy system design problem, where a Linear Program model is used to size solar-battery systems and grid connection capacity under uncertain building loads, modelled using historic electricity metering data. Load uncertainty is found to significantly impact both system operating costs ($\pm$30%) and the optimal system design ($\pm$20%). However, using building monitoring data to improve the design of the district reduces overall costs by less than 1.5% on average. As this is less than the cost of measurement, using monitoring is not economically worthwhile in this case. This provides the first numerical evidence to support the sufficiency of using standard building load profiles for energy system design. Further, reducing only uncertainty in mean load is found to provide most of the available decision support benefit, meaning using hourly measurement data provides little benefit for energy retrofit design.

A behavioral approach for LPV data-driven representations

Authors: Chris Verhoek, Ivan Markovsky, Sofie Haesaert, Roland Tóth

In this paper, we present a data-driven representation for linear parameter-varying (LPV) systems, which can be used for direct data-driven analysis and control of such systems. Specifically, we use the behavioral approach to develop a data-driven representation of the finite-horizon behavior of LPV systems for which there exists a kernel representation with shifted-affine scheduling dependence. Moreover, we provide a necessary and sufficient rank-based test on the available data that concludes whether the data fully represents the finite-horizon LPV behavior. Using the proposed data-driven representation, we also solve the data-driven simulation problem for LPV systems. Through multiple examples, we demonstrate that the results in this paper allow us to formulate a novel set of direct data-driven analysis and control methods for LPV systems, which are also applicable for LPV embeddings of nonlinear systems.

Asymptotic Analysis of One-bit Quantized Box-Constrained Precoding in Large-Scale Multi-User Systems

Authors: Xiuxiu Ma, Abla Kammoun, Mohamed-Slim Alouini, Tareq Y. Al-Naffouri

This paper addresses the design of multi-antenna precoding strategies, considering hardware limitations such as low-resolution digital-to-analog converters (DACs), which necessitate the quantization of transmitted signals. The typical approach starts with optimizing a precoder, followed by a quantization step to meet hardware requirements. This study analyzes the performance of a quantization scheme applied to the box-constrained regularized zero-forcing (RZF) precoder in the asymptotic regime, where the number of antennas and users grows proportionally. The box constraint, initially designed to cope with low-dynamic range amplifiers, is used here to control quantization noise rather than for amplifier compatibility. A significant challenge in analyzing the quantized precoder is that the input to the quantization operation does not follow a Gaussian distribution, making traditional methods such as Bussgang's decomposition unsuitable. To overcome this, the paper extends the Gordon's inequality and introduces a novel Gaussian Min-Max Theorem to model the distribution of the channel-distorted precoded signal. The analysis derives the tight lower bound for the signal-to-distortion-plus-noise ratio (SDNR) and the bit error rate (BER), showing that optimal tuning of the amplitude constraint improves performance.

Integrated Sensing, Communication, and Computation Over-the-Air in OFDM Systems

Authors: Biao Dong, Bin Cao, Qinyu Zhang

This work is concerned with integrated sensing, communication, and computation (ISCC) in uplink orthogonal frequency division multiplexing (OFDM) systems, wherein multiple devices perform target sensing and over-the-air computation (AirComp) simultaneously. We aim to minimize the computational mean squared error (MSE) by jointly optimizing the transmitting vector and the aggregation vector. To tackle the non-convexity of this problem, we develop a two-phase iterative algorithm. Simulations demonstrate the effectiveness of the proposed algorithm.

Assessing workflow impact and clinical utility of AI-assisted brain aneurysm detection: a multi-reader study

Authors: Tommaso Di Noto, Sofyan Jankowski, Francesco Puccinelli, Guillaume Marie, Sebastien Tourbier, Yasser Aleman-Gomez, Oscar Esteban, Ricardo Corredor-Jerez, Guillaume Saliou, Patric Hagmann, Meritxell Bach Cuadra, Jonas Richiardi

Despite the plethora of AI-based algorithms developed for anomaly detection in radiology, subsequent integration into clinical setting is rarely evaluated. In this work, we assess the applicability and utility of an AI-based model for brain aneurysm detection comparing the performance of two readers with different levels of experience (2 and 13 years). We aim to answer the following questions: 1) Do the readers improve their performance when assisted by the AI algorithm? 2) How much does the AI algorithm impact routine clinical workflow? We reuse and enlarge our open-access, Time-Of-Flight Magnetic Resonance Angiography dataset (N=460). We use 360 subjects for training/validating our algorithm and 100 as unseen test set for the reading session. Even though our model reaches state-of-the-art results on the test set (sensitivity=74%, false positive rate=1.6), we show that neither the junior nor the senior reader significantly increase their sensitivity (p=0.59, p=1, respectively). In addition, we find that reading time for both readers is significantly higher in the "AI-assisted" setting than in the "Unassisted" (+15 seconds, on average; p=3x10^(-4) junior, p=3x10^(-5) senior). The confidence reported by the readers is unchanged across the two settings, indicating that the AI assistance does not influence the certainty of the diagnosis. Our findings highlight the importance of clinical validation of AI algorithms in a clinical setting involving radiologists. This study should serve as a reminder to the community to always examine the real-word effectiveness and workflow impact of proposed algorithms.

MedSegNet10: A Publicly Accessible Network Repository for Split Federated Medical Image Segmentation

Authors: Chamani Shiranthika, Zahra Hafezi Kafshgari, Hadi Hadizadeh, Parvaneh Saeedi

Machine Learning (ML) and Deep Learning (DL) have shown significant promise in healthcare, particularly in medical image segmentation, which is crucial for accurate disease diagnosis and treatment planning. Despite their potential, challenges such as data privacy concerns, limited annotated data, and inadequate training data persist. Decentralized learning approaches such as federated learning (FL), split learning (SL), and split federated learning (SplitFed/SFL) address these issues effectively. This paper introduces "MedSegNet10," a publicly accessible repository designed for medical image segmentation using split-federated learning. MedSegNet10 provides a collection of pre-trained neural network architectures optimized for various medical image types, including microscopic images of human blastocysts, dermatoscopic images of skin lesions, and endoscopic images of lesions, polyps, and ulcers, with applications extending beyond these examples. By leveraging SplitFed's benefits, MedSegNet10 allows collaborative training on privately stored, horizontally split data, ensuring privacy and integrity. This repository supports researchers, practitioners, trainees, and data scientists, aiming to advance medical image segmentation while maintaining patient data privacy. The repository is available at: this https URL (password upon request to the authors).

Identification of additive multivariable continuous-time systems

Authors: Maarten van der Hulst, Rodrigo González, Koen Classens, Nic Dirkx, Jeroen van de Wijdeven, Tom Oomen

Multivariable parametric models are critical for designing, controlling, and optimizing the performance of engineered systems. The main aim of this paper is to develop a parametric identification strategy that delivers accurate and physically relevant models of multivariable systems using time-domain data. The introduced approach adopts an additive model structure, providing a parsimonious and interpretable representation of many physical systems, and applies a refined instrumental variable-based estimation algorithm. The developed identification method enables the estimation of multivariable parametric additive models in continuous time and is applicable to both open- and closed-loop systems. The performance of the estimator is demonstrated through numerical simulations and experimentally validated on a flexible beam system.

The Cesàro Value Iteration

Authors: Jonas Mair, Lukas Schwenkel, Matthias A. Müller, Frank Allgöwer

In this paper, we consider undiscouted infinitehorizon optimal control for deterministic systems with an uncountable state and input space. We specifically address the case when the classic value iteration does not converge. For such systems, we use the Ces`aro mean to define the infinite-horizon optimal control problem and the corresponding infinite-horizon value function. Moreover, for this value function, we introduce the Cesàro value iteration and prove its convergence for the special case of systems with periodic optimal operating behavior. For this instance, we also show that the Cesàro value function recovers the undiscounted infinite-horizon optimal cost, if the latter is well-defined.

Controlling Complex Systems

Authors: Marco Coraggio, Davide Salzano, Mario di Bernardo

This chapter provides a comprehensive overview of controlling collective behavior in complex systems comprising large ensembles of interacting dynamical agents. Building upon traditional control theory's foundation in individual systems, we introduce tools designed to address the unique challenges of coordinating networks that exhibit emergent phenomena, including consensus, synchronization, and pattern formation. We analyze how local agent interactions generate macroscopic behaviors and investigate the fundamental role of network topology in determining system dynamics. Inspired by natural systems, we emphasize control strategies that achieve global coordination through localized interventions while considering practical implementation challenges. The chapter concludes by presenting novel frameworks for managing very large agent ensembles and leveraging interacting networks for control purposes.

ESC-MVQ: End-to-End Semantic Communication With Multi-Codebook Vector Quantization

Authors: Junyong Shin, Yongjeong Oh, Jinsung Park, Joohyuk Park, Yo-Seb Jeon

This paper proposes a novel end-to-end digital semantic communication framework based on multi-codebook vector quantization (VQ), referred to as ESC-MVQ. Unlike prior approaches that rely on end-to-end training with a specific power or modulation scheme, often under a particular channel condition, ESC-MVQ models a channel transfer function as parallel binary symmetric channels (BSCs) with trainable bit-flip probabilities. Building on this model, ESC-MVQ jointly trains multiple VQ codebooks and their associated bit-flip probabilities with a single encoder-decoder pair. To maximize inference performance when deploying ESC-MVQ in digital communication systems, we devise an optimal communication strategy that jointly optimizes codebook assignment, adaptive modulation, and power allocation. To this end, we develop an iterative algorithm that selects the most suitable VQ codebook for semantic features and flexibly allocates power and modulation schemes across the transmitted symbols. Simulation results demonstrate that ESC-MVQ, using a single encoder-decoder pair, outperforms existing digital semantic communication methods in both performance and memory efficiency, offering a scalable and adaptive solution for realizing digital semantic communication in diverse channel conditions.

A Deep Learning-Based Supervised Transfer Learning Framework for DOA Estimation with Array Imperfections

Authors: Bo Zhou, Kaijie Xu, Yinghui Quan, Mengdao Xing

In practical scenarios, processes such as sensor design, manufacturing, and installation will introduce certain errors. Furthermore, mutual interference occurs when the sensors receive signals. These defects in array systems are referred to as array imperfections, which can significantly degrade the performance of Direction of Arrival (DOA) estimation. In this study, we propose a deep-learning based transfer learning approach, which effectively mitigates the degradation of deep-learning based DOA estimation performance caused by array imperfections. In the proposed approach, we highlight three major contributions. First, we propose a Vision Transformer (ViT) based method for DOA estimation, which achieves excellent performance in scenarios with low signal-to-noise ratios (SNR) and limited snapshots. Second, we introduce a transfer learning framework that extends deep learning models from ideal simulation scenarios to complex real-world scenarios with array imperfections. By leveraging prior knowledge from ideal simulation data, the proposed transfer learning framework significantly improves deep learning-based DOA estimation performance in the presence of array imperfections, without the need for extensive real-world data. Finally, we incorporate visualization and evaluation metrics to assess the performance of DOA estimation algorithms, which allow for a more thorough evaluation of algorithms and further validate the proposed method. Our code can be accessed at this https URL.

Theoretical Grid-Forming Extreme of Inverters

Authors: Qianxi Tang, Li Peng

What are the theoretical and physical limits of a grid-forming inverter? This letter proposes that the extreme grid-forming ability of inverters is limited by their dc-side, ac-side, circuit topology dynamics, but not control. While many papers focus on how to improve grid-forming inverters stability, power sharing, inertia emulation, fault response, few, if any, formally define the fundamental theoretical limits or extremes of grid-forming behavior. It seems that the grid-forming can be improved endlessly. No physical system can support a grid indefinitely without limitations, especially under increasing levels of disturbance or uncertainty. Therefore, this boundary is explicitly shown by a mathematical expression in this letter. Consequently, the results show that relatively low dc-side voltage and high active power injection could damage the grid-forming ability. Poor consideration of dc-side, ac-side, and circuit topology dynamics in real practice will cause jeopardizing oscillation even by the theoretical best grid-forming control strategy.

MFA-KWS: Effective Keyword Spotting with Multi-head Frame-asynchronous Decoding

Authors: Yu Xi, Haoyu Li, Xiaoyu Gu, Yidi Jiang, Kai Yu

Keyword spotting (KWS) is essential for voice-driven applications, demanding both accuracy and efficiency. Traditional ASR-based KWS methods, such as greedy and beam search, explore the entire search space without explicitly prioritizing keyword detection, often leading to suboptimal performance. In this paper, we propose an effective keyword-specific KWS framework by introducing a streaming-oriented CTC-Transducer-combined frame-asynchronous system with multi-head frame-asynchronous decoding (MFA-KWS). Specifically, MFA-KWS employs keyword-specific phone-synchronous decoding for CTC and replaces conventional RNN-T with Token-and-Duration Transducer to enhance both performance and efficiency. Furthermore, we explore various score fusion strategies, including single-frame-based and consistency-based methods. Extensive experiments demonstrate the superior performance of MFA-KWS, which achieves state-of-the-art results on both fixed keyword and arbitrary keywords datasets, such as Snips, MobvoiHotwords, and LibriKWS-20, while exhibiting strong robustness in noisy environments. Among fusion strategies, the consistency-based CDC-Last method delivers the best performance. Additionally, MFA-KWS achieves a 47% to 63% speed-up over the frame-synchronous baselines across various datasets. Extensive experimental results confirm that MFA-KWS is an effective and efficient KWS framework, making it well-suited for on-device deployment.

From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data

Authors: Chun-Yi Kuan, Hung-yi Lee

Audio-aware large language models (ALLMs) have recently made great strides in understanding and processing audio inputs. These models are typically adapted from text-based large language models (LLMs) through additional training on audio-related tasks. However, this adaptation process presents two major limitations. First, ALLMs often suffer from catastrophic forgetting, where crucial textual capabilities like instruction-following are lost after training on audio data. In some cases, models may even hallucinate sounds that are not present in the input audio, raising concerns about reliability. Second, achieving cross-modal alignment between audio and language typically relies on large collections of task-specific question-answer pairs for instruction tuning, making it resource-intensive. To address these issues, previous works have leveraged the backbone LLMs to synthesize general-purpose, caption-style alignment data. In this paper, we propose a data generation framework that produces contrastive-like training data, designed to enhance ALLMs' ability to differentiate between present and absent sounds. We further extend our approach to multi-audio scenarios, enabling the model to either explain differences between audio inputs or produce unified captions that describe all inputs, thereby enhancing audio-language alignment. We refer to the entire ALLM training framework as bootstrapping audio-language alignment via synthetic data generation from backbone LLMs (BALSa). Experimental results indicate that our method effectively mitigates audio hallucinations while reliably maintaining strong performance on audio understanding and reasoning benchmarks, as well as instruction-following skills. Moreover, incorporating multi-audio training further enhances the model's comprehension and reasoning capabilities. Overall, BALSa offers an efficient and scalable approach to developing ALLMs.

A Narrative Review on Large AI Models in Lung Cancer Screening, Diagnosis, and Treatment Planning

Authors: Jiachen Zhong, Yiting Wang, Di Zhu, Ziwei Wang

Lung cancer remains one of the most prevalent and fatal diseases worldwide, demanding accurate and timely diagnosis and treatment. Recent advancements in large AI models have significantly enhanced medical image understanding and clinical decision-making. This review systematically surveys the state-of-the-art in applying large AI models to lung cancer screening, diagnosis, prognosis, and treatment. We categorize existing models into modality-specific encoders, encoder-decoder frameworks, and joint encoder architectures, highlighting key examples such as CLIP, BLIP, Flamingo, BioViL-T, and GLoRIA. We further examine their performance in multimodal learning tasks using benchmark datasets like LIDC-IDRI, NLST, and MIMIC-CXR. Applications span pulmonary nodule detection, gene mutation prediction, multi-omics integration, and personalized treatment planning, with emerging evidence of clinical deployment and validation. Finally, we discuss current limitations in generalizability, interpretability, and regulatory compliance, proposing future directions for building scalable, explainable, and clinically integrated AI systems. Our review underscores the transformative potential of large AI models to personalize and optimize lung cancer care.

Adaptive event-triggered robust tracking control of soft robots

Authors: Renjie Ma, Ziyao Qu, Zhijian Hu, Dong Zhao, Marios M. Polycarpou

Soft robots manufactured with flexible materials can be highly compliant and adaptive to their surroundings, which facilitates their application in areas such as dexterous manipulation and environmental exploration. This paper aims at investigating the tracking control problem for soft robots under uncertainty such as unmodeled dynamics and external disturbance. First, we establish a novel switching function and design the compensated tracking error dynamics by virtue of the command filter. Then, based on the backstepping methodology, the virtual controllers and the adaptive logic estimating the supremum of uncertainty impacts are developed for synthesizing an event-triggered control strategy. In addition, the uniformed finite-time stability certification is derived for different scenarios of the switching function. Finally, we perform a case study of a soft robot to illustrate the effectiveness of the proposed control algorithm.

Score-based Generative Diffusion Models to Synthesize Full-dose FDG Brain PET from MRI in Epilepsy Patients

Authors: Jiaqi Wu, Jiahong Ouyang, Farshad Moradi, Mohammad Mehdi Khalighi, Greg Zaharchuk

Fluorodeoxyglucose (FDG) PET to evaluate patients with epilepsy is one of the most common applications for simultaneous PET/MRI, given the need to image both brain structure and metabolism, but is suboptimal due to the radiation dose in this young population. Little work has been done synthesizing diagnostic quality PET images from MRI data or MRI data with ultralow-dose PET using advanced generative AI methods, such as diffusion models, with attention to clinical evaluations tailored for the epilepsy population. Here we compared the performance of diffusion- and non-diffusion-based deep learning models for the MRI-to-PET image translation task for epilepsy imaging using simultaneous PET/MRI in 52 subjects (40 train/2 validate/10 hold-out test). We tested three different models: 2 score-based generative diffusion models (SGM-Karras Diffusion [SGM-KD] and SGM-variance preserving [SGM-VP]) and a Transformer-Unet. We report results on standard image processing metrics as well as clinically relevant metrics, including congruency measures (Congruence Index and Congruency Mean Absolute Error) that assess hemispheric metabolic asymmetry, which is a key part of the clinical analysis of these images. The SGM-KD produced the best qualitative and quantitative results when synthesizing PET purely from T1w and T2 FLAIR images with the least mean absolute error in whole-brain specific uptake value ratio (SUVR) and highest intraclass correlation coefficient. When 1% low-dose PET images are included in the inputs, all models improve significantly and are interchangeable for quantitative performance and visual quality. In summary, SGMs hold great potential for pure MRI-to-PET translation, while all 3 model types can synthesize full-dose FDG-PET accurately using MRI and ultralow-dose PET.

Symmetric Sliding-Mode Control of Grid-Forming Inverters With Controllable Region Under AC and DC Sides Varying

Authors: Qianxi Tang, Li Peng

Conventional grid-forming (GFM) controls often entangle voltage formation with power flow and dc-source dynamics, which can degrade voltage tracking performance and stability under grid disturbances, load transients, and dc-side perturbations. To address this issue, a symmetric sliding-mode control (SSMC) method is developed and its explicit voltage controllable region is derived. It illustrates how much ac-side power dynamics and dc-link voltage varying can be decoupled from the voltage regulation task, which helps predict when the entangling appears. While conventional sliding-mode controls address voltage-tracking error through complex sliding surface designs, repetitive correction techniques or special reaching laws, this work identifies that the error at power-line frequency primarily stem from the asymmetry property of inverters with the delay effect and the computational inaccuracy. Guided by this insight, a symmetric compensation structure is proposed, which avoids added design complexity and directly mitigates low-frequency voltage tracking errors. Furthermore, the control design is supported by a physical and quantitative explanation, aiding in parameter tuning. Simulation and experimental results demonstrate that the proposed method achieves faster tracking responses-on the order of hundreds of microseconds-while maintaining robust and more accurate tracking under both dc-link voltage and ac-side current variations. Conventional grid-forming and classical sliding-mode controllers, which handle these disturbances separately, cannot match this combined speed and robustness. Furthermore, the voltage controllability analysis is explicitly verified.

CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following

Authors: Yinghao Ma, Siyou Li, Juntao Yu, Emmanouil Benetos, Akira Maezawa

Recent advances in audio-text large language models (LLMs) have opened new possibilities for music understanding and generation. However, existing benchmarks are limited in scope, often relying on simplified tasks or multi-choice evaluations that fail to reflect the complexity of real-world music analysis. We reinterpret a broad range of traditional MIR annotations as instruction-following formats and introduce CMI-Bench, a comprehensive music instruction following benchmark designed to evaluate audio-text LLMs on a diverse set of music information retrieval (MIR) tasks. These include genre classification, emotion regression, emotion tagging, instrument classification, pitch estimation, key detection, lyrics transcription, melody extraction, vocal technique recognition, instrument performance technique detection, music tagging, music captioning, and (down)beat tracking: reflecting core challenges in MIR research. Unlike previous benchmarks, CMI-Bench adopts standardized evaluation metrics consistent with previous state-of-the-art MIR models, ensuring direct comparability with supervised approaches. We provide an evaluation toolkit supporting all open-source audio-textual LLMs, including LTU, Qwen-audio, SALMONN, MusiLingo, etc. Experiment results reveal significant performance gaps between LLMs and supervised models, along with their culture, chronological and gender bias, highlighting the potential and limitations of current models in addressing MIR tasks. CMI-Bench establishes a unified foundation for evaluating music instruction following, driving progress in music-aware LLMs.

M3SD: Multi-modal, Multi-scenario and Multi-language Speaker Diarization Dataset

Authors: Shilong Wu

In the field of speaker diarization, the development of technology is constrained by two problems: insufficient data resources and poor generalization ability of deep learning models. To address these two problems, firstly, we propose an automated method for constructing speaker diarization datasets, which generates more accurate pseudo-labels for massive data through the combination of audio and video. Relying on this method, we have released Multi-modal, Multi-scenario and Multi-language Speaker Diarization (M3SD) datasets. This dataset is derived from real network videos and is highly diverse. Our dataset and code have been open-sourced at this https URL.

MDR-DeePC: Model-Inspired Distributionally Robust Data-Enabled Predictive Control

Authors: Shihao Li, Jiachen Li, Christopher Martin, Soovadeep Bakshi, Dongmei Chen

This paper presents a Model-Inspired Distributionally Robust Data-enabled Predictive Control (MDR-DeePC) framework for systems with partially known and uncertain dynamics. The proposed method integrates model-based equality constraints for known dynamics with a Hankel matrix-based representation of unknown dynamics. A distributionally robust optimization problem is formulated to account for parametric uncertainty and stochastic disturbances. Simulation results on a triple-mass-spring-damper system demonstrate improved disturbance rejection, reduced output oscillations, and lower control cost compared to standard DeePC. The results validate the robustness and effectiveness of MDR-DeePC, with potential for real-time implementation pending further benchmarking.

Fusing Radiomic Features with Deep Representations for Gestational Age Estimation in Fetal Ultrasound Images

Authors: Fangyijie Wang, Yuan Liang, Sourav Bhattacharjee, Abey Campbell, Kathleen M. Curran, Guénolé Silvestre

Accurate gestational age (GA) estimation, ideally through fetal ultrasound measurement, is a crucial aspect of providing excellent antenatal care. However, deriving GA from manual fetal biometric measurements depends on the operator and is time-consuming. Hence, automatic computer-assisted methods are demanded in clinical practice. In this paper, we present a novel feature fusion framework to estimate GA using fetal ultrasound images without any measurement information. We adopt a deep learning model to extract deep representations from ultrasound images. We extract radiomic features to reveal patterns and characteristics of fetal brain growth. To harness the interpretability of radiomics in medical imaging analysis, we estimate GA by fusing radiomic features and deep representations. Our framework estimates GA with a mean absolute error of 8.0 days across three trimesters, outperforming current machine learning-based methods at these gestational ages. Experimental results demonstrate the robustness of our framework across different populations in diverse geographical regions. Our code is publicly available on \href{this https URL}.

Identifiability and Maximum Likelihood Estimation for System Identification of Networks of Dynamical Systems

Authors: Anders Hansson, João Victor Galvão da Mata, Martin S. Andersen

In this paper we investigate identifiability and maximum likelihood estimation for direct system identification of networks of dynamical systems. We provide necessary and sufficient conditions for network identifiability in terms of Gröbner bases. We show that the maximum likelihood approach is both consistent and efficient, which is in contrast to existing prediction error approaches. Moreover, our approach has wider applicability, i.e., it is applicable whenever network identifiability holds. Finally, we show that we can formulate the maximum likelihood problem without the use of a predictor, which is the key to numerically being able to solve it efficiently.

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Authors: Huadai Liu, Jialei Wang, Kaicheng Luo, Wen Wang, Qian Chen, Zhou Zhao, Wei Xue

While end-to-end video-to-audio generation has greatly improved, producing high-fidelity audio that authentically captures the nuances of visual content remains challenging. Like professionals in the creative industries, such generation requires sophisticated reasoning about items such as visual dynamics, acoustic environments, and temporal relationships. We present ThinkSound, a novel framework that leverages Chain-of-Thought (CoT) reasoning to enable stepwise, interactive audio generation and editing for videos. Our approach decomposes the process into three complementary stages: foundational foley generation that creates semantically coherent soundscapes, interactive object-centric refinement through precise user interactions, and targeted editing guided by natural language instructions. At each stage, a multimodal large language model generates contextually aligned CoT reasoning that guides a unified audio foundation model. Furthermore, we introduce AudioCoT, a comprehensive dataset with structured reasoning annotations that establishes connections between visual content, textual descriptions, and sound synthesis. Experiments demonstrate that ThinkSound achieves state-of-the-art performance in video-to-audio generation across both audio metrics and CoT metrics and excels in out-of-distribution Movie Gen Audio benchmark. The demo page is available at this https URL.

Dehazing Light Microscopy Images with Guided Conditional Flow Matching: finding a sweet spot between fidelity and realism

Authors: Anirban Ray, Ashesh, Florian Jug

Fluorescence microscopy is a major driver of scientific progress in the life sciences. Although high-end confocal microscopes are capable of filtering out-of-focus light, cheaper and more accessible microscopy modalities, such as widefield microscopy, can not, which consequently leads to hazy image data. Computational dehazing is trying to combine the best of both worlds, leading to cheap microscopy but crisp-looking images. The perception-distortion trade-off tells us that we can optimize either for data fidelity, e.g. low MSE or high PSNR, or for data realism, measured by perceptual metrics such as LPIPS or FID. Existing methods either prioritize fidelity at the expense of realism, or produce perceptually convincing results that lack quantitative accuracy. In this work, we propose HazeMatching, a novel iterative method for dehazing light microscopy images, which effectively balances these objectives. Our goal was to find a balanced trade-off between the fidelity of the dehazing results and the realism of individual predictions (samples). We achieve this by adapting the conditional flow matching framework by guiding the generative process with a hazy observation in the conditional velocity field. We evaluate HazeMatching on 5 datasets, covering both synthetic and real data, assessing both distortion and perceptual quality. Our method is compared against 7 baselines, achieving a consistent balance between fidelity and realism on average. Additionally, with calibration analysis, we show that HazeMatching produces well-calibrated predictions. Note that our method does not need an explicit degradation operator to exist, making it easily applicable on real microscopy data. All data used for training and evaluation and our code will be publicly available under a permissive license.

Acousto-optic reconstruction of exterior sound field based on concentric circle sampling with circular harmonic expansion

Authors: Phuc Duc Nguyen, Kenji Ishikawa, Noboru Harada, Takehiro Moriya

Acousto-optic sensing provides an alternative approach to traditional microphone arrays by shedding light on the interaction of light with an acoustic field. Sound field reconstruction is a fascinating and advanced technique used in acousto-optics sensing. Current challenges in sound-field reconstruction methods pertain to scenarios in which the sound source is located within the reconstruction area, known as the exterior problem. Existing reconstruction algorithms, primarily designed for interior scenarios, often exhibit suboptimal performance when applied to exterior cases. This paper introduces a novel technique for exterior sound-field reconstruction. The proposed method leverages concentric circle sampling and a two-dimensional exterior sound-field reconstruction approach based on circular harmonic extensions. To evaluate the efficacy of this approach, both numerical simulations and practical experiments are conducted. The results highlight the superior accuracy of the proposed method when compared to conventional reconstruction methods, all while utilizing a minimal amount of measured projection data.

KnowSafe: Combined Knowledge and Data Driven Hazard Mitigation in Artificial Pancreas Systems

Authors: Xugui Zhou, Maxfield Kouzel, Chloe Smith, Homa Alemzadeh

Significant progress has been made in anomaly detection and run-time monitoring to improve the safety and security of cyber-physical systems (CPS). However, less attention has been paid to hazard mitigation. This paper proposes a combined knowledge and data driven approach, KnowSafe, for the design of safety engines that can predict and mitigate safety hazards resulting from safety-critical malicious attacks or accidental faults targeting a CPS controller. We integrate domain-specific knowledge of safety constraints and context-specific mitigation actions with machine learning (ML) techniques to estimate system trajectories in the far and near future, infer potential hazards, and generate optimal corrective actions to keep the system safe. Experimental evaluation on two realistic closed-loop testbeds for artificial pancreas systems (APS) and a real-world clinical trial dataset for diabetes treatment demonstrates that KnowSafe outperforms the state-of-the-art by achieving higher accuracy in predicting system state trajectories and potential hazards, a low false positive rate, and no false negatives. It also maintains the safe operation of the simulated APS despite faults or attacks without introducing any new hazards, with a hazard mitigation success rate of 92.8%, which is at least 76% higher than solely rule-based (50.9%) and data-driven (52.7%) methods.

Deep Multi-Manifold Transformation Based Multivariate Time Series Fault Detection

Authors: Hong Liu, Xiuxiu Qiu, Yiming Shi, Miao Xu, Zelin Zang, Zhen Lei

Unsupervised fault detection in multivariate time series plays a vital role in ensuring the stable operation of complex systems. Traditional methods often assume that normal data follow a single Gaussian distribution and identify anomalies as deviations from this distribution. {\color{black} However, this simplified assumption fails to capture the diversity and structural complexity of real-world time series, which can lead to misjudgments and reduced detection performance in practical applications. To address this issue, we propose a new method that combines a neighborhood-driven data augmentation strategy with a multi-manifold representation learning framework.} By incorporating information from local neighborhoods, the augmentation module can simulate contextual variations of normal data, enhancing the model's adaptability to distributional changes. In addition, we design a structure-aware feature learning approach that encourages natural clustering of similar patterns in the feature space while maintaining sufficient distinction between different operational states. Extensive experiments on several public benchmark datasets demonstrate that our method achieves superior performance in terms of both accuracy and robustness, showing strong potential for generalization and real-world deployment.

MedLeak: Multimodal Medical Data Leakage in Secure Federated Learning with Crafted Models

Authors: Shanghao Shi, Md Shahedul Haque, Abhijeet Parida, Chaoyu Zhang, Marius George Linguraru, Y.Thomas Hou, Syed Muhammad Anwar, Wenjing Lou

Federated learning (FL) allows participants to collaboratively train machine learning models while keeping their data local, making it ideal for collaborations among healthcare institutions on sensitive data. However, in this paper, we propose a novel privacy attack called MedLeak, which allows a malicious FL server to recover high-quality site-specific private medical data from the client model updates. MedLeak works by introducing an adversarially crafted model during the FL training process. Honest clients, unaware of the insidious changes in the published models, continue to send back their updates as per the standard FL protocol. Leveraging a novel analytical method, MedLeak can efficiently recover private client data from the aggregated parameter updates, eliminating costly optimization. In addition, the scheme relies solely on the aggregated updates, thus rendering secure aggregation protocols ineffective, as they depend on the randomization of intermediate results for security while leaving the final aggregated results unaltered. We implement MedLeak on medical image datasets (MedMNIST, COVIDx CXR-4, and Kaggle Brain Tumor MRI), as well as a medical text dataset (MedAbstract). The results demonstrate that our attack achieves high recovery rates and strong quantitative scores on both image and text datasets. We also thoroughly evaluate MedLeak across different attack parameters, providing insights into key factors that influence attack performance and potential defenses. Furthermore, we demonstrate that the recovered data can support downstream tasks such as disease classification with minimal performance loss. Our findings validate the need for enhanced privacy measures in FL systems, particularly for safeguarding sensitive medical data against powerful model inversion attacks.

Generalized Ellipsoids

Authors: Amir Ali Ahmadi, Abraar Chaudhry, Cemil Dibek

We introduce a family of symmetric convex bodies called generalized ellipsoids of degree $d$ (GE-$d$s), with ellipsoids corresponding to the case of $d=0$. Generalized ellipsoids (GEs) retain many geometric, algebraic, and algorithmic properties of ellipsoids. We show that the conditions that the parameters of a GE must satisfy can be checked in strongly polynomial time, and that one can search for GEs of a given degree by solving a semidefinite program whose size grows only linearly with dimension. We give an example of a GE which does not have a second-order cone representation, but show that every GE has a semidefinite representation whose size depends linearly on both its dimension and degree. In terms of expressiveness, we prove that for any integer $m\geq 2$, every symmetric full-dimensional polytope with $2m$ facets and every intersection of $m$ co-centered ellipsoids can be represented exactly as a GE-$d$ with $d \leq 2m-3$. Using this result, we show that every symmetric convex body can be approximated arbitrarily well by a GE-$d$ and we quantify the quality of the approximation as a function of the degree $d$. Finally, we present applications of GEs to several areas, such as time-varying portfolio optimization, stability analysis of switched linear systems, robust-to-dynamics optimization, and robust polynomial regression.

METEOR: Melody-aware Texture-controllable Symbolic Orchestral Music Generation via Transformer VAE

Authors: Dinh-Viet-Toan Le, Yi-Hsuan Yang

Re-orchestration is the process of adapting a music piece for a different set of instruments. By altering the original instrumentation, the orchestrator often modifies the musical texture while preserving a recognizable melodic line and ensures that each part is playable within the technical and expressive capabilities of the chosen instruments. In this work, we propose METEOR, a model for generating Melody-aware Texture-controllable re-Orchestration with a Transformer-based variational auto-encoder (VAE). This model performs symbolic instrumental and textural music style transfers with a focus on melodic fidelity and controllability. We allow bar- and track-level controllability of the accompaniment with various textural attributes while keeping a homophonic texture. With both subjective and objective evaluations, we show that our model outperforms style transfer models on a re-orchestration task in terms of generation quality and controllability. Moreover, it can be adapted for a lead sheet orchestration task as a zero-shot learning model, achieving performance comparable to a model specifically trained for this task.

NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics

Authors: David Robinson, Marius Miron, Masato Hagiwara, Benno Weck, Sara Keen, Milad Alizadeh, Gagan Narula, Matthieu Geist, Olivier Pietquin

Large language models (LLMs) prompted with text and audio have achieved state-of-the-art performance across various auditory tasks, including speech, music, and general audio, showing emergent abilities on unseen tasks. However, their potential has yet to be fully demonstrated in bioacoustics tasks, such as detecting animal vocalizations in large recordings, classifying rare and endangered species, and labeling context and behavior -- tasks that are crucial for conservation, biodiversity monitoring, and animal behavior studies. In this work, we present NatureLM-audio, the first audio-language foundation model specifically designed for bioacoustics. Our training dataset consists of carefully curated text-audio pairs spanning bioacoustics, speech, and music, designed to address the field's limited availability of annotated data. We demonstrate successful transfer of learned representations from music and speech to bioacoustics, and our model shows promising generalization to unseen taxa and tasks. We evaluate NatureLM-audio on a novel benchmark (BEANS-Zero) and it sets a new state of the art on several bioacoustics tasks, including zero-shot classification of unseen species. To advance bioacoustics research, we release our model weights, benchmark data, and open-source the code for training and benchmark data generation and model training.

FreeCodec: A disentangled neural speech codec with fewer tokens

Authors: Youqiang Zheng, Weiping Tu, Yueteng Kang, Jie Chen, Yike Zhang, Li Xiao, Yuhong Yang, Long Ma

Neural speech codecs have gained great attention for their outstanding reconstruction with discrete token representations. It is a crucial component in generative tasks such as speech coding and large language models (LLM). However, most works based on residual vector quantization perform worse with fewer tokens due to low coding efficiency for modeling complex coupled information. In this paper, we propose a neural speech codec named FreeCodec which employs a more effective encoding framework by decomposing intrinsic properties of speech into different components: 1) a global vector is extracted as the timbre information, 2) a prosody encoder with a long stride level is used to model the prosody information, 3) the content information is from a content encoder. Using different training strategies, FreeCodec achieves state-of-the-art performance in reconstruction and disentanglement scenarios. Results from subjective and objective experiments demonstrate that our framework outperforms existing methods.

FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Authors: Taekyung Ki, Dongchan Min, Gyeongsu Chae

With the rapid advancement of diffusion-based generative models, portrait image animation has achieved remarkable results. However, it still faces challenges in temporally consistent video generation and fast sampling due to its iterative sampling nature. This paper presents FLOAT, an audio-driven talking portrait video generation method based on flow matching generative model. Instead of a pixel-based latent space, we take advantage of a learned orthogonal motion latent space, enabling efficient generation and editing of temporally consistent motion. To achieve this, we introduce a transformer-based vector field predictor with an effective frame-wise conditioning mechanism. Additionally, our method supports speech-driven emotion enhancement, enabling a natural incorporation of expressive motions. Extensive experiments demonstrate that our method outperforms state-of-the-art audio-driven talking portrait methods in terms of visual quality, motion fidelity, and efficiency.

Interaction Identification of a Heterogeneous NDS with Quadratic-Bilinear Subsystems

Authors: Tong Zhou, Yubing Li

This paper attacks time-domain identification for interaction parameters of a heterogeneous networked dynamic system (NDS), with each of its subsystems being described by a continuous-time descriptor quadratic-bilinear time-invariant (QBTI) model. The obtained results can also be applied to parameter estimations for a lumped QBTI system. No restrictions are put on the sampling rate. Explicit formulas are derived respectively for the transient and steady-state responses of the NDS, provided that the probing signal is generated by a linear time invariant (LTI) system. Some relations have been derived between the NDS steady-state response and its frequency domain input-output mappings. These relations reveal that the value of some NDS associated generalized TFMs can in principle be estimated at almost any interested point of the imaginary axis from time-domain input-output experimental data, as well as its derivatives and a right tangential interpolation along an arbitrary direction. Based on these relations, an estimation algorithm is suggested respectively for the parameters of the NDS and the values of these generalized TFMs. A numerical example is included to illustrate characteristics of the suggested estimation algorithms.

DisCoPatch: Taming Adversarially-driven Batch Statistics for Improved Out-of-Distribution Detection

Authors: Francisco Caetano, Christiaan Viviers, Luis A. Zavala-Mondragón, Peter H. N. de With, Fons van der Sommen

Out-of-distribution (OOD) detection holds significant importance across many applications. While semantic and domain-shift OOD problems are well-studied, this work focuses on covariate shifts - subtle variations in the data distribution that can degrade machine learning performance. We hypothesize that detecting these subtle shifts can improve our understanding of in-distribution boundaries, ultimately improving OOD detection. In adversarial discriminators trained with Batch Normalization (BN), real and adversarial samples form distinct domains with unique batch statistics - a property we exploit for OOD detection. We introduce DisCoPatch, an unsupervised Adversarial Variational Autoencoder (VAE) framework that harnesses this mechanism. During inference, batches consist of patches from the same image, ensuring a consistent data distribution that allows the model to rely on batch statistics. DisCoPatch uses the VAE's suboptimal outputs (generated and reconstructed) as negative samples to train the discriminator, thereby improving its ability to delineate the boundary between in-distribution samples and covariate shifts. By tightening this boundary, DisCoPatch achieves state-of-the-art results in public OOD detection benchmarks. The proposed model not only excels in detecting covariate shifts, achieving 95.5% AUROC on ImageNet-1K(-C) but also outperforms all prior methods on public Near-OOD (95.0%) benchmarks. With a compact model size of 25MB, it achieves high OOD detection performance at notably lower latency than existing methods, making it an efficient and practical solution for real-world OOD detection applications. The code is publicly available.

Drivetrain simulation using variational autoencoders

Authors: Pallavi Sharma, Jorge-Humberto Urrea-Quintero, Bogdan Bogdan, Adrian-Dumitru Ciotec, Laura Vasilie, Henning Wessels, Matteo Skull

This work proposes variational autoencoders (VAEs) to predict a vehicle's jerk signals from torque demand in the context of limited real-world drivetrain datasets. We implement both unconditional and conditional VAEs, trained on experimental data from two variants of a fully electric SUV with differing torque and drivetrain configurations. The VAEs synthesize jerk signals that capture characteristics from multiple drivetrain scenarios by leveraging the learned latent space. A performance comparison with baseline physics-based and hybrid models confirms the effectiveness of the VAEs, without requiring detailed system parametrization. Unconditional VAEs generate realistic jerk signals without prior system knowledge, while conditional VAEs enable the generation of signals tailored to specific torque inputs. This approach reduces the dependence on costly and time-intensive real-world experiments and extensive manual modeling. The results support the integration of generative models such as VAEs into drivetrain simulation pipelines, both for data augmentation and for efficient exploration of complex operational scenarios, with the potential to streamline validation and accelerate vehicle development.

Semantic-Aware Adaptive Video Streaming Using Latent Diffusion Models for Wireless Networks

Authors: Zijiang Yan, Jianhua Pei, Hongda Wu, Hina Tabassum, Ping Wang

This paper proposes a novel Semantic Communication (SemCom) framework for real-time adaptive-bitrate video streaming by integrating Latent Diffusion Models (LDMs) within the FFmpeg techniques. This solution addresses the challenges of high bandwidth usage, storage inefficiencies, and quality of experience (QoE) degradation associated with traditional Constant Bitrate Streaming (CBS) and Adaptive Bitrate Streaming (ABS). The proposed approach leverages LDMs to compress I-frames into a latent space, offering significant storage and semantic transmission savings without sacrificing high visual quality. While retaining B-frames and P-frames as adjustment metadata to support efficient refinement of video reconstruction at the user side, the proposed framework further incorporates state-of-the-art denoising and Video Frame Interpolation (VFI) techniques. These techniques mitigate semantic ambiguity and restore temporal coherence between frames, even in noisy wireless communication environments. Experimental results demonstrate the proposed method achieves high-quality video streaming with optimized bandwidth usage, outperforming state-of-the-art solutions in terms of QoE and resource efficiency. This work opens new possibilities for scalable real-time video streaming in 5G and future post-5G networks.

Efficient malicious information detection method based on set partitioning for large-scale Internet of Things

Authors: Yuhan Suo, Runqi Chai, Kaiyuan Chen, Senchun Chai, Wannian Liang, Yuanqing Xia

With the large-scale integration of Internet of Things (IoT) into enterprise information management systems, organizations are pursuing digital transformation that hinges on real-time data insights-and yet face escalating security and governance risks. Detecting and responding to threats at scale without impairing system efficiency has therefore become a critical information-management and decision-support challenge for today's executives. This paper develops a distributed, gain-based anomaly-detection framework tailored to IoT-enabled enterprise systems, underpinned by an optimized sensor-subset partitioning strategy. Starting from the perspective of set partitioning strategies, this study analyzes the key factor that contributes to the performance differences between distributed and centralized algorithms. By examining the gain mutual influence of sensor subsets, an optimal set partitioning strategy is designed to minimize inter-subset mutual influence while enhancing intra-subset correlation. To further reduce the computational cost of gain updates, a suboptimal partitioning strategy based on Grassmann distance is proposed, improving the efficiency of selecting suspicious sensors. Theoretical analysis demonstrates that this approach effectively reduces the computational cost of gain updates while maintaining detection performance. Finally, simulation results validate the effectiveness of the proposed method in enhancing attack detection performance.

Controlled Invariance in Fully Actuated Max-plus Linear Systems with Precedence Semimodules

Authors: Davide Zorzenon, Jörg Raisch

Given a max-plus linear system and a semimodule, the problem of computing the maximal controlled invariant subsemimodule is still open to this day. In this paper, we consider this problem for the specific class of fully actuated systems and constraints in the form of precedence semimodules. The assumption of full actuation corresponds to the existence of an input for each component of the system state. A precedence semimodule is the set of solutions of inequalities typically used to represent time-window constraints. We prove that, in this setting, it is possible to (i) compute the maximal controlled invariant subsemimodule and (ii) decide the convergence of a fixed-point algorithm introduced by R.D. Katz in strongly polynomial time.

Optimal Beamforming for Multi-Target Multi-User ISAC Exploiting Prior Information: How Many Sensing Beams Are Needed?

Authors: Jiayi Yao, Shuowen Zhang

This paper studies a multi-target multi-user integrated sensing and communication (ISAC) system where a multi-antenna base station (BS) communicates with multiple single-antenna users in the downlink and senses the unknown and random angle information of multiple targets based on their reflected echo signals at the BS receiver as well as their prior probability information. We focus on a general beamforming structure with both communication beams and dedicated sensing beams, whose design is highly non-trivial as more sensing beams provide more flexibility in sensing, but introduce extra interference to communication. To resolve this trade-off, we first characterize the periodic posterior Cramér-Rao bound (PCRB) as a lower bound of the mean-cyclic error (MCE) in multi-target sensing. Then, we optimize the beamforming to minimize the maximum periodic PCRB among all targets to ensure fairness, subject to individual communication rate constraints at multiple users. Despite the non-convexity of this problem, we propose a general construction method for the optimal solution by leveraging semi-definite relaxation (SDR), and derive a general bound on the number of sensing beams needed. Moreover, we unveil specific structures of the optimal solution in various cases, where tighter bounds on the number of sensing beams needed are derived (e.g., no or at most one sensing beam is needed under stringent rate constraints or with homogeneous targets). Next, we study the beamforming optimization to minimize the sum periodic PCRB under user rate constraints. By applying SDR, we propose a general construction method for the optimal solution and its specific structures which yield lower computational complexities. We derive a general bound and various tighter bounds on the number of sensing beams needed. Numerical results validate our analysis and effectiveness of our proposed beamforming designs.

Wi-Fi 6 Cross-Technology Interference Detection and Mitigation by OFDMA: an Experimental Study

Authors: Thijs Havinga, Xianjun Jiao, Wei Liu, Baiheng Chen, Adnan Shahid, Ingrid Moerman

Cross-Technology Interference (CTI) poses challenges for the performance and robustness of wireless networks. There are opportunities for better cooperation if the spectral occupation and technology of the interference can be detected. Namely, this information can help the Orthogonal Frequency Division Multiple Access (OFDMA) scheduler in IEEE 802.11ax (Wi-Fi 6) to efficiently allocate resources to multiple users inthe frequency domain. This work shows that a single Channel State Information (CSI) snapshot, which is used for packet demodulation in the receiver, is enough to detect and classify the type of CTI on low-cost Wi-Fi 6 hardware. We show the classification accuracy of a small Convolutional Neural Network (CNN) for different Signal-to-Noise Ratio (SNR) and Signal-to-Interference Ratio (SIR) with simulated data, as well as using a wired and over-the-air test with a professional wireless connectivity tester, while running the inference on the low-cost device. Furthermore, we use openwifi, a full-stack Wi-Fi transceiver running on software-defined radio (SDR) available in the w-iLab.t testbed, as Access Point (AP) to implement a CTI-aware multi-user OFDMA scheduler when the clients send CTI detection feedback to the AP. We show experimentally that it can fully mitigate the 35% throughput loss caused by CTI when the AP applies the appropriate scheduling.

Modular Distributed Nonconvex Learning with Error Feedback

Authors: Guido Carnevale, Nicola Bastianello

In this paper, we design a novel distributed learning algorithm using stochastic compressed communications. In detail, we pursue a modular approach, merging ADMM and a gradient-based approach, benefiting from the robustness of the former and the computational efficiency of the latter. Additionally, we integrate a stochastic integral action (error feedback) enabling almost sure rejection of the compression error. We analyze the resulting method in nonconvex scenarios and guarantee almost sure asymptotic convergence to the set of stationary points of the problem. This result is obtained using system-theoretic tools based on stochastic timescale separation. We corroborate our findings with numerical simulations in nonconvex classification.

A Coupled Friedkin-Johnsen Model of Popularity Dynamics in Social Media

Authors: Gaya Cocca, Paolo Frasca, Chiara Ravazzi

Popularity dynamics in social media depend on a complex interplay of social influence between users and popularity-based recommendations that are provided by the platforms. In this work, we introduce a discrete-time dynamical system to model the evolution of popularity on social media. Our model generalizes the well-known Friedkin-Johnsen model to a set of influencers vying for popularity. We study the asymptotic behavior of this model and illustrate it with numerical examples. Our results highlight the interplay of social influence, past popularity, and content quality in determining the popularity of influencers.

Super-Resolution Generative Adversarial Networks based Video Enhancement

Authors: Kağan Çetin, Hacer Akça, Ömer Nezih Gerek

This study introduces an enhanced approach to video super-resolution by extending ordinary Single-Image Super-Resolution (SISR) Super-Resolution Generative Adversarial Network (SRGAN) structure to handle spatio-temporal data. While SRGAN has proven effective for single-image enhancement, its design does not account for the temporal continuity required in video processing. To address this, a modified framework that incorporates 3D Non-Local Blocks is proposed, which is enabling the model to capture relationships across both spatial and temporal dimensions. An experimental training pipeline is developed, based on patch-wise learning and advanced data degradation techniques, to simulate real-world video conditions and learn from both local and global structures and details. This helps the model generalize better and maintain stability across varying video content while maintaining the general structure besides the pixel-wise correctness. Two model variants-one larger and one more lightweight-are presented to explore the trade-offs between performance and efficiency. The results demonstrate improved temporal coherence, sharper textures, and fewer visual artifacts compared to traditional single-image methods. This work contributes to the development of practical, learning-based solutions for video enhancement tasks, with potential applications in streaming, gaming, and digital restoration.

Spotlight-TTS: Spotlighting the Style via Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech

Authors: Nam-Gyu Kim, Deok-Hyeon Cho, Seung-Bin Kim, Seong-Whan Lee

Recent advances in expressive text-to-speech (TTS) have introduced diverse methods based on style embedding extracted from reference speech. However, synthesizing high-quality expressive speech remains challenging. We propose Spotlight-TTS, which exclusively emphasizes style via voiced-aware style extraction and style direction adjustment. Voiced-aware style extraction focuses on voiced regions highly related to style while maintaining continuity across different speech regions to improve expressiveness. We adjust the direction of the extracted style for optimal integration into the TTS model, which improves speech quality. Experimental results demonstrate that Spotlight-TTS achieves superior performance compared to baseline models in terms of expressiveness, overall speech quality, and style transfer capability. Our audio samples are publicly available.

Video-Guided Text-to-Music Generation Using Public Domain Movie Collections

Authors: Haven Kim, Zachary Novack, Weihan Xu, Julian McAuley, Hao-Wen Dong

Despite recent advancements in music generation systems, their application in film production remains limited, as they struggle to capture the nuances of real-world filmmaking, where filmmakers consider multiple factors-such as visual content, dialogue, and emotional tone-when selecting or composing music for a scene. This limitation primarily stems from the absence of comprehensive datasets that integrate these elements. To address this gap, we introduce Open Screen Soundtrack Library (OSSL), a dataset consisting of movie clips from public domain films, totaling approximately 36.5 hours, paired with high-quality soundtracks and human-annotated mood information. To demonstrate the effectiveness of our dataset in improving the performance of pre-trained models on film music generation tasks, we introduce a new video adapter that enhances an autoregressive transformer-based text-to-music model by adding video-based conditioning. Our experimental results demonstrate that our proposed approach effectively enhances MusicGen-Medium in terms of both objective measures of distributional and paired fidelity, and subjective compatibility in mood and genre. To facilitate reproducibility and foster future work, we publicly release the dataset, code, and demo.

Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion

Authors: Markus Frohmann, Gabriel Meseguer-Brocal, Markus Schedl, Elena V. Epure

The rapid advancement of AI-based music generation tools is revolutionizing the music industry but also posing challenges to artists, copyright holders, and providers alike. This necessitates reliable methods for detecting such AI-generated content. However, existing detectors, relying on either audio or lyrics, face key practical limitations: audio-based detectors fail to generalize to new or unseen generators and are vulnerable to audio perturbations; lyrics-based methods require cleanly formatted and accurate lyrics, unavailable in practice. To overcome these limitations, we propose a novel, practically grounded approach: a multimodal, modular late-fusion pipeline that combines automatically transcribed sung lyrics and speech features capturing lyrics-related information within the audio. By relying on lyrical aspects directly from audio, our method enhances robustness, mitigates susceptibility to low-level artifacts, and enables practical applicability. Experiments show that our method, DE-detect, outperforms existing lyrics-based detectors while also being more robust to audio perturbations. Thus, it offers an effective, robust solution for detecting AI-generated music in real-world scenarios. Our code is available at this https URL.

Efficient Channel Estimation for Rotatable Antenna-Enabled Wireless Communication

Authors: Xue Xiong, Beixiong Zheng, Wen Wu, Xiaodan Shao, Liang Dai, Ming-Min Zhao, Jie Tang

Non-fixed flexible antenna architectures, such as fluid antenna system (FAS), movable antenna (MA), and pinching antenna, have garnered significant interest in recent years. Among them, rotatable antenna (RA) is a promising antenna architecture that exploits additional spatial degrees of freedom (DoFs) to enhance the communication performance. To fully obtain the performance gain provided by RAs, accurate channel state information (CSI) is essential for adjusting the orientation/boresight of each antenna. In this letter, we propose an efficient channel estimation scheme for RA communication systems, where the base station (BS) can sequentially and adaptively adjust the orientations of RAs to enrich the environmental observations from diverse angular perspectives, thereby enhancing the channel estimation accuracy. The proposed scheme includes two main procedures that are conducted alternately during each channel training period. Specifically, the first procedure is to estimate the CSI with given RAs' orientations, involving the angle-of-arrivals (AoAs) information and path gains. Then, based on the estimated CSI, the second procedure adjusts the RAs' orientations to maximize the effective channel gain. Simulation results demonstrate that the proposed channel estimation method outperforms other benchmark schemes.

Relevant ArXiv eess Papers - 2025-06-30

Joint RIS-UE Association and Beamforming Design in RIS-Assisted Cell-Free MIMO Network

Authors: Hongqin Ke, Jindan Xu, Wei Xu, Chau Yuen, Zhaohua Lu

Reconfigurable intelligent surface (RIS)-assisted cell-free (CF) multiple-input multiple-output (MIMO) networks can significantly enhance system performance. However, the extensive deployment of RIS elements imposes considerable channel acquisition overhead, with the high density of nodes and antennas in RIS-assisted CF networks amplifying this challenge. To tackle this issue, in this paper, we explore integrating RIS-user equipment (UE) association into downlink RIS-assisted CF transmitter design, which greatly reduces the channel acquisition costs. The key point is that once UEs are associated with specific RISs, there is no need to frequently acquire channels from non-associated RISs. Then, we formulate the problem of joint RIS-UE association and beamforming at APs and RISs to maximize the weighted sum rate (WSR). In particular, we propose a two-stage framework to solve it. In the first stage, we apply a many-to-many matching algorithm to establish the RIS-UE association. In the second stage, we introduce a sequential optimization-based method that decomposes the joint optimization of RIS phase shifts and AP beamforming into two distinct subproblems. To optimize the RIS phase shifts, we employ the majorization-minimization (MM) algorithm to obtain a semi-closed-form solution. For AP beamforming, we develop a joint block diagonalization algorithm, which yields a closed-form solution. Simulation results demonstrate the effectiveness of the proposed algorithm and show that, while RIS-UE association significantly reduces overhead, it incurs a minor performance loss that remains within an acceptable range. Additionally, we investigate the impact of RIS deployment and conclude that RISs exhibit enhanced performance when positioned between APs and UEs.

Complex Phase Analysis of Power Grid Dynamics

Authors: Jakob Niehues, Anna Büttner, Anne Riegler, Frank Hellmann

With an increasing share of renewable energy sources, accurate and efficient modeling of grid-forming inverters is becoming crucial for system stability. Linear methods are a powerful tool for understanding dynamics close to an operating point, but usually depend on the reference trajectory. Thus, small deviations can render linear models invalid over time, posing a significant challenge in practice, and complicating theoretical analysis. As a solution, we show that the complex phase offers a robust formulation independent of reference phases and frequencies, thus preserving invariance properties under linearization. This enables robust system identification during realistic conditions and opens the road to powerful stability analysis of inverter-based grids.

A Matlab-based Toolbox for Automatic EMT Modeling and Small-Signal Stability Analysis of Modern Power Systems

Authors: Josep Arevalo-Soler, Dionysios Moutevelis, Elia Mateu-Barriendos, Onur Alican, Carlos Collados-Rodriguez, Marc Cheah-Mañe, Eduardo Prieto-Araujo, Oriol Gomis-Bellmunt

The intensive integration of power converters is changing the way that power systems operate, leading to the emergence of new types of dynamic phenomena and instabilities. At the same time, converters act as an interface between traditional AC grids and their more recent DC counterparts, giving rise to hybrid AC/DC networks. These conditions increase the necessity for stability analysis tools that can simultaneously account for the newly-introduced dynamic phenomena and can also be applied for the stability study of hybrid networks. This paper presents a Matlab-based toolbox for small-signal analysis of hybrid AC/DC power systems considering electromagnetic-transient (EMT) models. The toolbox allows the automatized modeling of the system from the input data and offers options for modal, impedance and passivity analyses. In the paper, the structure and internal processes of the toolbox are duly discussed, together with all its features, both main and complementary. Its capabilities for stability analysis are demonstrated via comprehensive case studies of converter-based system of various size and topology.

Economic Model Predictive Control with a Non-Fixed Reference Trajectory for Optimal Microgrid Dispatch

Authors: Avik Ghosh, Adil Khurram, Jan Kleissl, Sonia Martinez

Economic Model Predictive Control (EMPC), instead of stabilizing a reference trajectory/state in the objective function like a Tracking MPC, optimizes the economic performance over the prediction horizon, making it attractive for economical microgrid (MG) dispatch. However, the demand charge component in the monthly electricity cost, make it difficult to be encapsulated in additive stage costs, and can make solutions violate the principle of optimality if naively introduced in the objective function. Moreover, previous EMPC based works mostly rely on a-priori knowledge of an optimal economic steady state or optimal periodic trajectory for performance guarantees, which are not useful or possibly don't exist respectively, for real-time economical MG dispatch where load/generation forecasts are known only 24-48 h in advance. This paper, first, proposes an EMPC formulation for a generic deterministic discrete non-linear time varying system with hard state and input constraints, without any a-priori requirements of an optimal economic steady state or optimal periodic trajectory. It is proved that under mild assumptions on terminal cost and region, the asymptotic average economic cost of the proposed method is no worse than the asymptotic average economic cost of any other non-fixed arbitrary reference trajectory which is known only until the current time-step. The EMPC framework is then leveraged for optimal MG dispatch by showing that the problem can be reformulated to satisfy the assumptions required for the asymptotic performance guarantee. Realistic simulations at the Port of San Diego MG demonstrated that the proposed method can also reduce monthly electricity costs in closed-loop with respect to reference trajectories generated by directly optimizing the electricity cost function over the prediction horizon or by tracking an ideal grid import curve in a majority of the cases.

Day-Ahead Bidding Strategies for Wind Farm Operators under a One-Price Balancing Scheme

Authors: Max Bruninx, Timothy Verstraeten, Jalal Kazempour, Jan Helsen

We study day-ahead bidding strategies for wind farm operators under a one-price balancing scheme, prevalent in European electricity markets. In this setting, the profit-maximising strategy becomes an all-or-nothing strategy, aiming to take advantage of open positions in the balancing market. However, balancing prices are difficult, if not impossible, to forecast in the day-ahead stage and large open positions can affect the balancing price by changing the direction of the system imbalance. This paper addresses day-ahead bidding as a decision-making problem under uncertainty, with the objective of maximising the expected profit while reducing the imbalance risk related to the strategy. To this end, we develop a stochastic optimisation problem with explicit constraints on the positions in the balancing market, providing risk certificates, and derive an analytical solution to this problem. Moreover, we show how the price-impact of the trading strategy on the balancing market can be included in the ex-post evaluation. Using real data from the Belgian electricity market and an offshore wind farm in the North Sea, we demonstrate that the all-or-nothing strategy negatively impacts the balancing price, resulting in long-term losses for the wind farm. Our risk-constrained strategy, however, can still significantly enhance operational profit compared to traditional point-forecast bidding.

Relevant ArXiv eess Papers - 2025-06-27

DPLib: A Standard Benchmark Library for Distributed Power System Analysis and Optimization

Authors: Milad Hasanzadeh, Amin Kargarian

\textit{DPLib} is an open-source MATLAB-based benchmark library created to support research and development in distributed and decentralized power system analysis and optimization. Distributed and decentralized methods offer scalability, privacy preservation, and resilience to single points of failure, making them increasingly important for modern power systems. However, unlike centralized tools such as MATPOWER, no general-purpose, reproducible data library package currently exists for distributed power system studies. DPLib fills this gap by providing a standard power system library featuring over 20 multi-region benchmark test cases of varying sizes, along with a graph-based partitioning toolkit that decomposes any MATPOWER test system into multiple electrically coherent regions. The partitioning toolkit, an easy-to-use MATLAB code, generates standardized \texttt{.mat} and \texttt{.m} files, along with region visualizations for intuitive understanding. We also provide modular, easy-to-use distributed optimal power flow (OPF) solvers: an alternating direction method of multipliers(ADMM)-based DC-OPF solver implemented in YALMIP, and an ADMM-based AC-OPF solver leveraging IPOPT. These solvers validate the generated test systems for distributed optimization applications. Numerical results validate the generated test cases, establishing DPLib as a foundation for reproducible distributed power system research.

Optimal Parameter Design for Power Electronic Converters Using a Probabilistic Learning-Based Stochastic Surrogate Model

Authors: Akash Mahajan, Shivam Chaturvedi, Srijita Das, Wencong Su, Van-Hai Bui

The selection of optimal design for power electronic converter parameters involves balancing efficiency and thermal constraints to ensure high performance without compromising safety. This paper introduces a probabilistic-learning-based stochastic surrogate modeling framework to address this challenge and significantly reduce the time required during the design phase. The approach begins with a neural network classifier that evaluates the feasibility of parameter configurations, effectively filtering out unsafe and/or impractical inputs. Subsequently, a probabilistic prediction model estimates the converter's efficiency and temperature while quantifying prediction uncertainty, providing both performance insights and reliability metrics. Finally, a heuristic optimization-based model is employed to optimize a multi-objective function that maximizes efficiency while adhering to thermal constraints. The optimization process incorporates penalty terms to discourage solutions that violate practical thresholds, ensuring actionable and realistic recommendations. An advanced heuristic optimization method is used to find the optimal solution and is compared with several well-known search algorithms, including Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Simulated Annealing (SA), Tabu-Search (TS), and Stochastic Hill Climbing (SHC). The results demonstrate significant improvements in predictive accuracy and optimization outcomes, offering a robust solution for advancing power electronics design.

Estimating Technical Loss without Power Flows: A Practical, Data-Driven Approach for Loss Estimation in Distribution Grids

Authors: Mohini Bariya, Genevieve Flaspohler

Electric grids in low- and middle-income countries (LMICs) across the world face an acute challenge. To support global decarbonisation efforts and raise millions from energy poverty, these grids must shoulder substantial load growth while integrating distributed renewable generation. However, decades of rapid and poorly funded infrastructure expansions have led to national grids in many LMICs that are strained and weak, composed of aging, faulty, and undersized infrastructure. A cause and symptom of this weakness is excessive technical loss within the grid infrastructure during energy delivery, particularly at the distribution level; network losses are regularly estimated to be well over 20 percent, compared to a baseline of 5 percent in higher-income nations. Addressing technical loss through targeted interventions is essential for bolstering grids' physical and economic strength. Unfortunately, current approaches for estimating and localizing technical loss require expensive, extensive power flow sensing, which is essentially absent in LMIC distribution systems. We present a novel approach to technical loss estimation without power flows, which leverages more readily available voltage magnitude measurements at sparse locations in the grid. This estimator puts loss estimation and localization within reach for LMIC grids globally, and provides a critical tool for the effective design, implementation, and evaluation of loss-reduction interventions.

Joint Scheduling of DER under Demand Charges: Structure and Approximation

Authors: Ruixiao Yang, Gulai Shen, Ahmed S. Alahmed, Chuchu Fan

We study the joint scheduling of behind-the-meter distributed energy resources (DERs), including flexible loads, renewable generation, and battery energy storage systems, under net energy metering frameworks with demand charges. The problem is formulated as a stochastic dynamic program aimed at maximizing expected operational surplus while accounting for renewable generation uncertainty. We analytically characterize the structure of the optimal control policy and show that it admits a threshold-based form. However, due to the strong temporal coupling of the storage and demand charge constraints, the number of conditional branches in the policy scales combinatorially with the scheduling horizon, as it requires a look-ahead over future states. To overcome the high computational complexity in the general formulation, an efficient approximation algorithm is proposed, which searches for the peak demand under a mildly relaxed problem. We show that the algorithm scales linearly with the scheduling horizon. Extensive simulations using two open-source datasets validate the proposed algorithm and compare its performance against different DER control strategies, including a reinforcement learning-based one. Under varying storage and tariff parameters, the results show that the proposed algorithm outperforms various benchmarks in achieving a relatively small solution gap compared to the theoretical upper bound.

A Review of Safe Reinforcement Learning Methods for Modern Power Systems

Authors: Tong Su, Tong Wu, Junbo Zhao, Anna Scaglione, Le Xie

Given the availability of more comprehensive measurement data in modern power systems, reinforcement learning (RL) has gained significant interest in operation and control. Conventional RL relies on trial-and-error interactions with the environment and reward feedback, which often leads to exploring unsafe operating regions and executing unsafe actions, especially when deployed in real-world power systems. To address these challenges, safe RL has been proposed to optimize operational objectives while ensuring safety constraints are met, keeping actions and states within safe regions throughout both training and deployment. Rather than relying solely on manually designed penalty terms for unsafe actions, as is common in conventional RL, safe RL methods reviewed here primarily leverage advanced and proactive mechanisms. These include techniques such as Lagrangian relaxation, safety layers, and theoretical guarantees like Lyapunov functions to rigorously enforce safety boundaries. This paper provides a comprehensive review of safe RL methods and their applications across various power system operations and control domains, including security control, real-time operation, operational planning, and emerging areas. It summarizes existing safe RL techniques, evaluates their performance, analyzes suitable deployment scenarios, and examines algorithm benchmarks and application environments. The paper also highlights real-world implementation cases and identifies critical challenges such as scalability in large-scale systems and robustness under uncertainty, providing potential solutions and outlining future directions to advance the reliable integration and deployment of safe RL in modern power systems.

Exploring the Effects of Load Altering Attacks on Load Frequency Control through Python and RTDS

Authors: Michał Forystek, Andrew D. Syrmakesis, Alkistis Kontou, Panos Kotsampopoulos, Nikos D. Hatziargyriou, Charalambos Konstantinou

The modern power grid increasingly depends on advanced information and communication technology (ICT) systems to enhance performance and reliability through real-time monitoring, intelligent control, and bidirectional communication. However, ICT integration also exposes the grid to cyber-threats. Load altering attacks (LAAs), which use botnets of high-wattage devices to manipulate load profiles, are a notable threat to grid stability. While previous research has examined LAAs, their specific impact on load frequency control (LFC), critical for maintaining nominal frequency during load fluctuations, still needs to be explored. Even minor frequency deviations can jeopardize grid operations. This study bridges the gap by analyzing LAA effects on LFC through simulations of static and dynamic scenarios using Python and RTDS. The results highlight LAA impacts on frequency stability and present an eigenvalue-based stability assessment for dynamic LAAs (DLAAs), identifying key parameters influencing grid resilience.

Relevant ArXiv eess Papers - 2025-06-26

A Multi-Modal Spatial Risk Framework for EV Charging Infrastructure Using Remote Sensing

Authors: Oktay Karakuş, Padraig Corcoran

Electric vehicle (EV) charging infrastructure is increasingly critical to sustainable transport systems, yet its resilience under environmental and infrastructural stress remains underexplored. In this paper, we introduce RSERI-EV, a spatially explicit and multi-modal risk assessment framework that combines remote sensing data, open infrastructure datasets, and spatial graph analytics to evaluate the vulnerability of EV charging stations. RSERI-EV integrates diverse data layers, including flood risk maps, land surface temperature (LST) extremes, vegetation indices (NDVI), land use/land cover (LULC), proximity to electrical substations, and road accessibility to generate a composite Resilience Score. We apply this framework to the country of Wales EV charger dataset to demonstrate its feasibility. A spatial $k$-nearest neighbours ($k$NN) graph is constructed over the charging network to enable neighbourhood-based comparisons and graph-aware diagnostics. Our prototype highlights the value of multi-source data fusion and interpretable spatial reasoning in supporting climate-resilient, infrastructure-aware EV deployment.

Recursive-ARX for Grid-Edge Fault Detection

Authors: Soufiane El Yaagoubi, Keith Moffat, Eduardo Prieto Araujo, Florian Dörfler

Future electrical grids will require new ways to identify faults as inverters are not capable of supplying large fault currents to support existing fault detection methods and because distributed resources may feed faults from the edge of the grid. This paper proposes the use of real-time system identification for online power-system fault detection. Specifically, we implement Recursive ARX (rARX) system identification on a grid-connected inverter. Experiments demonstrate that the proposed rARX method is able to both detect large faults quickly, and distinguish between high-impedance faults and large load increases. These results indicate that rARX grid-edge fault detection is a promising research direction for improving the reliability and safety of modern electric grids.

Near-Field SWIPT Using XL-MIMO: Power Allocation and Subarray Activation

Authors: Muhammad Zeeshan Mumtaz, Mohammadali Mohammadi, Hien Quoc Ngo, Michail Matthaiou

This paper investigates the simultaneous wireless information and power transfer (SWIPT) capability of a modular extremely large multiple-input multiple-output (XL-MIMO) system, in the context of power consumption (PC) efficiency. The network users are divided into two functional categories: information decoding (ID) users and energy harvesting (EH) users. Non-stationary near-field channels are considered whilst the users are located in spatially distinct visibility regions (VRs). We formulate a two-tier joint optimization problem to minimize the PC, taking into account the power allocation (PA) for ID and EH users, along with the activation of constituent XL-MIMO subarrays. This complicated mixed-integer problem is transformed into more tractable formulations and efficient algorithms are proposed for solving them. The numerical results demonstrate that the overall PC of the XL-MIMO system for the proposed method is reduced by more than 60% in comparison to the benchmark scheme of equal PA with full subarray activation (SA) and 30% against the case of optimized PA with full SA, while satisfying the quality-of-service (QoS) constraints on both the downlink rate of the ID users and harvested energy at the EH users.

Near-Field Energy Harvesting Using XL-MIMO Over Non-Stationary Channels

Authors: Muhammad Zeeshan Mumtaz, Mohammadali Mohammadi, Hien Quoc Ngo, Michail Matthaiou

This paper explores the maximization of the harvested power efficiency (HPE) in a modular extremely large multiple-input multiple-output (XL-MIMO) system, which supports energy harvesting (EH) for near-field users. These users are located in spatially distinct visibility regions (VRs) with non-stationary channel characteristics. We propose to determine which sub-arrays are switched on or off as well the power control coefficients at the sub-arrays to maximize the HPE. The design can be processed via a multi-tier joint optimization framework based on fractional programming. The numerical results showcase that the HPE performance of the proposed algorithm is nearly optimal, comparable to that of exhaustive search. As a matter of fact, it achieves up to a 120% gain over the benchmark scheme which uses the entire XL-MIMO array with equal power allocation (PA) across sub-arrays, while significantly reducing the computational time.

A Data-Driven Approach for Topology Correction in Low Voltage Networks with DERs

Authors: Dong Liu, Sander Timmerman, Yu Xiang, Peter Palensky, Pedro P. Vergara

This paper introduces a data-driven topology identification and correction approach for low-voltage distribution networks (LVDNs) combined with a time-based smart meter data selection strategy, aiming to correct outdated recordings and identify the missed recordings. The proposed approach solely relies on voltage magnitude measurements, releasing privacy concerns and measurement burdens. It enables the distribution system operators to identify switch states through supervised learning algorithms, as well as determine user-feeder connections and phase labels of customers by a modified Hierarchical Clustering algorithm. To address the similarity among smart meter (SM) data caused by distributed photovoltaic (PV) systems, a time-based SM data selection strategy is combined with the proposed correlation analysis. The feasibility and robustness of the proposed approach are validated using modified real-world LVDNs and multiple incomplete SM datasets collected from customers in the Netherlands. The results demonstrate that the time-based SM data selection strategy effectively mitigates their impact on phase identification, and the corrected topology not only improves network observability but also supports network operators in load balancing and PV consumption.

Analyzing the Impact of Strategic Bidding on the Reserve Capacity via a Bi-Level Model

Authors: Yun Xu, Yunxiao Bai, Yunyong Zhang, Peng Wang, Xuelin Wang, Jiqun Guo, Kaijun Xie, Rusheng Zhao

The growing integration of renewable energy sources necessitates adequate reserve capacity to maintain power balance. However, in market clearing, power companies with flexible resources may submit strategic bids to maximize profits, potentially compromising system reserves. This paper examines the effects of such strategic behavior by modeling the market as a bi-level problem. The upper level represents a strategic company aiming to maximize profit, while the lower level simulates the system operator clearing the market based on submitted offers. To enable duality-based solution methods, we approximate unit commitments with a continuous reserve capacity calculation. Case studies indicate that, in an imperfectly competitive market, more units are incentivized to operate,enhancing system reserves. However, some units go online mainly for profit, ultimately raising electricity costs for consumers. These findings highlight the importance of market design in managing the trade-off between reserve adequacy and economic efficiency in the presence of strategic bidding behavior.

Reinforcement Learning Increases Wind Farm Power Production by Enabling Closed-Loop Collaborative Control

Authors: Andrew Mole, Max Weissenbacher, Georgios Rigas, Sylvain Laizet

Traditional wind farm control operates each turbine independently to maximize individual power output. However, coordinated wake steering across the entire farm can substantially increase the combined wind farm energy production. Although dynamic closed-loop control has proven effective in flow control applications, wind farm optimization has relied primarily on static, low-fidelity simulators that ignore critical turbulent flow dynamics. In this work, we present the first reinforcement learning (RL) controller integrated directly with high-fidelity large-eddy simulation (LES), enabling real-time response to atmospheric turbulence through collaborative, dynamic control strategies. Our RL controller achieves a 4.30% increase in wind farm power output compared to baseline operation, nearly doubling the 2.19% gain from static optimal yaw control obtained through Bayesian optimization. These results establish dynamic flow-responsive control as a transformative approach to wind farm optimization, with direct implications for accelerating renewable energy deployment to net-zero targets.

Decentralized Parametric Stability Certificates for Grid-Forming Converter Control

Authors: Verena Häberle, Xiuqiang He, Linbin Huang, Florian Dörfler, Steven Low

We propose a decentralized framework for guaranteeing the small-signal stability of future power systems with grid-forming converters. Our approach leverages dynamic loop-shifting techniques to compensate for the lack of passivity in the network dynamics and establishes decentralized parametric stability certificates, depending on the local device-level controls and incorporating the effects of the network dynamics. By following practical tuning rules, we are able to ensure plug-and-play operation without centralized coordination. Unlike prior works, our approach accommodates coupled frequency and voltage dynamics, incorporates network dynamics, and does not rely on specific network configurations or operating points, offering a general and scalable solution for the integration of power-electronics-based devices into future power systems. We validate our theoretical stability results through numerical case studies in a high-fidelity simulation model.

Physics-Informed Neural Networks: a Plug and Play Integration into Power System Dynamic Simulations

Authors: Ignasi Ventura Nadal, Jochen Stiasny, Spyros Chatzivasileiadis

Time-domain simulations are crucial for ensuring power system stability and avoiding critical scenarios that could lead to blackouts. The next-generation power systems require a significant increase in the computational cost and complexity of these simulations due to additional degrees of uncertainty, non-linearity and states. Physics-Informed Neural Networks (PINN) have been shown to accelerate single-component simulations by several orders of magnitude. However, their application to current time-domain simulation solvers has been particularly challenging since the system's dynamics depend on multiple components. Using a new training formulation, this paper introduces the first natural step to integrate PINNs into multi-component time-domain simulations. We propose PINNs as an alternative to other classical numerical methods for individual components. Once trained, these neural networks approximate component dynamics more accurately for longer time steps. Formulated as an implicit and consistent method with the transient simulation workflow, PINNs speed up simulation time by significantly increasing the time steps used. For explanation clarity, we demonstrate the training, integration, and simulation framework for several combinations of PINNs and numerical solution methods using the IEEE 9-bus system, although the method applies equally well to any power system size.

Power-Capping Metric Evaluation for Improving Energy Efficiency in HPC Applications

Authors: Maria Patrou, Thomas Wang, Wael Elwasif, Markus Eisenbach, Ross Miller, William Godoy, Oscar Hernandez

With high-performance computing systems now running at exascale, optimizing power-scaling management and resource utilization has become more critical than ever. This paper explores runtime power-capping optimizations that leverage integrated CPU-GPU power management on architectures like the NVIDIA GH200 superchip. We evaluate energy-performance metrics that account for simultaneous CPU and GPU power-capping effects by using two complementary approaches: speedup-energy-delay and a Euclidean distance-based multi-objective optimization method. By targeting a mostly compute-bound exascale science application, the Locally Self-Consistent Multiple Scattering (LSMS), we explore challenging scenarios to identify potential opportunities for energy savings in exascale applications, and we recognize that even modest reductions in energy consumption can have significant overall impacts. Our results highlight how GPU task-specific dynamic power-cap adjustments combined with integrated CPU-GPU power steering can improve the energy utilization of certain GPU tasks, thereby laying the groundwork for future adaptive optimization strategies.