📄 Recent progress in text-conditioned human motion generation has been largely driven by diffusion models trained on large-scale human motion data. Building on this progress, recent methods attempt to transfer such models for character animation and real robot control by applying a Whole-Body Controller (WBC) that converts diffusion-generated motions into executable trajectories. While WBC trajectories become compliant with physics, they may expose substantial deviations from original motion. To a...
📄 中文:近期在文本条件人类动作生成方面的进展主要由在大规模人类动作数据上训练的扩散模型推动。在此基础上,近期的方法尝试通过应用全身控制器(WBC)将这些模型迁移到角色动画和真实机器人控制中,将扩散模型生成的动作转换为可执行的轨迹。虽然 WBC 轨迹符合物理约束,但它们可能表现出与原始动作的重大偏差。对于……(原文未完整)
📄 Machine learning approaches to spatiotemporal physical systems have primarily focused on next-frame prediction, with the goal of learning an accurate emulator for the system's evolution in time. However, these emulators are computationally expensive to train and are subject to performance pitfalls, such as compounding errors during autoregressive rollout. In this work, we take a different perspective and look at scientific tasks further downstream of predicting the next frame, such as estimation...
📄 机器学习在时空物理系统中的方法主要专注于 **next‑frame prediction**,目标是学习一个精确的 **emulator**,以模拟系统随时间的演化。然而,这些 **emulator** 训练成本高昂,并且存在性能缺陷,例如在 **autoregressive rollout** 过程中产生的累积误差。在本工作中,我们采用不同的 **perspective**(视角)……(原文在此处截断)
📄 In 2023, the author presented the first computable minimal Euclidean function for a non-trivial number field. Along with a formula for $φ_{\mathbb{Z}[i]}$, the minimal Euclidean function on the Gaussian inteers, the same paper introduced a geometric description for $φ_{\mathbb{Z}[i]}^{-1}([0,n])$. This paper uses that construction to prove formulas for the size of the function's pre-images, or $|φ_{\mathbb{Z}[i]}^{-1}([0,n])|$....
📄 2023 年,作者首次给出了 **非平凡数域** 的 **可计算的最优(最小)欧几里得函数**。该论文同时给出了 \[ \varphi_{\mathbb Z[i]} \] (高斯整数的最小欧几里得函数)的显式公式,并引入了 \[ \varphi_{\mathbb Z[i]}^{-1}([0,n]) \] 的几何描述。
📄 Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representations with high visual fidelity. While recent Large Vision Language Models (LVLMs) achieve strong results via supervised fine-tuning, reinforcement learning remains challenging due to misaligned reward signals. Existing rewards either rely on textual rules or coarse visual embedding similarity, both of which fail to capture fine-grained visual dis...
📄 Antimony sulfide is an emerging phase change material for optical and electrical memory and computation elements. It has additionally been reported as a ferroelectric, with recent evidence from hysteresis in piezoresponse force microscopy. Here, we complete a rigorous set of piezoresponse force microscopy experiments on a congruently crystallized Sb2S3 glass-ceramic, where piezoelectric coupling should be forbidden in glassy Sb2S3. We replicate previous results and reveal that the behavior is ab...
📄 Status signaling drives human behavior and the allocation of scarce resources such as mating opportunities, yet the generative mechanisms governing how specific goods, signals, or behaviors acquire prestige remain a puzzle. Classical frameworks, such as Costly Signaling Theory, treat preferences as fixed and struggle to explain how semiotic meaning changes based on context or drifts dynamically over time, occasionally reaching tipping points. In this work, we propose a computational theory of st...
📄 Accurate band offsets are essential for predictive continuum modeling of nanostructures such as quantum wells and quantum dots formed in strained Si/Si1-xGex and Ge/Si1-xGex heterostructures. Experimental offset data for these systems remain sparse away from endpoint compositions, making composition-dependent design difficult. We use atomistic first-principles density functional theory to compute valence- and conduction-band offsets across the full range 0 <= x <= 1. Random alloying is treated w...
📄 Evolutions in the world, such as water pouring or ice melting, happen regardless of being observed. Video world models generate "worlds" via 2D frame observations. Can these generated "worlds" evolve regardless of observation? To probe this question, we design a benchmark to evaluate whether video world models can decouple state evolution from observation. Our benchmark, STEVO-Bench, applies observation control to evolving processes via instructions of occluder insertion, turning off the light, ...
📄 In this work, we introduce and study the $p$-$α$-closest-center problem ($pα$CCP), which generalizes the $p$-second-center problem, a recently emerged variant of the classical $p$-center problem. In the $pα$CCP, we are given sets of customers and potential facility locations, distances between each customer and potential facility location as well as two integers $p$ and $α$. The goal is to open facilities at $p$ of the potential facility locations, such that the maximum $α$-distance between each...
📄 Large language models for code have achieved strong performance across diverse software analytics tasks, yet their real-world adoption remains limited by high computational demands, slow inference speeds, significant energy consumption, and environmental impact. Knowledge distillation (KD) offers a practical solution by transferring knowledge from a large model to a smaller and more efficient model. Despite its effectiveness, recent studies show that models distilled from a single source often e...
📄 Instruction Tuning (IT) has been proven to be an effective approach to unlock the powerful capabilities of large language models (LLMs). Recent studies indicate that excessive IT data can degrade LLMs performance, while carefully selecting a small subset of high-quality IT data can significantly enhance their capabilities. Therefore, identifying the most efficient subset data from the IT dataset to effectively develop either specific or general abilities in LLMs has become a critical challenge. ...
📄 Audio-only walking navigation can leave users disoriented, relying on vague cardinal directions and lacking real-time environmental context, leading to frequent errors. To address this, we present a novel system that integrates a Vision Language Model (VLM) with a spatial audio cue. Our system extracts environmental landmarks to anchor navigation instructions and, crucially, provides a directional spatial audio signal when the user faces the wrong direction, indicating the precise turn direction...
📄 The Semi-Analytical Finite Element (SAFE) method is widely used for computing guided wave dispersion curves in waveguides of arbitrary cross-section. Accurate mode tracking across consecutive wavenumber steps remains challenging, particularly in mode veering regions where eigenvalues become nearly degenerate and eigenvectors vary rapidly. This work establishes a rigorous theoretical framework for mode tracking in single-parameter Hermitian eigenvalue problems arising from SAFE formulations. We d...
📄 Cops and Robbers is a pursuit-evasion game played on graphs, of which many variants have been developed and studied. We introduce a variant of this game, "Sneaky-Active Cops and Robbers", where all cops and robber must move on their turn, and where the robber is allowed to move onto a cop position without being captured. We show that for reflexive graphs, this game is equivalent to the classical cops and robbers and that the cop number for a graph is invariant under $\times$-homotopy equivalence...
📄 While large language models (LLMs) have transformed AI agents into proficient executors of computational materials science, performing a hundred simulations does not make a researcher. What distinguishes research from routine execution is the progressive accumulation of knowledge -- learning which approaches fail, recognizing patterns across systems, and applying understanding to new problems. However, the prevailing paradigm in AI-driven computational science treats each execution in isolation,...
📄 Recent progress in text-conditioned human motion generation has been largely driven by diffusion models trained on large-scale human motion data. Building on this progress, recent methods attempt to transfer such models for character animation and real robot control by applying a Whole-Body Controller (WBC) that converts diffusion-generated motions into executable trajectories. While WBC trajectories become compliant with physics, they may expose substantial deviations from original motion. To a...
📄 Machine learning approaches to spatiotemporal physical systems have primarily focused on next-frame prediction, with the goal of learning an accurate emulator for the system's evolution in time. However, these emulators are computationally expensive to train and are subject to performance pitfalls, such as compounding errors during autoregressive rollout. In this work, we take a different perspective and look at scientific tasks further downstream of predicting the next frame, such as estimation...
📄 We present a new constraint on the effective number of relativistic species in the early universe, $N_{\rm eff}$, by combining recent primordial helium abundance measurements from the Large Binocular Telescope $Y_p$ Project with primordial deuterium abundance data, cosmic microwave background (CMB) observations from $\it{Planck}$, the Atacama Cosmology Telescope, and the South Pole Telescope, and baryon acoustic oscillation (BAO) data from the Dark Energy Spectroscopic Instrument, yielding $N_{\...
📄 Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representations with high visual fidelity. While recent Large Vision Language Models (LVLMs) achieve strong results via supervised fine-tuning, reinforcement learning remains challenging due to misaligned reward signals. Existing rewards either rely on textual rules or coarse visual embedding similarity, both of which fail to capture fine-grained visual dis...
📄 A flavor-tagged time-dependent analysis of $B^{0}\rightarrow K_{S}^{0}μ^{+}μ^{-}$ decays is performed across the full dimuon mass range excluding the $J/ψ$ and $ψ(2S)$ resonance regions. The analysis uses proton-proton collision data collected by the LHCb experiment in 2011--2018 at center-of-mass energies of 7, 8 and 13TeV, corresponding to an integrated luminosity of 9$fb^{-1}$. The CP violation parameters are determined to be $C=-0.13 \pm 0.32 \pm 0.04$ and $S= +0.82\pm 0.29 \pm 0.05$, where ...
📄 Understanding pairing in the strong-coupling regime of doped Mott insulators remains an open problem in the context of cuprate superconductors. We perform ultra-high resolution numerical simulations of spectral functions in the highly underdoped $t-J$ model and discover two coupled branches of hole pairs emerging at low energies in the largely unexplored two-particle spectrum. As spin anisotropy is tuned from the Ising limit to the $SU(2)$-symmetric Heisenberg regime, the lowest $d$-wave pair ev...
📄 Status signaling drives human behavior and the allocation of scarce resources such as mating opportunities, yet the generative mechanisms governing how specific goods, signals, or behaviors acquire prestige remain a puzzle. Classical frameworks, such as Costly Signaling Theory, treat preferences as fixed and struggle to explain how semiotic meaning changes based on context or drifts dynamically over time, occasionally reaching tipping points. In this work, we propose a computational theory of st...
📄 Accurate band offsets are essential for predictive continuum modeling of nanostructures such as quantum wells and quantum dots formed in strained Si/Si1-xGex and Ge/Si1-xGex heterostructures. Experimental offset data for these systems remain sparse away from endpoint compositions, making composition-dependent design difficult. We use atomistic first-principles density functional theory to compute valence- and conduction-band offsets across the full range 0 <= x <= 1. Random alloying is treated w...
📄 Line intensity mapping (LIM) is a technique for producing 3D maps of the Universe by scanning the sky with a spectrometer sensitive to a range of wavelengths corresponding to the redshifted spectral lines of atoms or molecules, such as hydrogen or carbon, commonly found in galaxies and the diffuse media around them. While LIM experiments have successfully detected the 21 cm line of neutral hydrogen, other lines that reveal large-scale structure or astrophysical processes remain undetected. Many ...
📄 Measurements of the event-by-event correlation between elliptic flow ($v_2$) and the mean transverse momentum ($[p_{\rm T}]$) using the modified Pearson correlation coefficient $ρ(v_2^2,[p_{\rm T}])$ are reported in pp collisions at $\sqrt{s} = 13$ TeV, and in p-Pb and Pb-Pb collisions at a center-of-mass energy per nucleon pair $\sqrt{s_{\rm NN}} = 5.02$ TeV. This analysis is based on the full LHC Run 2 dataset recorded by ALICE and is performed for the first time in small collision systems wit...
📄 Evolutions in the world, such as water pouring or ice melting, happen regardless of being observed. Video world models generate "worlds" via 2D frame observations. Can these generated "worlds" evolve regardless of observation? To probe this question, we design a benchmark to evaluate whether video world models can decouple state evolution from observation. Our benchmark, STEVO-Bench, applies observation control to evolving processes via instructions of occluder insertion, turning off the light, ...
📄 Large language models for code have achieved strong performance across diverse software analytics tasks, yet their real-world adoption remains limited by high computational demands, slow inference speeds, significant energy consumption, and environmental impact. Knowledge distillation (KD) offers a practical solution by transferring knowledge from a large model to a smaller and more efficient model. Despite its effectiveness, recent studies show that models distilled from a single source often e...
📄 We prove the existence of spontaneous symmetry breaking in suitably low-energy eigenstates of certain gapless and frustrated many-body quantum systems, namely symmetric quantum perturbations to classical models which exhibit spontaneous symmetry breaking of a finite group at some positive temperature. Additionally, the classical model need not be local in space, as long as it satisfies a quantum analogue of the Peierls condition. As an example of our technique, we establish robust ferromagnetism...
📄 While the superposition of quantum evolutions is known to produce interference effects, the interference between evolutions with regular and chaotic classical limits remains largely unexplored. Here, we use a Mach-Zehnder interferometer to investigate the superposition of two quantum evolutions, implemented via post-selection, and to compare it with the corresponding classical mixture. The quantum kicked top provides a natural platform for this study, as its classical dynamics ranges from regula...
📄 We investigate the impact of spatial curvature, $Ω_k$, and dynamical dark energy on the cosmological constraints of the neutrino mass sum, $\sum m_ν$. Using a joint analysis of the latest CMB (Planck and ACT DR6), BAO (DESI DR2) and SNe Ia (DESY5 and DES-Dovekie) datasets, we perform an exploration of the neutrino mass parameter space. To mitigate prior-driven biases near the physical boundary, we implement a symmetric extension wrapper that allows for effective negative masses. We find that the...
📄 Recent progress in text-conditioned human motion generation has been largely driven by diffusion models trained on large-scale human motion data. Building on this progress, recent methods attempt to transfer such models for character animation and real robot control by applying a Whole-Body Controller (WBC) that converts diffusion-generated motions into executable trajectories. While WBC trajectories become compliant with physics, they may expose substantial deviations from original motion. To a...
📄 Machine learning approaches to spatiotemporal physical systems have primarily focused on next-frame prediction, with the goal of learning an accurate emulator for the system's evolution in time. However, these emulators are computationally expensive to train and are subject to performance pitfalls, such as compounding errors during autoregressive rollout. In this work, we take a different perspective and look at scientific tasks further downstream of predicting the next frame, such as estimation...
📄 Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representations with high visual fidelity. While recent Large Vision Language Models (LVLMs) achieve strong results via supervised fine-tuning, reinforcement learning remains challenging due to misaligned reward signals. Existing rewards either rely on textual rules or coarse visual embedding similarity, both of which fail to capture fine-grained visual dis...
📄 Large language models for code have achieved strong performance across diverse software analytics tasks, yet their real-world adoption remains limited by high computational demands, slow inference speeds, significant energy consumption, and environmental impact. Knowledge distillation (KD) offers a practical solution by transferring knowledge from a large model to a smaller and more efficient model. Despite its effectiveness, recent studies show that models distilled from a single source often e...
📄 For most chemists, Kramers' degeneracy refers to the fact that for any radical system, every potential energy surface is at least doubly degenerate (with spin up and spin down, time-reversed solutions) for all nuclear positions $\mathbf{X}$. That being said, as is well-known to the community of spin chemists, one can experimentally detect a splitting of almost every rotational energy level for a doublet system -- highlighting the fact that nuclear motion breaks the spin degeneracy of such BO ele...
📄 While large language models (LLMs) have transformed AI agents into proficient executors of computational materials science, performing a hundred simulations does not make a researcher. What distinguishes research from routine execution is the progressive accumulation of knowledge -- learning which approaches fail, recognizing patterns across systems, and applying understanding to new problems. However, the prevailing paradigm in AI-driven computational science treats each execution in isolation,...
📄 Prior approaches for membership privacy preservation usually update or retrain all weights in neural networks, which is costly and can lead to unnecessary utility loss or even more serious misalignment in predictions between training data and non-training data. In this work, we observed three insights: i) privacy vulnerability exists in a very small fraction of weights; ii) however, most of those weights also critically impact utility performance; iii) the importance of weights stems from their ...
📄 Brain tumor classification from magnetic resonance imaging, which is also known as MRI, plays a sensitive role in computer-assisted diagnosis systems. In recent years, deep learning models have achieved high classification accuracy. However, their sensitivity to adversarial perturbations has become an important reliability concern in medical applications. This study suggests a robust brain tumor classification framework that combines Non-Negative Matrix Factorization (NNMF or NMF), lightweight c...
📄 Matrix multiplication performance has long been the major bottleneck to scaling deep learning workloads, which has stimulated the design of new accelerators that use increasingly low-precision number formats. However, improvements in matrix multiplication performance have far outstripped improvements in performance on reductions and elementwise computations, which are still being performed in higher precision. In this work, we propose MXNorm, a drop-in replacement for RMSNorm that estimates the ...
📄 The dynamics of Saturn's satellite system offer a rich framework for studying orbital stability and resonance interactions. Traditional methods for analysing such systems, including Fourier analysis and stability metrics, struggle with the scale and complexity of modern datasets. This study introduces a machine learning-based pipeline for clustering approximately 22,300 simulated satellite orbits, addressing these challenges with advanced feature extraction and dimensionality reduction technique...
📄 Concept Bottleneck Models (CBMs) are interpretable models that route predictions through a layer of human-interpretable concepts. While widely studied in vision and, more recently, in NLP, CBMs remain largely unexplored in multimodal settings. For their explanations to be faithful, CBMs must satisfy two conditions: concepts must be properly detected, and concept representations must encode only their intended semantics, without smuggling extraneous task-relevant or inter-concept information into...
📄 Practitioners monitoring deployed probabilistic models face a fundamental trap: any fixed-sample test applied repeatedly over an unbounded stream will eventually raise a false alarm, even when the model remains perfectly stable. Existing methods typically lack formal error guarantees, conflate alarm time with changepoint location, and monitor indirect signals that do not fully characterize calibration. We present PITMonitor, an anytime-valid calibration-specific monitor that detects distribution...
📄 This paper develops a quantized Q-learning algorithm for the optimal control of controlled diffusion processes on $\mathbb{R}^d$ under both discounted and ergodic (average) cost criteria. We first establish near-optimality of finite-state MDP approximations to discrete-time discretizations of the diffusion, then introduce a quantized Q-learning scheme and prove its almost-sure convergence to near-optimal policies for the finite MDP. These policies, when interpolated to continuous time, are shown...
📄 This paper presents NOIR, a framework that reframes core medical imaging tasks as operator learning between continuous function spaces, challenging the prevailing paradigm of discrete grid-based deep learning. Instead of operating on fixed pixel or voxel grids, NOIR embeds discrete medical signals into shared Implicit Neural Representations and learns a Neural Operator that maps between their latent modulations, enabling resolution-independent function-to-function transformations. We evaluate NO...
📄 Deep learning models, despite their impressive achievements, suffer from high computational costs and memory requirements, limiting their usability in resource-constrained environments. Sparse neural networks significantly alleviate these constraints by dramatically reducing parameter count and computational overhead. However, existing sparse training methods often experience chaotic and noisy gradient signals, severely hindering convergence and generalization performance, particularly at high s...
📄 Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representations with high visual fidelity. While recent Large Vision Language Models (LVLMs) achieve strong results via supervised fine-tuning, reinforcement learning remains challenging due to misaligned reward signals. Existing rewards either rely on textual rules or coarse visual embedding similarity, both of which fail to capture fine-grained visual dis...
📄 In modern human-robot collaboration (HRC) applications, multiple perception modules jointly extract visual, auditory, and contextual cues to achieve comprehensive scene understanding, enabling the robot to provide appropriate assistance to human agents intelligently. While executing multiple perception modules on a frame-by-frame basis enhances perception quality in offline settings, it inevitably accumulates latency, leading to a substantial decline in system performance in streaming perception...
📄 Concept Bottleneck Models (CBMs) are interpretable models that route predictions through a layer of human-interpretable concepts. While widely studied in vision and, more recently, in NLP, CBMs remain largely unexplored in multimodal settings. For their explanations to be faithful, CBMs must satisfy two conditions: concepts must be properly detected, and concept representations must encode only their intended semantics, without smuggling extraneous task-relevant or inter-concept information into...
📄 We establish a variant of the log-Sobolev and transport-information inequalities for mixture distributions. If a probability measure $π$ can be decomposed into components that individually satisfy such inequalities, then any measure $μ$ close to $π$ in relative Fisher information is close in relative entropy or transport distance to a reweighted version of $π$ with the same mixture components but possibly different weights. This provides a user-friendly interpretation of Fisher information bound...
📄 We propose a route to study monitored many-body dynamics in multimode bosonic systems using circuit quantum electrodynamics. In this experimental setting, we construct several bosonic models comprising brickwork circuits built from beam-splitter gates, local parity measurements, and optional on-site Hubbard interactions, and diagnose their monitored dynamics via ancilla purification and a learnability-based probe. Under parity measurements, generic gate sets exhibit behavior that is largely cons...
📄 Panoramic imagery provides holistic 360° visual coverage for perception in quadruped robots. However, existing occupancy prediction methods are mainly designed for wheeled autonomous driving and rely heavily on RGB cues, limiting their robustness in complex environments. To bridge this gap, (1) we present PanoMMOcc, the first real-world panoramic multimodal occupancy dataset for quadruped robots, featuring four sensing modalities across diverse scenes. (2) We propose a panoramic multimodal occup...
📄 We introduce **CRYSTAL** (*__C__lear __R__easoning via __Y__ielded __S__teps, __T__raceability and __L__ogic*), a diagnostic benchmark with 6,372 instances that evaluates multimodal reasoning through verifiable intermediate steps. We propose two complementary metrics: *Match F1*, which scores step-level precision and recall via semantic similarity matching, and *Ordered Match F1*, which further penalizes disordered reasoning chains. References are constructed through a Delphi-inspired pipeline w...
📄 We introduce SldprtNet, a large-scale dataset comprising over 242,000 industrial parts, designed for semantic-driven CAD modeling, geometric deep learning, and the training and fine-tuning of multimodal models for 3D design. The dataset provides 3D models in both .step and .sldprt formats to support diverse training and testing. To enable parametric modeling and facilitate dataset scalability, we developed supporting tools, an encoder and a decoder, which support 13 types of CAD commands and ena...
📄 The growing interest in embodied agents increases the demand for spatiotemporal video understanding, yet existing benchmarks largely emphasize extractive reasoning, where answers can be explicitly presented within spatiotemporal events. It remains unclear whether multimodal large language models can instead perform abstractive spatiotemporal reasoning, which requires integrating observations over time, combining dispersed cues, and inferring implicit spatial and contextual structure. To address ...
📄 State-of-the-art text-to-image diffusion models can produce impressive visuals but may memorize and reproduce training images, creating copyright and privacy risks. Existing prompt perturbations applied at inference time, such as random token insertion or embedding noise, may lower copying but often harm image-prompt alignment and overall fidelity. To address this, we introduce two complementary methods. First, Region-Aware Prompt Augmentation (RAPTA) uses an object detector to find salient regi...
📄 Continuous emotion recognition in terms of valence and arousal under in-the-wild (ITW) conditions remains a challenging problem due to large variations in appearance, head pose, illumination, occlusions, and subject-specific patterns of affective expression. We present a multimodal method for valence-arousal estimation ITW. Our method combines three complementary modalities: face, behavior, and audio. The face modality relies on GRADA-based frame-level embeddings and Transformer-based temporal r...
📄 We present Multimodal OCR (MOCR), a document parsing paradigm that jointly parses text and graphics into unified textual representations. Unlike conventional OCR systems that focus on text recognition and leave graphical regions as cropped pixels, our method, termed dots.mocr, treats visual elements such as charts, diagrams, tables, and icons as first-class parsing targets, enabling systems to parse documents while preserving semantic relationships across elements. It offers several advantages: ...
📄 Despite the strong multimodal performance, large vision-language models (LVLMs) are vulnerable during fine-tuning to backdoor attacks, where adversaries insert trigger-embedded samples into the training data to implant behaviors that can be maliciously activated at test time. Existing defenses typically rely on retraining backdoored parameters (e.g., adapters or LoRA modules) with clean data, which is computationally expensive and often degrades model performance. In this work, we provide a new ...
📄 Real-time understanding of continuous video streams is essential for interactive assistants and multimodal agents operating in dynamic environments. However, most existing video reasoning approaches follow a batch paradigm that defers reasoning until the full video context is observed, resulting in high latency and growing computational cost that are incompatible with streaming scenarios. In this paper, we introduce ThinkStream, a framework for streaming video reasoning based on a Watch--Think--...
📄 Perovskite solar cells (PSCs) have experienced a remarkable rise in power conversion efficiency (PCE) over the past 15 years, positioning them as a promising alternative or complement to silicon for large-scale photovoltaic deployment. However, beyond scalable fabrication, operational stability remains a major bottleneck for commercialization. Reliable and rapid methods to assess device health and degradation mechanisms - ideally compatible with field applications - are therefore essential. We p...
📄 Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representations with high visual fidelity. While recent Large Vision Language Models (LVLMs) achieve strong results via supervised fine-tuning, reinforcement learning remains challenging due to misaligned reward signals. Existing rewards either rely on textual rules or coarse visual embedding similarity, both of which fail to capture fine-grained visual dis...
📄 Accurate band offsets are essential for predictive continuum modeling of nanostructures such as quantum wells and quantum dots formed in strained Si/Si1-xGex and Ge/Si1-xGex heterostructures. Experimental offset data for these systems remain sparse away from endpoint compositions, making composition-dependent design difficult. We use atomistic first-principles density functional theory to compute valence- and conduction-band offsets across the full range 0 <= x <= 1. Random alloying is treated w...
📄 Measurements of the event-by-event correlation between elliptic flow ($v_2$) and the mean transverse momentum ($[p_{\rm T}]$) using the modified Pearson correlation coefficient $ρ(v_2^2,[p_{\rm T}])$ are reported in pp collisions at $\sqrt{s} = 13$ TeV, and in p-Pb and Pb-Pb collisions at a center-of-mass energy per nucleon pair $\sqrt{s_{\rm NN}} = 5.02$ TeV. This analysis is based on the full LHC Run 2 dataset recorded by ALICE and is performed for the first time in small collision systems wit...
📄 Evolutions in the world, such as water pouring or ice melting, happen regardless of being observed. Video world models generate "worlds" via 2D frame observations. Can these generated "worlds" evolve regardless of observation? To probe this question, we design a benchmark to evaluate whether video world models can decouple state evolution from observation. Our benchmark, STEVO-Bench, applies observation control to evolving processes via instructions of occluder insertion, turning off the light, ...
📄 In this work, we introduce and study the $p$-$α$-closest-center problem ($pα$CCP), which generalizes the $p$-second-center problem, a recently emerged variant of the classical $p$-center problem. In the $pα$CCP, we are given sets of customers and potential facility locations, distances between each customer and potential facility location as well as two integers $p$ and $α$. The goal is to open facilities at $p$ of the potential facility locations, such that the maximum $α$-distance between each...
📄 We investigate the impact of spatial curvature, $Ω_k$, and dynamical dark energy on the cosmological constraints of the neutrino mass sum, $\sum m_ν$. Using a joint analysis of the latest CMB (Planck and ACT DR6), BAO (DESI DR2) and SNe Ia (DESY5 and DES-Dovekie) datasets, we perform an exploration of the neutrino mass parameter space. To mitigate prior-driven biases near the physical boundary, we implement a symmetric extension wrapper that allows for effective negative masses. We find that the...
📄 Instruction Tuning (IT) has been proven to be an effective approach to unlock the powerful capabilities of large language models (LLMs). Recent studies indicate that excessive IT data can degrade LLMs performance, while carefully selecting a small subset of high-quality IT data can significantly enhance their capabilities. Therefore, identifying the most efficient subset data from the IT dataset to effectively develop either specific or general abilities in LLMs has become a critical challenge. ...
📄 While large language models (LLMs) have transformed AI agents into proficient executors of computational materials science, performing a hundred simulations does not make a researcher. What distinguishes research from routine execution is the progressive accumulation of knowledge -- learning which approaches fail, recognizing patterns across systems, and applying understanding to new problems. However, the prevailing paradigm in AI-driven computational science treats each execution in isolation,...
📄 This article presents a comparison of various implementations of the Lattice Discrete Particle Model (LDPM) for the numerical simulation of concrete and other heterogeneous quasibrittle materials. The comparison involves the use of transient implicit and explicit solvers and steady-state (static) solvers and implementations for Central Processing Unit (CPU) as well as Graphics Processing Unit (GPU). The various implementations are compared on the basis of a set of benchmarks tests describing beh...
📄 Large Language Models (LLMs) can generate persuasive influence strategies that shift cooperative behavior in multi-agent populations, but a critical question remains: does the resulting cooperation reflect genuine prosocial alignment, or does it mask erosion of agent autonomy, epistemic integrity, and distributional fairness? We introduce Constitutional Multi-Agent Governance (CMAG), a two-stage framework that interposes between an LLM policy compiler and a networked agent population, combining ...
📄 Spatio-temporal scene graphs provide a principled representation for modeling evolving object interactions, yet existing methods remain fundamentally frame-centric: they reason only about currently visible objects, discard entities upon occlusion, and operate in 2D. To address this, we first introduce ActionGenome4D, a dataset that upgrades Action Genome videos into 4D scenes via feed-forward 3D reconstruction, world-frame oriented bounding boxes for every object involved in actions, and dense r...
📄 The farthest-first traversal of Gonzalez is a classical $2$-approximation algorithm for solving the $k$-center problem, but its sequential nature makes it difficult to scale to very large datasets. In this work we study the effect of running farthest-first on a $δ$-cover of the dataset rather than on the full set of points. A $δ$-cover provides a compact summary of the data in which every point lies within distance $δ$ of some selected center. We prove that if farthest-first is applied to a $δ$-...
📄 The dynamics of Saturn's satellite system offer a rich framework for studying orbital stability and resonance interactions. Traditional methods for analysing such systems, including Fourier analysis and stability metrics, struggle with the scale and complexity of modern datasets. This study introduces a machine learning-based pipeline for clustering approximately 22,300 simulated satellite orbits, addressing these challenges with advanced feature extraction and dimensionality reduction technique...
📄 Large Language Models (LLMs) increasingly serve as autonomous reasoning agents in decision support, scientific problem-solving, and multi-agent coordination systems. However, deploying LLM agents in consequential applications requires assurance that their reasoning remains stable under semantically equivalent input variations, a property we term semantic invariance.Standard benchmark evaluations, which assess accuracy on fixed, canonical problem formulations, fail to capture this critical reliab...
📄 The ability to provide trustworthy maternal health information using phone-based chatbots can have a significant impact, particularly in low-resource settings where users have low health literacy and limited access to care. However, deploying such systems is technically challenging: user queries are short, underspecified, and code-mixed across languages, answers require regional context-specific grounding, and partial or missing symptom context makes safe routing decisions difficult. We presen...
📄 Recent progress in text-conditioned human motion generation has been largely driven by diffusion models trained on large-scale human motion data. Building on this progress, recent methods attempt to transfer such models for character animation and real robot control by applying a Whole-Body Controller (WBC) that converts diffusion-generated motions into executable trajectories. While WBC trajectories become compliant with physics, they may expose substantial deviations from original motion. To a...
📄 Machine learning approaches to spatiotemporal physical systems have primarily focused on next-frame prediction, with the goal of learning an accurate emulator for the system's evolution in time. However, these emulators are computationally expensive to train and are subject to performance pitfalls, such as compounding errors during autoregressive rollout. In this work, we take a different perspective and look at scientific tasks further downstream of predicting the next frame, such as estimation...
📄 We present a new constraint on the effective number of relativistic species in the early universe, $N_{\rm eff}$, by combining recent primordial helium abundance measurements from the Large Binocular Telescope $Y_p$ Project with primordial deuterium abundance data, cosmic microwave background (CMB) observations from $\it{Planck}$, the Atacama Cosmology Telescope, and the South Pole Telescope, and baryon acoustic oscillation (BAO) data from the Dark Energy Spectroscopic Instrument, yielding $N_{\...
📄 Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representations with high visual fidelity. While recent Large Vision Language Models (LVLMs) achieve strong results via supervised fine-tuning, reinforcement learning remains challenging due to misaligned reward signals. Existing rewards either rely on textual rules or coarse visual embedding similarity, both of which fail to capture fine-grained visual dis...
📄 A flavor-tagged time-dependent analysis of $B^{0}\rightarrow K_{S}^{0}μ^{+}μ^{-}$ decays is performed across the full dimuon mass range excluding the $J/ψ$ and $ψ(2S)$ resonance regions. The analysis uses proton-proton collision data collected by the LHCb experiment in 2011--2018 at center-of-mass energies of 7, 8 and 13TeV, corresponding to an integrated luminosity of 9$fb^{-1}$. The CP violation parameters are determined to be $C=-0.13 \pm 0.32 \pm 0.04$ and $S= +0.82\pm 0.29 \pm 0.05$, where ...
📄 Understanding pairing in the strong-coupling regime of doped Mott insulators remains an open problem in the context of cuprate superconductors. We perform ultra-high resolution numerical simulations of spectral functions in the highly underdoped $t-J$ model and discover two coupled branches of hole pairs emerging at low energies in the largely unexplored two-particle spectrum. As spin anisotropy is tuned from the Ising limit to the $SU(2)$-symmetric Heisenberg regime, the lowest $d$-wave pair ev...
📄 Status signaling drives human behavior and the allocation of scarce resources such as mating opportunities, yet the generative mechanisms governing how specific goods, signals, or behaviors acquire prestige remain a puzzle. Classical frameworks, such as Costly Signaling Theory, treat preferences as fixed and struggle to explain how semiotic meaning changes based on context or drifts dynamically over time, occasionally reaching tipping points. In this work, we propose a computational theory of st...
📄 Accurate band offsets are essential for predictive continuum modeling of nanostructures such as quantum wells and quantum dots formed in strained Si/Si1-xGex and Ge/Si1-xGex heterostructures. Experimental offset data for these systems remain sparse away from endpoint compositions, making composition-dependent design difficult. We use atomistic first-principles density functional theory to compute valence- and conduction-band offsets across the full range 0 <= x <= 1. Random alloying is treated w...
📄 Line intensity mapping (LIM) is a technique for producing 3D maps of the Universe by scanning the sky with a spectrometer sensitive to a range of wavelengths corresponding to the redshifted spectral lines of atoms or molecules, such as hydrogen or carbon, commonly found in galaxies and the diffuse media around them. While LIM experiments have successfully detected the 21 cm line of neutral hydrogen, other lines that reveal large-scale structure or astrophysical processes remain undetected. Many ...
📄 Measurements of the event-by-event correlation between elliptic flow ($v_2$) and the mean transverse momentum ($[p_{\rm T}]$) using the modified Pearson correlation coefficient $ρ(v_2^2,[p_{\rm T}])$ are reported in pp collisions at $\sqrt{s} = 13$ TeV, and in p-Pb and Pb-Pb collisions at a center-of-mass energy per nucleon pair $\sqrt{s_{\rm NN}} = 5.02$ TeV. This analysis is based on the full LHC Run 2 dataset recorded by ALICE and is performed for the first time in small collision systems wit...
📄 Evolutions in the world, such as water pouring or ice melting, happen regardless of being observed. Video world models generate "worlds" via 2D frame observations. Can these generated "worlds" evolve regardless of observation? To probe this question, we design a benchmark to evaluate whether video world models can decouple state evolution from observation. Our benchmark, STEVO-Bench, applies observation control to evolving processes via instructions of occluder insertion, turning off the light, ...
📄 Large language models for code have achieved strong performance across diverse software analytics tasks, yet their real-world adoption remains limited by high computational demands, slow inference speeds, significant energy consumption, and environmental impact. Knowledge distillation (KD) offers a practical solution by transferring knowledge from a large model to a smaller and more efficient model. Despite its effectiveness, recent studies show that models distilled from a single source often e...
📄 We prove the existence of spontaneous symmetry breaking in suitably low-energy eigenstates of certain gapless and frustrated many-body quantum systems, namely symmetric quantum perturbations to classical models which exhibit spontaneous symmetry breaking of a finite group at some positive temperature. Additionally, the classical model need not be local in space, as long as it satisfies a quantum analogue of the Peierls condition. As an example of our technique, we establish robust ferromagnetism...
📄 We investigate the impact of spatial curvature, $Ω_k$, and dynamical dark energy on the cosmological constraints of the neutrino mass sum, $\sum m_ν$. Using a joint analysis of the latest CMB (Planck and ACT DR6), BAO (DESI DR2) and SNe Ia (DESY5 and DES-Dovekie) datasets, we perform an exploration of the neutrino mass parameter space. To mitigate prior-driven biases near the physical boundary, we implement a symmetric extension wrapper that allows for effective negative masses. We find that the...
📄 We consider the problem of estimating the missing mass, partition function or evidence and its probability distribution in the case that for each sample point in the discrete sample space its (unnormalized) probability mass is revealed. Estimating the missing mass or partition function (evidence) is a well-studied problem for which, in different contexts, the harmonic mean estimator and the Good-Turing (and related) estimators are available. For sampling on a discrete set with revealed probabili...
📄 We investigate the impact of spatial curvature, $Ω_k$, and dynamical dark energy on the cosmological constraints of the neutrino mass sum, $\sum m_ν$. Using a joint analysis of the latest CMB (Planck and ACT DR6), BAO (DESI DR2) and SNe Ia (DESY5 and DES-Dovekie) datasets, we perform an exploration of the neutrino mass parameter space. To mitigate prior-driven biases near the physical boundary, we implement a symmetric extension wrapper that allows for effective negative masses. We find that the...
📄 Objective estimators of multimedia quality are often judged by comparing estimates with subjective "truth data," most often via Pearson correlation coefficient (PCC) or mean-squared error (MSE). But subjective test results contain noise, so striving for a PCC of 1.0 or an MSE of 0.0 is neither realistic nor repeatable. Numerous efforts have been made to acknowledge and appropriately accommodate subjective test noise in objective-subjective comparisons, typically resulting in new analysis framewo...
📄 The MAMBO mock galaxy catalogue, based on the Millennium Simulation with empirically assigned galaxy properties, provides predictions of FIR fluxes and physical parameters of Euclid-detectable galaxies. Predicted FIR flux distributions confirm that only the brightest Euclid sources will be detectable in existing FIR surveys. We employ stacking to measure the mean dust properties as a function of stellar mass and redshift. We find dust temperatures and infrared luminosities increase with redshift...
📄 Matrix multiplication performance has long been the major bottleneck to scaling deep learning workloads, which has stimulated the design of new accelerators that use increasingly low-precision number formats. However, improvements in matrix multiplication performance have far outstripped improvements in performance on reductions and elementwise computations, which are still being performed in higher precision. In this work, we propose MXNorm, a drop-in replacement for RMSNorm that estimates the ...
📄 Stage-IV galaxy surveys will provide the opportunity to test cosmological models and the underlying theory of gravity with unparalleled precision. In this context, it is crucial for the Euclid mission to leverage its spectroscopic and photometric probes to systematically investigate and incorporate non-standard cosmological models, including modified gravity, alternative dark energy scenarios, massive neutrinos, and primordial non-Gaussianity. We produce and release publicly simulated galaxy cat...
📄 We investigate a specific emergent dark energy scenario, known as critically emergent dark energy (CEDE), in which dark energy is effectively absent in the early Universe and becomes dynamically relevant only after a critical cosmic epoch through a phase transition. We constrain this model using recent cosmological observations, including cosmic microwave background (CMB) data from \emph{Planck} 2018, baryon acoustic oscillation (BAO) measurements from SDSS and DESI DR2, and two independent Type...
📄 Galaxy surveys demand fast large-scale structure forward models that preserve large-scale phases while providing realistic nonlinear morphology at fixed force resolution. Single-step Lagrangian Perturbation Theory (LPT) solvers are efficient, but they typically yield overly diffuse filaments and knots and underpredict small-scale clustering. We introduce Ridged Lagrangian Perturbation Theory (RLPT), a modular two-step scheme: a standard long-range LPT/ALPT transport is followed by a single pos...
📄 Gravitational-wave (GW) dark sirens provide an independent probe of the cosmic expansion history. Their cosmological constraining power, however, depends critically on precise luminosity-distance measurements and sky localizations for cross-matching with galaxy catalogs. Multiband GW observations can track GW events across different frequency bands and thus improve both. Motivated by this, we forecast the cosmological potential of intermediate-mass black hole binaries (IMBHBs) observed by a thre...
📄 Geochemical anomaly detection plays a critical role in mineral exploration as deviations from regional geochemical baselines may indicate mineralization. Existing studies suffer from two key limitations: (1) single region scenarios which limit model generalizability; (2) proprietary datasets, which makes result reproduction unattainable. In this work, we introduce \textbf{GeoChemAD}, an open-source benchmark dataset compiled from government-led geological surveys, covering multiple regions, samp...
📄 This survey reviews a collection of parallel phenomena between free boundary submanifolds in the Euclidean unit ball and closed submanifolds in the sphere, with particular emphasis on rigidity mechanisms, pinching thresholds, and canonical models. We do not regard the two theories as a unified system in one-to-one correspondence. Rather, we emphasize that in several typical settings -- including low topology, strong pinching, spectral extremality, and symmetry reduction -- the free boundary cond...
📄 The galaxy catalog dark siren method aims to infer cosmological parameters from gravitational waves (GWs) without an electromagnetic counterpart by statistically marginalizing over possible host galaxies. The cross-correlation of GW sources and galaxies is a promising avenue for cosmological inference without requiring observed host galaxies, by leveraging 2-point statistics. We provide a detailed guide to the cross-correlation method, clarifying its relationship to standard dark siren technique...
📄 Medical image segmentation (MIS) is a fundamental component of computer-assisted diagnosis and clinical decision support systems. Over the past decade, numerous architectures specifically tailored to medical imaging have emerged to address domain-specific challenges such as low contrast, small anatomical structures, and limited annotated data. In parallel, rapid progress in computer vision has produced highly capable general-purpose vision models (GP-VMs) originally designed for natural images. ...
📄 Mental health remains a major public health concern, while access to timely psychological support is often limited. AI-based dialogue systems have emerged as promising tools to address these barriers, and recent advances in large language models (LLMs) have significantly transformed this research area. However, a systematic understanding of this technological transition is still limited. This study reviews the technological evolution of AI-driven dialogue systems for mental health, focusing on t...
📄 Extracting cosmological information from Stage IV weak lensing surveys requires non-linear modelling of the matter power spectrum that is accurate across a broad range of scales and redshifts and robust to baryonic feedback. We forecast the application of the two-loop effective field theory of large-scale structure (EFTofLSS) to Roman Space Telescope, carefully considering parameterization, scale cuts, and priors. We develop neural network emulators for the two-loop integrals, allowing rapid eva...
📄 A surgical world model capable of generating realistic surgical action videos with precise control over tool-tissue interactions can address fundamental challenges in surgical AI and simulation -- from data scarcity and rare event synthesis to bridging the sim-to-real gap for surgical automation. However, current video generation methods, the very core of such surgical world models, require expensive annotations or complex structured intermediates as conditioning signals at inference, limiting t...