Yuda Bi — Research Notes

Spectral Methods in Brain Network Analysis

Yuda Bi — Tue, 07 Apr 2026 00:00:00 GMT

Graph Laplacian and Brain Networks

Given a brain connectivity matrix where represents the functional connectivity between regions and , the normalized graph Laplacian is:

where is the degree matrix with .

Spectral Decomposition

The eigendecomposition of yields:

where .

Key Properties

Number of zero eigenvalues = number of connected components
The Fiedler value measures algebraic connectivity
The spectral gap indicates community separation strength

Participation Ratio

The effective rank via participation ratio provides a model-free measure of spectral spread:

This ranges from 1 (single dominant eigenvalue) to (uniform spectrum), capturing the effective dimensionality of the network.

Connection to Statistical Physics

The graph Laplacian connects to the partition function of a Gaussian field on the network:

The free energy encodes thermodynamic properties of signal propagation on the network.

SpecRNA-QA: Why Spectral Graph Features See What Local Metrics Miss

Yuda Bi — Tue, 07 Apr 2026 00:00:00 GMT

RNA 3D structure quality visualization: C4’ deviation from best model (blue = 0 Å, red = 50 Å). Structures with correct local geometry but misplaced domains show large red regions — exactly the failure mode that spectral methods detect.

The Problem: Local Correctness, Global Failure

Predicting RNA 3D structure is one of the open frontiers of structural biology. Tools like AlphaFold3 and RoseTTAFold2NA now generate thousands of candidate structures — but how do you know which ones are right?

Existing quality assessment (QA) methods evaluate local atomic contacts: bond angles, clash scores, pairwise distance distributions. These work well when errors are local. But they fail catastrophically in a common and important failure mode: the local structure is correct, but entire domains are misplaced. A helix can be perfectly folded yet docked into the wrong pocket. Every bond angle checks out; the global topology is wrong.

This is exactly the regime where RNA QA matters most — large, multi-domain structures where the combinatorial space of domain arrangements dwarfs the local conformational space.

The Insight: Global Topology Lives in the Spectrum

A 3D molecular structure is, at its core, a graph: nucleotides are nodes, spatial contacts are edges. The spectrum of the graph Laplacian — the eigenvalues of — encodes the global connectivity pattern in a way that is invariant to rotation, translation, and node relabeling.

The key mathematical facts:

The number of zero eigenvalues equals the number of connected components
The Fiedler value measures how easily the graph can be bisected — a proxy for global compactness
The spectral gap quantifies community separation
Heat-kernel traces capture multi-scale diffusion: small probes local geometry, large probes global topology
The participation ratio measures effective spectral dimensionality

A misplaced domain changes the large- heat-kernel trace (disrupted long-range diffusion) while leaving the small- trace nearly intact (local contacts are fine). This is precisely the information that local metrics cannot access.

The Method

SpecRNA-QA builds on this insight with a practical pipeline:

Multi-scale contact graphs: Construct contact networks at multiple distance thresholds (8Å, 10Å, 12Å, 15Å), capturing different spatial resolutions of the RNA architecture.
Spectral feature extraction: From each graph’s normalized Laplacian, extract ~312 features:
- Eigenvalue statistics (mean, variance, skewness, kurtosis of )
- Heat-kernel traces at multiple diffusion times for
- Participation ratios and effective rank measures
- Spectral gap and algebraic connectivity
- Normalized Laplacian entropy
Learning-to-rank: An XGBRanker model trained to rank structures by quality within each target, using spectral features as input.

The entire pipeline runs on CPU — no GPU required. Processing time: 15 ms for a 100-nucleotide structure, ~4.2 seconds for 800 nucleotides.

Results

CASP16 Benchmark

On the CASP16 RNA structure prediction assessment (42 targets, 7,368 models):

Method	Median Spearman	-value vs. SpecRNA-QA
SpecRNA-QA (supervised)	0.689	—
Geometry baselines	0.465

Where It Matters Most: Large RNAs

The advantage is most pronounced for large RNA structures (>200 nucleotides), where the performance gap reaches +0.233 in Spearman correlation. This is the regime where domain-level misplacements dominate — and where spectral features shine.

For small RNAs (<100 nt), local metrics are often sufficient because there are few domains to misplace. The spectral advantage grows with structural complexity, exactly as the theory predicts.

Most Discriminative Features

Feature importance analysis reveals that the top-ranked features are heat-kernel traces at intermediate-to-large diffusion times — precisely the features that probe multi-scale and global transport geometry on the contact network. Local eigenvalue statistics (which probe small-scale structure) rank lower.

This confirms the theoretical motivation: the spectral approach works because it accesses the global information that local methods cannot reach.

Connection to the Broader Spectral Program

SpecRNA-QA is part of a broader research program applying spectral graph theory to structural biology:

SpecRNA-QA (RNA): Multi-scale Laplacian spectra for RNA 3D quality assessment → under review at Briefings in Bioinformatics
Spectral Coherence Index (Proteins): Participation-ratio effective rank of inter-model distance-variance matrices for protein ensemble QA, achieving AUC-ROC 0.973 on 110 NMR ensembles → under review at IEEE JBHI (arXiv:2603.25880)

Both methods share a design principle: model-free spectral features that are invariant to coordinate systems and capture global structural properties that local metrics miss.

Try It

SpecRNA-QA is open source and easy to use:

git clone https://github.com/yudabitrends/specrnaq
cd specrnaq
pip install -e .
specrnaq predict --input structures/ --output scores.csv

Python 3.10+, CPU-only, no external dependencies beyond standard scientific Python.

Papers

Ying Zhu, Huaiwen Zhang, Vince D. Calhoun^†, Yuda Bi^†. Spectral Graph Features Capture Global Topology for Reference-free RNA 3D Structure Quality Assessment. Under review at Briefings in Bioinformatics.
Yuda Bi, Huaiwen Zhang, Jingnan Sun, Vince D. Calhoun. Spectral Coherence Index: A Model-Free Metric for Protein Structural Ensemble Quality Assessment. Under review at IEEE JBHI. arXiv:2603.25880

The Geometry of Invisible Forces: A Quartic Detection Theory

Yuda Bi — Tue, 07 Apr 2026 00:00:00 GMT

The Problem: Forces You Cannot See

Hidden degrees of freedom are everywhere. The ocean drives climate through modes no single weather station resolves. Latent neural populations shape the signals recorded by any one electrode. Slow institutional forces move markets through channels invisible to any single asset’s price history. The central question is deceptively simple: given noisy observations, can you tell whether a hidden force is present?

Classical statistics offers a reassuring answer — collect enough data and any nonzero signal emerges. Fisher information scales as (the sample size), so the detection boundary shrinks as . For a coupling strength , you need observations. Reasonable. Manageable.

This answer is wrong. Not slightly wrong — wrong by orders of magnitude.

The Quartic Law: Why Detection Is Exponentially Harder

Consider the simplest possible hidden-variable problem. An observed time series obeys

where is a hidden persistent driver () and is noise. The coupling controls how strongly the hidden world leaks into the visible one.

The power spectrum of is indeed perturbed at . So far, so classical. But here is the catch: when you refit a reduced model (one without the hidden variable), the best-fit parameters shift to absorb most of that perturbation. The refitted model “explains away” the hidden signal by reparametrizing itself.

What survives is not the perturbation, but only its normal component — the part that cannot be absorbed by any reparametrization of the reduced model. And this residual is .

The result is the quartic detection law:

where is a system-specific constant and is the minimum Kullback-Leibler divergence between truth and the best-fit reduced model. The detection boundary becomes

For a 10% coupling (), the quartic penalty demands times more data than the naive quadratic expectation. This is not a small correction — it is a qualitative change in what is experimentally feasible.

The Geometry: Tangent Absorption on Statistical Manifolds

The quartic law is not an accident of the specific model. It is a geometric theorem on statistical manifolds.

Let be the manifold of reduced (null) models, parametrized by . The true distribution under hidden forcing is

The perturbation has a unique decomposition into tangent and normal components relative to :

where is the -projection onto the tangent space . The tangent part is indistinguishable from a parameter shift; only the normal residual is genuinely detectable. Since enters at , the squared residual — hence the KL divergence — is .

This is Efron’s statistical curvature repurposed for detection: the “curvature” of the embedding of truth relative to the null manifold controls how much signal survives refitting.

The Dark Regime: When Timescales Coalesce

Even with the quartic law in hand, there is a deeper obstruction. In the spectral setting, the quartic coefficient takes the closed form

This coefficient vanishes when — when the intrinsic relaxation timescale of the observed system exactly matches that of the hidden driver. At this timescale coalescence, the hidden driver’s spectral signature is perfectly tangent to the null manifold. The hidden forcing becomes spectrally dark: present, dynamically active, yet locally invisible to any single-channel spectral test.

The data cost diverges as the dark boundary is approached:

This is not merely a detection difficulty — it is a geometric singularity of the inference problem itself.

Breaking the Impossibility: The Cross-Spectral Escape

The single-channel impossibility is not the end of the story. It is the beginning of a deeper one.

Lucente et al. proved that no time-irreversibility measure can detect departure from equilibrium in a scalar Gaussian time series from a linear system. This seems like a fundamental wall. But it has a geometric loophole: it applies only to diagonal (single-channel) observations.

When a second channel shares the same hidden driver, the cross spectrum — the off-diagonal block of the spectral matrix — provides a detection channel that is orthogonal to the entire diagonal null manifold. The cross-spectral contribution obeys its own quartic law:

with a remarkable property: the coefficient is exactly independent of the observed-channel dynamics. All dependence on the transfer functions of the observed channels cancels identically. The cross-spectral detectability is determined solely by the hidden mode’s spectral density:

Crucially, at exact timescale coalescence, where all single-channel measures vanish. The cross spectrum breaks the dark regime.

The Thermodynamic Bridge: Detectability Certifies Irreversibility

The connection deepens at the level of thermodynamics. For a one-way coupled Ornstein-Uhlenbeck system, the full-system entropy production rate (EPR) is exactly

and the EPR-detectability bridge reads

This means: if the cross-spectral divergence is positive, the full system is certifiably out of equilibrium. Cross-spectral structure witnesses entropy production even when every single-channel estimator of time-irreversibility returns zero.

A single probe can sit inside a system with arbitrarily large true entropy production and measure exactly zero irreversibility. Two probes, sharing the same hidden bath, break this thermodynamic blindness.

The Unified Picture

The three results assemble into a coherent geometric theory of hidden-variable detectability:

Layer	Result	Key Object
Universal law	Quartic onset	Normal residual
Structural impossibility	Single probe is blind ( at )	Pairing principle: need probes
Spectral darkness	at timescale coalescence	Tangent alignment of hidden spectrum
Cross-spectral escape	at coalescence	Off-diagonal orthogonality
Thermodynamic certification		EPR-detectability bridge

The hierarchy is:

A single observation channel is exactly dark — hidden forcing is indistinguishable from equilibrium, for all coupling strengths, at all sample sizes.
Two channels sharing the hidden driver restore quartic detectability via the cross spectrum, even at the coalescence singularity where auto-spectral methods fail.
The cross-spectral witness certifies that the full system is out of equilibrium, linking observable spectral structure to the thermodynamic arrow of time.

Implications

For neuroscience: Detecting latent neural inputs requires multi-channel recording. The pairing principle prescribes exactly how many electrodes are needed given estimates of coupling and noise. A single fMRI voxel or EEG electrode is provably blind to a shared latent source.

For climate science: Detecting unresolved ocean forcing in climate records requires spatially distributed stations. A single station, no matter how long the record, cannot distinguish forced variability from intrinsic noise. The dark regime at timescale coalescence explains why slow ocean modes are so difficult to identify.

For stochastic thermodynamics: The single-channel impossibility is not a limitation of current methods — it is a geometric fact about the statistical manifold. The cross-spectral escape provides a constructive route to certifying irreversibility from partial observations.

For experimental design: The quartic law transforms the question “how much data do I need?” into “how many probes do I need, and where?” The answer: at least two, sharing the hidden effect, with detection power growing as the number of probe pairs.

Papers in This Series

Yuda Bi, Vince D. Calhoun. Why Single Probes Cannot Detect Hidden Forcing: A Quartic Detection Law. Under review at Physical Review Letters.
Yuda Bi, Vince D. Calhoun. Cross Spectra Break the Single-Channel Impossibility. Under review at Physical Review Letters.
Yuda Bi, Chenyu Zhang, Vince D. Calhoun. Timescale Coalescence Makes Hidden Persistent Forcing Spectrally Dark. Under review at Physical Review E. arXiv:2603.20917
Yuda Bi, Vince D. Calhoun. Conditioning on a Volatility Proxy Compresses the Apparent Timescale of Collective Market Correlation. Under review at Physical Review E. arXiv:2603.14072

Structure Sets the Stage: How Gray Matter Geometry Constrains Functional Brain Organization

Yuda Bi — Tue, 07 Apr 2026 00:00:00 GMT

Spectral fingerprint of structure-function coupling. (a) Singular value spectrum of the GM→FNC coefficient matrix showing rapid spectral decay — the coupling is low-rank. (b) Principal angle cosines between GM-selected and FNC-dominant subspaces: near-perfect alignment in ~3 dimensions, then rapid falloff. (c) The signature gap: subspace overlap reaches 0.45 while variance explained () stays at ~0.06.

The Paradox: Weak Prediction, Strong Geometry

A persistent puzzle in systems neuroscience: gray matter morphometry (regional volume, thickness, surface area) predicts functional network connectivity (FNC) poorly — typically explaining only a few percent of variance. This has led many to dismiss the structure-function relationship as negligible for gray matter, focusing attention on white-matter tractography instead.

But variance explained is the wrong metric. It measures whether structure predicts the amplitude of functional variation — how strongly each person’s connectivity deviates from the mean. A more fundamental question is whether structure predicts the directions — the coordinate axes along which functional variation occurs at all.

These are mathematically distinct quantities. A linear map from GM features () to FNC () can have low (weak amplitude prediction) while its coefficient matrix has column space closely aligned with the dominant subspace of (strong directional alignment).

The Method: Nuclear Norm Regularization and SVD

To disentangle amplitude from geometry, we use a spectral regularization framework:

where is the nuclear norm — the tightest convex relaxation of matrix rank. This encourages the learned mapping to be low-rank, concentrating the structure-function relationship into a small number of principal modes.

The proximal step has a closed-form solution via soft singular value thresholding:

The SVD of the resulting coefficient matrix gives us:

: the FNC directions most aligned with GM variation (the “stage geometry”)
: the GM features driving each coupling mode (the “structural scaffolding”)
: the coupling strength per mode (spectral concentration)

Measuring Directional Alignment: Subspace Overlap

To compare GM-selected functional directions with the dominant directions of FNC variation itself, we compute principal angles between two subspaces:

where are the top- right singular vectors of (GM-selected directions) and are the top- PCA directions of FNC. The subspace overlap is:

The Core Finding: vs.

Across three datasets (schizophrenia cohort ; external validation ; UK Biobank ):

Metric	Value	Interpretation
Variance explained ()	~0.06	GM weakly predicts FNC amplitudes
Subspace overlap ()	0.447	GM strongly constrains FNC directions
Top 3 principal angle cosines	0.97, 0.95, 0.86	Near-perfect alignment in 3 dimensions
4th principal angle	0.28	Sharp dropoff — concentrated coupling
Chance overlap (at )	~0.015	Observed overlap is 30× above chance

The gap is an order of magnitude. Gray matter does not determine how strongly individuals deviate from mean connectivity — but it strongly constrains in which directions that deviation can occur.

The analogy: anatomy sets the stage geometry; neural dynamics determine how the actors perform on that stage. The theater constrains the repertoire of possible plays without scripting any particular performance.

What the Coupling Modes Look Like

Structure-function coupling modes. Each row shows one SVD mode: left panel shows GM loadings on a dorsal glass brain (node size = loading magnitude, color = network domain), right panel shows the corresponding FNC loading matrix (domain-sorted). Mode 1 (36.5% of coupled variance) captures a global sensorimotor-cognitive axis; Mode 2 (9.2%) isolates default mode and visual interactions; Mode 3 (7.9%) captures cerebellar-cortical coupling.

The SVD decomposition reveals interpretable structure-function coupling modes:

Mode 1 (36.5% of coupled variance): A distributed pattern spanning cognitive control, sensorimotor, and visual domains. This is the dominant axis along which GM morphometry shapes functional organization.
Mode 2 (9.2%): Concentrated in default mode and visual network interactions — reflecting the structural basis of resting-state network architecture.
Mode 3 (7.9%): Captures cerebellar-cortical coupling patterns, highlighting the structural underpinning of cerebro-cerebellar communication.

These modes are highly stable across random seeds (correlation for the top 3) and replicate across datasets.

Linearity: The Coupling Really Is Linear

A critical finding: nonlinear models provide no reliable improvement over the linear nuclear norm solution.

MLP (multilayer perceptron) gains +0.004 on discovery data but reverses to −0.009 on external validation
A nonlinear residual model performs worse than nuclear norm alone on both datasets
When initialized from the linear solution, the MLP’s mixing parameter converges to , staying close to the linear regime

This linearity is not an assumption — it is an empirical finding validated across datasets. The structure-function relationship, at the population level, is well-captured by a linear, low-rank map.

Clinical Relevance: The Coupled Subspace Carries Disease Information

Decomposing each subject’s FNC into a structure-coupled component () and a structure-uncoupled component ():

Component	SZ classification AUC
Coupled FNC (rank 38)	0.795 [0.726, 0.857]
Full FNC	0.773 [0.712, 0.839]
Uncoupled FNC	0.728

The structure-coupled component outperforms full FNC for schizophrenia classification. This means that the functional variation constrained by gray matter morphometry preferentially carries clinically relevant information — anatomical structure does not just constrain function; it constrains the clinically informative part of function.

Physical and Mathematical Significance

Why Nuclear Norm?

The nuclear norm is the norm of the singular value vector — it induces sparsity in the spectral domain just as LASSO induces sparsity in the coefficient domain. This is the natural regularizer when the underlying relationship is low-rank: it seeks the simplest (lowest effective rank) linear map consistent with the data.

In the landscape of regularization:

Method	What it regularizes	Bias
Ridge ()	Shrinks all singular values equally	Preserves rank, weakens all directions
PLS	Maximizes covariance in few components	Greedily selects modes, poor generalization
Nuclear Norm ()	Soft-thresholds singular values	Suppresses weak modes, preserves strong ones

Nuclear Norm achieves the highest external generalization (74% retention) vs. PLS (45%) precisely because it soft-thresholds rather than hard-truncates the spectral structure.

The Subspace Overlap as an Information-Geometric Quantity

The subspace overlap is the mean squared cosine of the principal angles — equivalently, it is the normalized Frobenius norm of the product of two orthogonal projections:

This has a natural interpretation in information geometry: it measures the fraction of “geometric information” shared between two low-dimensional representations of the same high-dimensional space. When , the two modalities share geometric structure (directional alignment) without sharing amplitude information — the hallmark of a soft constraint rather than a deterministic prediction.

Looking Ahead: The Multimodal Decomposition Program

This paper is the first in a planned three-paper series:

Paper 0 (this work): Establishes that GM constrains function through a shared low-rank subspace, not pointwise prediction. Introduces nuclear norm regularization and the diagnostic.

Paper 1 (in preparation): Asks whether gray matter and white matter constrain the same or different functional subspaces. Uses two-stage subspace decomposition to partition functional variance into four components:

Key question: is the GM–WM functional overlap redundant or complementary? Data: UK Biobank (; sMRI + dMRI + resting fMRI).

Paper 2 (planned): Unified probabilistic framework with joint latent variables, non-Gaussian priors for identifiability, and posterior uncertainty on the variance decomposition. Connects to ICA/IVA frameworks (Adalı) and Bayesian ARD for automatic dimensionality selection.

Paper

Yuda Bi, Vince D. Calhoun. Gray Matter Morphometry Reveals a Soft Low-Rank Structure-Function Subspace. In preparation for NeuroImage.

Yuda Bi — Research Notes

Spectral Methods in Brain Network Analysis

Graph Laplacian and Brain Networks

Spectral Decomposition

Key Properties

Participation Ratio

Connection to Statistical Physics

SpecRNA-QA: Why Spectral Graph Features See What Local Metrics Miss

The Problem: Local Correctness, Global Failure

The Insight: Global Topology Lives in the Spectrum

The Method

Results

CASP16 Benchmark

Where It Matters Most: Large RNAs

Most Discriminative Features

Connection to the Broader Spectral Program

Try It

Papers

The Geometry of Invisible Forces: A Quartic Detection Theory

The Problem: Forces You Cannot See

The Quartic Law: Why Detection Is Exponentially Harder

The Geometry: Tangent Absorption on Statistical Manifolds

The Pairing Principle: One Probe Is Blind

The Dark Regime: When Timescales Coalesce

Breaking the Impossibility: The Cross-Spectral Escape

The Thermodynamic Bridge: Detectability Certifies Irreversibility

The Unified Picture

Implications

Papers in This Series

Structure Sets the Stage: How Gray Matter Geometry Constrains Functional Brain Organization

The Paradox: Weak Prediction, Strong Geometry

The Method: Nuclear Norm Regularization and SVD

Measuring Directional Alignment: Subspace Overlap

The Core Finding: vs.

What the Coupling Modes Look Like

Linearity: The Coupling Really Is Linear

Clinical Relevance: The Coupled Subspace Carries Disease Information

Physical and Mathematical Significance

Why Nuclear Norm?

The Subspace Overlap as an Information-Geometric Quantity

Looking Ahead: The Multimodal Decomposition Program

Paper