Nonlinear Feature Extraction for Financial Time Series Prediction Using Causal Legendre Polynomials

From FinTech Theory to Algorithmic Trading: Building a Complete Python Pipeline with Legendre Polynomials without Lookahead Bias 🔁🤖

May 26, 2026

This paper develops a causal financial forecasting and trading framework based on Legendre polynomial expansions, arguing that they provide a stable, distribution-agnostic way to extract smooth structure from noisy, nonlinear market data. It builds a pipeline that moves from mathematical foundations and time-series representation to causal trend filtering, crossover-based trading strategies, Sharpe-optimized signal design, and rigorous in/out-of-sample backtesting, ultimately extending the method to multi-horizon forecasting within a unified, real-time deployable system.

Most algo trading content gives you theory.

This gives you the code.

3 Python strategies. Fully backtested. Colab notebook included. Plus a free ebook with 5 more strategies the moment you subscribe.

5,000 quant traders already run these:

Subscribe | AlgoEdge Insights

Daily algorithmic trading signals, market insights, and code-ready strategies to sharpen your edge.algoedgeinsights.beehiiv.com

Legendre basis: 1, reality: 0 (image created by the author based on Canva templates).

😄 Legendre basis: 1, reality: 0 (image designed by the author based on Canva templates).

👋 Hello to everyone fascinated by prediction, causality, and market dynamics! If markets are nonlinear [7], why are so many predictive pipelines still pretending otherwise? What if your model’s “noise” is actually structured nonlinear information waiting for the right basis expansion?

The present study is focused on developing a systematic methodology for forecasting financial time series under generalized stochastic dynamics with nonlinear fluctuations [7], encompassing polynomial trend models, stochastic volatility structures, and causal smoothed moving averages based on polynomial regression endpoints.

The proposed framework exploits the fact that many of these models belong to the class of local or global polynomial diffusions, allowing efficient approximation via orthogonal polynomial expansions [2–6].

Although numerous orthogonal polynomial systems exist (including Hermite, Laguerre, Chebyshev, etc.), the selection of Legendre Polynomials (LP) [2] requires specific justification within the context of quantitative finance [3–6]:

LP are orthogonal on a finite interval with uniform weight. This matters because financial data does not naturally conform to Gaussian or exponential weighting assumptions, and in many cases it is undesirable to impose a specific distributional form on the errors.
LP are orthogonal, which helps reduce multicollinearity, and they are numerically stable in least-squares estimation, while also being better conditioned than raw power features such as x**n with n>0. This is crucial when dealing with noisy financial time series, high-dimensional feature extraction, and rolling-window estimation, where stability and robustness of the representation are especially important.
LP are defined on a finite interval, which makes them naturally suited for sliding-window forecasting, local feature extraction, and causal segmentation of time series.
The Legendre basis introduces minimal inductive bias beyond smoothness and orthogonality, avoiding tail overweighting, decay assumptions, and Gaussian noise dependence associated with Chebyshev, Laguerre, and Hermite bases, respectively. This is often desirable in financial modeling where the underlying distribution is unknown, regime shifts occur, and structural assumptions can be unreliable.
Since polynomial diffusions remain invariant under polynomial transformations, LP enable efficient projection onto finite-dimensional spaces and naturally align with the orthogonal decomposition of state dynamics. Polynomial diffusions are especially useful in volatility forecasting because volatility (or variance) is not directly observed but evolves as a latent stochastic process.

These properties make LP a natural choice for causal, local feature extraction in financial time series forecasting.

This study draws a clear distinction between Legendre polynomials (LP) and the Legendre transform [1], focusing primarily on the former rather than the latter.

LP are special basis functions Pn(x) used for approximating functions f(x). Instead of describing the function point-by-point, we describe it using LP expansion coefficients w_n (n=0, 1, 2, etc.). This is similar in spirit to Fourier series.

The Legendre transform g(p) = sup_x [px — f(x)] is an operation on convex functions f(x). Instead of describing a function by position x, we describe it by slope p = df/dx. The we construct a new function g(p) that depends on the slope p instead of the original variable x.

It should be emphasized that the Legendre transform is unrelated to Legendre polynomials.

The remainder of the paper is organized as follows:

The “Legendre Polynomials from Scratch” section establishes the mathematical foundation of LP and introduces the primary tool used throughout the paper.
“What Is the Legendre Transform?” explains the transform and builds intuition for changing perspectives on relevant data and functions.
“Legendre Transform in Portfolio Optimization” links the transform to portfolio decisions by showing how it captures risk–return trade-offs.
“Legendre Polynomial Expansion of Time Series” is the first direct data application of Legendre basis functions, used to smooth noisy financial data and convert raw prices into structured trend components.
“Causal Legendre Polynomials in Time Series Analysis” introduces a causality constraint by ensuring only past and present data are used, avoiding future leakage and making the model suitable for real-time trading systems.
“Legendre Trend Filtering on Stock Closing Prices” makes the method practical by using LP expansions to extract smooth trends from price series, acting as a filtering technique that removes noise while preserving underlying structure.
“Causal Legendre Crossover Trading Strategy” defines a crossover system by comparing fast and slow Legendre trends and generating buy and sell signals when they cross.
“Sharpe-Optimized Causal Legendre Crossover Trading Signals” improves the strategy by tuning parameters to maximize the Sharpe ratio, shifting focus from raw returns to risk-adjusted performance and adding an optimization layer to signal generation.
“In-Sample Grid Search and Out-of-Sample Backtesting” is the validation framework where parameters are tuned on in-sample data and then tested on unseen data to ensure the results are not overfitted.
“Causal Time-Series Trend Estimation and Multi-Horizon Forecasting Model” uses causal LP for trend estimation, extends them to multi-horizon forecasting, and unifies the backtesting and out-of-sample evaluation frameworks.

In simple terms, the paper progresses from the mathematical foundation of Legendre theory, to time-series representation via LP expansion, followed by causal filtering under realistic constraints, then trend estimation, trading strategy design, Sharpe-ratio-based optimization, rigorous backtesting, and finally an extension to multi-horizon LP forecasting.

Let’s get started! 🚀

· Legendre Polynomials from Scratch
· What Is the Legendre Transform?
· Legendre Transform in Portfolio Optimization
· Legendre Polynomial Expansion of Time Series
· Causal Legendre Polynomials in Time Series Analysis
· Legendre Trend Filtering on Stock Closing Prices
· Causal Legendre Crossover Trading Strategy
· Sharpe-Optimized Causal Legendre Crossover Trading Signals
· In-Sample Grid Search and Out-of-Sample Backtesting
· Causal Time-Series Trend Estimation and Multi-Horizon Forecasting Model
· Conclusions

Legendre Polynomials from Scratch

Let’s build intuition for LP [2] step-by-step in Python by viewing them as a special family of polynomials that are orthogonal on the interval [-1, 1].

The expressions x**n (with n=0, 1, 2,…) are the simplest examples of polynomials in x. More specifically, they are called monomials (aka the standard polynomial basis).

LP are just specially chosen combinations of these:

(n+1)Pn+1(x)=(2n+1) x Pn(x) − n Pn−1(x)

This recursion is the standard way to compute LP.

The first few are given by

Why not try plotting them?

import numpy as np
import matplotlib.pyplot as plt

# x values
x = np.linspace(-1, 1, 500)

# first Legendre polynomials
P0 = np.ones_like(x)
P1 = x
P2 = 0.5 * (3*x**2 - 1)
P3 = 0.5 * (5*x**3 - 3*x)

# plot
plt.plot(x, P0, label=’P0’)
plt.plot(x, P1, label=’P1’)
plt.plot(x, P2, label=’P2’)
plt.plot(x, P3, label=’P3’)

plt.legend()
plt.grid(True)
plt.title(”First Legendre Polynomials”)
plt.xlabel(”x”)
plt.ylabel(”Pn(x)”)
plt.show()

First Legendre Polynomials

Once we understand the basics, NumPy can do the heavy lifting

from numpy.polynomial.legendre import Legendre
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(-1, 1, 500)

# P3
P3 = Legendre.basis(3)

plt.plot(x, P3(x))
plt.grid(True)
plt.title(”Legendre Polynomial P3”)
plt.xlabel(”x”)
plt.ylabel(”P3(x)”)
plt.show()

This is analogous to the FFT: Fourier methods decompose signals into oscillating waves, while LP decompose functions into smooth curved polynomial basis functions.

For example, suppose a time series y(t=time) has a smooth nonlinear trend. A common approach is polynomial regression

y(t) = b0+b1*t+b2*t**2+…+O(t**n)

But ordinary powers of t are highly correlated. This creates multicollinearity.

LP solve this because the basis functions are orthogonal. Instead, we have

y(t)=b0 P0(x_t) +b1 P1(x_t)+b2 P2(x_t) +…+En,

where time t is rescaled to x_t = [-1, 1].

If the function y(t) has k smooth derivatives, then the asymptotic error bound is given by

En=O(1/n**k)

Takeaway

LP can be viewed as a collection of mutually independent smooth basis shapes for trends. Rather than constructing a trend from correlated power terms like t**n for n≥0, we use orthogonal smooth components Pn(t), resulting in cleaner estimation and improved numerical stability.

Bottom Line

Orthogonal polynomial approximations (like LP) are powerful for smooth time series trends because they provide a stable, compact, and energy-efficient representation of smooth structure.

The crucial point is that smooth time series can be represented accurately using only a small number of low-order orthogonal modes.

② One strategy in this book returned 2.3× the S&P 500 on a risk-adjusted basis over 5 years.

Fully coded in Python. Yours to run today.

The 2026 Playbook — 30+ backtested strategies,

full code included, ready to deploy.

20% off until Thursday. Use MAY2026 at checkout.

~~$79~~ → $63.20 · Expires May 21.

→ Grab it before Tuesday

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

What Is the Legendre Transform?

The Legendre transform [1] of a convex function y=f(x) is constructed from its graph in the (x,y) plane, yielding a new function g(p) of the variable p (see below). Consider the straight line y = px. We take the point x = x (p) at which the curve is farthest from the straight line in the vertical direction: for each p the function F(p,x) = px-f(x) has a maximum with respect to x at the point x(p). Now we define g(p) = F(p, x(p)). The point x(p) is defined by the extremal condition dF/dx =0 or df/dx=p. Since f is convex, the point x(p) is unique.

Legendre Transformation (image created by the author based on Canva templates).

The simple Python code below provides a visual explanation of the Legendre transform. The code defines a convex function f(x), draws straight lines with different slopes, finds where each slope “best touches” the curve, and shows how the Legendre transform comes from maximizing the bivariate function F(p,x)=px-f(x).

import numpy as np
import matplotlib.pyplot as plt

# Choose a smooth bowl-shaped curve (convex).
def f(x):
    return 0.5 * x**2 + 0.2 * x**4   # convex example

# The derivative tells us the slope of the curve at each point.
def df(x):
    return x + 0.8 * x**3 
#This derivative is not actually used below.
#It’s included for completeness.

# grid for x
x = np.linspace(-2, 2, 500)

# choose several p values (slopes)
p_values = [-1.5, -0.5, 0.5, 1.5]

# --- Plot f(x) ---
plt.figure(figsize=(7, 5))
plt.plot(x, f(x), label=”f(x)”)

for p in p_values:
    # supporting line y = p x
    plt.plot(x, p * x, linestyle=”--”, alpha=0.6, label=f”y = {p}x”)

    # numerical maximization of F(p,x)
    F = p * x - f(x)
    x_star = x[np.argmax(F)]
    y_star = f(x_star)

    plt.plot(x_star, y_star, ‘o’)

plt.title(”Legendre Transform: Geometry (supporting lines)”)
plt.xlabel(”x”)
plt.ylabel(”y”)
plt.legend()
plt.grid()

plt.show()


# --- Plot F(p,x) = px - f(x) for one p ---
p = 1.0
F = p * x - f(x)

x_star = x[np.argmax(F)]
g_p = np.max(F)

plt.figure(figsize=(7, 5))
plt.plot(x, F, label=f”F(p,x) = p x - f(x), p={p}”)
plt.plot(x_star, g_p, ‘ro’, label=”maximum”) 

plt.title(”Legendre Transform: Optimization View”)
plt.xlabel(”x”)
plt.ylabel(”F(p,x)”)
plt.legend()
plt.grid()

plt.show()

Legendre Transform Example: Geometry (Supporting Lines) vs Optimization View. Geometrically, the maximum point is where the line with slope p “supports” the convex curve.

The first graph shows the convex function f(x), several straight lines with different slopes, and the special point where each slope best matches the curve. This is the geometric view of the Legendre transform.

The second plot illustrates how F(p,x) varies with x and where its maximum (red dot) occurs.

Legendre Transform in Portfolio Optimization

Both the Kelly criterion and mean–variance optimization can be understood through the same idea as the Legendre dual viewpoint: we simply switch between allocations and prices (or risk trade-offs).

The mean–variance optimization (MVO) is a quadratic Legendre transform of the risk cost function f(w) = λ(σw)**2/2, where λ is the risk aversion factor, σ is the standard deviation of returns, x=w signifies portfolio weights, and p=μ means expected returns.

Also, the Kelly criterion maximizes long-run log growth as a dual problem. For small returns, we approximate it as max_w [μw-(σw)**2/2]. This is structurally similar to MVO, except that risk aversion λ is not chosen exogenously but is instead determined endogenously by log utility.

We now extend the previous Legendre transform example to MVO

import numpy as np
import matplotlib.pyplot as plt

# -----------------------------
# Mean–variance risk function
# f(w) = (λ/2) σ² w²
# -----------------------------
sigma = 0.4
lam = 3.0

def f(w):
    return 0.5 * lam * sigma**2 * w**2

def df(w):
    return lam * sigma**2 * w  # not used, but conceptually slope

# -----------------------------
# grid for portfolio weights
# -----------------------------
w = np.linspace(-3, 3, 500)

# expected returns (acts like slope p)
mu_values = [-0.5, 0.0, 0.5, 1.0]

# -----------------------------
# Plot risk function + linear reward
# -----------------------------
plt.figure(figsize=(7, 5))
plt.plot(w, f(w), label=”Risk cost f(w) = (λ/2)σ²w²”)

for mu in mu_values:
    # linear reward term μw
    plt.plot(w, mu * w, linestyle=”--”, alpha=0.6, label=f”μw (μ={mu})”)

    # Legendre-style objective
    F = mu * w - f(w)

    w_star = w[np.argmax(F)]
    f_star = f(w_star)

    plt.plot(w_star, f_star, ‘o’)

plt.title(”Mean–Variance as a Legendre Transform”)
plt.xlabel(”portfolio weight w”)
plt.ylabel(”value”)
plt.legend()
plt.grid()
plt.show()

# -----------------------------
# Dual function F(μ) = max_w (μw - f(w))
# -----------------------------
mu = 0.6
F = mu * w - f(w)

w_star = w[np.argmax(F)]
g_mu = np.max(F)

plt.figure(figsize=(7, 5))
plt.plot(w, F, label=f”F(μ,w)=μw - f(w), μ={mu}”)
plt.plot(w_star, g_mu, ‘ro’, label=”maximum”)

plt.title(”Mean–Variance: Dual Optimization View”)
plt.xlabel(”w”)
plt.ylabel(”F(μ,w)”)
plt.legend()
plt.grid()

plt.show()

Mean-Variance Portfolio Optimization as a Legendre Transform and Dual Optimization View

This code turns MVO into a Legendre transform problem. To resolve the risk–return trade-off, we maximize the cost function μw-λ(σw)**2/2, where w is the portfolio position (how much we invest), μw is the expected return (linear reward), and λ(σw)**2/2 is the risk penalty (quadratic cost). This implies that the risk penalty grows faster with position size and becomes more curved as either the risk-aversion parameter λ or the volatility σ increases.

Most algo trading content gives you theory.

This gives you the code.

3 Python strategies. Fully backtested. Colab notebook included. Plus a free ebook with 5 more strategies the moment you subscribe.

5,000 quant traders already run these:

Subscribe | AlgoEdge Insights

Daily algorithmic trading signals, market insights, and code-ready strategies to sharpen your edge.algoedgeinsights.beehiiv.com

Legendre Polynomial Expansion of Time Series

Suppose we have a sampled time series

y(t) ~ {yt}, t=0,1,…,N-1.

We can expand the series in Legendre basis functions [2]

yt ~ a0 P0(ut)+…+ap Pp(ut), p>0,

where

ut = -1 + 2t/(N-1),

Pk(*) are the Legendre polynomials of order k, and p is simply the truncation order of the Legendre series.

In practice, we cannot use infinitely many terms, so we cut it off at some finite level p<<N. In the time-series analysis described below, small p yields a smooth, coarse approximation capturing only low-frequency trends, while larger p gives a more detailed fit but may lead to overfitting by capturing noise when p is too large.

The coefficients in the discrete projection are expressed as

ak ~ [(2k+1)/(N-1)] * {y0 P0(u0)+…+ym Pm(um)}, m=N-1.

This is a discrete projection approximation, not exact orthogonality. Accuracy improves when N is large and y(t) is smooth. A uniform grid is acceptable.

Relation to Nonlinear Feature Extraction

The Legendre expansion defines a unique mapping y(t) <-> {a0,…,ap}, making the coefficients a low-dimensional representation of y(t). This naturally connects to nonlinear state-space modeling and dimensionality reduction, as it represents a high-dimensional time series using a small set of coefficients that retain most of its structure.

Example: Legendre Polynomial Approximation vs True Nonlinear Trend

The code below shows how Legendre polynomials can break a noisy signal into a smooth low frequency trend, finer nonlinear structure, and random noise.

import numpy as np
import matplotlib.pyplot as plt
from numpy.polynomial.legendre import legfit, legval

np.random.seed(42)

# =========================
# SYNTHETIC DATA
# =========================
N = 500
t = np.arange(N)
u = -1 + 2 * t / (N - 1) #generate 500 time points and rescale them to the interval [-1,1].

#Build the “true” nonlinear trend
true_trend = ( 
    50
    + 10 * np.sin(2 * np.pi * u * 2)
    + 5 * u**2
    + 8 * np.tanh(3 * u)
)
#Add heteroskedastic noise
volatility = 2.5 + 4 * (u + 1) / 2
noise = volatility * np.random.randn(N)
#This mimics financial data where variance changes over time.
price = true_trend + noise

# =========================
# LEGENDRE APPROXIMATION
# =========================
degree = 14  # high-order approximation

coeffs = legfit(u, price, deg=degree)
approx_trend = legval(u, coeffs) #This evaluates the fitted polynomial expansion.

# =========================
# OPTIONAL: SMOOTH TREND ONLY (LOW FREQUENCIES)
# =========================
low_degree = 2
coeffs_low = legfit(u, price, deg=low_degree)
smooth_trend = legval(u, coeffs_low) #captures only the broad macro shape.
#This is essentially a low frequency compression of the signal.

# =========================
# PLOT
# =========================
plt.figure(figsize=(12,6))

plt.plot(price, label=”Noisy Price”, alpha=0.5)
plt.plot(true_trend, label=”True Trend”, linewidth=2)
plt.plot(approx_trend, label=”14-Order Legendre”, linewidth=2)
plt.plot(smooth_trend, label=”2-Order Legendre”, linewidth=2)

plt.title(”Legendre Polynomial Approximation vs True Nonlinear Trend”)
plt.xlabel(”Time Stamp”)
plt.ylabel(”Price”)
plt.legend()
plt.grid()
plt.show()

Legendre Polynomial Approximation vs True Nonlinear Trend

Interpretations

The plot shows how different LP orders capture different levels of structure in the data.

The gray “Noisy Price” series contains both the underlying nonlinear trend and random volatility. The noise becomes larger toward the right side of the sample because the simulated volatility increases over time.

The “True Trend” line is the hidden smooth signal that generated the data. Since this is synthetic data, we know the exact underlying process and can compare the approximations directly against it.

The “14 Order Legendre” approximation tracks the true trend quite closely. This means the higher order Legendre basis is flexible enough to recover most of the nonlinear structure, including oscillations, curvature, and smooth turning points. It captures much of the systematic signal while ignoring a large portion of the random noise. However, because the order is relatively high, the approximation also starts reacting to some local fluctuations produced by the noise, especially in the higher volatility region.

The “2 Order Legendre” approximation behaves very differently. It only captures the broadest low frequency structure of the series: the overall level, directional slope, and large scale curvature. It cannot reproduce the oscillatory sinusoidal behavior or more detailed nonlinear deformations. As a result, it acts like a strong smoothing filter that compresses the signal into only a few dominant components.

Takeaway

LP provide a hierarchical decomposition of smooth structure. Lower order terms describe coarse global behavior, while higher order terms progressively add finer detail. Even though the dataset contains 500 observations, the smooth underlying dynamics can still be represented reasonably well using only a relatively small number of orthogonal polynomial components.

Bottom Line

Even if you sample something like 10,000 points, the smooth pattern is basically just a few things like level, slope, a bit of curvature, and maybe one or two extra wiggles, and LP are just a clean way to separate those pieces.

Causal Legendre Polynomials in Time Series Analysis

A useful way to understand a causal Legendre polynomial series expansion of a discrete-time signal is as a bridge between three ideas that are usually treated separately in signal processing: (1) local signal approximation via sliding-window (moving average) modeling, (2) orthogonal polynomial expansions using Legendre polynomials, and (3) real-time (online) computation without access to future data.

In classical approximation theory, we often expand a time series y(t) over a whole interval using an orthogonal basis:

y(t) ~ a0 P0(t) +…+ aK PK(t), K>0.

This is a global expansion in which all coefficients depend on the entire interval.

But a discrete-time signal y[n] is typically processed online, meaning that at time n, only the past and present samples y[0], y[1],…, y[n] are available, while future values are not. So instead of a global expansion, we shift to a moving local window ending at time n.

The Causal Data Segment

At each time sample n, we define a finite history (causal) window

{y[n], y[n−1],…,y[n−N+1]}.

That is an arithmetic sequence starting at n, ending at n-(N-1), with step −1, and it contains N terms.

To apply Legendre polynomials, we must map this discrete index set into a normalized interval

from m={0,…,N-1} to tm=[-1, 1], e.g., tm = [2m/(N-1)]-1.

Now the past window becomes a “mini-function” defined on [-1, 1].

The Causal Legendre Expansion

At time n, we approximate the recent signal segment as

y[n-m] ~ a0[n] P0(tm) +…+ aK[n] PK(tm).

This is a local polynomial model of the past, where k=0 represents the local mean (DC component), k=1 the slope or trend, k=2 the curvature, and higher orders k=3,…,K>3 capture increasingly detailed structure.

Here, the coefficients are obtained by projecting the past window onto each basis function

ak[n] = w0 y[n-0] Pk(t0) +…+ wm y[n-m] Pk(tm), m=N-1.

This is where causality enters explicitly: only the present and past samples y[n],y[n−1],… are used, no future samples appear, and the system can therefore operate in real time. So at each time step, we are effectively performing a least-squares polynomial fit over the past window, but expressed in an orthogonal basis.

The Practical Value

Local geometric interpretation

Here, a0[n] represents the local average level, a1[n] captures the local trend or derivative-like behavior, and a2[n] describes curvature or acceleration-like behavior.

Effective noise suppression

Truncating at small K acts as a structured low-pass filter while providing better numerical stability than monomial-based expansions.

Feature extraction for dynamics

The coefficient vector a[n] = {a0[n],…, aK[n]} provides a compact state description of the signal’s recent history.

Example: Causal Legendre Polynomial Approximation

The code below demonstrates a causal Legendre polynomial expansion for estimating the underlying trend of a noisy time series using only past information.

The key idea is that, at every time step, the model fits a smooth polynomial representation over a rolling historical window and then extrapolates the trend estimate to the current point without using future data.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from numpy.polynomial.legendre import legfit, legval

np.random.seed(42)

# =====================================================
# SYNTHETIC TIME SERIES
# =====================================================

N = 500
t = np.arange(N)

# normalize time to [-1, 1]
u = -1 + 2 * t / (N - 1)

# smooth latent signal
true_trend = (
    50
    + 8 * np.sin(2 * np.pi * u * 1.5)
    + 5 * u**2
    + 6 * np.tanh(3 * u)
)
‘’‘
The latent signal combines:

a constant level
oscillatory structure (sin)
smooth curvature (u**2)
nonlinear saturation (tanh)

This produces a realistic nonlinear trend.
‘’‘
# heteroskedastic noise
volatility = 2 + 3 * (u + 1) / 2
noise = volatility * np.random.randn(N)

#The volatility increases gradually over time.

#So the right side of the series becomes noisier than the left side.

# observed process
price = true_trend + noise

# =====================================================
# CAUSAL LEGENDRE ESTIMATION
# =====================================================

degree = 4
window = 100

causal_fit = np.full(N, np.nan)

for t_now in range(window, N):

    # ---------------------------------------------
    # ONLY USE PAST INFORMATION
    # ---------------------------------------------

    y_window = price[t_now - window:t_now]

    # local normalized coordinates
    local_x = np.linspace(-1, 1, window)

    # fit Legendre polynomial using ONLY past data
    coeffs = legfit(local_x, y_window, deg=degree)

    # estimate current endpoint
    # x=1 corresponds to “now”
    causal_fit[t_now] = legval(1, coeffs)

# =====================================================
# LOW-FREQUENCY CAUSAL FILTER
# =====================================================

low_degree = 2
smooth_fit = np.full(N, np.nan)

for t_now in range(window, N):

    y_window = price[t_now - window:t_now]
    local_x = np.linspace(-1, 1, window)

    coeffs_low = legfit(local_x, y_window, deg=low_degree)

    smooth_fit[t_now] = legval(1, coeffs_low)

plot_start = window

# =====================================================
# PLOT
# =====================================================

plt.figure(figsize=(14, 6))

plt.plot(
    t[plot_start:],
    price[plot_start:],
    alpha=0.4,
    label=”Observed Price”
)

plt.plot(
    t[plot_start:],
    true_trend[plot_start:],
    linewidth=2,
    label=”True Trend”
)

plt.plot(
    t[plot_start:],
    causal_fit[plot_start:],
    linewidth=2,
    label=f”Causal Legendre ({degree}-Order)”
)

plt.plot(
    t[plot_start:],
    smooth_fit[plot_start:],
    linewidth=2,
    label=f”Causal Low-Frequency ({low_degree}-Order)”
)

plt.title(”Causal Legendre Polynomial Expansion”)
plt.xlabel(”Time Sample”)
plt.ylabel(”Price”)

plt.legend()
plt.grid(True)

plt.show()

Causal Legendre Polynomial Approximation

In the code above, for every current time t_now, the model only uses the previous window observations. Therefore, the estimate at time t depends only on [t−window,…,t−1] with no future information leakage. This mimics a real forecasting environment.

Interpretations

The rolling window of 100 observations means that, at every time step, the estimator fits the Legendre expansion using only the previous 100 data points. This creates a locally adaptive trend estimate that evolves over time while remaining causal and free from future information leakage.

Using a 4th order Legendre expansion gives the model enough flexibility to capture the overall level, slope, broad curvature, and some smooth nonlinear oscillations without overreacting to local noise fluctuations.

In practice, the 4th order fit behaves like a medium frequency trend extractor. It is flexible enough to follow large nonlinear movements in the signal, but still smooth enough to suppress much of the random volatility.

The 2nd order approximation is much smoother because it only captures the average level, long term slope, and broad curvature of the signal.

As a result, the low degree estimate acts as a strong low pass filter. It ignores finer oscillations and reacts much more slowly to local changes in the series.

Using a rolling window of 100 observations makes the estimate smoother and more stable because it averages over a longer history, which reduces noise, dampens short term fluctuations, slows the response to turning points, and improves overall robustness.

The combination of degree=4 and window=100 therefore produces a fairly conservative causal trend estimator that prioritizes stable smooth structure over rapid local adaptation.

Takeaway

Conceptually, the code illustrates how LP can decompose a noisy time series into hierarchical smooth deformation modes while preserving causality. Lower order components capture broad macro structure, while higher order components progressively add finer nonlinear detail.

Bottom Line

A causal Legendre expansion transforms a time series into a stream of evolving polynomial shape descriptors, providing a multi-scale representation of local dynamics through a real-time least-squares projection onto orthogonal basis functions.

⑤ Most quant courses teach you to watch. This one makes you build.

Live. Weekly. With feedback on your actual code.

The AlgoEdge Quant Finance Bootcamp — 12 weeks of stochastic models, Black-Scholes, Heston, volatility surfaces, and exotic options. Built from scratch in Python.

Not pre-recorded. Not self-paced. Live sessions, weekly homework, direct feedback, and a full code library that’s yours to keep.

Cohort size is limited intentionally — so every question gets answered.

AlgoEdge Insights Quant Finance Bootcamp: Stochastic Models, Volatility, & Option Pricing Mastery

Legendre Trend Filtering on Stock Closing Prices

Below is a Python example using historical stock closing prices (e.g., from Yahoo Finance) and causal LP fitting to separate trend from noise, viz.

Fetch historical AAPL closing prices.
Rescale time to domain [−1,1].
Fit a Legendre polynomial of chosen degree to the prices.
Trend = fitted polynomial; Filtered series = original − trend.

import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt
from numpy.polynomial.legendre import Legendre

# Download stock data
ticker = “AAPL”
data = yf.download(ticker, start=”2023-01-01”, end=”2024-01-01”)

# Extract closing prices and flatten
prices = data[”Close”].to_numpy().flatten()  # now shape (n,)
n = len(prices)
t = np.arange(n)

# Map time to [-1,1]
x = -1 + 2 * t / (n - 1)

trend = np.zeros_like(prices)
window = 30  # only use past 30 days

for i in range(window-1, len(prices)):
    x_window = np.linspace(-1, 1, window)  # scale past points to [-1,1]
    y_window = prices[i-window+1:i+1]
    coeffs = np.polynomial.legendre.legfit(x_window, y_window, deg=3)
    trend[i] = np.polynomial.legendre.legval(1, coeffs)  # value at most recent point

# Detrended / filtered series
detrended = prices - trend

# Plot
plt.figure(figsize=(10,6))
plt.plot(data.index[30:], prices[30:], label=”Original Prices”)
plt.plot(data.index[30:], trend[30:], label=f”Legendre Trend (deg {deg})”, color=”red”)
plt.legend()
plt.grid()
plt.show()

plt.figure(figsize=(10,3))
plt.plot(data.index[30:], detrended[30:], label=”Detrended”, color=”orange”)
plt.legend()
plt.grid()
plt.show()

Causal 3-Order Legendre Polynomial Trend Filtering on AAPL Stock Closing Prices

Interpretations

The trend component is similar to a moving average or low‑pass filter: it captures the primary, smooth evolution of price.
The detrended signal isolates short‑term fluctuations and noise that might matter in high‑frequency analysis or residual modeling.
Changing the polynomial degree controls smoothness, i.e. higher degrees fit more wiggles (risking overfit).

Takeaway

This LP fitting approach works well when you want to strip out a smooth underlying trend without the delay you usually get from moving averages. It’s especially helpful for feature engineering, where you separate the main trend from the leftover noise. But it’s not really designed for predicting returns, since financial returns are extremely noisy and hard to forecast reliably.

Causal Legendre Crossover Trading Strategy

In this section, we develop a crossover trading strategy based on the causal Legendre trend filter discussed above.

This is similar to moving-average crossovers, but uses two Legendre trends with different sensitivities: a fast trend (short window) and a slow trend (long window).

Specifically, we compute two causal Legendre trends: the fast trend (small window) reacts quickly to price changes, whereas the slow trend (larger window) is smoother and slower to react.

The crossover rules are defined as follows: a buy signal occurs when the fast trend crosses above the slow trend (upward momentum), while a sell signal is generated when the fast trend crosses below the slow trend, indicating downward momentum.

The strategy only uses past prices for both trends to avoid lookahead bias.

Part 1: Generating Trading Signals

import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt
from numpy.polynomial.legendre import legfit, legval

# --- Load stock data ---
ticker = “AAPL”
data = yf.download(ticker, start=”2023-01-01”, end=”2024-01-01”)
prices = data[”Close”].to_numpy().flatten()
n = len(prices)

# --- Parameters ---
window_fast = 10   # fast Legendre trend
window_slow = 40   # slow Legendre trend
deg = 3            # polynomial degree

# --- Allocate arrays ---
trend_fast = np.full(n, np.nan)
trend_slow = np.full(n, np.nan)

# --- Causal Legendre trends ---
def causal_legendre(prices, window, deg):
    n = len(prices)
    trend = np.full(n, np.nan)
    for i in range(window-1, n):
        y_win = prices[i-window+1:i+1]
        x_win = np.linspace(-1, 1, window)
        coeffs = legfit(x_win, y_win, deg)
        trend[i] = legval(1, coeffs)  # evaluate at most recent point
    return trend

trend_fast = causal_legendre(prices, window_fast, deg)
trend_slow = causal_legendre(prices, window_slow, deg)

# --- Generate crossover signals ---
# 1 = buy, -1 = sell, 0 = hold
signal = np.zeros(n)
for i in range(1, n):
    if np.isnan(trend_fast[i-1]) or np.isnan(trend_slow[i-1]):
        continue
    if trend_fast[i-1] < trend_slow[i-1] and trend_fast[i] > trend_slow[i]:
        signal[i] = 1  # buy
    elif trend_fast[i-1] > trend_slow[i-1] and trend_fast[i] < trend_slow[i]:
        signal[i] = -1  # sell

# --- Plot results ---
plt.figure(figsize=(12,6))
plt.plot(data.index, prices, label=”Price”, color=”black”,alpha=0.3)
plt.plot(data.index, trend_fast, label=”Fast Legendre Trend”, color=”blue”,alpha=0.5)
plt.plot(data.index, trend_slow, label=”Slow Legendre Trend”, color=”cyan”,alpha=0.5)
plt.scatter(data.index[signal==1], prices[signal==1], marker=”^”, color=”green”, label=”Buy Signal”)
plt.scatter(data.index[signal==-1], prices[signal==-1], marker=”v”, color=”red”, label=”Sell Signal”)
plt.title(f”{ticker} Causal Legendre Trend Crossover Strategy”)
plt.xlabel(”Date”)
plt.ylabel(”Price”)
plt.grid()
plt.legend()
plt.show()

AAPL Causal Legendre Trend Crossover Strategy: Buy and Sell Trading Signals

Part 2: Backtesting

# --- Backtest ---
position = 0         # current position (0 = no position, 1 = long)
cash = 100000        # starting cash
equity = np.zeros(n) # portfolio value
shares = 0

for i in range(n):
    price = prices[i]
    if signal[i] == 1 and position == 0:  # buy
        shares = cash // price
        cash -= shares * price
        position = 1
    elif signal[i] == -1 and position == 1:  # sell
        cash += shares * price
        shares = 0
        position = 0
    equity[i] = cash + shares * price

# --- Plot equity curve ---
plt.figure(figsize=(12,6))
plt.plot(dates, equity, label=”Portfolio Equity”, color=”green”)
plt.plot(dates, prices / prices[0] * 100000, label=”Buy & Hold”, color=”gray”, alpha=0.5)
plt.title(f”{ticker} Causal Legendre Crossover Backtest”)
plt.xlabel(”Date”)
plt.ylabel(”Portfolio Value”)
plt.legend()
plt.grid()
plt.show()

# --- Print final results ---
print(f”Final equity: ${equity[-1]:.2f}”)
print(f”Buy & Hold final value: ${prices[-1]/prices[0]*100000:.2f}”)

Final equity: $129664.77
Buy & Hold final value: $154798.25

AAPL Causal Legendre Crossover Backtest

Clearly, the strategy underperformed buy & hold in this case. The market trend during the period was likely generally upward, so a passive strategy captured more of the upside.

Takeaway

Even if a strategy is systematic and avoids lookahead bias, it is not guaranteed to beat buy & hold, especially in strongly trending markets.

Sharpe-Optimized Causal Legendre Crossover Trading Signals

In this section, we construct a causal polynomial trend-following system using Legendre filters, generate trading signals from fast/slow trend crossovers, and select optimal filter windows by maximizing the strategy’s Sharpe ratio. The strategy operates only on historical closing prices.

import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt
from numpy.polynomial.legendre import legfit, legval

# --- Load stock data ---
ticker = “AAPL”
data = yf.download(ticker, start=”2020-01-01”, end=”2026-05-21”)
prices = data[”Close”].to_numpy().flatten()
n = len(prices)

# --- Parameters ---
deg = 3
fast_range = range(5, 30, 2)
slow_range = range(20, 70, 2)

# --- Causal Legendre trend ---
def causal_legendre(prices, window, deg):
    n = len(prices)
    trend = np.full(n, np.nan)
    for i in range(window-1, n):
        y_win = prices[i-window+1:i+1]
        x_win = np.linspace(-1, 1, window)
        coeffs = legfit(x_win, y_win, deg)
        trend[i] = legval(1, coeffs)
    return trend

# --- Backtest function to get Sharpe ---
def backtest_sharpe(prices, fast_window, slow_window, deg):
    trend_fast = causal_legendre(prices, fast_window, deg)
    trend_slow = causal_legendre(prices, slow_window, deg)
    signal = np.zeros(n)
    for i in range(1, n):
        if np.isnan(trend_fast[i-1]) or np.isnan(trend_slow[i-1]):
            continue
        if trend_fast[i-1] < trend_slow[i-1] and trend_fast[i] > trend_slow[i]:
            signal[i] = 1
        elif trend_fast[i-1] > trend_slow[i-1] and trend_fast[i] < trend_slow[i]:
            signal[i] = -1
    # Backtest equity
    position = 0
    cash = 100000
    equity = np.zeros(n)
    shares = 0
    for i in range(n):
        price = prices[i]
        if signal[i] == 1 and position == 0:
            shares = cash // price
            cash -= shares * price
            position = 1
        elif signal[i] == -1 and position == 1:
            cash += shares * price
            shares = 0
            position = 0
        equity[i] = cash + shares * price
    returns = np.diff(equity) / equity[:-1]
    if np.std(returns) == 0:
        return -np.inf
    sharpe = np.mean(returns) / np.std(returns) * np.sqrt(252)
    return sharpe

# --- Grid search ---
sharpe_grid = np.full((len(fast_range), len(slow_range)), np.nan)

for i, fast in enumerate(fast_range):
    for j, slow in enumerate(slow_range):
        if fast >= slow:
            continue
        sharpe_grid[i, j] = backtest_sharpe(prices, fast, slow, deg)

Extracting the best-performing parameter combination from the Sharpe ratio grid search

# --- Find max Sharpe point ---
max_idx = np.nanargmax(sharpe_grid)
max_i, max_j = np.unravel_index(max_idx, sharpe_grid.shape)

best_fast = list(fast_range)[max_i]
best_slow = list(slow_range)[max_j]
best_sharpe = sharpe_grid[max_i, max_j]

# --- Plot Heatmap ---
plt.figure(figsize=(10,6))

im = plt.imshow(
    sharpe_grid,
    origin=’lower’,
    aspect=’auto’,
    extent=[
        slow_range[0],
        slow_range[-1],
        fast_range[0],
        fast_range[-1]
    ],
    cmap=’viridis’
)

plt.colorbar(im, label=’Sharpe Ratio’)

# --- Mark optimal point ---
plt.scatter(
    best_slow,
    best_fast,
    marker=’X’,
    s=250,
    edgecolors=’white’,
    linewidths=2,
    label=f’Max Sharpe = {best_sharpe:.2f}’
)

# --- Annotate point ---
plt.text(
    best_slow + 1,
    best_fast + 1,
    f’({best_fast}, {best_slow})\nSR={best_sharpe:.2f}’,
    color=’white’,
    fontsize=10,
    bbox=dict(facecolor=’black’, alpha=0.7)
)

plt.xlabel(’Slow Window’)
plt.ylabel(’Fast Window’)
plt.title(’Sharpe Ratio Heatmap: Causal Legendre Crossover’)

plt.legend()
plt.show()

# --- Print best parameters ---
print(f”Best Fast Window: {best_fast}”)
print(f”Best Slow Window: {best_slow}”)
print(f”Best Sharpe Ratio: {best_sharpe:.2f}”)

Best Fast Window: 5
Best Slow Window: 22
Best Sharpe Ratio: 1.10

Sharpe Ratio (SR) Heatmap: Causal Legendre Crossover

Backtesting the causal Legendre crossover trading strategy using window length parameters previously selected from the Sharpe-ratio optimization step.

# --- Fixed Parameters ---
window_fast = 5
window_slow = 22
deg = 3
starting_cash = 100000

# --- Causal Legendre Trend Function ---
def causal_legendre(prices, window, deg):
    n = len(prices)
    trend = np.full(n, np.nan)
    for i in range(window-1, n):
        y_win = prices[i-window+1:i+1]
        x_win = np.linspace(-1, 1, window)
        coeffs = legfit(x_win, y_win, deg)
        trend[i] = legval(1, coeffs)  # evaluate at current time
    return trend

# --- Compute trends ---
trend_fast = causal_legendre(prices, window_fast, deg)
trend_slow = causal_legendre(prices, window_slow, deg)

# --- Generate crossover signals ---
signal = np.zeros(n)
for i in range(1, n):
    if np.isnan(trend_fast[i-1]) or np.isnan(trend_slow[i-1]):
        continue
    if trend_fast[i-1] < trend_slow[i-1] and trend_fast[i] > trend_slow[i]:
        signal[i] = 1  # buy
    elif trend_fast[i-1] > trend_slow[i-1] and trend_fast[i] < trend_slow[i]:
        signal[i] = -1  # sell

# --- Backtest causal strategy ---
position = 0
cash = starting_cash
equity = np.zeros(n)
shares = 0

for i in range(n):
    price = prices[i]
    if signal[i] == 1 and position == 0:
        shares = cash // price
        cash -= shares * price
        position = 1
    elif signal[i] == -1 and position == 1:
        cash += shares * price
        shares = 0
        position = 0
    equity[i] = cash + shares * price

# Close final open position
if position == 1:
    equity[-1] += shares * prices[-1]

# --- Buy & Hold Equity ---
buy_hold_equity = starting_cash / prices[0] * prices

# --- Compute Performance Metrics ---
def compute_metrics(equity, prices, signals):
    returns = np.diff(equity)/equity[:-1]
    total_return = (equity[-1]/equity[0] - 1) * 100
    running_max = np.maximum.accumulate(equity)
    max_drawdown = np.max((running_max - equity)/running_max) * 100
    days = (dates[-1] - dates[0]).days
    years = days / 365.25
    cagr = (equity[-1]/equity[0])**(1/years) - 1
    sharpe = np.mean(returns)/np.std(returns) * np.sqrt(252) if np.std(returns) > 0 else 0
    
    # Trades and win rate
    buy_idx = np.where(signals == 1)[0]
    sell_idx = np.where(signals == -1)[0]
    if len(buy_idx) > len(sell_idx):
        sell_idx = np.append(sell_idx, len(equity)-1)
    paired_buys = []
    paired_sells = []
    j = 0
    for b in buy_idx:
        while j < len(sell_idx) and sell_idx[j] < b:
            j += 1
        if j < len(sell_idx):
            paired_buys.append(b)
            paired_sells.append(sell_idx[j])
            j += 1
    paired_buys = np.array(paired_buys, dtype=int)
    paired_sells = np.array(paired_sells, dtype=int)
    trade_profits = prices[paired_sells] - prices[paired_buys]
    total_trades = len(trade_profits)
    win_rate = np.sum(trade_profits > 0)/total_trades*100 if total_trades > 0 else 0
    
    return total_return, cagr, max_drawdown, sharpe, total_trades, win_rate

# --- Strategy Metrics ---
crossover_metrics = compute_metrics(equity, prices, signal)
buyhold_metrics = compute_metrics(buy_hold_equity, prices, np.zeros(n))

print(”=== Causal Legendre Crossover ===”)
print(f”Total Return: {crossover_metrics[0]:.1f}%”)
print(f”CAGR: {crossover_metrics[1]*100:.1f}%”)
print(f”Max Drawdown: {crossover_metrics[2]:.1f}%”)
print(f”Sharpe Ratio: {crossover_metrics[3]:.2f}”)
print(f”Total Trades: {crossover_metrics[4]}”)
print(f”Win Rate: {crossover_metrics[5]:.1f}%”)

print(”\n=== Buy & Hold ===”)
print(f”Total Return: {buyhold_metrics[0]:.1f}%”)
print(f”CAGR: {buyhold_metrics[1]*100:.1f}%”)
print(f”Max Drawdown: {buyhold_metrics[2]:.1f}%”)
print(f”Sharpe Ratio: {buyhold_metrics[3]:.2f}”)
print(f”Total Trades: {buyhold_metrics[4]}”)
print(f”Win Rate: {buyhold_metrics[5]:.1f}%”)

# --- Plot Equity Curves ---
plt.figure(figsize=(12,6))
plt.plot(dates, equity, label=”Legendre Crossover”, color=”green”)
plt.plot(dates, buy_hold_equity, label=”Buy & Hold”, alpha=0.7)
plt.title(f”{ticker} Fixed-Parameter Historical Simulation”)
plt.xlabel(”Date”)
plt.ylabel(”Portfolio Value”)
plt.xlim(pd.Timestamp(”2025-01-01”),
         pd.Timestamp(”2026-05-19”))
plt.legend()
plt.grid()
plt.show()

=== Causal Legendre Crossover ===
Total Return: 702.8%
CAGR: 38.6%
Max Drawdown: 25.2%
Sharpe Ratio: 0.88
Total Trades: 203
Win Rate: 52.2%

=== Buy & Hold ===
Total Return: 315.4%
CAGR: 25.0%
Max Drawdown: 33.4%
Sharpe Ratio: 0.87

AAPL Fixed-Parameter Historical Simulation

Takeaways

These results suggest that the causal Legendre crossover strategy substantially outperformed a passive buy-and-hold investment in AAPL over the tested historical period, while also slightly reducing downside risk.
The relatively high trade frequency suggests that transaction costs may materially affect real-world performance, and further out-of-sample validation is necessary to assess robustness and mitigate overfitting concerns.

In-Sample Grid Search and Out-of-Sample Backtesting

In this section, we separate historical data into an in-sample period (training) and a future unseen period (testing).

# For example, 80% training, 20% testing
split_idx = int(len(prices) * 0.8)

# Training (in-sample)
prices_train = prices[:split_idx]
dates_train = dates[:split_idx]

# Testing (out-of-sample)
prices_test = prices[split_idx:]
dates_test = dates[split_idx:]

The training set represents in-sample data the strategy is “allowed to learn from.” In the training phase, the model is used to tune window lengths, select the polynomial degree, and potentially optimize the Sharpe ratio.

The testing set (out-of-sample data) is used for evaluating performance, simulating real-world trading, and checking generalization. This is “future data” the model has never seen during development.

# Compute causal trends on test set
trend_fast_test = causal_legendre(prices_test, window_fast, deg)
trend_slow_test = causal_legendre(prices_test, window_slow, deg)

# Generate signals on test set
signal_test = np.zeros(len(prices_test))
for i in range(1, len(prices_test)):
    if np.isnan(trend_fast_test[i-1]) or np.isnan(trend_slow_test[i-1]):
        continue
    if trend_fast_test[i-1] < trend_slow_test[i-1] and trend_fast_test[i] > trend_slow_test[i]:
        signal_test[i] = 1
    elif trend_fast_test[i-1] > trend_slow_test[i-1] and trend_fast_test[i] < trend_slow_test[i]:
        signal_test[i] = -1

# Backtest on test set
position = 0
cash = 100000
equity_test = np.zeros(len(prices_test))
shares = 0

for i in range(len(prices_test)):
    price = prices_test[i]
    if signal_test[i] == 1 and position == 0:
        shares = cash // price
        cash -= shares * price
        position = 1
    elif signal_test[i] == -1 and position == 1:
        cash += shares * price
        shares = 0
        position = 0
    equity_test[i] = cash + shares * price

# Close last open position
if position == 1:
    equity_test[-1] += shares * prices_test[-1]

import numpy as np

def compute_metrics(equity, prices, signals, dates):
    # Daily returns
    returns = np.diff(equity) / equity[:-1]
    
    # Total Return (%)
    total_return = (equity[-1] / equity[0] - 1) * 100
    
    # Max Drawdown (%)
    running_max = np.maximum.accumulate(equity)
    max_drawdown = np.max((running_max - equity) / running_max) * 100
    
    # CAGR
    days = (dates[-1] - dates[0]).days
    years = days / 365.25
    cagr = (equity[-1] / equity[0])**(1/years) - 1
    
    # Sharpe Ratio (annualized)
    sharpe = np.mean(returns) / np.std(returns) * np.sqrt(252) if np.std(returns) > 0 else 0
    
    # Total Trades and Win Rate
    buy_idx = np.where(signals == 1)[0]
    sell_idx = np.where(signals == -1)[0]
    
    # Close last open trade
    if len(buy_idx) > len(sell_idx):
        sell_idx = np.append(sell_idx, len(equity)-1)
    
    # Pair buys with next sell
    paired_buys = []
    paired_sells = []
    j = 0
    for b in buy_idx:
        while j < len(sell_idx) and sell_idx[j] < b:
            j += 1
        if j < len(sell_idx):
            paired_buys.append(b)
            paired_sells.append(sell_idx[j])
            j += 1
    
    paired_buys = np.array(paired_buys, dtype=int)
    paired_sells = np.array(paired_sells, dtype=int)
    
    trade_profits = prices[paired_sells] - prices[paired_buys]
    total_trades = len(trade_profits)
    win_rate = np.sum(trade_profits > 0) / total_trades * 100 if total_trades > 0 else 0
    
    return total_return, cagr, max_drawdown, sharpe, total_trades, win_rate

# --- Compute metrics on test data ---
test_metrics = compute_metrics(equity_test, prices_test, signal_test, dates_test)

print(”=== Out-of-Sample Test Metrics ===”)
print(f”Total Return: {test_metrics[0]:.1f}%”)
print(f”CAGR: {test_metrics[1]*100:.1f}%”)
print(f”Max Drawdown: {test_metrics[2]:.1f}%”)
print(f”Sharpe Ratio: {test_metrics[3]:.2f}”)
print(f”Total Trades: {test_metrics[4]}”)
print(f”Win Rate: {test_metrics[5]:.1f}%”)

=== Out-of-Sample Test Metrics ===
Total Return: 255.4%
CAGR: 171.3%
Max Drawdown: 7.9%
Sharpe Ratio: 1.38
Total Trades: 35
Win Rate: 60.0%

Plotting Out-of-Sample Equity Curve vs Buy & Hold

import matplotlib.pyplot as plt
import numpy as np

# --- Buy & Hold Equity on Test Set ---
buy_hold_test = 100000 / prices_test[0] * prices_test

# --- Buy/Sell indices ---
buy_idx = np.where(signal_test == 1)[0]
sell_idx = np.where(signal_test == -1)[0]

# --- Plot ---
plt.figure(figsize=(14,7))

# Strategy equity
plt.plot(
    dates_test,
    equity_test,
    label=”Legendre Crossover (OOS)”,
    linewidth=2
)

# Buy & Hold
plt.plot(
    dates_test,
    buy_hold_test,
    label=”Buy & Hold”,
    linestyle=”--”,
    alpha=0.8
)

# --- Mark BUY trades ---
plt.scatter(
    dates_test[buy_idx],
    equity_test[buy_idx],
    marker=”^”,
    s=120,
    label=”Buy Signal”
)

# --- Mark SELL trades ---
plt.scatter(
    dates_test[sell_idx],
    equity_test[sell_idx],
    marker=”v”,
    s=120,
    label=”Sell Signal”
)

# --- Formatting ---
plt.title(”Out-of-Sample Equity Curve vs Buy & Hold”)
plt.xlabel(”Date”)
plt.ylabel(”Portfolio Value”)
plt.legend()
plt.xlim(pd.Timestamp(”2025-03-19”),
         pd.Timestamp(”2026-05-19”))
plt.grid(True)

plt.show()

Out-of-Sample Equity Curve vs Buy & Hold

Visualizing AAPL Price with Causal Legendre Buy/Sell Signals

# --- Plot price with buy/sell signals ---

plt.figure(figsize=(14,7))

# Price curve
plt.plot(
    dates_test,
    prices_test,
    label=”AAPL Price”,
    linewidth=2
)

# Buy markers
buy_idx = np.where(signal_test == 1)[0]

plt.scatter(
    dates_test[buy_idx],
    prices_test[buy_idx],
    marker=”^”,
    s=140,
    label=”Buy”,
    zorder=5
)

# Sell markers
sell_idx = np.where(signal_test == -1)[0]

plt.scatter(
    dates_test[sell_idx],
    prices_test[sell_idx],
    marker=”v”,
    s=140,
    label=”Sell”,
    zorder=5
)

# Optional: overlay Legendre trends
plt.plot(
    dates_test,
    trend_fast_test,
    linestyle=”--”,
    alpha=0.8,
    label=f”Fast Legendre ({window_fast})”
)

plt.plot(
    dates_test,
    trend_slow_test,
    linestyle=”--”,
    alpha=0.8,
    label=f”Slow Legendre ({window_slow})”
)

# Formatting
plt.title(”AAPL Price with Causal Legendre Buy/Sell Signals”)
plt.xlabel(”Date”)
plt.ylabel(”Price”)
plt.legend()
plt.grid(True)

plt.show()

AAPL Price with Causal Legendre Buy/Sell Signals

These results describe how the strategy performed on unseen (out-of-sample) data, meaning this is the part of the backtest that most closely simulates real trading conditions.

Insights

The strategy more than tripled the initial capital on unseen out-of-sample data, growing the portfolio to roughly 3.55 times its starting value. This suggests the model was able to identify exploitable market structure beyond the training period.
The 171.3% CAGR is exceptionally high for a trading strategy, although such a large annualized return may partly reflect a relatively short out-of-sample period.
The maximum drawdown of 7.9% is relatively low, indicating that the strategy avoided large portfolio declines and maintained strong risk control relative to its returns.
The out-of-sample Sharpe ratio of 1.38 indicates strong risk-adjusted performance, suggesting that the strategy generated not only high returns but also relatively stable returns, and its improvement over the earlier in-sample Sharpe ratio of approximately 0.88 may indicate better generalization to unseen market data.
The strategy executed only 35 trades during the out-of-sample period, reflecting relatively low turnover and implying that it likely acted only on stronger signals, which can help reduce transaction costs, noise sensitivity, and exposure to whipsaw market conditions.
A 60.0% win rate is relatively strong for a trading strategy and, when combined with the high overall returns, implies that profitable trades likely outweighed losing trades by a meaningful margin.

Takeaway

The strategy shows very strong performance with controlled risk and relatively low trading activity, which is generally a positive sign for robustness.

Causal Time-Series Trend Estimation and Multi-Horizon Forecasting Model

Let’s predict where the price will be n=1, 3, 5, and 10 steps ahead using a causal Legendre model, then go long or short depending on whether the forecast is above or below the current price.

The function causal_legendre_forecast creates a causal multi-step-ahead price forecasting model using rolling Legendre polynomial regression.

import numpy as np
from numpy.polynomial.legendre import legfit, legval

def causal_legendre_forecast(prices, window, deg, horizon):
    n = len(prices)

    forecast = np.full(n, np.nan)

    # rolling normalized coordinates
    x_win = np.linspace(-1, 1, window)

    # future extrapolation coordinate
    x_future = 1 + 2 * horizon / (window - 1)

    for i in range(window - 1, n):

        y_win = prices[i-window+1:i+1]

        coeffs = legfit(x_win, y_win, deg)

        # extrapolated forecast
        forecast[i] = legval(x_future, coeffs)

    return forecast

Generating an n-step-ahead causal price forecast using a Legendre polynomial model

# 1 step ahead
forecast_1 = causal_legendre_forecast(
    prices_test,
    window_fast,
    deg,
    horizon=1
)
# 3 steps ahead
forecast_3 = causal_legendre_forecast(
    prices_test,
    window_fast,
    deg,
    horizon=3
)
# 5 steps ahead
forecast_5 = causal_legendre_forecast(
    prices_test,
    window_fast,
    deg,
    horizon=5
)
# 10 steps ahead
forecast_10 = causal_legendre_forecast(
    prices_test,
    window_fast,
    deg,
    horizon=10
)

Computing trading signals and equity curves from forecasts. For the sake of simplicity, we restrict ourselves to a simple forecast-driven long-only trading system based on the n-step-ahead price forecast.

# 5 steps ahead

signal_5 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    if np.isnan(forecast_5[i]):
        continue

    # bullish forecast
    if forecast_5[i] > prices_test[i]:
        signal_5[i] = 1

    # bearish forecast
    elif forecast_5[i] < prices_test[i]:
        signal_5[i] = -1

cash = 100000
position = 0
shares = 0

equity_5 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    price = prices_test[i]

    if signal_5[i] == 1 and position == 0:

        shares = cash // price
        cash -= shares * price
        position = 1

    elif signal_5[i] == -1 and position == 1:

        cash += shares * price
        shares = 0
        position = 0

    equity_5[i] = cash + shares * price

cash = 100000
position = 0
shares = 0

# 1 step ahead

signal_1 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    if np.isnan(forecast_1[i]):
        continue

    # bullish forecast
    if forecast_1[i] > prices_test[i]:
        signal_1[i] = 1

    # bearish forecast
    elif forecast_1[i] < prices_test[i]:
        signal_1[i] = -1

equity_1 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    price = prices_test[i]

    if signal_1[i] == 1 and position == 0:

        shares = cash // price
        cash -= shares * price
        position = 1

    elif signal_1[i] == -1 and position == 1:

        cash += shares * price
        shares = 0
        position = 0

    equity_1[i] = cash + shares * price

#3 steps ahead

signal_3 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    if np.isnan(forecast_3[i]):
        continue

    # bullish forecast
    if forecast_3[i] > prices_test[i]:
        signal_3[i] = 1

    # bearish forecast
    elif forecast_3[i] < prices_test[i]:
        signal_3[i] = -1

equity_3 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    price = prices_test[i]

    if signal_3[i] == 1 and position == 0:

        shares = cash // price
        cash -= shares * price
        position = 1

    elif signal_3[i] == -1 and position == 1:

        cash += shares * price
        shares = 0
        position = 0

    equity_3[i] = cash + shares * price

#10 steps ahead

signal_10 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    if np.isnan(forecast_10[i]):
        continue

    # bullish forecast
    if forecast_10[i] > prices_test[i]:
        signal_10[i] = 1

    # bearish forecast
    elif forecast_10[i] < prices_test[i]:
        signal_10[i] = -1

cash = 100000
position = 0
shares = 0

equity_10 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    price = prices_test[i]

    if signal_10[i] == 1 and position == 0:

        shares = cash // price
        cash -= shares * price
        position = 1

    elif signal_10[i] == -1 and position == 1:

        cash += shares * price
        shares = 0
        position = 0

    equity_10[i] = cash + shares * price

Plotting equity curves for forecasts with n=1, 3, 5, and 10 steps ahead

plt.figure(figsize=(14,7))

plt.plot(
    dates_test,
    equity_1,
    label=”1-Step Forecast”
)

plt.plot(
    dates_test,
    equity_3,
    label=”3-Step Forecast”
)

plt.plot(
    dates_test,
    equity_5,
    label=”5-Step Forecast”
)

plt.plot(
    dates_test,
    equity_10,
    label=”10-Step Forecast”
)

plt.plot(
    dates_test,
    buy_hold_test,
    linestyle=”--”,
    alpha=0.7,
    label=”Buy & Hold”
)

plt.title(”Out-of-Sample Forecast Strategies”)
plt.xlabel(”Date”)
plt.ylabel(”Portfolio Value”)
plt.legend()
plt.grid(True)

plt.show()

Out-of-Sample Forecast Strategies

Examining additional backtesting metrics for the 5-step forecast strategy (best performer)

import numpy as np

# ============================================================
# 5-STEP FORECAST BACKTEST METRICS
# ============================================================

# --- Build 5-step forecast signals ---
signal_5 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    if np.isnan(forecast_5[i]):
        continue

    # bullish forecast
    if forecast_5[i] > prices_test[i]:
        signal_5[i] = 1

    # bearish forecast
    elif forecast_5[i] < prices_test[i]:
        signal_5[i] = -1

# ============================================================
# BACKTEST
# ============================================================

starting_cash = 100000

cash = starting_cash
position = 0
shares = 0

equity_5 = np.zeros(len(prices_test))

for i in range(len(prices_test)):

    price = prices_test[i]

    # BUY
    if signal_5[i] == 1 and position == 0:

        shares = cash // price
        cash -= shares * price
        position = 1

    # SELL
    elif signal_5[i] == -1 and position == 1:

        cash += shares * price
        shares = 0
        position = 0

    equity_5[i] = cash + shares * price

# Close final open position
if position == 1:

    cash += shares * prices_test[-1]
    shares = 0

    equity_5[-1] = cash

# ============================================================
# BUY & HOLD
# ============================================================

buy_hold_test = (
    starting_cash / prices_test[0]
) * prices_test

# ============================================================
# METRICS FUNCTION
# ============================================================

def compute_metrics(
    equity,
    prices,
    signals,
    dates
):

    # -------------------------
    # RETURNS
    # -------------------------

    returns = np.diff(equity) / equity[:-1]

    # -------------------------
    # TOTAL RETURN
    # -------------------------

    total_return = (
        equity[-1] / equity[0] - 1
    ) * 100

    # -------------------------
    # CAGR
    # -------------------------

    days = (
        dates[-1] - dates[0]
    ).days

    years = days / 365.25

    cagr = (
        equity[-1] / equity[0]
    )**(1 / years) - 1

    # -------------------------
    # MAX DRAWDOWN
    # -------------------------

    running_max = np.maximum.accumulate(equity)

    drawdown = (
        running_max - equity
    ) / running_max

    max_drawdown = np.max(drawdown) * 100

    # -------------------------
    # SHARPE RATIO
    # -------------------------

    sharpe = (
        np.mean(returns)
        /
        np.std(returns)
    ) * np.sqrt(252)

    # =====================================================
    # TRADE ANALYSIS
    # =====================================================

    buy_idx = np.where(signals == 1)[0]
    sell_idx = np.where(signals == -1)[0]

    # close open trade
    if len(buy_idx) > len(sell_idx):

        sell_idx = np.append(
            sell_idx,
            len(prices) - 1
        )

    paired_buys = []
    paired_sells = []

    j = 0

    for b in buy_idx:

        while (
            j < len(sell_idx)
            and
            sell_idx[j] < b
        ):
            j += 1

        if j < len(sell_idx):

            paired_buys.append(b)
            paired_sells.append(sell_idx[j])

            j += 1

    paired_buys = np.array(
        paired_buys,
        dtype=int
    )

    paired_sells = np.array(
        paired_sells,
        dtype=int
    )

    trade_profits = (
        prices[paired_sells]
        -
        prices[paired_buys]
    )

    total_trades = len(trade_profits)

    win_rate = (
        np.sum(trade_profits > 0)
        /
        total_trades
        * 100
    ) if total_trades > 0 else 0

    return {
        “Total Return (%)”: total_return,
        “CAGR (%)”: cagr * 100,
        “Max Drawdown (%)”: max_drawdown,
        “Sharpe Ratio”: sharpe,
        “Total Trades”: total_trades,
        “Win Rate (%)”: win_rate
    }

# ============================================================
# COMPUTE METRICS
# ============================================================

metrics_5 = compute_metrics(
    equity_5,
    prices_test,
    signal_5,
    dates_test
)

metrics_bh = compute_metrics(
    buy_hold_test,
    prices_test,
    np.zeros(len(prices_test)),
    dates_test
)

# ============================================================
# PRINT RESULTS
# ============================================================

print(”\n=== 5-Step Forecast Strategy ===\n”)

for k, v in metrics_5.items():

    if isinstance(v, float):
        print(f”{k}: {v:.2f}”)
    else:
        print(f”{k}: {v}”)

print(”\n=== Buy & Hold ===\n”)

for k, v in metrics_bh.items():

    if isinstance(v, float):
        print(f”{k}: {v:.2f}”)
    else:
        print(f”{k}: {v}”)


=== 5-Step Forecast Strategy ===

Total Return (%): 23.34
CAGR (%): 17.96
Max Drawdown (%): 19.78
Sharpe Ratio: 0.99
Total Trades: 145
Win Rate (%): 54.48

=== Buy & Hold ===

Total Return (%): 32.67
CAGR (%): 24.93
Max Drawdown (%): 30.22
Sharpe Ratio: 0.88
Total Trades: 0
Win Rate (%): 0

Insights

The strategy generated a total return of 23.34% and a CAGR of 17.96%, compared with 32.67% and 24.93% for buy-and-hold. Thus, the passive benchmark outperformed in terms of raw profitability.
However, the strategy substantially reduced risk exposure. Its max drawdown was 19.78%, considerably lower than the 30.22% drawdown of buy-and-hold. This indicates that the strategy was more effective at limiting losses during adverse market periods.
The strategy also achieved a higher Sharpe ratio (0.99 vs. 0.88), implying better risk-adjusted returns. In other words, each unit of risk taken by the strategy generated more return than the passive benchmark.
The system executed 145 trades with a win rate of 54.48%, reflecting modest but persistent predictive power. Since the win rate is only moderately above 50%, profitability likely comes from capturing trends and controlling losses rather than from extremely high forecasting accuracy.

Takeaway

The 5-step forecast strategy produced lower absolute returns than the buy-and-hold benchmark, but it achieved superior risk-adjusted performance and significantly reduced downside risk.

GuruFinance Insights

Discussion about this post

Ready for more?

GuruFinance Insights

Nonlinear Feature Extraction for Financial Time Series Prediction Using Causal Legendre Polynomials

From FinTech Theory to Algorithmic Trading: Building a Complete Python Pipeline with Legendre Polynomials without Lookahead Bias 🔁🤖

Contents

Legendre Polynomials from Scratch

② One strategy in this book returned 2.3× the S&P 500 on a risk-adjusted basis over 5 years.

Fully coded in Python. Yours to run today.

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

What Is the Legendre Transform?

Legendre Transform in Portfolio Optimization

Legendre Polynomial Expansion of Time Series

Causal Legendre Polynomials in Time Series Analysis

AlgoEdge Insights Quant Finance Bootcamp: Stochastic Models, Volatility, & Option Pricing Mastery

Legendre Trend Filtering on Stock Closing Prices

Causal Legendre Crossover Trading Strategy

Sharpe-Optimized Causal Legendre Crossover Trading Signals

In-Sample Grid Search and Out-of-Sample Backtesting

Causal Time-Series Trend Estimation and Multi-Horizon Forecasting Model

Discussion about this post

Ready for more?