12 February, 2026
I first encountered the Signature Method while researching for my thesis. It remains one of the most elegant mathematical frameworks I’ve come across — a perspective shift that fundamentally changes how we think about sequential data.
In standard machine learning, we often treat data as static points in high-dimensional space. A vector is a point. An image is a point. But reality is rarely static. Financial markets tick irregularly, handwriting flows with varying velocity, biological signals oscillate continuously. Reality is a path.
The central question becomes: how can we represent a continuous path — where order matters just as much as magnitude — in a way that is invariant to sampling and mathematically principled? The answer lies in rough path theory and the Signature Transform. In a precise sense, the signature is the Taylor expansion of a path.
The Path Integral Construction
Let \( X : [a,b] \to \mathbb{R}^d \) be a continuous path of bounded variation. The bounded variation assumption ensures that Riemann–Stieltjes integrals are well-defined. The signature of \( X \) over \([a,b]\) is the infinite collection of its iterated integrals.
Level 0:
\[ S(X)_{a,b}^\emptyset = 1 \]
Level 1:
\[ S(X)^i_{a,b} = \int_a^b dX^i_t = X^i_b - X^i_a \]
This captures the total increment of each coordinate.
Level 2: Geometry Emerges
\[ S(X)^{i,j}_{a,b} = \int_{a < r < s < b} dX^i_r \, dX^j_s \]
Integration by parts yields:
\[ S(X)^{1,2} + S(X)^{2,1} = (X^1_b - X^1_a)(X^2_b - X^2_a) \]
The antisymmetric component defines the Lévy area:
\[ \mathcal{A} = \frac{1}{2}\big(S(X)^{1,2} - S(X)^{2,1}\big) \]
This captures curvature and orientation — true geometric information.
Higher Levels
For a multi-index \( I = (i_1,\dots,i_k) \):
\[ S(X)^I_{a,b} = \int_{a < t_1 < \dots < t_k < b} dX^{i_1}_{t_1} \cdots dX^{i_k}_{t_k} \]
The full signature is:
\[ S(X)_{a,b} = \left(1, S^1, \dots, S^d, S^{1,1}, S^{1,2}, \dots \right) \]
The Shuffle Product
The signature satisfies:
\[ S(X)^I S(X)^J = \sum_{K \in I \shuffle J} S(X)^K \]
Example:
\[ S(X)^1 S(X)^2 = S(X)^{1,2} + S(X)^{2,1} \]
Chen’s Identity
For concatenated paths:
\[ S(X * Y) = S(X) \otimes S(Y) \]
Component-wise:
\[ S(X*Y)^{i_1,\dots,i_k} = \sum_{m=0}^k S(X)^{i_1,\dots,i_m} S(Y)^{i_{m+1},\dots,i_k} \]
The Log-Signature
\[ \log S(X) = \sum_{n \ge 1} \frac{(-1)^{n-1}}{n} (S(X) - 1)^{\otimes n} \]
The level-2 antisymmetric part corresponds to:
\[ [e_i, e_j] = e_i \otimes e_j - e_j \otimes e_i \]
Lead–Lag Embedding
For discrete data, piecewise linear interpolation gives:
\[ S = \exp(v) \]
The lead–lag transform embeds 1D data into \( \mathbb{R}^2 \), where:
\[ S^{1,2} \approx \sum_i (X_{t_i} - X_{t_{i-1}})^2 \]
References
- I. Chevyrev and A. Kormilitzin (2016). A Primer on the Signature Method in Machine Learning. arXiv:1603.03788
- T. Lyons, M. Caruana, T. Lévy (2007). Differential Equations Driven by Rough Paths. Springer.
- T. Lyons (1998). Differential equations driven by rough signals. Revista Matemática Iberoamericana.