🧮 Symbolic Mathematics Engine in Haskell

👨🏻‍🔬 Connect With Me

🎯 Seeking Roles: Research or industry roles where I can leverage my multi-disciplinary toolkit, combining low-level systems logic, machine learning workflows, and modern web architecture, to solve real-world problems in engineering, finance, and adjacent fields.
📝 CV / Resume: View my Resume
📍 Location: Bologna, Italy
📱 Phone: (+39) 366 4207296
📧 Email: giovannigravili112@gmail.com
🔗 LinkedIn: linkedin.com/in/giovanni-gravili
🐙 GitHub: github.com/ghovax

🧮 Symbolic Mathematics Engine in Haskell

Giovanni Gravili April 13, 2024 Haskell, CAS

Computer algebra systems like Mathematica or SymPy feel like magic. You type in a derivative request and get back a perfect symbolic answer. But how does a computer understand variables rather than strings? How does it know the rules of calculus? I decided to build a symbolic mathematics engine from scratch in Haskell to find out. I chose Haskell because its functional nature mirrors mathematics perfectly. In imperative languages, you might implement an expression as an object with methods; in functional programming, an expression is just data, and differentiation is just a transformation of that data.

Expression Trees

The fundamental insight is that any mathematical expression can be represented as a tree. The expression $2x + 1$ isn’t a flat string; it’s a Sum operation where the left child is a Product (of 2 and x) and the right child is 1. In Haskell, we define this using an Algebraic Data Type that covers everything from simple constants to trig functions. Once you have this tree structure, calculus becomes a game of pattern matching. Differentiation isn’t a calculation; it’s a recursive function that transforms one tree structure into another based on the standard rules. For example, the product rule $d(uv) = u'v + uv'$ becomes a direct translation where a Product node is transformed into a Sum of two Product nodes representing the sub-derivatives.

Implementing replacement and substitution required a careful distinction between the structure of an expression and its evaluation. The substitute function performs a purely structural traversal, replacing Var nodes matching a target string with a replacement sub-tree. This allows for powerful compositional operations; for instance, computing $f(g(x))$ becomes a substitution of $g(x)$ into every variable node of $f$ . This is distinct from mathematical evaluation where variables are replaced by numeric constants; in this system, variables can be replaced by entire complex expression trees, enabling symbolic composition and chain rule applications.

Simplification

However, raw symbolic differentiation is messy. If you differentiate $x^2$ using the power rule, you strictly get $2 \cdot x^{2-1}$ . A computer doesn’t automatically know that $2-1=1$ or that $x^1=x$ . Without simplification, the derivative of $2x$ might come out as $0 \cdot x + 2 \cdot 1$ . To fix this, I implemented a simplifier that repeatedly applies algebraic identities until the expression stops changing (a “fixed-point” algorithm). It applies rules like identity addition, identity multiplication, and constant folding to turn that messy tree back into the clean result we expect.

The simplification algorithm operates on a fixed-point iteration principle. A single pass of algebraic rules isn’t sufficient, simplifying $(1+1)x$ to $2x$ might reveal further opportunities like $2x + 0$ that weren’t visible in the first pass. My simplifyFully function recursively applies the simplification rules up to a safety limit (100 iterations) or until the expression stops changing (converges). This ensures that deeply nested expressions like $(x+(1-1))$ collapse completely down to $x$ without requiring complex look-ahead logic in the individual rules.

This engine is robust enough to handle advanced applications like Taylor Series expansion. The Taylor Series construction showcases the power of Haskell’s lazy evaluation and functional folding. The series is generated by mapping a term-generating function over an integer sequence $[0..n]$ and then folding the results with Sum. Each term combines three distinct symbolic operations: computing the $n$ -th derivative, evaluating that derivative at the center point $a$ via substitution, and multiplying by the power term $\frac{(x-a)^n}{n!}$ . Because these operations return valid Expr trees, they can be immediately simplified by the optimization pass, keeping the polynomial expansion compact even for high-degree approximations:

$f(x) \approx \sum\_{n=0}^{\infty} \frac{f^{(n)}(a)}{n!} (x-a)^n$

Building this library highlighted why functional programming is so effective for compiler-like tasks; the lack of mutable state means you never worry about “breaking” an expression while transforming it.

👨🏻‍🔬 Connect With Me

Other Projects