Mathlib.Lean.Meta.RefinedDiscrTree

We define discrimination trees for the purpose of unifying local expressions with library results.

This structure is based on Lean.Meta.DiscrTree. I document here what features are not in the original:

The keys Key.lam, Key.forall and Key.bvar have been introduced in order to allow for matching under lambda and forall binders. Key.lam has arity 1 and indexes the body. Key.forall has arity 2 and indexes the domain and the body. The reason for not indexing the domain of a lambda expression is that it is usually already determined, for example in ∃ a : α, p, which is @Exists α fun a : α => p, we don't want to index the domain α twice. In a forall expression it is necessary to index the domain, because in an implication p → q we need to index both p and q. Key.bvar works the same as Key.fvar, but stores the De Bruijn index to identify it.

For example, this allows for more specific matching with the left hand side of ∑ i ∈ range n, i = n * (n - 1) / 2, which is indexed by [⟨Finset.sum, 5⟩, ⟨Nat, 0⟩, ⟨Nat, 0⟩, *0, ⟨Finset.Range, 1⟩, *1, λ, ⟨#0, 0⟩].
The key Key.star takes a Nat identifier as an argument. For example, the library pattern ?a + ?a is encoded as [⟨HAdd.hAdd, 6⟩, *0, *0, *0, *1, *2, *2]. *0 corresponds to the type of a, *1 to the HAdd instance, and *2 to a. This means that it will only match an expression x + y if x is definitionally equal to y. The matching algorithm requires that the same stars from the discrimination tree match with the same patterns in the lookup expression, and similarly requires that the same metavariables form the lookup expression match with the same pattern in the discrimination tree.
The key Key.opaque has been introduced in order to index existential variables in lemmas like Nat.exists_prime_and_dvd {n : ℕ} (hn : n ≠ 1) : ∃ p, Prime p ∧ p ∣ n, where the part Prime p gets the pattern [⟨Nat.Prime, 1⟩, ◾]. (◾ represents Key.opaque) When matching, Key.opaque can only be matched by Key.star.

Using the WhnfCoreConfig argument, it is possible to disable β-reduction and ζ-reduction. As a result, we may get a lambda expression applied to an argument or a let-expression. Since there is no support for indexing these, they will be indexed by Key.opaque.
We keep track of the matching score of a unification. This score represents the number of keys that had to be the same for the unification to succeed. For example, matching (1 + 2) + 3 with add_comm gives a score of 2, since the pattern of commutativity is [⟨HAdd.hAdd, 6⟩, *0, *0, *0, *1, *2, *3], so matching ⟨HAdd.hAdd, 6⟩ gives 1 point, and matching *0 after its first appearance gives another point, but the third argument is an outParam, so this gets ignored. Similarly, matching it with add_assoc gives a score of 5.
Patterns that have the potential to be η-reduced are put into the RefinedDiscrTree under all possible reduced key sequences. This is for terms of the form fun x => f (?m x₁ .. xₙ), where ?m is a metavariable, and one of x₁, .., xₙ in x. For example, the pattern Continuous fun y => Real.exp (f y)]) is indexed by both [⟨Continuous, 5⟩, *0, ⟨Real, 0⟩, *1, *2, λ, ⟨Real.exp⟩, *3] and [⟨Continuous, 5⟩, *0, ⟨Real, 0⟩, *1, *2, ⟨Real.exp⟩] so that it also comes up if you search with Continuous Real.exp. Similarly, Continuous fun x => f x + g x is indexed by both [⟨Continuous, 1⟩, λ, ⟨HAdd.hAdd, 6⟩, *0, *0, *0, *1, *2, *3] and [⟨Continuous, 1⟩, ⟨HAdd.hAdd, 5⟩, *0, *0, *0, *1, *2].
For sub-expressions not at the root of the original expression we have some additional reductions:
- Any combination of ofNat, Nat.zero, Nat.succ and number literals is stored as just a number literal.
- The expression fun a : α => a is stored as @id α.
  - This makes lemmata such as continuous_id' redundant, which is the same as continuous_id, with id replaced by fun x => x.
- Any expressions involving +, *, -, / or ⁻¹ is normalized to not have a lambda in front and to always have the default amount of arguments. e.g. (f + g) a is stored as f a + g a and fun x => f x + g x is stored as f + g.
  - This makes lemmata such as MeasureTheory.integral_integral_add' redundant, which is the same as MeasureTheory.integral_integral_add, with f a + g a replaced by (f + g) a
  - it also means that a lemma like Continuous.mul can be stated as talking about f * g instead of fun x => f x + g x.

I have also made some changes in the implementation:

Instead of directly converting from Expr to Array Key during insertion, and directly looking up from an Expr during lookup, I defined the intermediate structure DTExpr, which is a form of Expr that only contains information relevant for the discrimination tree. Each Expr is transformed into a DTExpr before insertion or lookup. For insertion there could be multiple DTExpr representations due to potential η-reductions as mentioned above.

TODO:

More thought could be put into the matching algorithm for non-trivial unifications. For example, when looking up the expression ?a + ?a (for rewriting), there will only be results like n + n = 2 * n or a + b = b + a, but not like n + 1 = n.succ, even though this would still unify.
The reason why implicit arguments are not ignored by the discrimination tree is that they provide important type information. Because of this it seems more natural to index the types of expressions instead of indexing the implicit type arguments. Then each key would additionally index the type of that expression. So instead of indexing ?a + ?b as [⟨HAdd.hAdd, 6⟩, *0, *0, *0, *1, *2, *3], it would be indexed by something like [(*0, ⟨HAdd.hAdd, 6⟩), _, _, _, _, (*0, *1), (*0, *2)]. The advantage of this would be that there will be less duplicate indexing of types, because many functions index the types of their arguments and their return type with implicit arguments, meaning that types unnecessarily get indexed multiple times. This modification can be explored, but it could very well not be an improvement.

Inserting intro a RefinedDiscrTree #

source

partial def Lean.Meta.RefinedDiscrTree.insertInTrie {α : Type} [BEq α] (keys : Array Key) (v : α) (i : ℕ) :

Trie α → Trie α

Insert the value v at index keys : Array Key in a Trie.

source

def Lean.Meta.RefinedDiscrTree.insertInRefinedDiscrTree {α : Type} [BEq α] (d : RefinedDiscrTree α) (keys : Array Key) (v : α) :

RefinedDiscrTree α

Insert the value v at index keys : Array Key in a RefinedDiscrTree.

Warning: to account for η-reduction, an entry may need to be added at multiple indexes, so it is recommended to use RefinedDiscrTree.insert for insertion.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

def Lean.Meta.RefinedDiscrTree.insertDTExpr {α : Type} [BEq α] (d : RefinedDiscrTree α) (e : DTExpr) (v : α) :

RefinedDiscrTree α

Insert the value v at index e : DTExpr in a RefinedDiscrTree.

Warning: to account for η-reduction, an entry may need to be added at multiple indexes, so it is recommended to use RefinedDiscrTree.insert for insertion.

Equations

d.insertDTExpr e v = d.insertInRefinedDiscrTree e.flatten v

Instances For

source

def Lean.Meta.RefinedDiscrTree.insert {α : Type} [BEq α] (d : RefinedDiscrTree α) (e : Expr) (v : α) (onlySpecific : Bool := true) (fvarInContext : FVarId → Bool := fun (x : FVarId) => false) :

MetaM (RefinedDiscrTree α)

Insert the value v at index e : Expr in a RefinedDiscrTree. The argument fvarInContext allows you to specify which free variables in e will still be in the context when the RefinedDiscrTree is being used for lookup. It should return true only if the RefinedDiscrTree is built and used locally.

if onlySpecific := true, then we filter out the patterns * and Eq * * *.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

def Lean.Meta.RefinedDiscrTree.insertEqn {α : Type} [BEq α] (d : RefinedDiscrTree α) (lhs rhs : Expr) (vLhs vRhs : α) (onlySpecific : Bool := true) (fvarInContext : FVarId → Bool := fun (x : FVarId) => false) :

MetaM (RefinedDiscrTree α)

Insert the value vLhs at index lhs, and if rhs is indexed differently, then also insert the value vRhs at index rhs.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

partial def Lean.Meta.RefinedDiscrTree.Trie.mapArraysM {α β : Type} {m : Type → Type} [Monad m] (t : Trie α) (f : Array α → m (Array β)) :

m (Trie β)

Apply a monadic function to the array of values at each node in a RefinedDiscrTree.

source

def Lean.Meta.RefinedDiscrTree.mapArraysM {α β : Type} {m : Type → Type} [Monad m] (d : RefinedDiscrTree α) (f : Array α → m (Array β)) :

m (RefinedDiscrTree β)

Apply a monadic function to the array of values at each node in a RefinedDiscrTree.

Equations

d.mapArraysM f = do let __do_lift ← d.root.mapM fun (x : Lean.Meta.RefinedDiscrTree.Trie α) => x.mapArraysM f pure { root := __do_lift }

Instances For

source

def Lean.Meta.RefinedDiscrTree.mapArrays {α β : Type} (d : RefinedDiscrTree α) (f : Array α → Array β) :

RefinedDiscrTree β

Apply a function to the array of values at each node in a RefinedDiscrTree.

Equations

d.mapArrays f = d.mapArraysM f

Instances For