We define discrimination trees for the purpose of unifying local expressions with library results.
This structure is based on Lean.Meta.DiscrTree.
I document here what features are not in the original:
The keys
Key.lam,Key.forallandKey.bvarhave been introduced in order to allow for matching under lambda and forall binders.Key.lamhas arity 1 and indexes the body.Key.forallhas arity 2 and indexes the domain and the body. The reason for not indexing the domain of a lambda expression is that it is usually already determined, for example in∃ a : α, p, which is@Exists α fun a : α => p, we don't want to index the domainαtwice. In a forall expression it is necessary to index the domain, because in an implicationp → qwe need to index bothpandq.Key.bvarworks the same asKey.fvar, but stores the De Bruijn index to identify it.For example, this allows for more specific matching with the left hand side of
∑ i ∈ range n, i = n * (n - 1) / 2, which is indexed by[⟨Finset.sum, 5⟩, ⟨Nat, 0⟩, ⟨Nat, 0⟩, *0, ⟨Finset.Range, 1⟩, *1, λ, ⟨#0, 0⟩].The key
Key.startakes aNatidentifier as an argument. For example, the library pattern?a + ?ais encoded as[⟨HAdd.hAdd, 6⟩, *0, *0, *0, *1, *2, *2].*0corresponds to the type ofa,*1to theHAddinstance, and*2toa. This means that it will only match an expressionx + yifxis definitionally equal toy. The matching algorithm requires that the same stars from the discrimination tree match with the same patterns in the lookup expression, and similarly requires that the same metavariables form the lookup expression match with the same pattern in the discrimination tree.The key
Key.opaquehas been introduced in order to index existential variables in lemmas likeNat.exists_prime_and_dvd {n : ℕ} (hn : n ≠ 1) : ∃ p, Prime p ∧ p ∣ n, where the partPrime pgets the pattern[⟨Nat.Prime, 1⟩, ◾]. (◾ representsKey.opaque) When matching,Key.opaquecan only be matched byKey.star.Using the
WhnfCoreConfigargument, it is possible to disable β-reduction and ζ-reduction. As a result, we may get a lambda expression applied to an argument or a let-expression. Since there is no support for indexing these, they will be indexed byKey.opaque.We keep track of the matching score of a unification. This score represents the number of keys that had to be the same for the unification to succeed. For example, matching
(1 + 2) + 3withadd_commgives a score of 2, since the pattern of commutativity is [⟨HAdd.hAdd, 6⟩, *0, *0, *0, *1, *2, *3], so matching⟨HAdd.hAdd, 6⟩gives 1 point, and matching*0after its first appearance gives another point, but the third argument is an outParam, so this gets ignored. Similarly, matching it withadd_assocgives a score of 5.Patterns that have the potential to be η-reduced are put into the
RefinedDiscrTreeunder all possible reduced key sequences. This is for terms of the formfun x => f (?m x₁ .. xₙ), where?mis a metavariable, and one ofx₁, .., xₙinx. For example, the patternContinuous fun y => Real.exp (f y)])is indexed by both[⟨Continuous, 5⟩, *0, ⟨Real, 0⟩, *1, *2, λ, ⟨Real.exp⟩, *3]and[⟨Continuous, 5⟩, *0, ⟨Real, 0⟩, *1, *2, ⟨Real.exp⟩]so that it also comes up if you search withContinuous Real.exp. Similarly,Continuous fun x => f x + g xis indexed by both[⟨Continuous, 1⟩, λ, ⟨HAdd.hAdd, 6⟩, *0, *0, *0, *1, *2, *3]and[⟨Continuous, 1⟩, ⟨HAdd.hAdd, 5⟩, *0, *0, *0, *1, *2].For sub-expressions not at the root of the original expression we have some additional reductions:
- Any combination of
ofNat,Nat.zero,Nat.succand number literals is stored as just a number literal. - The expression
fun a : α => ais stored as@id α.- This makes lemmata such as
continuous_id'redundant, which is the same ascontinuous_id, withidreplaced byfun x => x.
- This makes lemmata such as
- Any expressions involving
+,*,-,/or⁻¹is normalized to not have a lambda in front and to always have the default amount of arguments. e.g.(f + g) ais stored asf a + g aandfun x => f x + g xis stored asf + g.- This makes lemmata such as
MeasureTheory.integral_integral_add'redundant, which is the same asMeasureTheory.integral_integral_add, withf a + g areplaced by(f + g) a - it also means that a lemma like
Continuous.mulcan be stated as talking aboutf * ginstead offun x => f x + g x.
- This makes lemmata such as
- Any combination of
I have also made some changes in the implementation:
- Instead of directly converting from
ExprtoArray Keyduring insertion, and directly looking up from anExprduring lookup, I defined the intermediate structureDTExpr, which is a form ofExprthat only contains information relevant for the discrimination tree. EachExpris transformed into aDTExprbefore insertion or lookup. For insertion there could be multipleDTExprrepresentations due to potential η-reductions as mentioned above.
TODO:
More thought could be put into the matching algorithm for non-trivial unifications. For example, when looking up the expression
?a + ?a(for rewriting), there will only be results liken + n = 2 * nora + b = b + a, but not liken + 1 = n.succ, even though this would still unify.The reason why implicit arguments are not ignored by the discrimination tree is that they provide important type information. Because of this it seems more natural to index the types of expressions instead of indexing the implicit type arguments. Then each key would additionally index the type of that expression. So instead of indexing
?a + ?bas[⟨HAdd.hAdd, 6⟩, *0, *0, *0, *1, *2, *3], it would be indexed by something like[(*0, ⟨HAdd.hAdd, 6⟩), _, _, _, _, (*0, *1), (*0, *2)]. The advantage of this would be that there will be less duplicate indexing of types, because many functions index the types of their arguments and their return type with implicit arguments, meaning that types unnecessarily get indexed multiple times. This modification can be explored, but it could very well not be an improvement.
Inserting intro a RefinedDiscrTree #
Insert the value v at index keys : Array Key in a RefinedDiscrTree.
Warning: to account for η-reduction, an entry may need to be added at multiple indexes,
so it is recommended to use RefinedDiscrTree.insert for insertion.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Insert the value v at index e : DTExpr in a RefinedDiscrTree.
Warning: to account for η-reduction, an entry may need to be added at multiple indexes,
so it is recommended to use RefinedDiscrTree.insert for insertion.
Equations
- d.insertDTExpr e v = d.insertInRefinedDiscrTree e.flatten v
Instances For
Insert the value v at index e : Expr in a RefinedDiscrTree.
The argument fvarInContext allows you to specify which free variables in e will still be
in the context when the RefinedDiscrTree is being used for lookup.
It should return true only if the RefinedDiscrTree is built and used locally.
if onlySpecific := true, then we filter out the patterns * and Eq * * *.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Insert the value vLhs at index lhs, and if rhs is indexed differently, then also
insert the value vRhs at index rhs.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Apply a monadic function to the array of values at each node in a RefinedDiscrTree.
Equations
- d.mapArraysM f = do let __do_lift ← d.root.mapM fun (x : Lean.Meta.RefinedDiscrTree.Trie α) => x.mapArraysM f pure { root := __do_lift }
Instances For
Apply a function to the array of values at each node in a RefinedDiscrTree.
Equations
- d.mapArrays f = d.mapArraysM f