[fix](mv) Align extra-join elimination safety check by foxtail463 · Pull Request #62527 · apache/doris

foxtail463 · 2026-04-15T12:41:04Z

Problem Summary:
Extra-join elimination rejected some safe LEFT JOIN rewrites too early. When the right side was guaranteed unique, a LEFT JOIN could still be safely eliminated even if its ON clause contained right-side otherJoinConjuncts, because those predicates only decide whether the nullable side matches and do not filter preserved-side rows.

Example:

-- MV
SELECT t1.id, t1.name
FROM t1
LEFT JOIN t2
  ON t1.id = t2.id AND t2.id = 1

-- Query
SELECT id, name FROM t1

If t2.id is unique and no output or residual filter depends on t2, eliminating t2 is safe.

This PR relaxes the MV comparison path for that LEFT JOIN case, while still rejecting eliminations when unmatched residual filters depend on the eliminated side. For example, filters above the join such as WHERE t2.id IS NULL, or INNER JOIN filters such as WHERE lineitem.l_shipdate = '2024-01-01', cannot be dropped because they affect which preserved-side rows survive.

To make that safety check precise, FilterEdge now records the actual input nodes of its predicates during HyperGraph construction, including projected aliases and slot-free predicates.

Thearas · 2026-04-15T12:41:10Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

foxtail463 · 2026-04-15T12:41:24Z

/review

github-actions

Found 1 blocking issue.

fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/HyperGraphComparator.java: the new filter-edge safety check still keys off FilterEdge.getReferenceNodes(), but for filter edges that bitmap represents the whole child subtree. A left-only predicate above a LEFT JOIN (for example WHERE t1.id > 1) therefore overlaps the eliminated right node and is treated as unsafe even though it does not depend on nullable-side rows. That means the PR still rejects safe LEFT JOIN eliminations it intends to allow.

Critical checkpoint conclusions:

Goal of current task: Partially met. The otherJoinConjuncts case is handled, but safe LOJ rewrites with preserved-side top filters remain blocked.
Modification size/focus: Small and focused.
Concurrency: Not involved.
Lifecycle/static initialization: Not involved.
Configuration: None.
Compatibility / parallel paths: No protocol or storage compatibility impact; the inner-join PK/FK path is still guarded separately.
Special conditional checks: Blocking issue above; the new condition uses subtree membership instead of the filter predicate's actual input slots.
Test coverage: Improved with FE unit tests and a regression test, but missing a preserved-side top-filter case such as WHERE t1.id > 1, which would expose this false negative.
Test result files: The added .out file is consistent for the covered positive case only.
Observability: Not applicable.
Transaction / persistence / data writes / FE-BE variable passing: Not applicable.
Performance: No significant concern in this patch.
Other issues: None beyond the blocking correctness gap above.

foxtail463 · 2026-04-15T14:38:17Z

/review

github-actions

Found 2 issues that should be addressed before merge.

Critical Checkpoints

Goal of current task: partially accomplished. The patch tries to allow safe LEFT JOIN extra predicates during MV matching, but the new filter-node inference introduces an unsafe false-positive case, and the exact happy-path from the PR description is still not covered by a direct regression test.
Minimality/focus: mostly focused to MV comparison, but the JoinUtils.canEliminateByLeft helper is shared, so the semantic change should be proven by tests on the intended path.
Concurrency: not involved in this change.
Lifecycle/static init: no special lifecycle concerns here.
Configuration: no config changes.
Incompatible changes: none in FE/BE protocol, symbols, or storage format.
Parallel code paths: yes. The changed elimination semantics flow through shared join-elimination logic, so parallel callers need explicit coverage if the broader behavior is intentional.
Special conditional checks: the new leftChildEdges/inputSlots heuristic is documented, but it does not account for all predicate forms on the eliminated side.
Test coverage: insufficient for the positive case described in the PR body.
Test result files: not applicable.
Observability: not needed for this optimizer rewrite.
Transaction/persistence/data writes: not involved.
FE/BE variable passing: not involved.
Performance: no material concern found in the touched code.
Other issues: none beyond the two findings below.

I also tried to run FE unit tests for the touched areas, but this runner is missing thirdparty/installed/bin/protoc, so run-fe-ut.sh cannot build generated code here.

github-actions · 2026-04-15T15:38:40Z

                && LongBitmap.getCardinality(reservedShouldEliminatedViewNodes) == 0;
    }

+    static long getFilterInputNodes(HyperGraph hyperGraph, FilterEdge filterEdge) {


getFilterInputNodes() now treats slot-free predicates as if they belonged to no node at all. That makes the new safety check unsound for inner-join elimination.

For example, a view side like:

SELECT t1.id FROM t1 INNER JOIN (SELECT * FROM t2 WHERE 1 = 0) t2f ON t1.id = t2f.id

still has a filter that changes which t1 rows survive, but filterEdge.getInputSlots() is empty here, so this helper returns 0. After that:

tryEliminateNodesAndEdge() no longer marks the filter as unsafe for the removed side, and

isLogicCompatible() also skips the residual FilterEdge because it uses the same helper in the overlap check.

That can make SELECT t1.id FROM t1 look compatible with a view that is actually empty. This helper needs to treat slot-free predicates as belonging to the whole filter subtree, not to "no nodes".

github-actions · 2026-04-15T15:38:40Z

        dropConstraint("alter table T2 drop constraint uk");
    }

+    @Test


The added tests only cover rejection paths plus the helper method. They never assert the positive regression described in the PR body: a unique right side with otherJoinConjuncts inside the LEFT JOIN ... ON ... clause should now match SELECT ... FROM t1.

Because this patch is specifically about relaxing otherJoinConjuncts, we still need a direct end-to-end regression here, e.g. the LEFT JOIN t2 ON t1.id = t2.id AND t2.id = 1 shape from the PR description.

foxtail463 · 2026-04-15T16:24:45Z

/review

github-actions

Found 1 blocking issue.

Critical checkpoints:

Goal of the task: Partially met. The PR does cover the MV case it describes and adds MV-focused unit/regression coverage, but the shared helper change also alters the standalone rewrite and is not safe there.
Scope/focus: Not fully focused. The intended MV-only relaxation leaked into another rewrite path through .
Concurrency: No concurrency concerns in the reviewed FE optimizer path.
Lifecycle/static init: No special lifecycle or static-initialization concerns here.
Config changes: None.
Compatibility/storage format/function symbol changes: None.
Functionally parallel code paths: Problematic. is also used by , which has no compensation path for .
Conditional checks: The new filter-edge checks in are directionally correct for the MV path I traced.
Test coverage: Added coverage is good for MV matching, but there is no coverage for the standalone unique-join elimination path now affected by the helper change.
Test result files: The new regression output is consistent with the added case.
Observability: No additional observability needed.
Transaction/persistence/data-write/FE-BE variable passing: Not applicable.
Performance: No obvious issue in the reviewed path.
Other issues: The blocking issue below must be addressed before merge.

Please keep the relaxed handling local to MV comparison, or add equivalent semantic protection/tests for the standalone join-elimination rewrite before merging.

github-actions · 2026-04-15T17:01:37Z

        if (join.getJoinType().isLeftOuterJoin() || join.getJoinType().isAsofLeftOuterJoin()) {
            Pair<Set<Slot>, Set<Slot>> njHashKeys = join.extractNullRejectHashKeys();
-            if (!join.getOtherJoinConjuncts().isEmpty() || njHashKeys == null) {
+            if (njHashKeys == null) {


Blocking: is shared with the normal rewrite, not just (). After removing the guard here, a plain query like becomes eligible for . That rule simply rewrites to and never preserves/evaluates the extra predicate, so the query returns rows instead of raising the error. The relaxation should stay in the MV path, or this rewrite needs its own semantic checks/tests for observable .

github-actions

Found 1 blocking issue.

Critical checkpoints:

Goal of the task: Partially met. The PR covers the intended MV case and adds MV focused unit and regression coverage, but the helper change also alters a standalone join elimination rewrite that this PR does not protect.
Scope and focus: Not fully focused. The intended MV only relaxation leaked into another rewrite path through JoinUtils.canEliminateByLeft().
Concurrency: No concurrency concerns in the reviewed FE optimizer path.
Lifecycle and static initialization: No special lifecycle or static initialization concerns here.
Config changes: None.
Compatibility or storage format changes: None.
Functionally parallel code paths: Problematic. canEliminateByLeft() is also used by EliminateJoinByUnique, which has no compensation path for extra ON predicates.
Conditional checks: The new filter edge checks in HyperGraphComparator are reasonable for the MV path I traced.
Test coverage: Added coverage is good for MV matching, but there is no coverage for the standalone unique join elimination path now affected by the helper change.
Test result files: The new regression output is consistent with the added case.
Observability: No additional observability needed.
Transaction, persistence, data write, FE-BE variable passing: Not applicable.
Performance: No obvious issue in the reviewed path.
Other issues: The blocking issue below must be addressed before merge.

Please keep the relaxed otherJoinConjuncts handling local to MV comparison, or add equivalent semantic protection and tests for the standalone join elimination rewrite before merging.

github-actions · 2026-04-15T17:02:37Z

        if (join.getJoinType().isLeftOuterJoin() || join.getJoinType().isAsofLeftOuterJoin()) {
            Pair<Set<Slot>, Set<Slot>> njHashKeys = join.extractNullRejectHashKeys();
-            if (!join.getOtherJoinConjuncts().isEmpty() || njHashKeys == null) {
+            if (njHashKeys == null) {


Blocking: JoinUtils.canEliminateByLeft() is shared with the normal EliminateJoinByUnique rewrite, not only with HyperGraphComparator. After removing the otherJoinConjuncts guard here, a plain query like select t1.id1 from t1 left join t2 on t1.id1 = t2.id2 and assert_true(false, ...) becomes eligible for ELIMINATE_JOIN_BY_UK. That rule rewrites project(join) to project(left) and never preserves or evaluates the extra ON predicate, so the query can return rows instead of raising the assert_true error. The relaxation should stay in the MV path, or the standalone rewrite needs its own semantic checks and tests for observable extra ON predicates.

foxtail463 · 2026-04-15T17:40:35Z

/review

foxtail463 · 2026-04-15T17:49:04Z

run buildall

github-actions

I found 1 blocking issue.\n\n1. loses dependencies when a filter sits above a inside the eliminated subtree. Because MV hypergraph nodes only expose base-node outputs, alias slots such as are not matched back to the underlying node. That means safe LOJ cases like are still rejected, and nested filters above join results can also evade the new unsafe-filter guard. Please resolve filter inputs through projection lineage / , and add a unit test for a projected-alias case.\n\nCritical checkpoints:\n- Goal of current task: Partially met. The simple LOJ case is fixed and tested, but projected-alias filter edges still break the intended relaxation.\n- Modification size/focus: Small and focused.\n- Concurrency: Not involved.\n- Lifecycle/static init: Not involved.\n- Config changes: None.\n- Compatibility / parallel paths: No FE/BE protocol or persisted-format impact. The standalone path intentionally remains unchanged and has a guarding test.\n- Special conditional checks: The new heuristic is under-specified for filters above projections.\n- Test coverage: Positive/negative FE unit tests plus one regression test were added, but coverage misses nested project/alias filter edges.\n- Test result files: The added regression output matches the new groovy case.\n- Observability: Not applicable.\n- Transaction/persistence/data writes/FE-BE variable passing: Not involved.\n- Performance: Acceptable for this change; no blocking issue found there.

github-actions · 2026-04-15T18:00:42Z

+        if (inputSlots.isEmpty()) {
+            return filterEdge.getSubTreeNodes();
+        }
+        long inputNodes = LongBitmap.newBitmap();


strips nodes and stores aliases in , so only contains the base-node slots. For a filter above a projected subtree, e.g. , returns here even though the filter still depends on the eliminated right node. That means the new relaxation still rejects this safe LOJ case, and the same hole can also miss unsafe nested filters above join results. Please resolve filter inputs through projection lineage / alias mapping before using them in the elimination check, and add a projected-alias test.

github-actions

I found 1 blocking issue.

HyperGraphComparator.getFilterInputNodes() does not resolve filter input slots through projection lineage. MV hypergraph nodes expose base node outputs, while aliases from stripped LogicalProject nodes live in complexProject. For a filter above a projected eliminated subtree, such as LEFT JOIN (SELECT * FROM (SELECT id AS rid FROM t2) x WHERE rid = 1) b ON t1.id = b.rid, the helper returns no referenced node even though the filter depends on the eliminated right side. As a result the comparator still rejects a safe LEFT JOIN elimination case that this PR intends to allow. Please derive node references through alias lineage and add a projected alias test.

Critical checkpoints:

Goal of current task: Partially met. The simple LEFT JOIN otherJoinConjunct case is fixed and tested, but the same relaxation still fails when the filter or join keys pass through a stripped projection alias.
Modification size and focus: Small and focused.
Concurrency: Not involved.
Lifecycle or static initialization: Not involved.
Configuration: No new config.
Compatibility and parallel code paths: No FE or BE protocol or persistence impact. The separate EliminateJoinByUnique path remains intentionally unchanged and has a guarding test.
Special conditional checks: The new filter input node heuristic is incomplete for projected aliases.
Test coverage: Positive and negative FE unit tests plus one regression test were added, but coverage misses the projected alias case above.
Test result files: The new regression output is consistent with the added suite.
Observability: Not applicable.
Transaction, persistence, data writes, FE-BE variable passing: Not involved.
Performance: No blocking issue found.

foxtail463 · 2026-04-15T19:53:07Z

/review

foxtail463 · 2026-04-15T19:53:14Z

run buildall

foxtail463 · 2026-04-15T20:29:27Z

run compile
run feut

github-actions · 2026-04-15T20:49:17Z

OpenCode automated review failed and did not complete.

Error: Review step was failure (possibly timeout or cancelled)
Workflow run: https://github.com/apache/doris/actions/runs/24475129729

Please inspect the workflow logs and rerun the review after the underlying issue is resolved.

foxtail463 · 2026-04-15T20:59:50Z

/review

github-actions

No blocking issues found in this PR. The change is focused and the safety argument in the MV comparator is consistent with the surrounding Nereids logic.

Critical checkpoints:

Goal: Pass. The PR aims to allow MV extra-join elimination for safe LEFT JOIN cases with right-side otherJoinConjuncts, while still rejecting unsafe residual filters. The code now records filter input nodes, relaxes the LEFT JOIN uniqueness check only in the MV comparator, and adds positive and negative FE coverage plus a regression case.
Minimality/Focus: Pass. The implementation stays local to the MV comparison / hypergraph path and does not broaden the standalone rewrite rule.
Concurrency / locking / lifecycle: Not applicable. This is FE optimizer logic with no new concurrent state, locks, or lifecycle complexity.
Config / compatibility / persistence / FE-BE protocol: Not applicable. No new config, no persisted format changes, and no protocol changes.
Parallel code paths: Acceptable. The standalone EliminateJoinByUnique rule remains unchanged by design; this PR only adjusts MV logical-compatibility checking.
Special conditions / comments: Pass. The new FilterEdge input-node tracking and the unsafe-filter distinction are commented and match the intended safety checks.
Test coverage: Mostly pass. FE tests cover the new positive LOJ case and several negative cases (WHERE t2.id IS NULL, slot-free filter on eliminated side, multi-node residual filter), and the regression test covers the end-to-end positive MV rewrite. Residual gap: the new FE tests still disable INFER_PREDICATES, so that shape is covered by reasoning rather than an explicit test here.
Performance: Pass. The added work is small and confined to MV comparison, not a hot execution path.

Residual risk / testing gap:

I did not run the FE unit tests or regression suite locally in this review environment.
There is no dedicated coverage here for the same scenarios with INFER_PREDICATES enabled.

foxtail463 · 2026-04-15T21:37:14Z

run buildall

github-actions bot requested changes Apr 15, 2026

View reviewed changes

Comment thread ...e-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/HyperGraphComparator.java

foxtail463 force-pushed the mv_rewrite_eliminate_filter_edge branch from e138f6e to 9763bb0 Compare April 15, 2026 14:38

github-actions bot requested changes Apr 15, 2026

View reviewed changes

foxtail463 force-pushed the mv_rewrite_eliminate_filter_edge branch from 9763bb0 to 17b7ffe Compare April 15, 2026 16:24

github-actions bot requested changes Apr 15, 2026

View reviewed changes

[fix](mv) Align extra-join elimination safety check

d3daa83

foxtail463 force-pushed the mv_rewrite_eliminate_filter_edge branch from 8e222c7 to d3daa83 Compare April 15, 2026 19:52

github-actions bot reviewed Apr 15, 2026

View reviewed changes

Conversation

foxtail463 commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Thearas commented Apr 15, 2026

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

foxtail463 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

foxtail463 commented Apr 15, 2026 •

edited

Loading