Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifies joins and fixes bugs #1438

Merged
merged 3 commits into from
Jul 12, 2024

Conversation

johnedquinn
Copy link
Member

@johnedquinn johnedquinn commented Apr 19, 2024

Relevant Issues

  • None

Description

  • Increases conformance by 4 tests (.07%). See report.
  • Simplifies joins by using Kotlin's helpful iterator inline function. This substantially simplifies our code at the cost of some performance. The reason I'm making this move is that there are bugs in the current implementation, and for the sake of our goal (conformance, not performance), I'm moving towards a simpler implementation. Once we pass all join tests, we can re-evaluate our implementation for performance reasons.
  • This PR now passes 8 additional tests, however, there are now 4 failing that were not previously failing. This is due to our inability to correctly coerce nulls to other types. Please disregard this issue as this is being handled in Initializes Datum and adds functionality to evaluator #1451.

Other Information

  • Updated Unreleased Section in CHANGELOG: NO
  • Any backward-incompatible changes? Maybe?
    • There is one change that is worth noting. <struct evaluating to null>.a returns null. Is this right? IMO, it makes the most sense, and it makes the modelling of JOINs very very easy.
  • Any new external dependencies? NO
  • Do your changes comply with the Contributing Guidelines and Code Style Guidelines? YES

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov-commenter
Copy link

codecov-commenter commented Apr 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (v1@5b86afc). Learn more about missing BASE report.

Additional details and impacted files
@@          Coverage Diff          @@
##             v1    #1438   +/-   ##
=====================================
  Coverage      ?   13.70%           
  Complexity    ?       31           
=====================================
  Files         ?       29           
  Lines         ?     1729           
  Branches      ?      190           
=====================================
  Hits          ?      237           
  Misses        ?     1470           
  Partials      ?       22           
Flag Coverage Δ
CLI 13.70% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@johnedquinn johnedquinn force-pushed the v1-conformance-join branch 2 times, most recently from 2b3236b to d114152 Compare April 23, 2024 22:47
@RCHowell
Copy link
Contributor

Why draft?

@johnedquinn
Copy link
Member Author

Why draft?

Currently incorporating the change for check() that is causing some of the conformance tests for joins to fail.

* yield(null, rhsRecord)
* ```
*
* Note: The LHS and RHS must be sorted. TODO: We need to add sorting to the LHS and RHS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on this please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated KDoc.

Comment on lines +303 to +304
// TODO: Specify that this is a LATERAL JOIN. Create a separate JOIN. Also, determine the allowable types of JOIN.
// For context: Oracle SQL doesn't allow ... FULL OUTER JOIN LATERAL ... or ... RIGHT OUTER JOIN LATERAL ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PartiQL Spec section 5.7

SQL 2003 used the LATERAL keyword to correlate FROM clause items. In the interest of compatibility with SQL, PartiQL also allows the use of the keyword LATERAL, though it does not do anything more than the comma itself would do.
That is “l , LATERAL r” is equivalent to “l , r”.

This is: I don't think we need to support LATERAL JOIN. The spec is quite clear about there is no difference between left join and left lateral join, inner join and inner lateral join.

The spec is also quite clear about the fact that Full Join can not be lateral.

I think the only decision we have to make here is if we want right join to be lateral.

I vote no : )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline a while back. I'd like to have different plan nodes for both types -- which corresponds to a different implementation in the evaluator. While yes, syntactically, one doesn't need to use the LATERAL keyword, it has semantic implementations. Upon analyzing the plan and seeing no references to bindings from the LHS, we can potentially use the normal join operator (and not the lateral join).

If we ever want to support that in the future, what would we call this plan node in the future? non_lateral_join? Since join is already taken and is used to semantically denote a lateral join?

Especially given that we are a library where DB implementers choose which features to support. Does their execution allow for joins? DynamoDB doesn't. What about lateral joins? Does every implementation support lateral joins in their execution? I'm not sure -- but I do know that lateral joins are expensive due to their correlated nature. I believe it limits the implementations that one can use. If I'm not mistaken, you are forced to use a variation of a nested loop join.

Copy link
Contributor

@yliuuuu yliuuuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good to me.

Had some small comments around test cases and kdoc.

Comment on lines +264 to +277
// RIGHT OUTER JOIN -- No Matches
SuccessTestCase(
input = """
SELECT VALUE [lhs, rhs]
FROM << 0, 1, 2 >> lhs
RIGHT OUTER JOIN << 3, 4, 5 >> rhs
ON lhs = rhs
""".trimIndent(),
expected = bagValue(
listValue(int32Value(null), int32Value(3)),
listValue(int32Value(null), int32Value(4)),
listValue(int32Value(null), int32Value(5)),
)
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a couple tests for full outer join as well?

Also, would be nice if we can test the lateral properties of Left Join, and set up baseline for non-lateral behavior for Right and Full Join. Not sure if those should go to the evaluation test or the planning test (if not already exist)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using the conformance tests for full outer join -- but I just updated and pushed additional tests for planning.

Copy link
Contributor

@yliuuuu yliuuuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional test cases.

@johnedquinn johnedquinn merged commit 35271b1 into partiql:v1 Jul 12, 2024
7 checks passed
@johnedquinn johnedquinn deleted the v1-conformance-join branch July 12, 2024 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants