Updates partiql-eval APIs to Java and split compile phase #1618

RCHowell · 2024-10-11T22:07:31Z

Description

As part of the user-defined operators work, I've factored out the compiler from evaluation and updated the APIs to Java.

Other Information

Updated Unreleased Section in CHANGELOG: NO
Any backward-incompatible changes? NO
Any new external dependencies? NO
Do your changes comply with the Contributing Guidelines
and Code Style Guidelines? YES

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

github-actions · 2024-10-11T22:15:26Z

CROSS-ENGINE-REPORT ❌

	BASE (LEGACY-V0.14.8)	TARGET (EVAL-ABD8D66)	+/-
% Passing	89.67%	94.22%	4.55% ✅
Passing	5287	5555	268 ✅
Failing	609	54	-555 ✅
Ignored	0	287	287 🔶
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: v0.14.8
Base Engine: LEGACY
Target Commit: abd8d66
Target Engine: EVAL

Result Details

❌ REGRESSION DETECTED. See Now Failing/Ignored Tests. ❌
Passing in both: 2642
Failing in both: 18
Ignored in both: 0
PASSING in BASE but now FAILING in TARGET: 1
PASSING in BASE but now IGNORED in TARGET: 111
FAILING in BASE but now PASSING in TARGET: 179
IGNORED in BASE but now PASSING in TARGET: 0

Now FAILING Tests ❌

The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see

data type mismatch in logical expression, compileOption: PERMISSIVE

Now IGNORED Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-ABD8D66). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

CROSS-COMMIT-REPORT ✅

	BASE (EVAL-9CFBC51)	TARGET (EVAL-ABD8D66)	+/-
% Passing	94.22%	94.22%	0.00% ✅
Passing	5555	5555	0 ✅
Failing	54	54	0 ✅
Ignored	287	287	0 ✅
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: 9cfbc51
Base Engine: EVAL
Target Commit: abd8d66
Target Engine: EVAL

Result Details

Passing in both: 5555
Failing in both: 54
Ignored in both: 287
PASSING in BASE but now FAILING in TARGET: 0
PASSING in BASE but now IGNORED in TARGET: 0
FAILING in BASE but now PASSING in TARGET: 0
IGNORED in BASE but now PASSING in TARGET: 0

github-actions · 2024-10-11T22:36:12Z

CROSS-ENGINE-REPORT ❌

	BASE (LEGACY-V0.14.8)	TARGET (EVAL-4D21E82)	+/-
% Passing	89.67%	94.22%	4.55% ✅
Passing	5287	5555	268 ✅
Failing	609	54	-555 ✅
Ignored	0	287	287 🔶
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: v0.14.8
Base Engine: LEGACY
Target Commit: 4d21e82
Target Engine: EVAL

Result Details

❌ REGRESSION DETECTED. See Now Failing/Ignored Tests. ❌
Passing in both: 2642
Failing in both: 18
Ignored in both: 0
PASSING in BASE but now FAILING in TARGET: 1
PASSING in BASE but now IGNORED in TARGET: 111
FAILING in BASE but now PASSING in TARGET: 179
IGNORED in BASE but now PASSING in TARGET: 0

Now FAILING Tests ❌

The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see

data type mismatch in logical expression, compileOption: PERMISSIVE

Now IGNORED Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-4D21E82). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

CROSS-COMMIT-REPORT ✅

	BASE (EVAL-9CFBC51)	TARGET (EVAL-4D21E82)	+/-
% Passing	94.22%	94.22%	0.00% ✅
Passing	5555	5555	0 ✅
Failing	54	54	0 ✅
Ignored	287	287	0 ✅
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: 9cfbc51
Base Engine: EVAL
Target Commit: 4d21e82
Target Engine: EVAL

Result Details

Passing in both: 5555
Failing in both: 54
Ignored in both: 287
PASSING in BASE but now FAILING in TARGET: 0
PASSING in BASE but now IGNORED in TARGET: 0
FAILING in BASE but now PASSING in TARGET: 0
IGNORED in BASE but now PASSING in TARGET: 0

RCHowell · 2024-10-14T22:02:18Z

Follow-up includes the "operator" package for custom physical operators. https://github.com/partiql/partiql-lang-kotlin/tree/v1-udop-strat/partiql-eval/src/main/java/org/partiql/eval/operator

github-actions · 2024-10-15T22:05:38Z

CROSS-ENGINE-REPORT ❌

	BASE (LEGACY-V0.14.8)	TARGET (EVAL-5BF1CE5)	+/-
% Passing	89.67%	94.22%	4.55% ✅
Passing	5287	5555	268 ✅
Failing	609	54	-555 ✅
Ignored	0	287	287 🔶
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: v0.14.8
Base Engine: LEGACY
Target Commit: 5bf1ce5
Target Engine: EVAL

Result Details

❌ REGRESSION DETECTED. See Now Failing/Ignored Tests. ❌
Passing in both: 2642
Failing in both: 18
Ignored in both: 0
PASSING in BASE but now FAILING in TARGET: 1
PASSING in BASE but now IGNORED in TARGET: 111
FAILING in BASE but now PASSING in TARGET: 179
IGNORED in BASE but now PASSING in TARGET: 0

Now FAILING Tests ❌

The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see

data type mismatch in logical expression, compileOption: PERMISSIVE

Now IGNORED Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-5BF1CE5). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

CROSS-COMMIT-REPORT ✅

	BASE (EVAL-9CFBC51)	TARGET (EVAL-5BF1CE5)	+/-
% Passing	94.22%	94.22%	0.00% ✅
Passing	5555	5555	0 ✅
Failing	54	54	0 ✅
Ignored	287	287	0 ✅
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: 9cfbc51
Base Engine: EVAL
Target Commit: 5bf1ce5
Target Engine: EVAL

Result Details

Passing in both: 5555
Failing in both: 54
Ignored in both: 287
PASSING in BASE but now FAILING in TARGET: 0
PASSING in BASE but now IGNORED in TARGET: 0
FAILING in BASE but now PASSING in TARGET: 0
IGNORED in BASE but now PASSING in TARGET: 0

johnedquinn

Still going through the PR, but leaving preliminary comments.

johnedquinn · 2024-10-16T17:06:50Z

partiql-eval/src/main/java/org/partiql/eval/Statement.java

+public interface Statement {
+
+    /**
+     * Executes the statement with no parameters.
+     *
+     * @return Datum execution result.
+     */
+    @NotNull
+    public Datum execute();
+}


Personally, I like some of the modeling exposed by JDBC. They have a sort of fat interface, but they don't expose a type -- which would actually prove quite useful for customers.

Suggested change

public interface Statement {

/**

* Executes the statement with no parameters.

*

* @return Datum execution result.

*/

@NotNull

public Datum execute();

}

public interface PreparedStatement {

/**

* The type of the executable statement.

*/

int type;

/**

* If a prepared statement is of this type, [executeQuery] may be invoked.

*/

public static final int QUERY = 1;

/**

* Executes the query.

*

* @return Datum execution result.

*/

@NotNull

public Datum executeQuery();

/**

* Executes an UPDATE.

*

* @return UpdateDetails the details of the update

*/

@NotNull

public UpdateDetails executeUpdate();

}

Also, I don't think we need these comments regarding parameters. We've discussed this, but you can't plan without the parameters already injected, the parameters are (likely) a JDBC library feature (before sending the request to the server) and are outside of the compiler/engine.

but you can't plan without the parameters already injected

You can, you just need the number of parameters and optionally the types (aka the formal arguments of a function).

(likely) a JDBC library feature

Not a JDBC thing because these parameters aren't stringified. Stringified params makes sense for JDBC because it supports multiple backends, but I think my other answers show how this adds formal arguments to a compiled statement.

ACK on the factoring of prepared statement being fat-interface like.

Was looking into the JDBC prepared statement. We can't have an uninitialized field like in your code sample, but they may be something more interesting here. The use of executeQuery() vs executeUpdate() have to do with differing function return types; the former returns a ResultSet and the latter an int. There's also executeLargeUpdate() which returns a long. It also reads like the "executeUpdate" was added first, customers realized the limitation/bug, and the executeLargeUpdate() had to be added because a function cannot be defined with two different return types. This seems to indicate that the PreparedStatement methods don't necessarily correspond to the underlying SQL statement type, but are more about the return type.

We have two options here:

Datum can represent both a ResultSet and an int, long, etc; so no need to have different execution return types.

We introduce generics so we have Statement<T> with T execute().

I'm working on the mental model of "A PartiQL compiler returns a statement which can be invoked with zero or more parameters". It produces functions from the input environment and input string, then you can invoke that function.

Interestingly the JDBC statement has no notion of a statement type. I'm not against fat interfaces, but we have to keep in mind that they were a solution before generics existed and are advised against in OO design.

johnedquinn · 2024-10-16T17:10:37Z

partiql-eval/src/main/java/org/partiql/eval/PartiQLEvaluator.java

+/**
+ * TODO JAVADOC
+ */
+public interface PartiQLEvaluator {


I'm not convinced we need this -- since the StandardEvaluator implementation is a single line.

This is a "core" package. I'm of the opinion that some other package could provide easier-to-use APIs. If this is the intention, maybe in partiql-lang, we expose an API that takes in a String query and returns the result.

The standard may be a single line, but consider introducing a plan cache. It's not necessarily about an easier API, but about splitting the responsibility of the preparation phase from invoking.

The compiler returns a function, the evaluator invokes that function. I could put these both into PartiQLEvaluator, but then we'd have to extend the "evaluator" with compilation strategies for choosing physical operators.

So it's really not about a "easier API" but a responsibility split, where the default custom need not interact with the compilation phase.

It's not necessarily about an easier API, but about splitting the responsibility of the preparation phase from invoking... The compiler returns a function, the evaluator invokes that function.

In this implementation, the PartiQLEvaluator doesn't invoke the function returned by the PartiQLCompiler. The responsibilities here overlap. The PartiQLEvaluator here compiles the plan and executes. The PartiQLCompiler here compiles the plan and does not execute.

The compiler returns a compiled "function" aka a prepared statement to be invoked via "execute". The PartiQLEvaluator does call the "function/statement" returned by the compiler.

The responsibility of the evaluator indeed overlaps the compiler by design. The split exists so that customers can supply their own compilation phase, but don't need to.

The idea here was to make an entry-point for custom compilation, but independent of the evaluator. Do you have some ideas on how to best split these responsibilities?

We could also make evaluator drive the datum for customers who don't want the lazy behavior?

johnedquinn · 2024-10-16T17:13:16Z

partiql-eval/src/main/java/org/partiql/eval/Mode.java

+    /**
+     * Strict execution mode.
+     */
+    public static final int STRICT = 0;


Nice! We've been leaving a spot for UNKNOWN = 0 which may help with protecting against forward compatibility. You don't need to add it here, but you could make STRICT = 1 and PERMISSIVE = 2.

johnedquinn · 2024-10-16T17:32:16Z

partiql-eval/src/main/kotlin/org/partiql/eval/internal/Parameters.kt

+/**
+ * Statement execution parameters.
+ */
+internal class Parameters(private val values: Array<Datum>) {


Mentioned in other comment, but I don't believe this is necessary.

Ack, I can remove this, but just letting you know it's not coming from JDBC.

You can always plan a "parameterized" node as dynamic which would allow repeated execution of prepared statements with different parameters. This should seriously cut down on a lot of unnecessary work, but we could even do better.

You also don't need any info about the parameter other than the type, so you can always plan with parameters which are only defined by their types. Then you can cache prepared plans and invoke with the differing parameters. Recall that the ability to do so was a direct ask from customers who wanted to plan once and repeatedly execute while changing some parameter values.

You can think of the prepared statement as producing a function that can be invoked. As-is, you are unable to provide any inputs to that function. In other words, our query compiles to fun query(): Datum but parameters allows us to compile the query to fun query(p1: T1, p2: T2, ...): Datum.

The direct ask was resolved using our plugin model -- and the executable maintained a reference to a table (which was mutable by design). This might just be a preemptive optimization, since we don't allow for parameters in the plan itself (yet?).

I see, use a mutable table rather than an execution parameter. That works for the tools available, and doesn't need any other changes and achieves the same thing but I would push back on compiling over a mutable value. I understand it solved the problem with what is available, but that is like using mutable pointers inside a function call and replacing the data at the pointer before invoking again, rather than just passing the new state.

I can remove this and may tinker with helpers in partiql-lang but the grammar and AST have parameter expressions. Also this could be added later. Parameters are just missing from v1 planner/plans afaik

Updates partiql-eval APIs to Java and split compile phase

ecd8b5a

RCHowell requested a review from johnedquinn October 11, 2024 22:07

Fix nested aggregate scope

bee3776

RCHowell added 2 commits October 14, 2024 12:21

Adds operator interface to partiql-plan for udops

8b59df7

Creates some public Java APIs for udops

1114fb8

RCHowell added 2 commits October 15, 2024 13:34

Backup

b1ea09b

Unify rex/rel to create operator trees

8e9a540

johnedquinn reviewed Oct 16, 2024

View reviewed changes

RCHowell added 2 commits October 17, 2024 16:02

Updates based on PR feedback

7835732

Merge v1-udop-strat

8cf5e65

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates partiql-eval APIs to Java and split compile phase #1618

Updates partiql-eval APIs to Java and split compile phase #1618

RCHowell commented Oct 11, 2024

github-actions bot commented Oct 11, 2024 •

edited

Loading

github-actions bot commented Oct 11, 2024

RCHowell commented Oct 14, 2024

github-actions bot commented Oct 15, 2024

johnedquinn left a comment

johnedquinn Oct 16, 2024

RCHowell Oct 16, 2024

RCHowell Oct 17, 2024

johnedquinn Oct 16, 2024

RCHowell Oct 16, 2024

johnedquinn Oct 16, 2024

RCHowell Oct 17, 2024

johnedquinn Oct 16, 2024

johnedquinn Oct 16, 2024

RCHowell Oct 16, 2024

johnedquinn Oct 16, 2024

RCHowell Oct 17, 2024 •

edited

Loading

-public interface Statement {
-    /**
-     * Executes the statement with no parameters.
-     *
-     * @return Datum execution result.
-     */
-    @NotNull
-    public Datum execute();
-}
+public interface PreparedStatement {
+    /**
+     * The type of the executable statement.
+     */
+    int type;
+    /**
+     * If a prepared statement is of this type, [executeQuery] may be invoked.
+     */
+    public static final int QUERY = 1;
+    /**
+     * Executes the query.
+     *
+     * @return Datum execution result.
+     */
+    @NotNull
+    public Datum executeQuery();
+    /**
+     * Executes an UPDATE.
+     *
+     * @return UpdateDetails the details of the update
+     */
+    @NotNull
+    public UpdateDetails executeUpdate();
+}

Updates partiql-eval APIs to Java and split compile phase #1618

Are you sure you want to change the base?

Updates partiql-eval APIs to Java and split compile phase #1618

Conversation

RCHowell commented Oct 11, 2024

Description

Other Information

License Information

github-actions bot commented Oct 11, 2024 • edited Loading

CROSS-ENGINE-REPORT ❌

Testing Details

Result Details

Now FAILING Tests ❌

Now IGNORED Tests ❌

Now Passing Tests

CROSS-COMMIT-REPORT ✅

Testing Details

Result Details

github-actions bot commented Oct 11, 2024

CROSS-ENGINE-REPORT ❌

Testing Details

Result Details

Now FAILING Tests ❌

Now IGNORED Tests ❌

Now Passing Tests

CROSS-COMMIT-REPORT ✅

Testing Details

Result Details

RCHowell commented Oct 14, 2024

github-actions bot commented Oct 15, 2024

CROSS-ENGINE-REPORT ❌

Testing Details

Result Details

Now FAILING Tests ❌

Now IGNORED Tests ❌

Now Passing Tests

CROSS-COMMIT-REPORT ✅

Testing Details

Result Details

johnedquinn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RCHowell Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Oct 11, 2024 •

edited

Loading

RCHowell Oct 17, 2024 •

edited

Loading