Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates partiql-eval APIs to Java and split compile phase #1618

Open
wants to merge 8 commits into
base: v1
Choose a base branch
from

Conversation

RCHowell
Copy link
Contributor

Description

As part of the user-defined operators work, I've factored out the compiler from evaluation and updated the APIs to Java.

Other Information

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link

github-actions bot commented Oct 11, 2024

CROSS-ENGINE-REPORT ❌

BASE (LEGACY-V0.14.8) TARGET (EVAL-ABD8D66) +/-
% Passing 89.67% 94.22% 4.55% ✅
Passing 5287 5555 268 ✅
Failing 609 54 -555 ✅
Ignored 0 287 287 🔶
Total Tests 5896 5896 0 ✅

Testing Details

  • Base Commit: v0.14.8
  • Base Engine: LEGACY
  • Target Commit: abd8d66
  • Target Engine: EVAL

Result Details

  • ❌ REGRESSION DETECTED. See Now Failing/Ignored Tests. ❌
  • Passing in both: 2642
  • Failing in both: 18
  • Ignored in both: 0
  • PASSING in BASE but now FAILING in TARGET: 1
  • PASSING in BASE but now IGNORED in TARGET: 111
  • FAILING in BASE but now PASSING in TARGET: 179
  • IGNORED in BASE but now PASSING in TARGET: 0

Now FAILING Tests ❌

The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see
  1. data type mismatch in logical expression, compileOption: PERMISSIVE

Now IGNORED Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-ABD8D66). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

CROSS-COMMIT-REPORT ✅

BASE (EVAL-9CFBC51) TARGET (EVAL-ABD8D66) +/-
% Passing 94.22% 94.22% 0.00% ✅
Passing 5555 5555 0 ✅
Failing 54 54 0 ✅
Ignored 287 287 0 ✅
Total Tests 5896 5896 0 ✅

Testing Details

  • Base Commit: 9cfbc51
  • Base Engine: EVAL
  • Target Commit: abd8d66
  • Target Engine: EVAL

Result Details

  • Passing in both: 5555
  • Failing in both: 54
  • Ignored in both: 287
  • PASSING in BASE but now FAILING in TARGET: 0
  • PASSING in BASE but now IGNORED in TARGET: 0
  • FAILING in BASE but now PASSING in TARGET: 0
  • IGNORED in BASE but now PASSING in TARGET: 0

Copy link

CROSS-ENGINE-REPORT ❌

BASE (LEGACY-V0.14.8) TARGET (EVAL-4D21E82) +/-
% Passing 89.67% 94.22% 4.55% ✅
Passing 5287 5555 268 ✅
Failing 609 54 -555 ✅
Ignored 0 287 287 🔶
Total Tests 5896 5896 0 ✅

Testing Details

  • Base Commit: v0.14.8
  • Base Engine: LEGACY
  • Target Commit: 4d21e82
  • Target Engine: EVAL

Result Details

  • ❌ REGRESSION DETECTED. See Now Failing/Ignored Tests. ❌
  • Passing in both: 2642
  • Failing in both: 18
  • Ignored in both: 0
  • PASSING in BASE but now FAILING in TARGET: 1
  • PASSING in BASE but now IGNORED in TARGET: 111
  • FAILING in BASE but now PASSING in TARGET: 179
  • IGNORED in BASE but now PASSING in TARGET: 0

Now FAILING Tests ❌

The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see
  1. data type mismatch in logical expression, compileOption: PERMISSIVE

Now IGNORED Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-4D21E82). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

CROSS-COMMIT-REPORT ✅

BASE (EVAL-9CFBC51) TARGET (EVAL-4D21E82) +/-
% Passing 94.22% 94.22% 0.00% ✅
Passing 5555 5555 0 ✅
Failing 54 54 0 ✅
Ignored 287 287 0 ✅
Total Tests 5896 5896 0 ✅

Testing Details

  • Base Commit: 9cfbc51
  • Base Engine: EVAL
  • Target Commit: 4d21e82
  • Target Engine: EVAL

Result Details

  • Passing in both: 5555
  • Failing in both: 54
  • Ignored in both: 287
  • PASSING in BASE but now FAILING in TARGET: 0
  • PASSING in BASE but now IGNORED in TARGET: 0
  • FAILING in BASE but now PASSING in TARGET: 0
  • IGNORED in BASE but now PASSING in TARGET: 0

@RCHowell
Copy link
Contributor Author

Follow-up includes the "operator" package for custom physical operators. https://github.com/partiql/partiql-lang-kotlin/tree/v1-udop-strat/partiql-eval/src/main/java/org/partiql/eval/operator

Copy link

CROSS-ENGINE-REPORT ❌

BASE (LEGACY-V0.14.8) TARGET (EVAL-5BF1CE5) +/-
% Passing 89.67% 94.22% 4.55% ✅
Passing 5287 5555 268 ✅
Failing 609 54 -555 ✅
Ignored 0 287 287 🔶
Total Tests 5896 5896 0 ✅

Testing Details

  • Base Commit: v0.14.8
  • Base Engine: LEGACY
  • Target Commit: 5bf1ce5
  • Target Engine: EVAL

Result Details

  • ❌ REGRESSION DETECTED. See Now Failing/Ignored Tests. ❌
  • Passing in both: 2642
  • Failing in both: 18
  • Ignored in both: 0
  • PASSING in BASE but now FAILING in TARGET: 1
  • PASSING in BASE but now IGNORED in TARGET: 111
  • FAILING in BASE but now PASSING in TARGET: 179
  • IGNORED in BASE but now PASSING in TARGET: 0

Now FAILING Tests ❌

The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see
  1. data type mismatch in logical expression, compileOption: PERMISSIVE

Now IGNORED Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-5BF1CE5). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

CROSS-COMMIT-REPORT ✅

BASE (EVAL-9CFBC51) TARGET (EVAL-5BF1CE5) +/-
% Passing 94.22% 94.22% 0.00% ✅
Passing 5555 5555 0 ✅
Failing 54 54 0 ✅
Ignored 287 287 0 ✅
Total Tests 5896 5896 0 ✅

Testing Details

  • Base Commit: 9cfbc51
  • Base Engine: EVAL
  • Target Commit: 5bf1ce5
  • Target Engine: EVAL

Result Details

  • Passing in both: 5555
  • Failing in both: 54
  • Ignored in both: 287
  • PASSING in BASE but now FAILING in TARGET: 0
  • PASSING in BASE but now IGNORED in TARGET: 0
  • FAILING in BASE but now PASSING in TARGET: 0
  • IGNORED in BASE but now PASSING in TARGET: 0

Copy link
Member

@johnedquinn johnedquinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still going through the PR, but leaving preliminary comments.

Comment on lines 11 to 20
public interface Statement {

/**
* Executes the statement with no parameters.
*
* @return Datum execution result.
*/
@NotNull
public Datum execute();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I like some of the modeling exposed by JDBC. They have a sort of fat interface, but they don't expose a type -- which would actually prove quite useful for customers.

Suggested change
public interface Statement {
/**
* Executes the statement with no parameters.
*
* @return Datum execution result.
*/
@NotNull
public Datum execute();
}
public interface PreparedStatement {
/**
* The type of the executable statement.
*/
int type;
/**
* If a prepared statement is of this type, [executeQuery] may be invoked.
*/
public static final int QUERY = 1;
/**
* Executes the query.
*
* @return Datum execution result.
*/
@NotNull
public Datum executeQuery();
/**
* Executes an UPDATE.
*
* @return UpdateDetails the details of the update
*/
@NotNull
public UpdateDetails executeUpdate();
}

Also, I don't think we need these comments regarding parameters. We've discussed this, but you can't plan without the parameters already injected, the parameters are (likely) a JDBC library feature (before sending the request to the server) and are outside of the compiler/engine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but you can't plan without the parameters already injected

You can, you just need the number of parameters and optionally the types (aka the formal arguments of a function).

(likely) a JDBC library feature

Not a JDBC thing because these parameters aren't stringified. Stringified params makes sense for JDBC because it supports multiple backends, but I think my other answers show how this adds formal arguments to a compiled statement.

ACK on the factoring of prepared statement being fat-interface like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was looking into the JDBC prepared statement. We can't have an uninitialized field like in your code sample, but they may be something more interesting here. The use of executeQuery() vs executeUpdate() have to do with differing function return types; the former returns a ResultSet and the latter an int. There's also executeLargeUpdate() which returns a long. It also reads like the "executeUpdate" was added first, customers realized the limitation/bug, and the executeLargeUpdate() had to be added because a function cannot be defined with two different return types. This seems to indicate that the PreparedStatement methods don't necessarily correspond to the underlying SQL statement type, but are more about the return type.

We have two options here:

  1. Datum can represent both a ResultSet and an int, long, etc; so no need to have different execution return types.
  2. We introduce generics so we have Statement<T> with T execute().

I'm working on the mental model of "A PartiQL compiler returns a statement which can be invoked with zero or more parameters". It produces functions from the input environment and input string, then you can invoke that function.

Interestingly the JDBC statement has no notion of a statement type. I'm not against fat interfaces, but we have to keep in mind that they were a solution before generics existed and are advised against in OO design.

/**
* TODO JAVADOC
*/
public interface PartiQLEvaluator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced we need this -- since the StandardEvaluator implementation is a single line.

This is a "core" package. I'm of the opinion that some other package could provide easier-to-use APIs. If this is the intention, maybe in partiql-lang, we expose an API that takes in a String query and returns the result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard may be a single line, but consider introducing a plan cache. It's not necessarily about an easier API, but about splitting the responsibility of the preparation phase from invoking.

The compiler returns a function, the evaluator invokes that function. I could put these both into PartiQLEvaluator, but then we'd have to extend the "evaluator" with compilation strategies for choosing physical operators.

So it's really not about a "easier API" but a responsibility split, where the default custom need not interact with the compilation phase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily about an easier API, but about splitting the responsibility of the preparation phase from invoking... The compiler returns a function, the evaluator invokes that function.

In this implementation, the PartiQLEvaluator doesn't invoke the function returned by the PartiQLCompiler. The responsibilities here overlap. The PartiQLEvaluator here compiles the plan and executes. The PartiQLCompiler here compiles the plan and does not execute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler returns a compiled "function" aka a prepared statement to be invoked via "execute". The PartiQLEvaluator does call the "function/statement" returned by the compiler.

The responsibility of the evaluator indeed overlaps the compiler by design. The split exists so that customers can supply their own compilation phase, but don't need to.

The idea here was to make an entry-point for custom compilation, but independent of the evaluator. Do you have some ideas on how to best split these responsibilities?

We could also make evaluator drive the datum for customers who don't want the lazy behavior?

/**
* Strict execution mode.
*/
public static final int STRICT = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! We've been leaving a spot for UNKNOWN = 0 which may help with protecting against forward compatibility. You don't need to add it here, but you could make STRICT = 1 and PERMISSIVE = 2.

/**
* Statement execution parameters.
*/
internal class Parameters(private val values: Array<Datum>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned in other comment, but I don't believe this is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, I can remove this, but just letting you know it's not coming from JDBC.

You can always plan a "parameterized" node as dynamic which would allow repeated execution of prepared statements with different parameters. This should seriously cut down on a lot of unnecessary work, but we could even do better.

You also don't need any info about the parameter other than the type, so you can always plan with parameters which are only defined by their types. Then you can cache prepared plans and invoke with the differing parameters. Recall that the ability to do so was a direct ask from customers who wanted to plan once and repeatedly execute while changing some parameter values.

You can think of the prepared statement as producing a function that can be invoked. As-is, you are unable to provide any inputs to that function. In other words, our query compiles to fun query(): Datum but parameters allows us to compile the query to fun query(p1: T1, p2: T2, ...): Datum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The direct ask was resolved using our plugin model -- and the executable maintained a reference to a table (which was mutable by design). This might just be a preemptive optimization, since we don't allow for parameters in the plan itself (yet?).

Copy link
Contributor Author

@RCHowell RCHowell Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, use a mutable table rather than an execution parameter. That works for the tools available, and doesn't need any other changes and achieves the same thing but I would push back on compiling over a mutable value. I understand it solved the problem with what is available, but that is like using mutable pointers inside a function call and replacing the data at the pointer before invoking again, rather than just passing the new state.

I can remove this and may tinker with helpers in partiql-lang but the grammar and AST have parameter expressions. Also this could be added later. Parameters are just missing from v1 planner/plans afaik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants