-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates partiql-eval APIs to Java and split compile phase #1618
base: v1
Are you sure you want to change the base?
Conversation
CROSS-ENGINE-REPORT ❌
Testing Details
Result Details
Now FAILING Tests ❌The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET: Click here to see
Now IGNORED Tests ❌The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. Now Passing Tests179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-ABD8D66). Before merging, confirm they are intended to pass. The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. CROSS-COMMIT-REPORT ✅
Testing DetailsResult Details
|
CROSS-ENGINE-REPORT ❌
Testing Details
Result Details
Now FAILING Tests ❌The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET: Click here to see
Now IGNORED Tests ❌The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. Now Passing Tests179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-4D21E82). Before merging, confirm they are intended to pass. The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. CROSS-COMMIT-REPORT ✅
Testing DetailsResult Details
|
Follow-up includes the "operator" package for custom physical operators. https://github.com/partiql/partiql-lang-kotlin/tree/v1-udop-strat/partiql-eval/src/main/java/org/partiql/eval/operator |
CROSS-ENGINE-REPORT ❌
Testing Details
Result Details
Now FAILING Tests ❌The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET: Click here to see
Now IGNORED Tests ❌The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. Now Passing Tests179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-5BF1CE5). Before merging, confirm they are intended to pass. The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. CROSS-COMMIT-REPORT ✅
Testing DetailsResult Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still going through the PR, but leaving preliminary comments.
public interface Statement { | ||
|
||
/** | ||
* Executes the statement with no parameters. | ||
* | ||
* @return Datum execution result. | ||
*/ | ||
@NotNull | ||
public Datum execute(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I like some of the modeling exposed by JDBC. They have a sort of fat interface, but they don't expose a type
-- which would actually prove quite useful for customers.
public interface Statement { | |
/** | |
* Executes the statement with no parameters. | |
* | |
* @return Datum execution result. | |
*/ | |
@NotNull | |
public Datum execute(); | |
} | |
public interface PreparedStatement { | |
/** | |
* The type of the executable statement. | |
*/ | |
int type; | |
/** | |
* If a prepared statement is of this type, [executeQuery] may be invoked. | |
*/ | |
public static final int QUERY = 1; | |
/** | |
* Executes the query. | |
* | |
* @return Datum execution result. | |
*/ | |
@NotNull | |
public Datum executeQuery(); | |
/** | |
* Executes an UPDATE. | |
* | |
* @return UpdateDetails the details of the update | |
*/ | |
@NotNull | |
public UpdateDetails executeUpdate(); | |
} |
Also, I don't think we need these comments regarding parameters. We've discussed this, but you can't plan without the parameters already injected, the parameters are (likely) a JDBC library feature (before sending the request to the server) and are outside of the compiler/engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but you can't plan without the parameters already injected
You can, you just need the number of parameters and optionally the types (aka the formal arguments of a function).
(likely) a JDBC library feature
Not a JDBC thing because these parameters aren't stringified. Stringified params makes sense for JDBC because it supports multiple backends, but I think my other answers show how this adds formal arguments to a compiled statement.
ACK on the factoring of prepared statement being fat-interface like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was looking into the JDBC prepared statement. We can't have an uninitialized field like in your code sample, but they may be something more interesting here. The use of executeQuery() vs executeUpdate() have to do with differing function return types; the former returns a ResultSet and the latter an int. There's also executeLargeUpdate() which returns a long. It also reads like the "executeUpdate" was added first, customers realized the limitation/bug, and the executeLargeUpdate() had to be added because a function cannot be defined with two different return types. This seems to indicate that the PreparedStatement methods don't necessarily correspond to the underlying SQL statement type, but are more about the return type.
We have two options here:
- Datum can represent both a ResultSet and an int, long, etc; so no need to have different execution return types.
- We introduce generics so we have
Statement<T>
withT execute()
.
I'm working on the mental model of "A PartiQL compiler returns a statement which can be invoked with zero or more parameters". It produces functions from the input environment and input string, then you can invoke that function.
Interestingly the JDBC statement has no notion of a statement type. I'm not against fat interfaces, but we have to keep in mind that they were a solution before generics existed and are advised against in OO design.
/** | ||
* TODO JAVADOC | ||
*/ | ||
public interface PartiQLEvaluator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced we need this -- since the StandardEvaluator
implementation is a single line.
This is a "core" package. I'm of the opinion that some other package could provide easier-to-use APIs. If this is the intention, maybe in partiql-lang
, we expose an API that takes in a String query
and returns the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard may be a single line, but consider introducing a plan cache. It's not necessarily about an easier API, but about splitting the responsibility of the preparation phase from invoking.
The compiler returns a function, the evaluator invokes that function. I could put these both into PartiQLEvaluator, but then we'd have to extend the "evaluator" with compilation strategies for choosing physical operators.
So it's really not about a "easier API" but a responsibility split, where the default custom need not interact with the compilation phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily about an easier API, but about splitting the responsibility of the preparation phase from invoking... The compiler returns a function, the evaluator invokes that function.
In this implementation, the PartiQLEvaluator doesn't invoke the function returned by the PartiQLCompiler. The responsibilities here overlap. The PartiQLEvaluator here compiles the plan and executes. The PartiQLCompiler here compiles the plan and does not execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler returns a compiled "function" aka a prepared statement to be invoked via "execute". The PartiQLEvaluator does call the "function/statement" returned by the compiler.
The responsibility of the evaluator indeed overlaps the compiler by design. The split exists so that customers can supply their own compilation phase, but don't need to.
The idea here was to make an entry-point for custom compilation, but independent of the evaluator. Do you have some ideas on how to best split these responsibilities?
We could also make evaluator drive the datum for customers who don't want the lazy behavior?
/** | ||
* Strict execution mode. | ||
*/ | ||
public static final int STRICT = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! We've been leaving a spot for UNKNOWN = 0
which may help with protecting against forward compatibility. You don't need to add it here, but you could make STRICT = 1
and PERMISSIVE = 2
.
/** | ||
* Statement execution parameters. | ||
*/ | ||
internal class Parameters(private val values: Array<Datum>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned in other comment, but I don't believe this is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, I can remove this, but just letting you know it's not coming from JDBC.
You can always plan a "parameterized" node as dynamic which would allow repeated execution of prepared statements with different parameters. This should seriously cut down on a lot of unnecessary work, but we could even do better.
You also don't need any info about the parameter other than the type, so you can always plan with parameters which are only defined by their types. Then you can cache prepared plans and invoke with the differing parameters. Recall that the ability to do so was a direct ask from customers who wanted to plan once and repeatedly execute while changing some parameter values.
You can think of the prepared statement as producing a function that can be invoked. As-is, you are unable to provide any inputs to that function. In other words, our query compiles to fun query(): Datum
but parameters allows us to compile the query to fun query(p1: T1, p2: T2, ...): Datum
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The direct ask was resolved using our plugin model -- and the executable maintained a reference to a table (which was mutable by design). This might just be a preemptive optimization, since we don't allow for parameters in the plan itself (yet?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, use a mutable table rather than an execution parameter. That works for the tools available, and doesn't need any other changes and achieves the same thing but I would push back on compiling over a mutable value. I understand it solved the problem with what is available, but that is like using mutable pointers inside a function call and replacing the data at the pointer before invoking again, rather than just passing the new state.
I can remove this and may tinker with helpers in partiql-lang but the grammar and AST have parameter expressions. Also this could be added later. Parameters are just missing from v1 planner/plans afaik
Description
As part of the user-defined operators work, I've factored out the compiler from evaluation and updated the APIs to Java.
Other Information
and Code Style Guidelines? YES
License Information
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.