Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive destructuring #2027

Open
jneem opened this issue Aug 28, 2024 · 1 comment
Open

Recursive destructuring #2027

jneem opened this issue Aug 28, 2024 · 1 comment

Comments

@jneem
Copy link
Member

jneem commented Aug 28, 2024

We've had some discussion of recursive destructuring in #2010 and #1989. Opening this issue for more discussion.

Nickel currently allows recursive bindings in let blocks (opt-in, using the rec keyword), but only if all bindings are simple. As soon as destructuring is used, recursive bindings are forbidden. Possible use-cases include recursive references for default arguments:

let rec { x, y ? x } = some_record in ...

or referring to other bindings in let blocks:

let rec
  x = f y,
  { a, b } = g x
in ...

The main difficulty with this feature is that we'd need to figure out the semantics of recursive destructuring in general, and it's possible to come up with pretty complicated examples.

Haskell

I spent a little while exploring how Haskell does recursive let bindings. Their let blocks are recursive and lazy by default, and it seems like they desugar the destructuring assignments into one gigantic letrec block. For example, running ghc -ddump-ds-preopt on an input like

  Bar (a, b) = x
  z@(c, [d, [e, Bar (m, _)]]) = g b c y
  y = i z m

gives (after a bit of clean-up, and removing all the error cases in the case blocks)

letrec {
  generatedPairVar = case x of { Bar x -> x; __DEFAULT -> fail }
  a = case generatedPairVar of { (x, y) -> x }
  b = case generatedPairVar of { (x, y) -> y }
} in
letrec {
  generatedTupVar = case g b c y of { (c, generatedListVar) ->
    case generatedListVar of {
      d : ds -> case ds of { e : es -> case es of { e' : rest -> case e' of { Bar (m, _) -> case rest of { [] -> (z, c, d, e, m) }}}}
    }
  }
  z = case generatedTupVar of { (z, c, d, e, m) -> z }
  c = case generatedTupVar of { (z, c, d, e, m) -> c }
  d = case generatedTupVar of { (z, c, d, e, m) -> d }
  e = case generatedTupVar of { (z, c, d, e, m) -> e }
  m = case generatedTupVar of { (z, c, d, e, m) -> m }
  y = i z m
} in
...

I'm not sure how ghc decides to use two letrec blocks instead of one; changing the input around to make it more recursive forces it use only one. I also noticed that ghc likes to produce one tuple binding for each pattern binding in the let block. So rather than recursively desugar z@(c, d) = ... like

letrec {
  z = ...
  c = case z of { (c, d) -> c }
  d = case z of { (c, d -> d }
}

it first generates a pattern than destructures all the bound variables and returns them as the tuple (z, c, d), and only then does it extract them. Since tuples are lazy, I think these two approaches are equivalent.

I'm going to have a look at scala next...

@aspiwack
Copy link
Member

aspiwack commented Sep 9, 2024

If you have questions about how GHC compile “binding groups” as they're called there, ask me again at the end of the week. I've spent a lot of time in this code this year (right now I'm drowning in emails, so not the best time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants