Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some utilities in std.array: remove_duplicates and filter_map #1958

Open
Danl2620 opened this issue Jun 13, 2024 · 5 comments
Open

Some utilities in std.array: remove_duplicates and filter_map #1958

Danl2620 opened this issue Jun 13, 2024 · 5 comments
Labels

Comments

@Danl2620
Copy link

Is your feature request related to a problem? Please describe.

I need to take values that fit in to nice high-level schema and re-organize them for ad-hoc "back ends" like Ansible inventories. Some more array processing functions would help in this transformation.

Describe the solution you'd like

  • A function std.array.remove_duplicates. This could be written to take an equality function and likely have the type forall a (a -> a -> Bool) -> Array a-> Array a. It could also use built-in equality testing for a simpler definition like forall a Array a -> Array a.
  • A function std.array.filter_map. I'm not sure if the type of function is express-able -- i.e. it would need a type like forall a b. (a -> b) -> Array a -> Array b but b could be a Bool (with the only acceptable value being false).

Describe alternatives you've considered

I have workarounds that bloat the code a bit.

Additional context

Hopefully these are fairly understood functions -- they definitely exist in other functional languages. The type system of Nickel is a bit different than what I'm used to so perhaps the types would need adjustment.

@jneem
Copy link
Member

jneem commented Jun 13, 2024

I think the signature for std.array.filter_map should be something like forall a b. (a -> [| 'Some b, 'None |]) -> Array a -> Array b. (Or Just / Nothing)

@thufschmitt
Copy link
Contributor

Quick passing thoughts on these (mostly from a performance pov):

  • std.array.remove_duplicates is a dangerous thing because it's quadratic in complexity. Maybe a version that sorts would be better (I know that Nixpkgs' performance suffered quite a bit from an abusive usage of an equivalent function until it got replaced by an $n*log(n)$ sorting equivalent).
  • std.array.filter_map is kinda expressible already through std.array.flat_map. Not exactly the same semantics, but at least it means that it can probably be defined in client code at very low cost:
    let
      filter_map : forall a b. (a -> [| 'Some b, 'None |]) -> Array a -> Array b = fun f =>
        std.array.flat_map (fun x => f x |> match { 'Some y => [y], 'None => [] })
    in
    filter_map (fun x => if x % 2 == 0 then 'Some (x / 2) else 'None) [ 2, 2, 3, 5 ,3, 4, 9 ,0 ]
    

@yannham
Copy link
Member

yannham commented Jun 26, 2024

std.array.remove_duplicates is a dangerous thing because it's quadratic in complexity. Maybe a version that sorts would be better (I know that Nixpkgs' performance suffered quite a bit from an abusive usage of an equivalent function until it got replaced by an sorting equivalent).

By the way, does Nix do anything fancy for sorting? I have a vague idea of sorting having a really bad performance in early Nickel. You'd like to have it as a fast primitive (using, say, Rust's sorting algorithm), but preserving laziness plus the fact that you take a user-defined order makes it quite hard to do so.

std.array.filter_map is kinda expressible already through std.array.flat_map. Not exactly the same semantics, but at least it means that it can probably be defined in client code at very low cost:

While I agree, IMHO it's still a common and useful addition that could be reused across Nickel projects. In the end many std.array functions are just a few lines of glue code on top of core combinators like maps and folds.

@thufschmitt
Copy link
Contributor

does Nix do anything fancy for sorting? I have a vague idea of sorting having a really bad performance in early Nickel. You'd like to have it as a fast primitive (using, say, Rust's sorting algorithm), but preserving laziness plus the fact that you take a user-defined order makes it quite hard to do so.

It mostly falls-back to the STL sort algorithm. The main fancy thing is an optimisation to bypass the language layer if the ordering is the built-in one: https://github.com/tweag/nix/blob/84aa8e9f196e50549aa5cdef76f61f442ce40b82/src/libexpr/primops.cc#L3305-L3310

(note also that the built-in comparison operator of Nix is more permissive than the Nickel one as it accepts anything except records and functions)

@yannham
Copy link
Member

yannham commented Jun 26, 2024

It mostly falls-back to the STL sort algorithm. The main fancy thing is an optimisation to bypass the language layer if the ordering is the built-in one: https://github.com/tweag/nix/blob/84aa8e9f196e50549aa5cdef76f61f442ce40b82/src/libexpr/primops.cc#L3305-L3310

I see. Using the standard sort algorithm is easier in Nix because the interpreter just use mutually recursive functions to force values. In Nickel, as we have a stack machine (in practice a loop and a heap-allocated stack), its more annoying. But I suppose we could make it work by nesting call to vm.eval or something.

@yannham yannham added area:stdlib good first issue Good for newcomers labels Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants