Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New API to convert directly between lazy and short bytestrings #619

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vdukhovni
Copy link
Contributor

No description provided.

Data/ByteString/Short/Internal.hs Outdated Show resolved Hide resolved
Data/ByteString/Short/Internal.hs Outdated Show resolved Hide resolved
-> ByteString -- ^ initial strict chunk
-> LBS.ByteString -- ^ remaining chunks
-> IO (MutableByteArray RealWorld) -- ^ possibly resized result
lazyToShortIO mba used size c@(BS fptr len) cs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better not to pass size, but rather extract it every time with getSizeofMutableByteArray#? Admittedly this would require IO before any pattern-matching begins...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a purely internal helper function, we could even move it into a where clause, to really make sure there are no unexpected calls. So I don't think such caution is warranted, I was much more concerned with handling integer arithmetic overflow, which I believe is handled correctly.

@Bodigrim Bodigrim requested a review from clyring October 11, 2023 20:30
Copy link
Member

@clyring clyring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what's going on with CI.

I don't mind this style of concatenation seen in lazyToShort, but will point out that every other concatenation-like function in the package calculates the total length first and allocates just one buffer.

-> LBS.ByteString -- ^ remaining chunks
-> IO (MutableByteArray RealWorld) -- ^ possibly resized result
lazyToShortIO mba used size c@(BS fptr len) cs
-- Safe against overflow since used <= size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- Safe against overflow since used <= size.
-- Safe against overflow since 0 <= used <= size.

...or we could just say let newUsed = checkedAdd "Short.lazyToShort" used len. (This does have the marginal downside of checking for overflow in the no-new-buffer case.)

Another option is to allow used+len to signed-overflow, but check intToWord newUsed <= intToWord size since it cannot unsigned-overflow.

-> lazyToShortIO mba (used + len) size c' cs'
-- Care to detect possible 'Int' arithmetic overflow.
| used < used + len
, let newsize = max (used + len) (size + size `quot` 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • If (size + size `quot` 2) overflows and becomes negative, this will behave degenerately and cause O(n²) allocation for inputs that narrowly fit into the Int# size limit.
  • Signed (`div` 2) will compile to an arithmetic right shift; signed (`quot` 2) needs an extra shift and add/subtract to fix up negative odd inputs. (Or (`shiftR` 1) makes the performance question much easier.)

Data/ByteString/Short/Internal.hs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants