Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement http client timeouts using hashed wheel timer #499

Closed
wants to merge 4 commits into from

Conversation

kachayev
Copy link
Collaborator

The motivation behind it was described here.

http/request now uses the default shared timer or provided as a parameter. On higher loads it makes sense to use different timeouts instead of a single instance even tho' hashed wheel timer is designed to handle thousands of simultaneous tasks with almost no visible performance degradation. It would be probably nicer to have timer being attached to a specific connections pool, but the current implementation does not let us attach arbitrary artifacts to flow/instrumented-pool... any other approaches would either break compatibility or would look weird.

@alexander-yakushev Please, take a look when you have time. Thanks!

@alexander-yakushev
Copy link
Contributor

Looks good!

Do I understand correctly that it doesn't solve the issue that timeout callbacks could be slow/blocking, and will end up being executed by a single poor Timer thread?

Anyway, this is a strict improvement over what we have right now.

@kachayev
Copy link
Collaborator Author

@alexander-yakushev

it doesn't solve the issue that timeout callbacks could be slow

Why so? The timeout just realized the given deferred with an exception. The chain of callbacks should follow the same path it would follow otherwise. I don't see anything that might be blocking here... Am I missing something?

@kachayev
Copy link
Collaborator Author

@alexander-yakushev Would you be able to compare performance on some real-world use cases? If it works, I would update implementation with more fine-grained control over timer settings. Oh, and docs.

@@ -88,6 +88,9 @@
(def default-response-executor
(flow/utilization-executor 0.9 256 {:onto? false}))

(def default-timer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be defonce?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ztellman It definitely should! I'm waiting for the feedback from @alexander-yakushev if this even worse being changed. If so, I will complete the implementation and polishing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(defn create-timer []
(HashedWheelTimer. (enumerating-thread-factory "aleph-timer-" false)))

(defn timeout! [^HashedWheelTimer timer d delay]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like delay to be delay-ms, just so it's clear what unit we're using without looking inside the implementation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@alexander-yakushev
Copy link
Contributor

@kachayev I mean this scenario:

user=> (def my-d (d/deferred))
#'user/my-d
user=> (def other-process 
         (-> my-d 
             (d/catch identity) 
             (d/chain (fn [_] 
                        (println "Gonna block in" (Thread/currentThread)) 
                        (Thread/sleep 5000)))))
#'user/other-process
user=> (d/timeout! my-d 300)
<< … >>
Gonna block in #object[java.lang.Thread 0x6ac1e042 Thread[manifold-scheduler-pool-1,5,main]]

@kachayev
Copy link
Collaborator Author

@alexander-yakushev Should it be netty/timeout!?

@alexander-yakushev
Copy link
Contributor

alexander-yakushev commented Mar 26, 2019

@kachayev That was an example of the current behavior, and your implementation is not different in this regard.

(netty/timeout! (create-timer) my-d 500)
<< … >>
Gonna block in #object[java.lang.Thread 0x77142860 Thread[aleph-timer--1,5,main]]

I brought this up because you mentioned this yourself here #479 (comment)

@kachayev
Copy link
Collaborator Author

kachayev commented Mar 26, 2019 via email

@kachayev
Copy link
Collaborator Author

@ztellman Merged with the latest master. I also set tick duration to 10ms, to be more precise by default. I think it makes sense to manually tune this for higher loads than to keep it less precise for everyone.

@kachayev
Copy link
Collaborator Author

Even thought I still think there's a benefit of having timers optimized for highly loaded I/O environments, this specific implementation requires substantial portion of testing (ideally, in multiple production systems). Moreover, better approach would be to dispatch between timer types when necessary instead of using single timer for all needs. Happy to assist anyone who would be willing to extend the functionality. As of now, closing the PR.

@kachayev kachayev closed this Aug 12, 2020
@alexander-yakushev
Copy link
Contributor

Sorry it slipped through the cracks for me, I totally forgot this PR exists. I never tested this particular PR because we need all Manifold timeouts to be using HWT. But we have been using HWT for this for the last 1.5 years in multiple high-load projects and haven't any issues so far.

Thanks for the effort!

@kachayev
Copy link
Collaborator Author

@alexander-yakushev Not a problem at all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants