Ticket #84 (new defect)

Opened 2 months ago

Last modified 1 month ago

epoll causes thread priority issues in multi thread Ruby

Reported by: rogerdpack Assigned to:
Priority: major Milestone: EventMachine "Framework"
Keywords: Cc: raggi, rogerdpack

Description

At least with 1.8.6, if you run it with epoll, the epoll thread seems to hog almost all the cpu time.

Change History

09/25/08 17:45:56 changed by tmm1

  • cc set to raggi.
  • milestone set to EventMachine "Framework".

Could you elaborate? Are you talking about the following behavior:

$ time ruby -rubygems -e "require 'eventmachine'; EM.epoll; EM.run{ EM.add_timer(0){ Thread.new{ 100.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m5.382s
user	0m0.056s
sys	0m0.004s

$ time ruby -rubygems -e "require 'eventmachine'; EM.run{ EM.add_timer(0){ Thread.new{ 100.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m0.250s
user	0m0.048s
sys	0m0.016s

If so, you can try the following patch:

diff --git a/ext/em.cpp b/ext/em.cpp
index 4094755..0ebc2ca 100644
--- a/ext/em.cpp
+++ b/ext/em.cpp
@@ -425,8 +425,19 @@ bool EventMachine_t::_RunEpollOnce()
 {
        #ifdef HAVE_EPOLL
        assert (epfd != -1);
+
+       // default epoll timeout of -1 (wait indefinitely for an event)
+       // if there are ruby threads or pending timers, use a timeout of 0 (return immediately)
+       int timeout = -1;
+       #ifdef BUILD_FOR_RUBY
+       if (!rb_thread_alone())
+               timeout = 0;
+       #endif
+       if (Timers.size() > 0)
+               timeout = 0;
+
        struct epoll_event ev [MaxEpollDescriptors];
-       int s = epoll_wait (epfd, ev, MaxEpollDescriptors, 50);
+       int s = epoll_wait (epfd, ev, MaxEpollDescriptors, timeout);
        if (s > 0) {
                for (int i=0; i < s; i++) {
                        EventableDescriptor *ed = (EventableDescriptor*) ev[i].data.ptr;
@@ -511,8 +522,12 @@ bool EventMachine_t::_RunEpollOnce()
                }
        }
 
-       timeval tv = {0,0};
-       EmSelect (0, NULL, NULL, NULL, &tv);
+       #ifdef BUILD_FOR_RUBY
+       if (!rb_thread_alone()) {
+               timeval tv = {0,0};
+               rb_thread_select (0, NULL, NULL, NULL, &tv);
+       }
+       #endif
 
        return true;
        #else

10/10/08 17:46:24 changed by tmm1

  • cc changed from raggi to raggi, rogerdpack.

Before Patch:

$ time ruby -rubygems -e "require 'eventmachine'; EM.epoll; EM.run{ EM.add_timer(0){ Thread.new{ 100.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m5.333s
user	0m0.048s
sys	0m0.016s

$ time ruby -rubygems -e "require 'eventmachine'; EM.run{ EM.add_timer(0){ Thread.new{ 100.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m0.249s
user	0m0.044s
sys	0m0.020s

After Patch:

$ time ruby -rubygems -e "require 'eventmachine'; EM.epoll; EM.run{ EM.add_timer(0){ Thread.new{ 100.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m0.084s
user	0m0.080s
sys	0m0.004s

$ time ruby -rubygems -e "require 'eventmachine'; EM.run{ EM.add_timer(0){ Thread.new{ 100.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m0.242s
user	0m0.048s
sys	0m0.012s

10/10/08 18:26:13 changed by tmm1

And again with 100k iterations:

Before Patch:

$ time ruby -rubygems -e "require 'eventmachine'; EM.epoll; EM.run{ EM.add_timer(0){ Thread.new{ 100_000.times{|i| p(i)}; EM.stop } } }" > /dev/null

(took too long: epoll_wait is called with a 50 millisecond timeout, multiplied by 100k iterations = 83 minutes)

$ time ruby -rubygems -e "require 'eventmachine'; EM.run{ EM.add_timer(0){ Thread.new{ 100_000.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m0.522s
user	0m0.236s
sys	0m0.108s

After Patch:

$ time ruby -rubygems -e "require 'eventmachine'; EM.epoll; EM.run{ EM.add_timer(0){ Thread.new{ 100_000.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m21.694s
user	0m20.521s
sys	0m0.244s

$ time ruby -rubygems -e "require 'eventmachine'; EM.run{ EM.add_timer(0){ Thread.new{ 100_000.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m0.521s
user	0m0.264s
sys	0m0.080s

10/10/08 19:10:52 changed by tmm1

Updated patch, waits 10 milliseconds when only timers exist to give them a chance to expire, and uses rb_thread_schedule:

diff --git a/ext/em.cpp b/ext/em.cpp
index f7339ee..380e4e9 100644
--- a/ext/em.cpp
+++ b/ext/em.cpp
@@ -425,8 +425,20 @@ bool EventMachine_t::_RunEpollOnce()
 {
        #ifdef HAVE_EPOLL
        assert (epfd != -1);
+
+       // default epoll timeout of -1 (wait indefinitely for an event)
+       // if there are pending timers, use a timeout of 10 (give timers 10 milliseconds to expire)
+       // if there are ruby threads, use a timeout of 0 (return immediately)
+       int timeout = -1;
+       if (Timers.size() > 0)
+               timeout = 10;
+       #ifdef BUILD_FOR_RUBY
+       if (!rb_thread_alone())
+               timeout = 0;
+       #endif
+
        struct epoll_event ev [MaxEpollDescriptors];
-       int s = epoll_wait (epfd, ev, MaxEpollDescriptors, 50);
+       int s = epoll_wait (epfd, ev, MaxEpollDescriptors, timeout);
        if (s > 0) {
                for (int i=0; i < s; i++) {
                        EventableDescriptor *ed = (EventableDescriptor*) ev[i].data.ptr;
@@ -511,8 +523,11 @@ bool EventMachine_t::_RunEpollOnce()
                }
        }
 
-       timeval tv = {0,0};
-       EmSelect (0, NULL, NULL, NULL, &tv);
+       #ifdef BUILD_FOR_RUBY
+       if (!rb_thread_alone()) {
+               rb_thread_schedule();
+       }
+       #endif
 
        return true;
        #else

rb_thread_schedule makes it a little bit faster:

$ time ruby -rubygems -e "require 'eventmachine'; EM.epoll; EM.run{ EM.add_timer(0){ Thread.new{ 100_000.times{|i| p(i)}; EM.stop } } }" > /dev/null

real	0m19.932s
user	0m19.733s
sys	0m0.172s