This document provides a brief overview of
performance within the Event Handler 7 example. This should all still apply to the
Barracuda PR1 release (although I haven't actually gone back and re-run the tests).Event
Handler 7
In Event Handler 7, most of the performance testing notes from EventHandler6 (see
below) still apply. I will take a few moments to talk about what changed.
The only real difference in EventHandler7 is that all HttpRequestEvents now implement
Polymorphic, so when an events is dispatched, we end up creating a parent event (through
Class.newInstance()) for every event in the parent chain. In other words, if you have an
event that is 10 levels removed from HttpRequestEvent, there will be 10 additional events
generated every time that event is dispatched. I was concerned that this might impair
throughput, so I created a test (under org.barracudamvc.examples.ex3) that would
allow me to test event dispatching to different levels of an event hierarchy, in order to
see how much the depth impacted throughtput.
In a nutshell, I was unable to determine any real difference, but this may have been
due to my test methodology -- I was testing across a local network against a low powered
Linux box which was also acting as a web server to the outside world. Now, I don't believe
that this site was experiencing any significant traffic, but I did observe significant
variance in my results.
In general, when testing with 20 concurrent HTTP requests for the same event, I was
able to handle between 150-250 requests per second, regardless of whether I fired an event
that was 1 level removed from HttpRequestEvent or 10 levels removed. I did notice a
slight slowdown (perhaps more so than in EventHandler6 testing) when I was NOT using
event pooling; typically throughput dropped to 100-150 requests/second without event
pooling. Note however, the amount of deviation in these figures is significant (so take
them as a ballpark).
We really need more testing here to get a better feel for just what levels of
throughput the framework itself can actually sustain. Even if we could only support
50 requests per second, that figure would still amount to 180,000 requests per hour or
about 4.3 million requests per day (which is still pretty dang good). In addition, we
should also note the framework is still as fundamentally scalable as it was in the
previous iteration. Hitting it with 500 simultaneous event dispatching requests worked
flawlessly every time.
If someone would like to do some work in the area of stress testing, I'd be more than
happy to work with you to set things up.
Event Handler 6
At this point, I have done some preliminary stress testing and verified that the basic
framework appears to hold up under significant load.
We did have a syncronization problem in the DefaultEventPool that was causing deadlock
when we ran out of available threads. Basically, because the internal lock() method was
syncronizing on the lock object, nothing else could release events until the code timed
out. The solution was to modify lock() to through a NoAvailableEventsException as soon as
it sees there aren't any events available. The outer checkoutEvent() method can then sleep
for a specified interval and then retry. After a specified number of retries, it will
simply propogate the exception on up, and the ApplicationGateway will just manually create
the event using reflection. This eliminates the deadlock, because the lock object is
released in between locking attempts. This allows other threads to check events back in
during this time.
Interstingly, the thread pooling stuff works quite effectively in terms of reducing
overhead. When testing with 20 threads (each starting 20 millisecs after the previous
one), the sample case was able to handle all 20 requests with only 3 actual event
instances. This demonstrates that the event pooling mechanism can be used effectively to
reduce the amount of resources needed to service large numbers of requests.
At the same time, the actual performance improvements that came from the Event Pooling
were not as significant as I thought. In a very non-scientific experiment, I created a
simple event handler and then (from another box) created 500 threads to issue concurrent
HTTP requests for that event. With event pooling turned on, I was able to handle all 500
requests in an average of about 1000 millisecs. With event pooling turned off, it took
about 1400 millisecs. In both of these cases, I was creating a new instance of the event
handler for each request.
If we really wanted to get a true feel for how much the event pooling actually helps,
we could create a standalone test to compare instantiation through EventPool vs.
instantiation through reflection. The actual difference may in fact be moot, however.
One of the unknown factors here is really the amount of time spent setting up and
tearing down the HTTP connection. Ultimately, this will be one of the larger bottlenecks,
as will the time it takes to query or update a database and of course the amount of
available network bandwidth. In short, even if there is a significant performance gap
between pooling vs. reflection, the actual difference may be inconsequential in a real
world app if there are significant costs associated with bandwidth, HTTP, and database
communication.
At this point, we really shouldn't read too much into the performance testing I've
done, other than to assert a few very obvious points:
The event model framework appears fundamentally threadsafe - In short,
it appears to stand up under significant load. I tested up to 500 simultaneous requests,
and the framework was able to serve them all. I suspect that a majority of sites will not
need anywhere near this capacity of throughput. Regardless of what figure we decide
we need to hit, the fact that we can support a large number of concurrent requests
indicates that the fundamental architecture is sound.
- Initial throughput seems acceptable - From my perspective, the initial throughput
results seem fairly acceptable. If we were only able to serve 50-100 requests per second
I'd say we were off to a rough start. Of course, we really need to test these in the
context of a presentation framework in order to arrive at a true level. One of the things
we might consider as an early objective is to port the Pet Store app to various
presentation frameworks in order to accurately measure throughput with various web-app
approaches.
(Do we have a line in the sand saying "Any web-app framework must be
able to server at least X number of requests per second?" For instance, even
the ability to handle a paltry 250 requests per second would translate into into 15,000
requests per minute, or 900,000 requests per hour. That's a pretty hefty figure.)
There is a slight performance benefit to EventPooling - I think we
should hold off judgement on this until we have a real world app to benchmark. In short,
while it works and that's great, I'm really beginning to suspect that the cost of
instantiating one event through reflection as opposed to caching it in an event pool is
probably inconsequential in the larger scheme of things. Now, we certainly don't want to
be creating ALL events through reflection, but early indications seem to suggest that
there may not be a whole lot of benefit to the pooling mechanism. We need to reserve
judgement on this.
- The framework is very tuneable - Given the inability to know for certain where the
performance bottlenecks will lie, one of the good things about the current implementation
of the event model is that it doesn't force you to implement things one way or another.
For instance, you can use event pooling or you can elect not to. You can implement
listener factories to provide a new instance of listener or you can reuse a common
synchronized instance. In short, the architecture itself does not make any decisions one
way or the other...it defers those decisions to implementation, and where possible makes
them configurable. This buys us a lot in terms of future tuneability.
|