INCLUDE_DATA

Teach by Example. Tech in your face.

Scalable REST services in WCF

Filed Under (REST, WCF) by haidersabri on 05-06-2008

Tagged Under : ,

Introduction

We’ve started to research how best to scale our WCF service, especially in the event of middleware failure that causes latency typical of IO activity. The problem that was initially found was that when an IIS hosted WCF service gets a request, the IIS worker thread handles the request and sends the work over to an IO thread. Meanwhile, the worker thread blocks until the IO thread completes. In the event of high unexpected IO latency, we can have a scenario that all the worker threads are quickly blocked on all the IO threads, thus making the server reject new requests. A way to alleviate this problem was to reduce the amount of worker threads used, and increase the amount of IO threads that are spawned. This would allow the server to continue to service new requests while the middleware recovers.

So I’ve began to investigate how to increase performance from this area. Not necessarily increasing throughput, but rather protecting a server from crashing in the event of middleware failure.

A few weeks back, .NET 3.5 SP1 was released as beta. It is claimed that the sp1 version increases performance specifically on WCF RESTful services. The next step in our research would be to compare the results below with the SP1 Beta bits.

Performance Testing Parameters

These tests done below do not fully test real-time scenarios. IO latency is simulated. Since WCF 3.5 does not support the Asynchronous Programming Model (APM) for REST services, all service calls are handled synchronously. However, the test compares a normal out of the box model with a modified HttpModule solution that uses the APM found internally inside the ServiceModel namespace. This code is based upon code provided by the WCF team. It makes use of reflection, which may contribute a performance hit.

These tests also take into account new features provided by the integrated pipeline of IIS 7. To be specific, I’ve varied performance testing on a registry key named MaxConcurrentRequestsPerCpu. This DWORD is located in HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ASP.NET\2.0.50727.0 and is non-existent by default. The default value would be 12, so if the DWORD is not present, the value IIS uses is 12. MaxConcurrentRequestsPerCpu specifies how many requests per CPU are going to be concurrently handled by IIS. The values we chose to use for testing are 12, 24, 48 and 0 (0 means there is no maximum). More information on MaxConcurrentRequestsPerCpu can be found on blog posts done by two members of the WCF product team, Thomas Marquardt and Wenlong Dong.

The test tool used to produce simulated load is the Web Application Stress Tool. I set the stress settings to have 50 threads and 10 sockets per thread.

The actual work done by the service call is a simple addition of two numbers. The simulated IO is conducted using a Thread.Sleep.

Virtual Machine

I tested on a virtual machine running Windows Server 2008 with 1 CPU, 1GB RAM, and the difference in worker thread usage between a service out-of-the- box and a service with our asynchronous module was clearly disparate. The out-of-the-box service would use up more worker threads and CPU, while as the async module service would use less worker threads, and over higher latency, would have lower CPU.

clip_image002clip_image004

Here you can see a clear difference in worker thread usage between both models. This, however, is a virtual machine, so its data cannot be conclusive. The above graphs are indicative of behavior with a MaxConcurrentRequestPerCpu setting of 12. Higher settings showed similar disparateness.

Real Machine

For the more accurate testing, we used a more robust machine with the following schematics:

· Intel Xeon L5320 1.86GHZ Dual Quad-Core (8 CPUs)

· 2GB RAM

· 32Bit Windows Server 2008

· .NET 3.5

Having such a robust machine greatly increased performance. The interesting result is that our initial tests using our asynchronous HttpModule on the VM produced way different results in terms of worker thread behavior than on the real machine. I noticed that both Async and Non-Async modes proved to have a consistently low usage of worker threads. At higher latencies, the AsyncModule would curtail performance more than the out-of-the-box model. Below is a breakdown of the data in more detail.

I chose to put the graphs of each mode together, as opposed to putting the graphs of the ConcurrentRequestsperCpu together because I felt there is more performance to be gained by manipulating this registry setting, than actually using an AsyncModule the way we used it.

The metrics recorded were averages over a 5 minutes load testing period:

· Execution Time:

· CPU%

· IO Threads

· Worker Threads

· ConcurrentRequests (will not be over 8* MaxConcurrentRequestsPerCpu)

· Requsts/Sec

Request Queued

· clip_image002[4]clip_image004[4]

clip_image006clip_image008

clip_image010 clip_image012 clip_image014 clip_image016

Analysis

There’s an obvious general trend that as latency increases, execution time naturally increases and throughput decreases. We also see from the data that as latency increases, more requests are handled concurrently through a higher volume of IO threads.

We also notice as latency increases, there is a slight tendency to get better throughput if you have a higher MaxConcurrentRequestsPerCpu. I say slight because as soon as the MaxConcurrentRequestsPerCpu goes over some threshold, the opposite starts to occur: we knowingly expect an increase in Concurrent Requests, but we also see an enormous increase in execution time, and a drastic decrease in throughput.

Another interesting finding is that worker threads on all test scenarios was incredibly low. I would attribute this to the hardware. I suspect that we would see higher worker thread values for the out-of-the-box scenario if we had a more robust load testing system. I also suspect, based upon the results on our VM system, that the usage of the async module would reduce the worker threads as well. On this box of ours, it wasn’t a factor at all. In fact, in some cases the added reflection and APM introduced by the AsyncModule actually reduced performance.

Also note the case of no maximum concurrent requests. Indeed it does eliminate any queuing, but the resulting overhead of managing all those concurrent requests destroys all performance on the server.

It looks from the above data that the best performance was gained with an out-of-the-box service with a MaxConcurrentRequestsPerCpu set to 24. At 1000 ms latency, we had lower execution times than all other scenarios, and throughput was nearly the highest for all other 1000ms scenarios. It still experienced high request queuing. Going up to a MaxConcurrentRequestsPerCpu of 48 would offer 1/3 the queuing, but also triple the execution time. At this point we would need to see what a more important tradeoff is. If we expect constant latency, we might be able to live with a controllable higher request queuing to keep the execution times lower. However, if we expect latency issues to come in uncontrollable spikes, we might opt to have slower overall execution times but safeguard our servers from running up their queues to high.

Summary

These tests attempted to investigate performance tweaks to WCF RESTful service out-of-the-box or with an asynchronous module. With less-than-adequate hardware, having the async module will reduce the amount of worker threads being blocked and thus increase performance. With more than adequate hardware, introduce the asynchronous module adds unnecessary overhead that actually reduces performance in some cases.

It is necessary for us to investigate the performance boosts introduced into the WCF 3.5 SP1 Beta bits that have recently been released. It is claimed that RESTful WCF services have a 5 to 10 times increase in performance. Since the bits are still in Beta and just recently released, results based upon them were not included in this paper.

To dive deeper into performance, I recommend we conduct similar tests using one or a combination of the following frameworks:

· Generic ASHX Handler

· WCF 3.5 SP1

· ASP.NET MVC

· Using a middleware that follows the APM

· Microsoft Parallel Extensions

· Microsoft Robotics

My MIX Theatrix

Filed Under (Presentations, REST, WCF) by haidersabri on 11-03-2008

I had the pleasure of attending my first MIX conference last week. I never thought that at my first time at the conference, I’d actually be a presenter. I remember the first time i went to a different Microsoft conference, PDC, where I attended the presentations of renown technologists and wondered if I’d ever be up on stage. Well, i had the honor of being invited to present due to my roll on the MySpace Developer Platform REST API.
I was first approached by the REST API Manager/Visionary, Paul Walker, in mid-January about the prospects of co-presenting at MIX with him. Without hesitation, i accepted but had no clue what he wanted to present, and when we would find the time to prepare for the presentation. After all, we were in the middle of the release of MDP, and i was about to have a baby. Nevertheless, the opportunity was too great to pass up. It was mid-February and Paul wasn’t even sure he wanted to present. In the end, I was able to talk him into it, but a day later my wife went into labor and i left the earth’s orbit for at least 10 days..,
Well, to make the long story short, I found out that I’m going to be the only presenter, and the MySpace team wouldn’t be able to make it. Nervous as i was, i began to doubt whether to go through with it or not; i hadn’t seen the slides until the day i flew out to vegas, all the WCF guys were expecting a large MySpace presence, and I’m was just getting back into orbit. Anyhow, I eventually decided to go through with it, and Aaron Sloman, CEO of speakTECH (and my current employer) came to the rescue to help with the opening of the presentation. After no sleep, frantic demo development, and pathetic practice sessions, the 3 pm deadline approached. We were pulling out slides, adding new ones, and just hoping that our prod environment for the demo would not send back a cryptic error just by visiting the default page. Up on stage, Aaron and I looked at the MySpace API tool we were supposed to demo, and it was failing to return the request. So we had to improvise.
Aaron had a great opener. At the very first MIX in 2005, Aaron presented Vista gadgets using MySpace photos. In the audience someone shouted, "open up your API." A few years later, Aaron was back at MIX announcing that MySpace API, albeit only 6 months after development started on it. :)

I then took over with the technical stuff. I spoke about the principles of REST, and how an API using RESTful services made sense to us. Then I jumped into how WCF actually implements RESTful services. I genuinely believe our API is the first of its kind. That is, its the first fully RESTful service built upon WCF. We actually started development with WCF 3.5 beta 2 out in August. The bits we needed for our API were not RTM until December, when we already reached critical milestones in the API development. Add to that the total absence of documentation, we were able to impress the internal WCF team with what strides we were able to take WCF to. Some of the cool features created were custom channels that could be turned off and on real easily. These channels would effectively intercept messages high in the call stack to do specific functions on incoming requests. Our demo showed off two custom channels: OAuth and HighLowREST.
The OAuth channel provides a layer of authentication checking per the newly adopted OAuth standard that is currently being flaunted on the web.
The HighLowREST channel is a bit more complicated. For those familiar with REST, they know that HttpMethods are the vehicle of all requests, and that PUT and DELETE are two valid Http methods. Many clients out there, however, do not support PUT and DELETE calls simply because they haven’t had common usage for the past decade or so. In order to bridge those "Low-REST" clients into our "High-REST" service which supports PUTs and DELETEs, we had to add a custom channel to make the service agnostic to that complexity. More details in the video though :).

We also added cool http-like exception handling. You can now throw a ResourceNotFoundException or a BadRequestException anywhere in the code, and it will transform to the appropriate http status code on the reply (i.e. 404, 403).
I am definitely leaving out much more of what we did in our demo, but the good news is that we made it available for people to download. The theme adopted by some of the MySpace guys on the presentation was a RESTful Chess API where you can PUT chess moves, and GET game moves, etc.. Get the source code here.

The video for the session is also available online at visitmix.com, or you can just get it directly through here

All in all, the presentation went well thanks to all the help and effort that went into it. The WCF evangelist Vittorio Bertocci was more than helpful and patient with us. He also blogged about it on his site.