Saturday, November 27, 2010

Sip Servlets Application Routing Guidelines and Best Practices

Sip Servlets 1.1 enjoys great adoption in the telco industry displacing proprietary and legacy applications. One of the big promises of Sip Servlets 1.1 is the application (WAR module) composition and isolation provided by the Application Routing feature which is supposed to allow the applications to grow over time independently from each other and orchestrated in different services.

On the surface...

Application Routing is probably one of the most misunderstood parts of the specification. Misunderstood not in terms of function, but in terms of use and consequences. From JSR-289 application routing is described as a separation of concern concept similar to the Chain of Responsibility Design Pattern where each application is responsible for it's own small part of the service and can be added or removed on demand. In fact it would often look like a pipeline on your architecture diagrams:
  • Call Blocking will check if the Callee has blocked the Caller and if so it will reject the call. Otherwise if the the Caller is not blocked it will just proxy the call to the next app.
  • Call Forwarding will forward the call to the Callee or her voicemail depending on availability status.
  • The Voicemail application (if reached) will act on behalf of the Callee and it will answer the call to record a message.
Looking great - clear responsibilities for each application, zero coupling, applications are completely unaware of each other, ..blah, blah, blah. Every architect's dream!

The Real World Picture

What is not shown in these diagrams however is how it works inside the container with all external entities. JSR-289 mandates that applications communicate with each other through SIP, which means all messages go up and down the stack to reach the next application. Internally, Mobicents and most Sip Servlets containers optimize the message passing by skipping a few stages of the SIP protocol stack, but still you have to do it. And that's not the problem at all. There are a number of things from the real world that are missing in the picture:
Each application creates it's own application session. The B2BUA application creates two Sip Sessions while the Proxy and the UAS application create one session each. Additionally if the container operates with dialogs it will create 2, 3 or 4 dialog representations. Sip Servlets applications have no awareness of the SIP dialogs and have no references to dialogs directly but they are in the memory. The Sip Servlets specification says that dialogs roughly correspond to SipSessions, which is true from application point of view, but in SIP terms the dialog spans from UAC to UAS through any proxies in the way, thus in the best case it is possible two or more sessions to share the same dialog instance if they are on the same machine.

So what's the big deal? The applications care only about their own state and not the other application's state, right? Not exactly. To maintain the call internally, the containers keep references to a number of objects that represent the transactions or the the application routing chain in some way. SipApplicationSessions have timers associated with them and so do the SIP client and server transactions. B2BUA and proxy implicitly have to keep the associated inbound and outbound requests/responses. There are also a number of properties that users can specify and are maintained in memory. That's just for empty sessions. Once the developers start adding session attributes independently for each app, it is almost certain that they will have duplicate data.

In real-world applications most of the data would be somewhere in persistent storage such as a database. The data would be queried and loaded in the sessions by the applications. Because the applications can't share state they will have to query the database independently and very likely transfer and use duplicate data like the user profiles and preferences. Note that I am not assuming that it has to be that way. You may be able to organize the data and the queries not to transfer redundant data, but this is very hard to do correctly over time and especially when you need to have separate Web UI to access the data. It is just better to make one network request than many.

On top of that if you took advantage of the application router properly it may also query the database, especially since it is responsible for assigning subscriber identity and function selection.

To summarize what we got so far for a single call in this 3-app service:
  • 4 times SipSession with attributes and properties
  • 3 times SipApplicationSession with attributes and properties
  • 2 times SIPDialog
  • Whatever transactions are in progress multiplied by at least 3 (client and server)
  • 3 times the timers
  • 3 times JDBC over the network, once for each app, each in separate DB transaction
  • 3 times call the getNextApplication() to the application router, each potentially querying the database again
  • ..and I am probably forgetting something
Fault tolerance

When you account for the fault-tolerance, you need to have at least one replica of the above state. Replication occurs all the time continuously and has very serious additional consequences for memory, network traffic and CPU utilization. Application Routing also amplifies any gaps in time where a failure is unrecoverable depending on the replication policy.
In the end, with or without fault tolerance, having this service implemented in 3 applications probably costs around 3 times more in terms of consumed hardware, development/testing and has much slower response time compared to a monolithic application that does the same with simple if .. then .. else statements. Application chaining doesn't look like such a good idea any more.

Converged HTTP applications spaghetti

Unlike SIP, HTTP servlet applications do not allow composition. Usually each SIP application would have some sort of Web UI in the same WAR module allowing users and administrators to configure the system or user profiles. Do you really want your service to have 3 different entry-points of configuration? Block users from one app, configure forwarding in another and set the voicemail greeting in a third app. One option would be to use a Java Portal (JSR-168 or JSR-286 with WSRP) portlets to combine the UI. That may or may not work. I am not going to cover the limitations of Java Portals, but the fact is that it has some limitations, greater complexity, performance penalty and so on. If you turn on and off application from the application router, your application router will have to update the portal configuration to reconfigure the UI.

In most cases you will have a separate Web UI application that is aware of the schema used by all 3 applications, which limits the potential of the SIP applications to grow independently from each other. A modification in one app may force a change in your Web UI app.

No matter which way you go, there will be a chain of dependencies that your applications or the application router should be aware of, which breaks the promise of isolation to some extent.

Application Composition is not a design pattern

You shouldn't design or plan your service to be in different applications. Sip Servlets application composition is not a chain of responsibility implementation. Applications are deployment units, not architectural building blocks. You don't design your Web Services specifically to be orchestrated with BPEL and certainly not with a particular BPEL implementation. Just like application routing, BPEL's power is in the integration.

Application composition is probably best fit for unrelated services from different vendors that consume different databases and in most cases run without being chained together with other applications. Application routing is OK as long as the actual application chaining is rare and applications are self-sufficient as individual services that have a single Web access point.

In that sense, application routing can also be used as a cheap rolling upgrade technique - just switch the requests to a new application and the new sessions will arrive in your new app while the old sessions will continue to go to the old app until they finish. AFAIK this will work on any container.

In conclusion, if you are an architect or developer you should forget that application routing exists. It is sysadmin job to reconfigure applications if they see fit or have no other choice.

Friday, October 15, 2010

Another crazy year with Mobicents...


More than 20 releases in the past year, 6000+ revisions, almost 1000 tracker issues, airline miles to go many times around the world, and here we are in Belek, Antalya just 500km from my home town.



View Larger Map

Thanks to Eduardo for arranging this exceptional venue for us.

This year for the first time we had the whole team together from all over the world - USA, Europe, Asia, Australia, and also for a first time customers and users were invited to attend the technical sessions.

We had a very tight agenda, but even so a couple of extra sessions were added due to demand. Eventually, we were able to focus the agenda on the most interesting topics and had a lot of discussions on each project. It was a great very productive meeting with lots of impact on the roadmaps and the direction of the projects.

I had a talk dedicated on general clustering, which is a key topic for all projects and easily generates even more discussions. My slides are here, but I should probably annotate them a bit more, if you have any questions ask away:

This is the version for fullscreen viewing. The rest of the slides will be available in our Antalya directory in google docs over here.

Great meeting, great party!

After a week in the paradise I also got to "enjoy" an extra stay is this hotel when I missed my flight in Istanbul. Excellent choice if you need to catch up with reality :)

Friday, April 16, 2010

The dreaded so-called "geographical failover" and SIP

This interesting requirement just keeps popping-up in the middleware world and is a topic of many threads in developer discussions as we are designing our HA architecture to be adequate and performant. The problem stems from a common deployment architecture where you have two or more data-centers in distant geographical locations (think WAN distance, where LAN is not possible) . Each data-center site has the same service deployed, but is responsible for serving only the requests from their geographical area. All data centers are independently fault-tolerant, but if one site goes down for any reason, this would cause service outage in the area it is supposed to cover.

Geographical failover usually refers to being able to temporary redirect the requests to other more distant data-center sites when a local total data-center failure occurs. Even if it is slower or more expensive, it is better than a completely unavailable service.

This problem is common for HTTP and SIP, and the strategies to solve it are very similar. The challenge is in two aspects:
  • Load balancing. Obviously, you need to make sure your local load balancers survive the data center failure. There are many ways this can done, but commonly you can just host the load balancers somewhere close to the users and keep it isolated from the data center. You should also make sure there is enough backup power and network infrastructure to keep them alive for a few hours or days.
    In Mobicents and JBCP, we introduced pluggable algorithms for the SIP/HTTP converged load balancer, and it is not hard at all to come up with a working solution. For instance, the load balancer algorithm may have two lists of server nodes - local nodes and distant nodes. If and only if all local application server nodes are dead, then the load balancer can start routing to the distant nodes. Plain and simple.
    If you are using a distributed load balancer (shown in the figure on right), one per site, it will behave the same way, and you may have several additional options depending on the IP load balancer capabilities.

    As long as the SIP load balancers are aware of the topology, the IP load balancer doesn't have to be aware of the other data-centers. It might be possible to instruct the IP load balancer to route requests to the backup data-center site, but the mechanism doesn't depend on it.
  • State Replication (a.k.a What happens with the ongoing calls after the failure?). In the case of complete data-center failure, the whole LAN cluster will go down. Let's first take a look at the case where Mobicents nodes only replicate state within the data-center network.
    The ongoing calls are likely to have some state associated with them and this state will be lost without replication. Unlike HTTP, in SIP the containers generally do not deliver requests to the applications if their dialog state doesn't exist in the protocol stack. SIP has strict rules about the state machines in the dialogs and transactions, thus the overall consistency depends on that state. All SIP requests sent after the failure would not succeed and the calls are technically lost. But, let's look at what exactly happens with lost calls in the following cases:
    • Simple media calls, audio or video calls - once the SDP exchange is complete the users can hear each-other and have a normal call without any SIP messages. So the failure in the SIP server will not affect them at all. As long as the Media Server is alive or the Media is flowing directly between the User Agents, the call will be fine. When the users hang-up, the BYE is lost, but it won't matter any more, because the call is over. Unless you need to capture the BYE and do some logic, you will be fine after the failure.
    • SIP-intensive applications - presence, chat, some IVR applications and so on. In these cases the application will fail due to the lost state.
Being aware of the limitations, lets see what are the options of for state replication between data-centers.

First, you have to realize that if you had enough network capacity to replicate the call state to distant nodes, you could just organize all nodes in small JBoss clusters where the nodes are from different geographical areas. JBoss already has a lot of settings in JBoss Cache/Infinispan and JGroups to fine-tune the connections, even if there is no dedicated feature for geography.

Thus, let's assume you don't have enough bandwidth for that. At this point it is very-application specific, because you must decide what part of the state you can lose. There are several options:
  • Partial replication per call between the geographical areas - lose some bits from each call
  • Rare replication - do the full replication, but only in steady-state calls when changes are unlikely to occur.
  • Priority calls - pick high priority calls and only replicate them.
  • No replication - invest in reducing the risk of complete data center failure. How hard is that? If the data-center is down, but your users are still online, this implies that there is massive working infrastructure somewhere. Moreover, power and network backup resources are available cheaply and they scale well. In fact, I can think of many reasons why the geographical replication methods mentioned above are not a good idea:
    • There may be a noticeable memory impact. In most implementations the local and the geographically replicated call state resides in different memory structures (so double the memory).
    • The performance impact may be huge - first because of the additional replication protocol logic in each node and second because your data centers must run at lower utilization to be ready to take the extra load in case of data-center failure.
    • Incidentally, the long distance network bandwidth is the most expensive. So you will have to pay for that as well.
    • Investing in hardware fault-tolerant resources for your data-center makes the reliability independent of the software. In other words - even cluster-unaware applications will benefit and will be more reliable.
    • Even if you recover some SIP state and your SIP applications don't fail, you still need to think how to keep the Media Servers alive or to switch them over to the other data-center.
    • Partial replication in general will still produce a lot of lost calls. It is just on best-effort basis.
    • Application are hard to code and test in partially replicated environment. It is an explosion of cases to be considered and tested when developing.
In summary, the geographically redundant load balancing is justified and easy to achieve at reasonable price. However state replication between data-centers is too expensive. It is much more efficient to focus on reducing the probablity of data-center failure, which is low anyways. Unless you have some service-specific condition that really fits the model, there is no point in state replication between data-centers.