AddThis Social Bookmark Button

Listen Print

Embracing the Web Part 2: Componentization and Enterprise Services

by Thuan L. Thai
07/18/2001

In Part 1 of this series, I briefly revisited distributed computing and discussed its goals, problems, and solutions--most importantly RPC (remote procedure calls). In this article, I'll examine componentization, which allows easy and rapid software integration, and enterprise services, which allow developers to build software that scales to massive numbers of users.

Related Reading

.Net Framework Essentials

.Net Framework Essentials
By Thuan L. Thai, Hoang Lam

Table of Contents
Index
Sample Chapter

Read Online--Safari Search this book on Safari:
 

Code Fragments only

Componentization

RPC commences a journey that leads into many other technologies, one of which is componentization. Because RPC supports only remote procedure calls, software vendors create their own homegrown wrappers that layer on top of RPC to support remote object-oriented programming. This allows the software vendors to develop rich C++ class libraries and still attain the benefits of RPC for distributed computing.

Additionally, some people realize that the concept of separating interface from implementation can help build software that supports plug-n-play. In the hardware world, you can plug a few boards together to assemble a complete computer. This works because these boards expose a specific interface that another board knows how to use. Hardware plug-n-play has been around for years, but the commercial software industry has never seen software plug-n-play or componentization until the realized success of RPC. Like plug-n-play hardware components, plug-n-play software components fit together like Lego blocks. Commercially introduced in the early 1990s, software plug-n-play has only been around for the last six to ten years.


O'Reilly recently released Programming Web Services with XML-RPC, which covers five XML-RPC implementations: Java, Perl, Python, ASP, and PHP.

There are many different component software infrastructures, all of which are object-oriented. These include the Component Object Model (COM), System Object Model (SOM), Common Object Request Broker Architecture (CORBA), and Java Beans. Each of these component technologies has similar principles, supports similar features, and exhibits similar applicable realities.

Principles

While there are many principles in a component technology, I'll review just the important ones that, if missing, would render componentization useless. These principles include interface contract, binary standard, wire protocol, and bridges.

  • Interface contract. The principle of separating interface from implementation is a key success that has been assimilated into the paradigm of component software. In component software, a component consumer (client) sees the component provider (or simply component) as a black box. The consumer knows the interface intimately but ignores how the component is implemented. So long as the interface behaves correctly as specified and documented, the consumer is totally content. Because of this intimacy, the interface is a contract between the component and its consumer. The implementation can change in future releases of a component, but the interface must remain the same. If a change to the interface is required, a new interface is typically added, so that new clients can use the new interface, and older clients won't crash. The idea of an interface contract permits much easier component implementation and reusability.

  • Binary standard. Since a component technology's goal is to allow software plug-n-play, it must support component integration not only at the source code level but also at the binary level. Put differently, it must allow a consumer to utilize another component, even if the consumer has no source code for the target component. In fact, the consumer doesn't even need to have the component's public header files or static libraries. The reason that this is possible is that the interface contracts and the component's type information, or metadata, are often shipped with the component. Type information contains the information about the component, the classes hosted by the component, the interfaces implemented by the classes, and the methods that are supported by the interfaces.

    In order to gain binary interoperability, the system has to be aware of type information and it must either know or somehow support the layout and execution of objects in memory. Most C++ programmers know that many C++ compilers support polymorphism using the notion of virtual tables (vtbl) and virtual pointers (vptr). These vptrs and vtbls have a specified, consistent layout in memory, allowing for binary interoperability. In other words, software executables that conform to this binary layout will interoperate with one another at the binary level. Put another way, a compiler for any computer language that can generate this binary representation can produce software that works with other software that conforms to this standard. The vptr and vtbl facilities can be thought of as a binary standard. Because COM needed a binary standard in the late 1980s, Microsoft took advantage of its own implementation of C++, so that it could acquire the vtbl and vptr facilities for free1.

  • Wire protocol. Component interoperation within the confines of a single machine was a noble achievement, but a component technology couldn't be successful unless it supported component distribution, allowing components across cyberspace to interoperate with one another. To allow for this magic, a component technology must specify the wire protocol that components can utilize to communicate across the wire. Instead of coughing up a totally brand new RPC mechanism, Microsoft utilized the successful RPC standard. Because it lacked support for object-orientation, Microsoft added an additional type, the object reference (OBJREF), to RPC, resulting in a flavor of RPC called Microsoft RPC (MSRPC).


For more in-depth coverage of Microsoft's .NET Framework, visit the O'Reilly Network's .NET DevCenter.

A wire protocol serves as an infrastructure that supports distributed component interoperation. It can be extremely complex, but once specified and implemented, no one will need to write this type of infrastructure again. Microsoft did this part so that developers wouldn't have to invent their own wire protocols.
  • Bridges. Because there were a number of different component technologies, a few companies started to build and sell software components that acted as intermediaries, stitching two different component technologies together. This type of software is appropriately dubbed a bridge, because it serves as a bridge for two different component architectures, allowing interoperation between the two. For example, using a COM-CORBA bridge, your COM object can use your CORBA objects, and vice versa.

Features

Besides the important principles mentioned in the previous sections, a component technology must support a number of popular features, including location transparency, dynamic activation, dynamic invocation, security, and connection management.
  • Location transparency. This feature refers to the ability to invoke methods that are implemented in a remote machine, without much knowledge about where the remote machine is located, or what network protocol is being utilized. On the client side, you write the code to simply invoke a method. On the server side, you implement the method that the client will call. Everything else should be handled by the magic of the wire protocol, an example of which is the Distributed COM wire protocol and the COM library.

    Figure 1. COM proxy and stub
    Figure 1. COM proxy and stub

    As shown in Figure 1, COM supports transparency using a proxy object on the client side, a stub object on the server side, and interface marshaling code on both sides for the supported interfaces. When calling a remote method, the client simply invokes a method that is intercepted by the proxy object. The proxy object knows how to communicate with the stub object through the COM infrastructure. When the stub object receives the call from the proxy object, it acts in place of the remote client and calls the target method. After the method invocation completes, the stub object passes the return values back to the proxy object, which then passes the same values back to the client.

  • Dynamic activation. This feature supports the lazy launching of a component whenever a client requests it. With this support, you don't need to start your component process and have it wait for client connections, as in traditional RPC. Dynamic activation allows you to save resource consumption on your system.

    To take advantage of dynamic activation in COM, you simply let COM know that you are a component with a specific identifier. You do this by registering the class identifiers (CLSID) of the COM classes that your component exposes into the system registry2. Once your classes have been registered, the COM library will find and launch your component whenever a client tells COM that it needs your object.


For more on maintaining, monitoring, and updating the registry, check out O'Reilly's Managing the Windows 2000 Registry.

  • Dynamic invocation. Dynamic invocation is a feature that allows components to dynamically interoperate with each other at runtime without prior knowledge of one another. If you develop a client application that knows how to take advantage of dynamic invocation, that client will be able to call all current and future components that implement dynamic invocation features. An example of such a client is Visual Basic (VB) 6.0. Developed several years ago, Visual Basic 6.0 can call all COM objects that support dynamic invocation, even the COM objects that you develop today.

    In COM, a client, like Visual Basic 6.0, is called an automation controller, and the component that it is using is called an automation server. Automation controllers know how to use the IDispatch interface, and automation servers are COM components that implement the IDispatch interface. Again, the powerful aspect of dynamic invocation is that it allows a client to be developed today, but the same client can invoke methods implemented by an automation server written in the future, without recompiling the client at all.

  • Security. Because component distribution is key in any component technology, security must be an integrated feature. Otherwise, security violations will be very hard to prevent. As mentioned earlier, two key security features include authentication and authorization. Recall that authentication verifies whether the caller is who it says it is, and authorization verifies whether the caller, who has already been authenticated, has access to a particular resource. When you see the word "authentication," think of this as a passport, and when you see the word "authorization," think of this as a visa.

    Another security concept is impersonation, a feature that lets the server assume the client's identity. This feature allows the server to temporarily be the client, such that the server can access all resources that the client can access. Impersonation is important because server components normally run under a specific server-side system account, which may have specific security constraints and thus may be inappropriate for the caller. And because different clients access the server, you don't want a powerful server identity to support all the clients of varying security permissions. Instead, the most appropriate thing to do is to impersonate the clients and service their request using their security permissions. Of course, the client must allow the server to impersonate the client before this is possible.

  • Garbage collection. In a distributed, component-based system, many components interoperate with one another. In typical component-based systems, a simple rule is that a client keeps a reference to an object until it no longer needs the object. However, there will be times when clients fail to meet this rule or the network fails to operate, causing invalid and extant references. A component infrastructure or middleware must be robust enough to mitigate these obvious problems. It must provide a garbage collection facility to clean up any invalid and extant references, or the machine will eventually run out of resources.

    Distributed COM takes advantage of pinging to implement the distributed garbage collector. Using ping messages, if a client pings a remote server and gets no answer, it assumes that it has lost the connection to the server. In this case it removes the server references that it holds and cleans up the allocated resources associated with those references. On the server side, if it hasn't heard a ping message from the client within a set time-out period, it will assume that the client has died and clean up the resources allocated for the client on the server side. Because ping messages can clog up the network, DCOM invented a pinging mechanism called delta pings to detect extant references. In theory, the component middleware on the client side pings the server machine at different intervals. It uses a special algorithm to collect the pings together in a group, and sends just one ping message to the server instead of many ping messages, saving network bandwidth.

Practicality

Componentization brought many benefits. Not too long after its introduction, everyone was creating and reusing components, not only within the confines of a single machine but throughout the network, with the help of the DCOM wire protocol.

But developers didn't stop there, especially because they wanted to share their valuable components across the Internet. The problem they faced was that DCOM embeds the IP address within the Network Data Representation (NDR) buffer, preventing DCOM from working across the Internet through firewalls and Network Address Translation (NAT) software. Microsoft quickly came up with a solution for this, called COM over the Internet. However, a more open and standard mechanism, called the Simple Object Access Protocol (SOAP), was introduced and immediately accepted widely. SOAP is entirely based on XML.

An additional problem with COM is that it relies too much on the system registry, making COM programming and component setup and configuration a nightmare for both developers and administrators. COM is not hard, but ramping up to the COM learning curve is painful, specifically because novices spent most of the time deciphering the meanings of each COM-related registry hive and entry. It would be more appropriate if all of the registry-related information were built into the component itself, so that there would be no need for the system registry. See .NET Framework Essentials for details on how .NET solves this problem.

Enterprise services

As mentioned earlier, componentization permits the plug-n-play of software components and revolutionizes the way people develop software. It's so popular that people build large-scale, enterprise-wide systems based on this concept. Building these kinds of systems poses a number of constraints. Because large-scale systems tend to support a massive number of users, architects must design these systems to scale over time and to support the sharing of resources. That means that they require their developers to develop services like thread pooling, resource pooling, security checking, and others.

These enterprise services are being crafted left and right for each enterprise system that people build. Similar to the marshaling problem found in distributed programming (see Part 1: Distributed Computing), the development of these services is not rocket science. Once you have built a component to support resource pooling for your database, it won't be hard to create another one for your next enterprise application. However, these services are complex enough to introduce defects and bugs that will give you nightmares. In addition, developing these services is tedious, error prone, and outright redundant. Moreover, these services conform to no standards, making them useless to other enterprise systems.

To solve this, component infrastructures, such as Microsoft Transaction Server (MTS), COM+ Services, CORBA Services, and Enterprise Java Beans (EJB), provide the support for these enterprise services. With such support, you don't have to write this type of code ever again when you develop an enterprise application. It is important to understand the principles behind enterprise services and the features that these enterprise services provide.


O'Reilly will soon release (September 2001) COM and .NET Component Services, which provides both traditional COM programmers and new .NET component developers with the information they need to begin developing applications that take full advantage of COM+ services.

Aspect-Oriented Programming

Enterprise services are built on the notion of aspect-oriented programming (AOP), a paradigm that is often overlooked but extremely essential in systems that support configuration or runtime changes. Under this paradigm, you develop your software once, but it behaves differently depending upon how it is configured. A simple example of this is a program that obtains information from a Windows initialization (INI) file. Unlike the use of an INI file in which your program would have to decipher and provide the necessary feature, AOP provides the feature, such as connection pooling, at the system level, where you don't have to write a single line of code.

In AOP, you tell the runtime or the system the features you need either at programming time or at configuration time. At programming time, you simply add special keywords or attributes in your methods or classes, but you don't write any custom code to support these features. At runtime, the system will intercept all calls to your methods or classes, inspect these attributes, and inject the necessary features into your component for you.

Instead of specifying these attributes at programming time, you can also delay these declarations until configuration time. Should you do this, you will have the opportunity at configuration and setup time to declare the services that you need. With this ability, the same system can be configured to run in different ways, supporting different features. In other words, two different companies can use the same system by simply declaring different configurations. Although there is only one system, the AOP paradigm allows us to instantiate different instances of this system at configuration time. Looking at this from another conceptual angle, you can instantiate a class in C++, but you can instantiate an entire system under the AOP paradigm.

MTS and COM+ Services use the notion of aspect-oriented programming to provide enterprise services for COM components. You set (at programming time) or declare (at configuration time using the MTS or COM+ Explorer) certain attributes to indicate the kinds of services you want you use, and MTS or COM+ will intercept all calls at runtime and execute the services you've specified.

Typical Enterprise Services

Enterprise services exist to support sharing, scalability, robustness, quick response time, and other features. To support these features, a component technology provides a number of common enterprise services, including thread pooling, object pooling, connection pooling, security management, transaction management, and disconnected processing.

  • Thread pooling. With the ability to create threads in Symmetric Multiprocessing (SMP) systems, developers can spawn threads to spread out workload. A large-scale system typically requires development efforts from not one but many developers. The development experience for one developer is different from that of another. For this reason, it would be wise to let all programmers write code as if they were writing a single-threaded application. The thread pooling enterprise service will worry about spreading the workload automatically. The system spawns a number of threads and puts them in a pool. Each time a thread is required, the system picks a thread from the pool. Not only does this save system resources, but it also simplifies the development effort.

  • Object pooling. Certain objects, such as those that rob 20 seconds to initialize a lookup table, take a long time to create. In a large, scalable, and robust system, objects are usually created, used, and destroyed quickly to maintain the consistency and integrity of the backend data. However, to always create and destroy an object that requires 20 seconds during creation will deteriorate system responsiveness and possibly resource utilization. It would be better if you could create this type of object once, but still able to use the object in the same way as all other objects. This is where object pooling enters. Objects that take a long time to create can be created once and stored in a pool. Each time the object is requested, the system picks the object from the pool and allows it to service the request.

  • Connection pooling. Since database connections are extremely precious, expensive to create, and resource-intensive to keep in memory, you must share database connections in a large-scale system. For example, if you only have ten database licenses and you want to support a hundred users, you've got to share the ten connections. If you have one client opening and closing a connection to a database, you may not notice problems with connection initialization and resource utilization. But imagine what would happen if a thousand users were doing this at once, something that could happen in a large-scale system. Like thread and object pooling, connection pooling holds a pool of connections. If a requested connection has been placed in the pool, the system will pick the connection from the pool and let it service the request.

  • Security management. Writing code to manage security is a complex task. Administrators can only configure security for a particular process, but not the objects and methods that the process provides. To provide an easier way to manage security, a component technology offers the notion of role-based security. A role is an alias that stands for a set of users or groups. Role-based security is an abstraction of system provided security that allows programmers to write very simple code to manage security and system administrators to easily configure security for components, objects, interfaces, and methods.

  • Transaction management. In large systems, a single object doesn't work alone. It interacts with many other objects to carry out a particular transaction, which can make updates to one or more databases. Transactions support database integrity because they must comply to the ACID properties, including atomicity, which means that all work performed by a transaction is either committed or rolled back completely; consistency, which means that data must be consistent no matter the outcome of the transaction; isolation, which means that a transaction is processed in isolation; and durability, which means that data is durable once committed, even if system failures occur.

  • Disconnected processing. It is often said that the largest systems in the world are based on message queues, which allow you to develop asynchronous and loosely coupled systems. Applications can enqueue a message that can later be dequeued by a different application anytime it wishes. In other words, the second application doesn't even have to exist at the time the request is enqueued. In addition, once the first application has enqueued the request, it can immediately perform other processing.

    Again, because the system provides these services, the programmer doesn't have to write custom code to support these features. Instead, the programmer signals to the system the types of needed services by adding attributes at programming time or declaring features at configuration time. The system will intercept all calls at runtime and inject the needed feature. It is also worth noting that Microsoft supports all these services in release 1.x of the COM+ Services.


Check out O'Reilly's new .NET Resource Center for the latest articles and books covering Microsoft's .NET technology.

Coming Up Next

In this article, I've briefly reviewed componentization and enterprise services, two of the most important concepts that enable software integration and large-scale systems. Since some of largest scalable software systems are Web-based, I will address Web-based concepts and maturity factors next week, in Part 3 of this series.

Footnotes

  1. If you are interested in the details of the COM binary standard, check out Chapter 3, "Objects," of Learning DCOM, published by O'Reilly.

  2. Dealing with the registry can be a nightmare for COM programmers. To mitigate this pain and suffering, .NET has abolished the use of the registry for component discovery and plug-n-play.


Thuan L. Thai is the author of Learning DCOM and coauthor of .NET Framework Essentials. He has been giving technical presentations on the .NET platform to clients since announcement of the initiative in August 2000.


O'Reilly & Associates recently released (June 2001) .NET Framework Essentials.