Embracing the Web Part 1: Distributed Computing
by Thuan L. Thai07/11/2001
|
Related Reading
|
Distributed computing, which simplifies the development of robust client and server applications.
Componentization, which simplifies the integration of software components developed by many different parties.
Enterprise services, which allow the development of scalable enterprise applications without writing code to manage transaction, security, or pooling.
Web paradigm shifts, which are the changes in Web technologies to simplify the development of complex Web applications.
Maturity factors, which are the lessons that the software industry has learned from developing large-scale enterprise and Web applications.
Each of these items possesses a number of important principles. Some of these principles include ease of development, component sharing, resource sharing, and standardization, all of which are required in order to embrace the Web.
This article is the first of a three-part series that revisits these well-known concepts. In this installment, I'll review distributed computing, examining its goals, problems, and solutions. Familiarize yourself with each of the concepts, their benefits and their goals, because they all map directly to the goals of Microsoft .NET.
Distributed Computing
Instead of running on a single computer, large-scale systems execute on a number of different machines. One simple reason is to distribute processing. Other reasons for distributing functionality among different physical machines include the ability to separate user-interface functionality from business logic, increase system robustness, and add security features.
Check out O'Reilly's new .NET Resource Center for the latest articles and books covering Microsoft's .NET technology.
By separating user-interface functionality from business logic, you can more easily partition user-interface development from business logic. This makes development and management of the code much easier. Once you have developed your business logic, you can reuse it later. For example, two different user-interface applications can make use of the same business logic without changes to the business-logic code.
Robustness is another important reason for separating the user interface from the back-end business logic. In a client/server application, the server may service 100 clients. If one of these clients causes a general protection fault, it shouldn't affect the execution of the server or the other 99 clients.
In addition, when you split up user interface, business, and data logic, you may add security features to your system. For instance, your server can provide security functionality that authenticates1 the client each time it connects to the server. You can even add security logic to authorize2 every subsequent request that the authenticated client sends.
These reasons make client and server, or distributed, computing very attractive, but traditional client and server computing has drawbacks. Let's discuss these problems, their solutions, and the concepts behind the solutions.
Problems
Each time you develop a client/server application, you have to deal with several issues that repeat themselves in every client/server development effort, including homegrown marshaling code, homegrown security features, and lots of donkeywork.
Homegrown Marshaling Code
When you develop a client/sever application, you need to design the application-level communication protocol, so that the client and the server can communicate and send messages to each other. Designing a protocol specification requires the designer to stipulate how many bytes are reserved for the verb, how many bytes are reserved for the size of the supporting data, and so on. Developing such a protocol requires some basic code to parse off the network buffer and extract the necessary information specified in the protocol-specification document.
The code for the application-level communication protocol knows how to assemble and disassemble an application-level protocol network buffer. For this reason, this piece of code resides on both the client and the server. It is called the communications layer, or the marshaler. Assembling and disassembling an application-level protocol network buffer is known as marshaling and unmarshaling, respectively.
For more in-depth coverage of Microsoft's .NET Framework, visit the O'Reilly Network's .NET DevCenter.
While designing and developing the application-level protocol are trivial, these are rudimentary tasks that are extremely error prone and time consuming. Many developers go crazy killing bugs that appear in the marshaling code they develop. However, once you have designed and developed an application-level protocol, you can repeat the same process when developing other client/server systems.
Homegrown Security Features
If you want to support authentication or authorization features, you have to design and develop them yourself. Often, software firms develop their own homegrown security modules that they can reuse on different client/server systems. However, the problem is that the security modules don't conform to standards, rendering them useless for interoperating with other systems.
Too Much Donkeywork
If you want to support different network-level protocols (such as TCP/IP, named pipes, netbios, and so on), you must write a marshaler for each type of network protocol that you want your system to support. The code for each of these marshalers is practically the same, with only minor differences in using the associated network protocol. Because of all the code you have to write to develop a marshaler, supporting an additional network protocol is just too much donkeywork.
All this donkeywork should be provided by the operating system.
Solutions
To remove the donkeywork found in traditional client/server application development, the distributed-computing environment's remote-procedure-calling facility introduces a simpler paradigm for distributed-application development. Using remote procedure calling, or RPC, developers no longer have to write homegrown marshaling code. This is because the RPC library provides a system-specific marshaler, and the RPC tools can generate the application-specific marshaling code, called an interface marshaler in the Windows world.
Using RPC instead of sending a network buffer of some specified format, developers simply write the code to make a function call, as if the target remote function resides in the same process. Using the generated-interface marshaler, the RPC library ensures that the target function will be executed remotely. Like traditional client/server development, marshaling exists, but unlike traditional client/server development, you don't have to write marshaling code. Aside from the support of remote procedure calls, RPC also supports authentication and authorization features. RPC also lets you configure the network protocol you would like your client/server application to use.
Concepts
RPC utilizes an important principle, the separation of interface from implementation. In RPC, you specify or define an RPC interface. Once you've done this, you can implement the interface in your product. Other people can rely on this interface specification to use your product. They care very much about your interface, but they do not care about your internal implementation. Even if you change your implementation in future releases of your product, they will still be able to use your product because they rely on your interface specification.
For more information about Web services and RPC, don't miss O'Reilly's recently released Programming Web Services with XML-RPC.
An added advantage to interface specification is that other software vendors can implement your specified interface. Should they do this, the people who know your interface can also use the products that these vendors produce. This is important because it shines the light on software cooperation, interoperation, and interconnection.
Instead of writing your own hideous application-level protocol specification, RPC gives you a language, the Interface Definition Language (IDL) for this purpose. IDL simplifies the application-level protocol specification because it allows a programmer to think in terms of function prototypes, which make programmers feel at home. Another advantage of IDL is that it looks syntactically just like C-language code.
Beyond simplifying a programmer's life, RPC also introduces the concept of attributes. This is an extremely important concept that carries itself into the realm of .NET. A computer theorist would call this feature an implementation of aspect-oriented programming. Instead of writing 30 lines of code, aspect-oriented programming lets you specify what you want to the system using a hint (formally called an attribute), which is nothing more than a special keyword. At compile, link, or runtime, the system uses this hint to inject special logic to support the functionality you've requested via the attribute. In short, this feature saves you from writing tedious, repeatable, and error-prone code, thus saving time, money, and effort.
It is also important to note that RPC promotes the principle of location transparency. Client-side programmers simply make a procedure call. RPC will ensure that the procedure will be executed on the server-side, wherever the server machine lives. Server-side programmers simply write the functions according to the C-language prototypes generated by the IDL compiler--a tool that generates interface-marshaling code and outputs a C-language header file (as discussed earlier). You can see why it is so easy to develop n-tier applications using RPC--no more parsing network buffers--all you need do is make simple function calls.
Coming Up Next
This article briefly reviews distributed computing's goals, problems, and solutions. Concepts such as the separation of interface from implementation, attribute-based programming, and location transparency are vital to software development. These concepts make componentization and enterprise services a reality. In Part 2 of this series, we'll examine componentization and enterprise services.
Footnotes
Authentication refers to validating the user identity: "Are you who you say you are?"
Authorization refers to validating the user's access to a particular resource: "I know you, but do you have access to the requested resource?"
Thuan L. Thai is the author of Learning DCOM and coauthor of .NET Framework Essentials. He has been giving technical presentations on the .NET platform to clients since announcement of the initiative in August 2000.
O'Reilly & Associates recently released (June 2001) .NET Framework Essentials.
Sample Chapter 6, Web Services, is available free online.
You can also look at the Table of Contents, the Index, and the Full Description of the book.
For more information, or to order the book, click here.



