DSL, UML and Reality
IT folks are always looking for ways to simplify and accelerate the delivery process. There are often niche ideas that linger around for years until technology finally advances to the point of making such niche ideas feasible to the masses. Sometimes it is not even a technology advancement that is needed, but simply a mindset change. Regardless of which of these holds true, Domain-Specific Languages (DSL) are finally starting to enter mainstream consciousness as a compelling mechanism for delivering software.
For those new to the concept, a DSL is a simply way to build solutions that fall into a specialized domain area. When I say build, I mean just that – the code for the solution is created automatically from the DSL description. The basic idea of DSL sprung up from the complexity around delivering solutions using a general-purpose language such as Java – or even describing a solution in a general purpose medium like UML.
Shattering the UML Mythology
The biggest difference between UML and DSL is that DSL focus on the description of the solution translating directly into the implementation of the solution. UML makes no such attempt. Personally, I hate UML – mainly because I hate complexity and repetitive work. UML is heavy weight and requires a good bit of training. If all the work of describing something in UML could automatically turn into running code, then maybe the investment would be worth it. As it stands, UML only serves to describe a system, not to produce the actual implementation of a system. Sure, you can use UML to describe how you will implement the system, but then you still have to run off and code it. Some UML tools make an attempt at generating code, but due to the massive breadth of UML , no tool can generate all that is described (thus the formation of DSL which I will elaborate on later). It always makes me chuckle when someone makes a comment that because some development tool is using UML that what it is describing can automatically move to another UML tool and build running code. When I read or hear such drivel, I immediately dump that person into the “knucklehead” category. Did you know that since Microsoft is now using UML in their Oslo initiative that whatever you build there can be moved into Rational and run as an application on Java? Yeah, right. You might get a good skeleton, from which to code the full solution, but you are sure not going to get a running application. That would be panacea, but I have yet to see anything even remotely close to this working anywhere. The closest I have seen is BPEL crossing tools. But in the BPEL case the actual web services themselves maintain their existing implementation and only the shell stringing them together changes.
The vast majority of shops using UML simply use it as a mechanism for documenting the system that is to be built or has been built. Even this use of UML often seems like a fool’s errand, since the many of the artifacts described in UML (use cases & sequence diagrams) are not kept in sync with the code that is written to implement the description. Developers constantly have to go back and update the UML sequence diagrams or use cases once they – inevitably – find that the description provided in UML cannot be implemented as described. This all seems like a colossal waste of time. Obviously, I am not the only one who thinks this as just about every developer I have ever talked to seriously dislikes using UML and keeping it in sync with code. As it stands, UML is often just a glorified mechanism for describing a system. While I can appreciate the need for a consistent way of describing a system, the complexity and redundancy required for most UML tools is not worth the cost. The biggest thing that most UML tools can keep in sync with code is the data model and some very basic sequencing. Oh, joy. Yet again, this is not worth the cost. For any UML aficionados reading this, I would ask you how many projects have you done where you have not had to go back and make significant adjustments to the UML documentation post implementation and how much of the actual implementation was generated from the UML documentation?
Jack of All Trades, Master of None
The most common DSL out there are actually subsets of UML that focus on a specific domain and create actual code implementations. Ed Merks, who leads the Eclipse Modeling Framework (EMF), had a great line during one of the Modeling Birds of a Feather talks at EclipseCon 2008, “it is not my fault that you have no reference implementation.” This quote was directed at the UML contingent representing OMG who were looking to have EMF be able to model some functionality that is not even possible to deliver in Java. I agree with Ed’s assessment that this lack of reference implementation is responsible for the current position of UML as documentation. For instance, EMF is well coupled with UML, but fixates on having a Java implementation. This approach of taking subsets of UML and using it to specify a DSL that has a specific implementation (or reference implementation) is great. It means that you cannot leverage the entirety of UML, but who cares. Per Ed’s comment, if I know that I am implementing a system in Java, then I really do not care that I cannot use the multiple inheritance aspects of UML since Java does not support multiple super classes anyhow. That is the problem with UML, it tries to be all things to all people – which is why it ends up being a documentation engine more than anything else.
Technology or Business
DSL are not perfect, but they are a heck of a lot better than UML. As the concept of DSL has begun to enter the mainstream, the exercise of nailing down exactly what constitutes a “domain” is just now beginning to be hammered out. Domain is a pretty overloaded term in the IT world, and in searching the web for definitions of DSL, you are not going to find a ton of specificity. The bulk of DSL seem to fall into one of two possible categories: technology domains and business domains. Most DSL descriptions focus on business domains. For instance, there are lots of examples online of using Ruby to generate a DSL for some specific industry such as healthcare. In such descriptions of a DSL, there is a data model specific to the healthcare domain along with some functional operations. In healthcare specifically, there is HL7 – which defines some domain specific data models and Microsoft has gone as far as to specify a WSDL.
The term DSL is also often used to refer to a technology domain rather than a business domain. For instance, Structured Query Language (SQL) is a technology focused DSL rather than a business DSL. In fact, SQL is often used as an enabler to many business specific DSL. This is a pretty common trend. The software we make here at Skyway is a DSL for web applications and web services. Yet we have customers who have implemented business specific DSL on top of Skyway.
This different set of uses for the term DSL needs some specificity in order to be clear to anyone who is not already pretty familiar with the DSL concept. For the time being, it serves us here at Skyway to simple refer to a business focused DSL or a technology focused DSL.
Wrapping it Up
DSL are often subsets of UML or general purpose languages that focus in on a specific business or technical domain and the real implementation of solutions in that domain. Where DSL can simplify and accelerate solution delivery, UML is mostly going to serve as a mechanism for documenting solutions. UML and DSL are not mutually exclusive, you could very well document the system in UML using use cases and sequence diagrams and then proceed to build the solution in a DSL such as Skyway or a business DSL created in Ruby or some other technology.







