Rants: julio 2007

martes, 31 de julio de 2007

Fumbling the Future

This is not a rant about how Xerox created the future, but in the mean time did not have courage to gather (or rather pick up) the benefits, so it fumbled its own future.

This rant is about what is likely to happen in the following 5 years:

Guessing the Future

1. Virtualization of computers on top of the JVM. Now it is up to 80% the speed, but it will get to 120% of the speed or more through optimizations (how I don't know, but if you look at past abstraction layers, once they were slower, then they became faster: Curses, Java, etc.). There will be no reason to run an operating system directly on hardware.

2. Multicore: 2007: 4 cpus, 2011: 16 CPUs, 2020: 256 CPUs. How to use all that raw power? One solution is virtualization for consolidating servers on one machine. Another solution is multithreading for making programs run faster, the problem is that typical multithreading is hard, specially if you want to improve the speed by 10. Typical speed improvements is not even 2x when coding multithreaded code by hand and optimizations are error prone. The EJB model allows to write multithreaded code as if it were single threaded, and although EJB was probebly not the best implementation, it is a very smart idea and I bet someone will come up with a way to implement it correctly so that magically all Java programs are multithreaded.

3. Hard drives replaced by pendrives on 2012: Computers will get smaller and consume less energy. Some people argue that this can't happen because the flash technology on which pendrives are based does not allow to record the same memory address more than a few million times. That's a very interesting argument, but it is a technological one. I can't know how it is going to be fixed, but it is going to be fixed for sure, somehow, just because there is a market impulse in that direction, so there is money to be invested, and smart ideas wil get funded. This means computers will get less expensive more people will buy computers, the software market will become bigger and therefore the software makers will become richer. Also since pendrives are not mechanic, the structure of data in disk will no longer follow the BTree structure, since access to contiguous portions of the hard disk meant faster access, but flash pendrives are RAM, and therefore they could be stored in HashMaps, TreeMaps or LinkedHashMaps and it wouldn't matter, access will be at least 10 times faster. Also paging memory will become faster, so computers with 64 bits will have unlimited memory (compared with today's miserable 2 GB). You laugh now, we will talk in 2012 and we will have computers with virtual RAMs of 32 GB.

4. 32" inches LCDs on every desktop on 2012: Less consumption and less space means lower prices and less eye strain, since the LCD monitors do not flicker.

5. WiMax means always connected and always on. IP-Radio, IP-Cells and IP-TV will be used on the road. TV programs will be stored on pendrives, and everyone will have their TV shows, which means less quality (can we even go any lower? yes, unfortunately, but at least you will have many options and you will have to dig for information, which is really good for Google). When will WiMax take off? Maybe 10 years from now, but then it will be to late to develop the technology. It must be created and perfected 10 years before, as happened with other technologies.

6. Since computers will be so cheap and powerful, fast operating systems will be considered legacy, and secure operating systems (microkernels written in Java for example) will be the operating systems du jour. As long as they can run Java (and its legacy). Windows and Linux wil run on top of Java, and since Java operating systems like the BEA's Liquid VM run on bare hardware, good bye operating systems.

7. ERPs like SAP have taken the market by storm because they can offer an integrated solution for the whole enterprise, but they are not easily configurable. All the 6 months projects to adapt SAP turned into 3 years projects which either did deliver or were simply killed. SAP is migrating to Java. There are many open source ERPs. So you will be able to fund your own company on cheap hardware and free software, and once everyone can do it, it is not a good business anymore. Companies will have to use cheap hardware and free software, but also invest in differentiation if they want to survive. The copycats will thrive.

8. All the software will run on the web, using Ajax technologies, mimicking the way Windows or MacOsX work. A new standard for web usability will emerge. Which? I wish I knew, but I guess people will prefer the tried and true (the Desktop???)...

9. Operating systems running on top of Java and Java running on a browser. This of course means operating systems running on top of a browser. There is a web browser called Lobo that is written in Java. What we are lacking yet is an OS written fully in Java, and then Java will be able to run independently of the all the legacy code, but will support all the legacy code. Then the hardware manufacturers will be free to improve the CPU design removing all that is not needed to run Java. Woudl it be significant an operating system written in Java? For starters the code must be short and simple, it must be microkernel and use virtualization to run several operating systems on top (even itself). This way if a driver fails, that one driver fails and it doesn't take the whole machine down.

10. The app server market is fragmented. If there was a mechanism for executing all the code form the different app servers into just one app server, it would be a real hit. For the moment there are machines like Azul that have more than 300 cores and can execute different OS and different app servers on the same machine, integratign several platforms, making everything execute faster, because the application can communicate with other servers using the internal data bus. It would have been a lot faster if the applications were developed for the Azul hardware instead (really meaning all in Java, since Java can run on top of a bare Azul system).

11. Ethernet at 40GB and 100GB will be standard on 2010.

12. The market for IT (converting manual process in automated and semiautomated processes) is way deeper than what is actually delivered today. It is only through constant failure, that the market still manages to deliver such a poor performance. The main problem is that when user and analysts design new systems, there is a lack of developers to implement the new tasks and they deliver in weeks, months and years instead of delivering in minutes, hours and days. There is a strong market for a need that has not been satisfied.

13. Devices connected through TCP/IP. This means lower time to market and lower desing expense, since all devices and drivers will simply use TCP/IP. Besides there are improvements in the use of the buses because information can get fragmented and therefore, you don't have to finish one operation to start the next. This means better throughput and more reliable bus communication. Also it would mean that you could connect to another computer devices without having to access the host's CPU.

lunes, 30 de julio de 2007

Database Algebra

Explanation

x = variable

X = defined name

x -> y = given x find y

= set of x

{} = empty set

{x} = a set with one element called x

{x,y,z} = a set with x, y and z

[x] = list of x

(x..y) = range from x to y

(x, y) = vector of x and y

y : = y belongs to set of x

<: = is a strict subset of

<=: = is a subset of

x y = x or y

s1 U: s2 = union of set s1 and set s2

s1 I: s2 = intersection of set s1 and set s2

x # x : = x so that x belongs to set of y

x => y = if x is true then y is true

x = (y,z) => x.y = if x is composed of (y,z), you can address x.y

PrimitiveType = ( int, float, date, string, string[n], blob )

Once the notation is defined (or should we call them axioms?), we can define some properties that we can prove to be true.

Algebra

(x..y) -> z == x -> z

(x, y, z) == ((x, y), z)

(x) == x

x -> y == (, f(x)=y)

Not very impressive, huh?

Well, the devil is in the details, and therefore the paradise is also in the details, if you know how to find and exorcise those devils ;-)

Database

Type = PrimitiveType

Fielddef = (name, Type)

Field = (name, value, Fielddef) # (value : fielddef.type)

Tabledef = (, pk) # pk <=:

Row =

Rowtx = ( tx, Row ) # tx : int

Rowdata = ( (lowertx..higher) -> Rowtx ) # ( lowertx : int && highertx : int )

Tabledata = ( TabvleDef, pk -> rowdata) # pk <=: Tabledef && row.. = tabledef.

Table = (Tabledef, Tabledata) # ( Tabledata.pk <=: Tabledef && Tabledata.Tabledef == Tabledef )

Database = (

, lowertx, highertx ) # (lowertx : int && highertx : int)

What is the point of having an algebra to define a database? If this algebra is correct, and I know for sure it is not, since I was transcribing it from notepad and I found several bugs, but let us assume we can write a correct database algebra. And then let us assume that we have an interpreter for this algebra.

We would have a database.

So what? I hear you say. Databases have existed for 50 years, so why are you trying to reinvent the wheel?

First, current database technology is dated. Also it does not implement the relational algebra nor the relational model properly. And the most important thing: it is full of hacks. Unmaintainable. Unable to grow.

I want this to be a vehicle for thought.

And now let us have some fun...

Table operations

table.insert(Row) # table.tabledef. = row..

table.update(updaterow, matchrow) # updaterow : Row && matchrow : Row

table.project( ) # <=: table.tabledef. // sql select stmt

table.select( matchrow) # matchrow : Row // sql where stmt

viernes, 27 de julio de 2007

Map Reduce (part 1)

Map and reduce have had a come back after Google published a paper on how they do distributed computing.

Can we do it in Java?

For starters, there are no standard map and reduce implementations in Java as there are in ML. The map here is not the Map interface, but the Lisp mapcar function.

Here is my attempt to implement map and reduce in Java:

import java.util.ArrayList;
import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

interface Applicable
{
Object apply( Object ob );
}
interface Reductible
{
Object oper( Object ob1, Object ob2 );
}

interface Mapping
{
List map( List list, Applicable app );
Object reduce( List list, Object val, Reductible redux );
Map map( Map map, Applicable app );
Map mapKeys( Map map, Applicable app );
Map mapValues( Map map, Applicable app );
}

class SimpleMapping implements Mapping
{
public List map( List list, Applicable app )
{
List ret = new ArrayList();
for ( Iterator it = list.iterator(); it.hasNext(); )
{
Object value = it.next();
Object newVal = app.apply( value );
ret.add( newVal );
}
return ret;
}
public Object reduce( List list, Object val, Reductible redux )
{
Object ret = val;
for ( Iterator it = list.iterator(); it.hasNext(); )
{
Object value = it.next();
ret = redux.oper( ret, value );
}
return ret;
}
public Map map( Map map, Applicable app )
{
Map ret = new HashMap();
for ( Iterator it = map.keySet().iterator(); it.hasNext(); )
{
Object key = it.next();
Object value = map.get( key );
Object newKey = app.apply( key );
Object newVal = app.apply( value );
ret.put( newKey, newVal );
}
return ret;
}
public Map mapKeys( Map map, Applicable app )
{
Map ret = new HashMap();
for ( Iterator it = map.keySet().iterator(); it.hasNext(); )
{
Object key = it.next();
Object value = map.get( key );
Object newKey = app.apply( key );
ret.put( newKey, value );
}
return ret;
}
public Map mapValues( Map map, Applicable app )
{
Map ret = new HashMap();
for ( Iterator it = map.keySet().iterator(); it.hasNext(); )
{
Object key = it.next();
Object value = map.get( key );
Object newVal = app.apply( value );
ret.put( key, newVal );
}
return ret;
}
}

// Example

public class Example
{
public void testMapping()
{
Mapping mapping = new SimpleMapping();
List list = new ArrayList();
list.add( new Integer( 1 ) );
list.add( new Integer( 2 ) );
list.add( new Integer( 3 ) );
printList( list );
list = mapping.map( list, new Applicable() {
public Object apply( Object ob )
{
Integer val = (Integer) ob;
val = new Integer( val.intValue() + 1 );
return val;
}
} );
printList( list );
}
public void testReduce()
{
Mapping mapping = new SimpleMapping();
List list = new ArrayList();
list.add( new Integer( 1 ) );
list.add( new Integer( 2 ) );
list.add( new Integer( 3 ) );
printList( list );
list = mapping.map( list, new Applicable() {
public Object apply( Object ob )
{
Integer val = (Integer) ob;
val = new Integer( val.intValue() + 1 );
return val;
}
} );
Integer zero = new Integer( 0 );
Integer result = (Integer) mapping.reduce( list, zero, new Reductible() {
public Object oper( Object ob1, Object ob2 )
{
Integer val1 = (Integer) ob1;
Integer val2 = (Integer) ob2;
Integer val3 = new Integer( val1.intValue() + val2.intValue() );
return val3;
}
} );
System.out.println( "sum = " + result );
printList( list );
}
public void printList( List list )
{
Mapping mapping = new SimpleMapping();
System.out.print( "[ " );
mapping.map( list, new Applicable() {
public Object apply( Object ob )
{
System.out.print( ob + " " );
return ob;
}
} );
System.out.println( "]" );
}
public static void main( String [] args )
{
Example example = new Example();
example.testMapping();
example.testReduce();
}
}

As you can see this is pretty... verbose. I doubt anyone would like to use this implementation because it is so cumbersome. First class blocks in Java would help (what Java7's closures are supposed to add), probably now you think the word "closure" is a misnomer and "first class block" is the correct wording.

Once that philosophical and etimological feature has been defined and probably corrected, the most important thing about map/reduce is that you want to code something using this map/reduce technique and you want this to automatically scale to a distributed algorithm with thousands of computers doign a part of the calculation.

Let start the design.

First, how can you call an interface and either call this SimpleMapping or the DistributedMapping (yet to be written).

I've written an EjbProxy, which basically implements a "jump" from one JVM to the next using the EJB standard. EJB calls are rather expensive, so you only want to do one of those expensive jumps when the data is too large to fit in one computer, or when the processing is so expensive (like finding all possible divisors of n, that is all p and q so that p * q = n), so that the search space can be divided among several computers.

What implementation would the EjbProxy call for the DistributedMapping?

It would call itself, but using a different range of values to search. Let as call it the SearchContext. So a DistributedMapping would have several machines each with its own search context.

In other words we would have a HashMap on each computer saying:

SearchContext -> machine

And also each machine needs to know which SearchContext is its own:

machine -> SearchContext

So it really would be handy if we could have a bidirectional mapping:

SearchContext <-> machine

martes, 24 de julio de 2007

Closures, Anonymous functions and Blocks (Part1)

A block is a chunck of code that has been delayed execution.

For example when you write:

void m( int x, int y )
{
doThis( x );
x.doThat( y );
}

Then you don't expect doThis() and doThat() to be executed immediatly but only after m() is called. This probably doesn't make sense to you if you have always compiled Java, because in compiled languages there is one extra step: compiling with javac and the invoking the program (with java). But imagine you are writing code in an IDE that executes your code so that if you write main(...) it executes main, but if you write void main(...) { ... } you are simply defining main and therefore at the end of the definition the IDE says: "main defined for class XXX".

If you understand this, then you know exactly what a block is in Smalltalk. In Smalltalk blocks are a way to write delayed evaluation, for example if you execute:

a doThis: x with: y

You are seding the message: "#doThis:with:" (ignore the "" they were added by me to complicate things, and ignore # for the moment) and the parameters are x and y in that order.

The equivalent in Java would be:

a.doThisWith( x, y );

The thing is that Smalltalk may execute that immediatly (if instructed to). For the curious who still do not get Smalltalk 80 IDE's, you have to select with mouse the text to evaluate and then right click with the mouse so that the context menu appears and then select either printIt or executeIt, depending on what you cant to do. That is a programming interface from 1980 and still in 2007 some people don't get it.

So we can imagine in Java an IDE that let us evaluate code. I mean, some debuggers allow us to do this already, so this is not so farfetched. Altough let me explain to the Smalltalk naysayers and to be aficionados that in Smalltalk the program is always running, meaning that if you want to take the program down, you have to take down the IDE too.

So in Smalltalk I can write code so that the IDE doesn't execute immediatly. That magic is called a block and it is used like this:

[a doThis: x with: y]

When I evaluate it I get nothing and when I print it I get "a Block".

Smalltalk uses these blocks as tools to build control structures (like if, while, etc.) that Smalltalk does not have.

aBoolean ifTrue: [ b doThat: y + x ] if False: [a doThis: x with: y].

The Java equivalent would be:

if ( aBoolean )
{
b.doThat( y + x );
}
else {
a.doThisWith( x, y );
}

So the [] in Smalltalk are the equivalent of {} in Java.

Ok, guys and gals, you can go away now, there is nothing to be seen here.

Still here? Ok, there are some differences. Blocks in Smalltalk can take parameters:

aList do: [ :elem | elem doSomething ]

which would be the equivalent of:

for ( Iterator it = aList.iterator(); it.hasNext(); )
{
Elem elem = it.next();
elem.doSomething();
}

The thing is that the block in Smalltalk is a First Class Object, or an object of the first kind, granted all privileges of full objects, which means that it can be assigned to variables, passed to functions and returned from functions, which is to say, it can be assigned to variables and the language is not weird, so:

block1 := [ :elem | elem doSomething ].
aList do: block1.

Coolisimo! But there is more.

Did I mention that Smalltalk has no control of execution? Then how does Smalltalk do while loops?

[condition] whileTrue: aBlock.

condition should be any boolean variable or boolean expression while aBlock could be any block. Please notice that the receiver of #whileTrue: is a block. Why?

Because the condition has to be tested after every completion of the cycle in the loop. If it were not written as a block, it would be executed once and then it would either never execute the loop or never exit the loop.

This means the control structures in Smalltalk are not implemented built-in the language and that you can create new control structures, like switch/cases and the like. This of course is not the same in Java, because in Java the blocks are not first class citizens and the control structures are therefore built-in the compiler and the JVM.

How is aList>>do: implemented?

"let us suppose that the list has a first node and then each node has a next node"
do: aBlock
| currentNode |
aBlock value: node.
currentNode := node.
[ currentNode next isNull not ] whileTrue: [ aBlock value: node.
currentNode := node. ]

Comments in Smalltalk are surrounded by: "" and variable declarations can occur only at the top of any method using pipes: | aVar | declares aVar as an untyped variable because in Smalltalk all variables are untyped.

Enough of Smalltalk already, can we do that in Java?

Block block1 = Block #void#( Elem elem ) {
elem.doSomething();
};
aList.forEach( block1 );

Please notice that I wrote nothing strange except for the assigment. If I can get the assigment to work, then passing a block as parameter and returning a block would all be logical conclusions, so let as concentrate on the assignment.

Java has something similar to this called anonymous classes, so that for example you can write delayed execution using Runnables:

Runnable delexec = new Runnable() {
public void run() {
System.out.println( "hello" );
}
};
delexec.run();

So Java has a mechanism to simulate and emulate blocks, but the syntax needs a lot of sugar.

Also Runnables are ok when you don't need parameters, but if you need parameters, you should declare your own interfaces and you would have endless interface pollution. Even if you just pass Object, you would still need:

public interface Executable
{
void run();
void runOn( Object ob1 );
void runOn( Object ob1, Object ob2 );
void runOn( Object ob1, Object ob2, Object ob3 );
void runOnAll( Object [] obArr );
}

The problem with this solution is that you would need to define all the methods, even if you just needed one.

So imagine Runnable when needing no parameters (run()), then Runnable1 when needing 1 parameter (runOn(Object ob1)), Runnable2 when needing 2 parameters (runOn(Object ob1, Object ob2)), etc.

Closures are any block of code that has no free variables. A free variable is a variable that is not assigned any value (yet), so that it can take any value passed as parameter (that seems to be the definition of a block any way).

miércoles, 18 de julio de 2007

Load Linked Store Conditional

There are 2 best known ways to create lock free thread safe data structures:

1. LL/SC (Load Linked/Store Conditional)
2. CAS (Compare And Swap)

A lock-free thread-safe data-structure is a data structure that can be used multithreaded environments (hence thread-safe) that uses no locks (or uses very small locks, like CAS and LL/SC, hence lock-free).

LL/SC is a techinique that uses 2 instructions: LL and SC. LL sets a value and has a "counter" that is used for the thread to remember who did the change, so that when SC is used to set that value again (over the same memory location), if the value has changed in the mean time (by another thread typically), the "counter" would not be the same and the operation would fail (hence "condition").

It works the following way. First a copy of the data structure is made and the old pointer value is marked "reserved for update" using LL. Once the whole lock-free algorithm finishes, it is time to set the pointer value to its new location using SC. As you can imagine this means both LL and SC must be atomic (they are usually atomic and implemented as single CPU instructions).

In the case of CAS, there is only one CPU instruction called conveniently CAS, receiving 2 parameters to be exchanged atomically.

CAS can be implemented using LL/SC and LL/SC can be implemented using CAS.

The whole idea of lock-free synchronization is that critical sections can be reduced so much that the whole algorithm may execute in parallel. The only disadvantage is that the thread needs to copy the part of the data structure that is being modified, and that the algorithm must be prepared for failure (in case that some thread modified the same data before it did). Since modifying a pointer can be done atomically, magically we have several threads doing its magic at the same time.

viernes, 13 de julio de 2007

Development Tools

Here is a list of open source software development tools that I can't live without:

1. Javac, the java compiler. Firstly because Java is multiplatform and WORA (Write Once, Run Anywhere), secondly because Java's dynamic proxies are a dream come true about writing less code to perform more and thridly because Java is open source, so it can be ported to any platform. Could you imagine developing all that software in Java only to find out 10 years later that your software no longer runs? Java is the best way to protect your investment.

2. Svn and ant. Svn let me store projects using version control and Ant lets me compile those projects out of the box, without manual intervention. This means all the time invested in individual projects is never lost.

3. TortoiseSVN: Let me do code reviews before check-in. TortoiseSvn is integrated with Windows explorer, so you just right click your project, select "commit" and it shows you the list of files to commit (and code review).

4. JUnit: Test all classes. Period. It doubles the amount of code you need to write, but programmers like to write code, right? Besides the code becomes untangled, or else you can't test it. You save at least 60% of the project time because you debug a lot less. Whenever a Java project is in trouble, I know for sure they are not using JUnit.

5. Eclipse: Those refactorings are breath taking. Besides that, it is a fine tool.

6. Insecticida: My own pet project, it is a task tracker that is iteration aware. I can know exactly how the project is doing. I can't live without it.

7. LuntBuild: You need to dedicate a machine for this monster, but it really pays off. It compiles and run the tests in your project after any check-in. If you use Svn and Ant, you would be irrational not to use Luntbuild, although it can be done: Just ask your developers to run ant before any check-in (which they normally do, don't they?) and after any check-in, to grab a brand new release of the project in a new directory and run ant. They beg for LuntBuild ;-)

8. Wiki: For achieving conceptual integrity in the project, all concepts must be written in the Wiki. Nice and easy! The only problem si that I haven't found a Wiki that I really like, so I will have just to build one.

9. Selenium: For doing automated functional tests. Defining tests is as complex as running the application while using the Selenium IDE plugin for Firefox. Any user can do it. Then you save the fiel generated by Selenium (it is an HTML) and run it using Selenium Core.

10. Power designer: Not open source, nor free. Very expensive, but worth it, this tool allows to model the database and generate the SQL script.

11. IBM Heap analyzer: Analyses the JVM heap when it cores dump. Core dumps are not fun, neither this tool.

12. Checkstyle: Checks the style of the code automagically. Fine tool.

13. HP JTune: Garbage collector visualizer. First you generate stats from the Java runtime, and then you analyze that data using this tool.

jueves, 12 de julio de 2007

Class diagrams considered harmful

I still have issues with the UML, specifically with the class diagrams.

Class diagrams show implementation details

Class diagrams can be delivered at the very end of a development effort, just when everything seems to work, so that we can show the implementation details to the extremely curious, but writing class diagrams when use cases are being written is a waste of time, sets in stone what really can't be decided so early and distracts the attention from what is important to what is irrelevant.

First, the important issues about object modeling is how objects behave in RAM, that is, object diagrams are a lot more important than class diagrams. And no, one can't be derived from the other. Object diagrams show how objects will interact in RAM, while class diagrams just show implementation details, and if the class hierarchy is set correctly, they can't show how objects behave in RAM. Otherwise, if your class diagrams and object diagrams are almost equivalent, you have a very *fixed* way to interrelate objects in RAM, obviously one that is completely directed by the class hierarchy. If the objects need to relate in another way, changes should be made to the class hierarchy, meaning the class hierarchy will never get stable. So you either err on the side on which you can't change the class hierarchy or you err on the side on which your class hierarchy never seems to get stable.

ORMs and class diagrams

There is another problem with confusing the object diagram with the class diagram. Let us suppose you need to store your object model in a database. Would you model your database after your class diagram or after your object diagram? If you think you want to store your classes, think again. You want to store objects, therefore the object diagram is the one to store.

There is more pervasive defects in the way most modellers approach design in UML. For example if I have a Person class, having a Person table may not be the best option. Tables are just object containers, persistent ones, but containers anyway. Could you imagine that if you wanted to store a new list of persons (let us say people who owe you something) and there was only one list of Person? Of course developers would complain, since it makes no sense, and I sincerely hope you see the problem here.

Then why don't OO developers complain if there is only one Person persistent container?

I suppose the problem has more to do with specialization than anything else.

Specialization in the software development field means that you either know object orientation (encapsulation + inheritance = polymorphism, plus something about object identity) or you either know about databases (relational algebra, SQL, ACID properties). It is very hard for developers to know intimately both worlds, because the language is arcane, one term used in one of the fields is not exactly the same in the other, and all concepts come with a baggage.

Even inside any of those fields there is some disagreement, as for example, SQL people like nulls, but the people who invented the relational model (Codd and Date) rejected the idea. Fortunately OO was influenced by Lisp so the null concept is pretty straightforward. There are other controversies about if SQL is a properly defined computer language, since it breaks a lot of rules other programming languages have achieved since the start of the discipline, like variables that can hold expressions: SQL is a very strange computer language, because you can't assign variables the result of a query (you can iteratively manipulate data using cursors, but that defeats the purpose of SQL which is to manage data in sets and never individually).

Also in the OO field (so that the database field does not take the criticism personally) there is some controversy about object identity (is it really necessary for an OO language to be defined as such?), about how inheritance breaks encapsulation, about if the important part of object orientation is polymorphism or not, and how polymorphism could be achieved without inheritance.

Back to the class diagram fiasco

It doesn't matter if class A extends class B or viceversa, all those decisions are just implementation details. The important issue is what is the protocol (the set of methods and their behavior) that each class can handle, what is normally expressed as the "responsibility" of the class. All the rest can be obtained as a logical consequence of the responsibility of the class.

Inheritance in particular is just an implementation detail to avoid code repetition, but other useful mechanisms could be used: automatic code generation, AOP, dynamic proxies, traits, mixins, etc.

Setting the class hierarchy in stone is a sure way to bang your head against the wall.

miércoles, 11 de julio de 2007

Multicores and lock-free programming

Personal computers are getting multi-core, meaning that a single chip can have 2, 4 or even 128 CPUs.

This is important for software writters, since multithreading up to 1024 threads will not be uncommon and all the synchronization problems is a big roadblock in the way of thread-heaven.

Synchronization problems are of 2 kinds:

1. Race conditions: Not getting what you want.
2. Contention: Getting what you want, but too late.

Usually race conditions are alleviated using locks: mutexes, semaphores, read-write-locks, condition variables, monitors, etc. The problem with all these is that they introduce contention in one way or the other, because it is implicit that they make a thread wait while the other executes the 'critical section'.

Ok, so this seems a problem without a solution, go now home, get a beer and enjoy The Simpsons.

Still here? Hmmm, ok, there is lock free synchronization, which is really a form of obscuring the problem.

See: http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-639.pdf

At the hart of lock free algorithms lies the best of lies: They do not remove the locks, they just make them smaller and smaller, until you can not see the locks.

Let me explain with an example of the database kind: In the old days of databases (1997), it was common to write software that used temporary tables, so you hired 30 guys, each wrote a piece of the action using temporary tables and finally when you integrated everything, you realized that the whole system froze whenever there was a user triggering work over a temporary table.

Why? Because all databases store their metadata in tables (this is a requierement to be called "relational database"), and everytime you did an insert in a table, the table would lock until you either commit or rollback. This "feature" is because databases are transactional. (The was a joke in those days "It is not a bug, it is a feature").

Microsoft SQL Server guys realized people were using databases like these and instead of issuing a warning to the computer software makers, they released a new version that did not lock the entire temporary meta-table when somebody did an insert. (The meta-table is the table of the table), so that two processes could create 2 temporary tables and continue to work in parallel.

Soon, this improvement was done for all tables, not just the temporary meta-table. The result was more paralellism, but please notice that a big temporary lock was just replaced with a smaller temporary lock, that was, conceptually, the real advance.

Why can't we do this for all software, that is, convert all those big locks in really smaller ones?

What if I needed to perform a really big change in a data structure, but instead of locking the whole data-structure, locked just the elements to be modified. Any modification could still lock(because others were modifying it) or even dead-lock (because we grabbed all our locks in different order), but then we could make one of those threads to succeed and restart the other thread (undoing the changes).

Undoing changes in the database world is cheap, undoing them in the systems programmers world is hard, because neither the language nor the runtime help. But someone came up with the idea that we could just copy everything in a temporary area in RAM, and just using a switch of pointers (called CAS), we could have instant gratification.

The data structure thus copied is simply let go if the CAS operation finds that the structure had been modified before CAS had a chance to execute, therefore the only lock is the CAS and in case of failure it is easy to restart the whole operation.

In fact this means that a very big piece of software is not aware of multithreading because it is using thread-safe data structures that are lock-free, and those data structures that are thread-safe but lock-free, work by just making copies of everything and switching pointers at the end, therefore reducing the amount of lock from several millisecs to a few nanosecs.

This of course makes CPU manufacturers happy and software developers happy, because they can deliver software that is written as if it was in one thread, but performs better than carefully written multithreaded software.

Setting Prices

Setting prices is something we all do, either as consumers or as producers. We set a price when we shop and we call it "budget". We are willing to spend less, but we are not willing to spend more, depending on quality features of the product.

Even people working at a corporation set prices for their work, when they first enter to work for said corporation, and they improve their methods of work in order to receive salary increases. If they don't receive salary increases, they stop contributing on their way to improve their work and they simply heat their chair and wander randomly through email, webpages, etc. Any observer can't for sure realize if they are doing their job or not, since they all seem to be goofing off.

When negotiating salary increases, IT employees tend to mention other's employees salaries and salaries for simmliar jobs at other companies. This is called a market price. If you don't give them what they are asking for, generally they tend to indicate their job relationship is going to finish in a given date, since IT employees are not really good at bargaining.

Most companies know this and either offer bonuses and salary increases (which are generally paid only if the employee threats to leave) or they simply replace the unhappy worker with a new one.

When you need to set a price for something you have created, you need to look at the market for prices and it is always best to use a market price, so that consumers have to decide based on merit.

Most consumers go with the stablishment, that is to say, they prefer stablished companies which have been for very long in the market, and if you have been with a salesman before, they will tell you how bad is the product of said company because it has so many *problems*... and stare in silence for your reply... you think it is not possible for them to know that and they are making it up, do you really want to engage in a technical discussion with a salesman? I mean he is a salesman, so whatever he says is based on somebody else's opinion, so the empty statement will be justified by other empty statements, and so forth ad infinitum. There is one exception though, when the salesman mentionss an specific company that had a problem with said product, this you could call and figure it out if it was true or not, but if you think about it, would people really let you know they had a problem? Even if they are your very good friends, they have no incentive to tell you the truth, and even if they do, most of the information is based on gossip anyway, so you can't trust it.

If you try to outcompete other companies by charging more, you will be out of business in no time. You certainly can't compete if your price is higher than the market price, unless your product is in a different category. That is why some companies market their products as "World class", because it is again an empty statement, but it does make a point on the buyer: Buy this and since "all the world is buying it, your bet is safe", this is just a different way of saying "No one has been fired for buying IBM" a very long marketing tradition at IBM. Then the product doesn't work, you go to the salesman and ask for a reimbursement, but the salesman would reply "I didn't tell you it would work, all I said was that this was a world class". World lesson would be a better term.

So prices are at most market prices unless you invent your own product category, so that you have no competition. But what if I want to charge less than the market price?

The reasoning goes like this: If I charge 50% of what the market price is, I will sell twice as much and therefore I will have the same benefit as my competitors, but after a while 80% of the market will be mine, and since in the computer software business, most companies always want more, I will be asked for more products and therefore I will land a lot more sales of new goods, and since I will be in my own category (no competition), I will be able to set prices, and even integrate those products and dominate the world. Let us call that strategy "World conquer".

The World Conquer strategy doesn't work unless the source is not released (a la Microsoft) and you don't depend on special hardware or special software (a la Microsoft). Microsoft started developing compilers and since today, they use their own compilers. Because of that they could develop their own operating systems.

Have a look, Linux started the same way from GCC. I think this is not a coincidence.

You need vertical integration in order to make money and the same is true if you want to create a free OS. If Linux was written on Microsoft C++, Microsoft could have defined the fate of Linux.

Do you think investors would trust your company if your company was based on Windows and Microsoft C++? Probably some investors would, but I bet those investors would not be tech savvy nor business savvy. Probably you wouldn't get that much money and you would be wondering if you didn't have the right connections.

But people connect on ideas. You need to have the right ideas and let the people with the right ideas connect to you. Eventually everything will be so simple, because people imitate each other, so smart people, when imitating other smart people, behave in a even smarter way, they communicate better, because other people is able to understand what they are saying, and the same ideas are said again a again in different ways.

People with the right ideas attract other people with the right ideas.

You don't need a whole bunch of ideas. I mean 10 important ideas are all you need. The rest can all be figured out because they are the details.

It is not so important to connect to smart people as to disconnect from not-so-smart people. Ideas tend to be memes, so if you constantly speak with people who can only watch TV and soap operas, you will end up thinking like them, or worse yet, not thinking but repeating assertions that do not make any sense, but are in vogue.

Ok, why then setting lower prices doesn't work? Suppose you go into the supermarket and all olive oil is $10 per liter. Then you see a bottle that is only $1 per liter, would you buy? Unless you know the product in advance, you think you are supposed to eat that stuff, not oil your bike, so you prefer the good old and known instead of the new and unknown.

In the corporate world, there is stuff to eat (the important stuff) and stuff to oil your bike (unimportant stuff, unless you use your bike to win the decathlon), and you probably want to make a profit selling your products to corporations, in which case, the product is important to them. If the product wasn't important because they are going to oil their bikes, they would change the price at will, since they have so many offers of oil, and they really don't care if they use your oil of recycled motor oil as long as they can concentrate on the important stuff.

See my point? In order to have market power, you need to understand the real needs of the market, that is the needs that are not satisfied... yet. And once you know them, stick to the market price.

Now let us suppose that you found a market that is not served well, your product is superior and the companies selling all competing products are selling basically crap. If you work as a Java developer you already know all those developers who can't even reverse a linked list, so you know exactly what I'm talking about, since in a given company, sometimes all people left can't code really, but they can look as if they could.

What could a company in this situation do? It could set a price higher than the rest of the companies, so that, let us say, 20% of the market is served by your company with the only good product and 80% of the market is served by the underdogs. Eventually *your* 20% of the market has 100% of their markets while the other 80% is gone. Can it be any better?

This doesn't happen in practice because:

If you are the only one programmer who knows how to reverse a linked list, your manager will ask you to do it once, and all the other programmers will copy that code, making your abilities irrelevant.
If the company decides to charge more, all other companies will improve their products until they match all your important features. Over and over during a very long period of time, you will improve your product and they will catch up or die, but the ones able to catch up will sell more than you, because they have a lower price.

So the only way to make loads of money is to charge the market price (or slightly less), have a better product, have strong salesmen, AND have a product no one can imitate.

martes, 10 de julio de 2007

On Software Quality...

One way to radically improve the quality of the software is to create libraries.

Libraries reduce the duplication of code and make the same piece of code to be reused extensively, increasing the amount of testing that is done over the same piece of code.

Having the same code spread all over the system is maintainance hazard, since if a bug is found in that code, it for sure is apread all over the system. Fixing a line of code, statiscally, introduces a new software defect, so if you touch 20 lines of code to fix a defect, statistically you are introducing 20 new defects.

Some projects detect a defect that needs to touch 2 thousand lines of code to be properly fixed, so they decide not to fix the bug, because the cost of finding and fixing all those 2,000 potential new bugs is a burden that very few projects could take.

Libraries can usually be tested aside of the rest of the system, because they are reusable. This is a very important characteristic of libraries, since they can be reused in other projects, so they are an investment, but also it is important because defects and dependencies can be contained within the library, and although it can affect systems built over that library, the library can be easily fixed and tested independently, so once it has been fixed, it can be incorporated again into the system built on top of it.

Case Analysis at Microsoft

Microsoft is a well known firm in the PC software market and has some very well known good practices like the informal code review before check-in, written code conventions, the daily build, the smoke test and the doog food processes. Nevertheless, ex Microsoft employees critizise Microsoft as having too low abstractions. Well, having too low abstractions is having no abstraction at all.

There are some coding metrics used at Microsoft that measure performance by the number of lines of code each developer writes. Many Microsofties complain that this practice encourages sloppy coding and doesn't measure the actual effort nor the actual benefit of the performed job. This creates a perverse incentive of any developer to use copy and paste, rather than to create reusable functions, since at the performance review, not only themselves, but their whole team will appear underperforming.

There have been some debate inside Microsoft as if measuring performance by counting the lines of code is a good practice, and several key executives within Microsoft have publicly declared that using function points would be more appropiate, but since there is a known correlation between the two, anyone can easily use one or the other interchangeably, with the added benefit that counting source lines is cheaper.

Companies that insist on use function points have the problem that there is no automatic way of telling how many function points a program has, some function points are harder than others (the same can be said of lines) and usually companies forced to use function points simply reduce it to how many reads and writes does a program do, be it to a file, to a database, or using getters and setters. When management is measuring like that, any competent developer can make his program use more reads and more writes (ie: make it innefficient) if that means having a better evaluation.

Let us suppose we have two developers names A and B. Developer A is very good reducing duplicated code and using very fast data structures, minimizing the amount of data moved in or out of classes, and in general making programs that perform faster than the rest. Developer B is all the contrary, he writes very long programs with very simmilar code copied and pasted all over the place. Then he adds lots and lots of optimizations which improve all the non-important parts of the code and make very little improvements to the speed, but make the program move a lot of data.

Developer A will get a very bad review while developer B will get a very good review, probably an increase in salary and a promotion. Eventually either developer A will leave for greener pastures or he will work under B's supervision, which means that all developers in the same company will do as B does, no matter what they do.

The point I'm trying to make is that both counting lines of code and function points are aberrations. We will see why this logic does stand the reality test using a gedanken experiment (a thought experiment).

Let us suppose that developer A needs to create a new reusable function Fa of 10 lines and use it in 3 different places, but since he has already been told not to create small functions, he rather decides to copy and paste them, instead of the 10 lines of code called from 3 places (13 lines total), he ends up with 30 lines of code.

Now comes developer B that needs to grab the function Fa and call it inside his new function Fb that contains 5 lines of code (inclusing the call to Fa), so that Fb will be called 20 times. But developer B has the same motivation as developer A not to create that function, specially since he has also been told not to create small functions and a five line function doesn't seem big at all.

In the scenario where both developers were encouraged to create new functions, developer B would already be done writing his 5 lines function and calling it 20 times (25 lines total). But under the scenario where both developers are encouraged not to create new functions, developer B has to look at 30 lines of code written by developer A, understand those lines, see which of them could be copied and pasted easierly, and then copy that code 20 times, plus 4 additional lines each time. That's exactly 14 lines copied 20 times, or 280 lines total. If statistics are real, that means that developer B will take almost 10 times longer without using functions that using them, and will introduce 10 times more defects.

Also, since the program is now less modular and less structured, it will take 10 more times to test it throughly, the number of defect present will be higher, but also the amount of defects detected will be higher, while the percentage of defect detected will be lower, because there will be less time to find the defects, less time testing the system, and removing defects that were reproduced by copy and paste.

Let us suppose that the average number of defects is 1 defect per 10 lines of code. In the case that both developers used functions, A wrote 13 lines (1.3 defects) and B wrote 25 lines (2.5 defects), so the total number of defects would be 3.8 defects (let us suppose it is 4, since 3.8 is statistically speaking).

In the case that no developer used functions, A wrote 30 lines of code (3 defects) and B wrote 280 lines (28 defects), so the total number of defects would be 31 defects, so it would be 8 times higher.

Statiscally, each developer can fix 3 defects per day while introducing a new one each day, so the rate would be 2 defects a day. The time it takes to fix all 4 defects for a developer would be 2 to 3 days, depending on if he introduces a new defect the last day or not.

In the case of 31 defects, to fix them, if we use one developer, he would take 15.5 days, or 15 to 16 days depending on if he introduces one defect on the last day or not.

Let us consider the total cost of development, asuming each developer writes 100 lines of code per day and that the testers take the same amount of time testing the code as the time necessary to write it:

Strategy	Write the code.	Test to find defects	Fix the detected defects
No functions	3 days	1 day	16 days
Use functions	3 days	1 day	3 days

This example shows what happens with a very small task and very few developers, now let us supposse that we have many developers and very large tasks, what would happen?

First, let us suppose that only developer A does his task and developer B is not required to do so. Do we see a huge impact?

Strategy	Write the code.	Test to find defects	Fix the detected defects
Write the code.	0.3 days	0.13 days	2 days
Test to find defects	0.3 days	0.13 days	1 days

We now see that the numbers are not so different for smaller tasks and smaller teams, but when teams grow larger, the number of copy-pastes increase exponentially, since every line that was copied and pasted is now a potential candidate for a new copy and paste.

I think the real reason that Microsoft prefers to copy and paste instead of creating new functions is the added cost of creating a new name. I mean what can you expect from a company that names their windowing operating system "Windows", their document processor "Word", their web browser "internet explorer", their planning program "Project". They have problems inventing meaningful names, since Bob, PowerPoint and Excel are not really names one could consider remotely related to the functions they perform. (Well maybe making a power point does have some sense, but that is an exception).

lunes, 9 de julio de 2007

Predicting how long projects will take...

Scrum proposes that on every iteration (Sprint in Scrum terminology) you plan a work burndown chart (WBC) [ notice the apparently coincidental simmilarity with WBS: Work Breakdown Structure ] so that time is on the X axis and Y represents the amount of work left in hours.

(This picture was copied from http://www.controlchaos.com/about/burndown.php)

As you have (I suppose) already imagined, the line is a diagonal forming a triangle against the X and Y axis (when planning it is a diagonal, or at least almost a diagonal, what actually happens in the project, if you could measure any points of completion, would look more like a stair going up and down).

Well, so far it seems deceptively simple.

Scrum masters also draw another line including the new items that appear during the iteration. Those were unplanned items and therefore they have to be written somewhere else.

(Taken without permission from: http://danube.com/docs/scrumworks/pro/latest/reports.html)

I was thinking why all those items were drawn below the X axis, since apparently they just pile waiting for someone to fix them, but I was feeling unconfortable with it. When the time on the iteration finishes, how much time is left?

Some Scrum books suggest that you project those 2 lines and they will meet at some point, therefore you will know for sure when this iteration is finished. But they also suggest that several days may go by and you may make no progress, because the originally planned line became flat... So now the lines never touch again.

What if people begin solving bugs first? I think this is one of the best strategies for finishing early, yet the WBC encourages working according to the plan and leaving all those pesky bugs for the next iteration...

Alistair Cockburn has a very good example of a burn down chart and its issues.

So I was wondering if we can know for sure how long it would take, without resorting to the "project the lines" misconception. I really think it is a misconception because all those new issues should be restimated and put in their own iteration and burn down chart. At least that was what Alistair was doing in his example, but I still think it is too expensive, because it is an "after the fact" exploratory system. When faced with an a-priori estimating method and a-posteriori one, I would certainly prefer the a priori estimating method because it would render value before the other method. If after the fact is important, I could always do another estimation later.

How could we estimate a priori the amount of work left? We can estimate a posteriori each and every item and actually fix them in the next iteration, but that iteration will also have some bugs, and so on.

Let us suppose that we empirically determine that for every iteration, we always have half the amount of work left (and for the sake of simplicity, let us suppose it is always half fo the previous iteration).

So if we had 256 hours left in the first iteration, after 256 hours of plentiful work, we still have 128 hours left just to finish was was left on the first iteration. Agilist will complain that the amount of time in an iteration is fixed, while the amount of work is variable during the iteration (we can remove items, not add them, others like to add them), but the problem now is that we finished all items, but new items appeared, because we didn't realize in advance all border cases for example. We will call this time to fix bugs, the re-iteration, in lack of a better term.

We reschedule and we work those 128 hours, only to find out that there are bugs and we estimate them and it amounts to 64 hours left. So we continue to work and this goes on and on. This is the 21st century of version of Achiles and the tortoise.

Can we know in advance how long it will take the fisrt iteration and all its re-iterations?

sum( n = 0 to infinitum, x^n ) = 1 / (1 - x)

See: http://en.wikipedia.org/wiki/Power_series

So if we always have half the work left, sum ( n = 0 to infinitum, 2^n) = 1 / ( 1 - 1/2 ) = 2

That is to say, any iteration it will always take double of what we estimate. Does it sound familiar?

The same can be calculated if we estimate that we have a third of work leff, sum ( n = 0 to infinitum, 3^n) = 1 / ( 1 - 1/3 ) = 2/3

I think this is a very important result, because it eliminates the need to make expensive burn down charts, although of course you still need to decompose the project using a work breakdown structure, identify risks, develop prototypes, use iterations, etc. The only advantage is that given a few iterations you can gather some statistics and predict accurately how long the project will take.

But remember you need to finish those iterations first in order to gather meaningful statistics, otherwise all that estimation is just wishful thinking.

It is very interesting that Agile Methods are fusioning with the Balanced Scorecard: http://www.agilejournal.com/articles/articles/the-agilev-scorecard/

viernes, 6 de julio de 2007

Rants about patterns

Alex is ranting about the Singleton pattern and the Template Method pattern (TMP).

I agree, he is right. The TMP is rather poor, but it is the heart and intent of all object orientation (Encapsulation + Inheritance = Polymorphism). To be fair, Polymorphism comes in 2 flavors in Java:

Class polymorphism, ie: the template method pattern.
Interface polymorphism, ie: use interfaces instead of abstract classes.

I have to agree with Alex on this one, but I don't like his proposed solution: Simply use an interface. What about all the code repetition?

It may be a good mechanism, but it is certainly a lot of code for any programmer to write, the TMP is very simple in comparison.

I have another proposed solution: Use Traits. Simply stated, a trait is just a class that has methods but no instance variables, some of the methods are abstract and must be redefined somewhere and the rest of the methods depend on those methods.

Here is a paper that explains Traits when compared to Interfaces, Mixins and Multiple Inheritance.

Multiple Inheritance is something to avoid at all costs. Interfaces are well defined, but they do not share any behavior, as we all know. Mixins are the next good thing, but a Mixin just mixes 2 classes creating a new one, it reminds me of templates in C++, also something you want to avoid, because the code looks simple, but what is does under the hood is disgusting.

There are some intentions to create alternative versions of Java that support Traits. I wonder if we really need that. Isn't Java Turing complete? Why should I have to extend Java to implement something so simple as a Trait?

Let us explain a wee little bit. First of all, if you use the TMP, you can share some code in the base class and override the template methods in different classes, but the only problem, at least in Java, is that you can't extend several classes at once and therefore there is some code duplication.

For example, let us suppose you have 4 classes: A, B, C and D, and they are defined like this:

class A
{
public void a() { ... }
}

class B extends A
{
public void b() { ... }
}

class C extends A
{
public void c() { ... }
}

class D extends B
{
public void c() { ... }
public void d() { ... }
}

This is the typical Diamond problem (if it were defined using multiple inheritance, and its solution in a sinlge inheritance language). Yes, method c() is repeated both in C and D, but Java doesn't have multiple inheritance, so this is the only solution.

Nevertheless you hate repeated code, you read about Mixins and Traits, Mixins maintain the source clean, which is a good thing, but the compiled code is a mess, so you study Traits.

Are you still with me?

We need a more realistic example to show how a Trait would work.

class Person
{
String id;
String name;
}

class Professor extends Person
{
List courseList; // List of Course
void administerTest( Test test, Course course ) { ... }
}

class Student extends Person
{
List courseList; // List of Course
void giveTest( Test test, Course course ) { ... }
}

class Course
{
Professor professor;
List studentList; // List of Student
List testList;
}

class AssistantProfessor extends Student, Professor
{}

Since AssistantProfessor can't really extend Student and Professor, we need to either extend Student or Professor and copy and paste the missing methods, ie:

class AssistantProfessor extends Student
{
List courseList; // List of Courses to teach
void administerTest( Test test, Course course ) { ... }
}

or:

class AssistantProfessor extends Professor
{
List courseList; // List of Courses to study
void giveTest( Test test, Course course ) { ... }
}

As you can see, the code is heavily repeated one way or the other.

The same solution using Traits:

class PersonTrait {}

class Person extends PersonTrait
{
String id;
String name;
}

class ProfessorTrait extends PersonTrait
{
abstract List getCourseList(); // List of Course
void administerTest( Test test, Course course ) { ... }
}

class Professor extends ProfessorTrait
{
List courseList; // List of Course
List getCourseList() { return courseList; }
}

class StudentTrait extends PersonTrait
{
abstract List getCourseList(); // List of Course
void giveTest( Test test, Course course ) { ... }
}

class Student extends StudentTrait
{
List courseList; // List of Course
List getCourseList() { return courseList; }
}

Now probably you have noticed that each class exists twice, once as a traits class with no instance variables and once as a normal class descending from the traits class. The class hierarchy has only traits classes (which are abstract) and leaf classes descend directly from those traits classes.

Also leaf classes are sometimes identical, like the Professor and the Student classes.

What about the AssistantProfessor?

class AssistantProfessorTrait extends StudentTrait, ProfessorTrait
{}

class AssistantProfessor extends AssistantProfessorTrait
{
List courseList; // List of Course
List getCourseList() { return courseList; }
}

The main problem with this is that AssistantProfessorTrait can't extend 2 classes. Even if that worked, there is no way to define AssistantProfessor's getCourseList() so that it satisfies both StudentTrait and ProfessorTrait.

miércoles, 4 de julio de 2007

The Manager Role

There are 2 kinds of manager:

1. The technical manager.
2. The non-technical manager.

The technical manager is a techie just like you and me, with the same objectives in life: Make life easier by automating stuff. He is a techinical manager because after so many years, he now the tricks of the trade and therefore he may direct a small group of developers, teach them, share stories on how to do things, and preserve old techinical knowledge that without him, for better or for worse, would become extinct.

The non-technical manager is a different beast. He knows techno-speak, but he doesn't really know the technical details, nor he is interested in those details, he just wants to know what are consequences of those decisions, so that he may talk to other technical managers and ask for help or offer help (by offering you), or otherwise, talk to technical managers and let them know what they want. Their mission is to know what the market wants and deliver it, if they are willing to pay the price.

They think differently because their objectives are opposed. While the technical manager wants to deliver more value in exchange for less (those are the forces that motivate change in the market), the non-techincal manager is after the financial gain of the company. While the technical manager tries to create new frameworks and make them work using less resources, the non-technical manager is thinking about ways to force potential clients into promoting their products, for example by forcing them to display banners and the like.

Do we really need non.technical managers? Apparently we do. But each year I see there is less need of them, since there knowledge seems to be rather spare and simple, while the technical managers seem to be getting more complex every year. Eventually non-technical managers will be reporting to technical managers, as they do at Google.

Why do companies hire both technical and non-techinical managers? Technical managers are the ones who can deliver value. There is a need to dose that value in order for clients to pay. As the techinical difficulty increases, since each year projects are more complex, no technical leader is able to understand everuthing, so the role of the non-technical manager becomes less and less necessary: The technical manager also is ignorant when talking to potential clients and has to manage the relationship superficially and with care.

Typically non-technical managers recommend that technical people, when interviewed by the customer, do not respond direct questions with yes or no, easy or hard, kind of answers, but with "delayed-execution" answers, like "I will look into it", no matter what. If you think a little bit about, let us suppose some potential customer is asking for a feature, but you just don't know if it can be implemented, you certainly need to make sure that it can be delivered on time and on budget, so you prefer to think on the alternatives and actually try them before saying when and how. That's ok for you, but puts the customer in a difficult position, because he has already expressed his reals requirements and you give him nothing to negotiate with. You simply take that information for a few days or weeks and give back the results. He may like the answer or not, but the precious time you use to decide how much it will cost is something terrible for him, because you could come up with a price he can't pay.

Now if you look at it from the perspective of the non-technical manager, it is exactly what he wants the customer to think. He wants him to think he has a solution in his hands, but probably the product is too expensive for him, so he may need to reduce the budget on other things... That's what the non-technical manager wants him to think. Delay, delay, delay, so that you can ask for ridicously large sums of money. Justifications? Sure, why not, licenses, people hired in the project, hardware, delayed meetings, incorrect specifications, etc.

So in a way the non-technical managers are necessary, for positioning the product in the market, etc., the only problem is who reports to who.

UML is Brain Damaged

I hate UML. I really do.

I've thinking this for 15 years, and now I must let everyone know that I think that UML is brain damaged. Sorry UMLers, but from its inception, even before UML even existed (in 1995, when it was UML 0.8), I thought it was brain damaged.

For starters, UML is specified... in UML. This means that UML means... whatever UML means, because it is specified in itself. People who know algebra or geometry will be laughing.

UML should be called UMD (Unified Modeling Drawing). Really, since it is not a language, neither in the computer language sense nor in the natural language sense.

People who was working with me, who didn't know object orientation, liked UML, the language (or drawing notation) that would allow them to draw diagrams and then we C++ coders could implement those wonderful designs. Needless to say, their diagrams and ideas had to be redone, usually by just tossing them away and looking at the original problem they were trying to solve and presenting a really straight forward solution to it. No wonder why people who couldn't code liked UML so much. It is the same kind of people who mention that failure and success is just an opinion.

There are no successful projects delivered using UML, and I guess all those people who can't program are hiding behind huge piles of diagrams stacked on the floor. You can laugh, but UML is an energy drain, since there is no way you can prove a design with UML is right nor wrong, therefore it is a waste, because it is open to interpretation. Most successful projects ignore UML completely and avoid it like the plague. This is not coincidence, but a serious decision to make.

Java was not designed using UML, nor it's class library. Still there are no compelling UML design for the Java class library and it has been around for at least 10 years. How come? Nobody can do it or it is found that it wouldn't be useful?

I have another explanation: Wouldn't it be possible that real engineers know that UML is a scam?

Have you ever seen diagrams of design patterns (Java best practices)? There are at least 24 design patterns with name and example code, but all UML diagrams made for them look the same. Would you use UML to document the design patterns you used? I bet you wouldn't, because it would be considered a waste, but then how can you explain your design if you can't draw them? But if you draw them, they all look the same anyway, so they serve no purpose.

UML class diagrams do not show polymorphism in your programs and polymorphism is the key to object orientation. No, using separate diagrams for each polymorphic message is not practical. Why is that so? The 3 amigos... no, not those 3, but these 3, who invented UML were after something else, not after solving real problems. Or maybe they were really incompetent, or a mixture of the two.

UML sequence diagrams do not show polymorphism and break encapsulation. Encapsulation and polymorphism are at the center of object orientation. Why is UML marketed as an object oriented tool if it works against object orientation?

UML use cases are simply escenario-oriented documents, specifying "steps" for the user to use the system, while we know that modern user interfaces (since 1984) are event-oriented, and therefore you can't force the user into any steps since the user selects the steps he wants to do.

Furthermore, Rational was the company behind UML. Where is Rational now? Why their people lost so much momentum?

Finally, Rational Rose was the key product marketed by Rational, but the product was obviously bloated (too big), underperforming (too slow), unusable (hard to use) and buggy (a diagram that extended beyond a page was usually the one you could not save). Maybe it was because Rational developed rose using UML.

The fact that most UML architects can't code is a sign that shows that they don't know what they are talking about, yet they have very strong opinions backed by companies that create UML design tools, a whole ecosystem. Architects should be able to code and give recomendations on how to code, pointing to design patterns when necesary. Doing code reviews and writing code conventions, for example, should be at least 20% of the time spent every day by any serious Java architect.

The UML tools camp disguised itself into the MDA tools camp and obviously joined forces with the dying CASE tool camp of the 80's. For now it seems that their fad has not gained momentum. UML has been marketed even as a BPR tool (Business Process Reingeneering), what a nerve!

You know what, it is good if they gain momentum, since it drives all bad programmers into them. Cool ;-)

Disclaimer: UML is still evolving and maybe in 100 years it will be suitable to capture requirements and model systems. In the mean time you can read a lot about the things you shouldn't do in case you are forced to use UML.

Rants