Thursday, November 17, 2011

Problemas por el desconocimiento de las leyes de la probabilidad

"Asumiendo que el 1% de mujeres de 40 que se hacen mamografías rutinarias tiene cáncer de mama, que el 80% de las mujeres con cáncer de mama obtienen mamografías positivas (o sea, que se les detecta algo que resulta ser un cáncer real) y que el 9,6% de las mujeres sin cáncer de mama también obtienen mamografías positivas (que son falsos positivos, en realidad no tienen cáncer). Si a una mujer de 40 le sale una mamografía positiva en un test rutinario, qué probabilidad hay de que realmente tenga cáncer de mama?"

Esto es un problema de estadística de bachillerato que se puede resolver aplicando la fórmula de Bayes. Varios experimentos confirman que sólo alrededor del 15% de los médicos (hablamos de médicos alemanes, australianos...) saben calcular la respuesta correcta en este tipo de problemas (y los que fallan típicamente fallan por mucho). No es broma, ni es un bulo (hasta donde he podido comprobar; es posible que los que publicaron los experimentos hayan cometido fraude científico, pero hasta ese nivel de escrutinio lógicamente no puedo llegar :-). Podéis localizar la respuesta a este problema, las referencias a los experimentos que he mencionado y más información relacionada con este tipo de fallos en la experimentación científica en http://norvig.com/experiment-design.html (en inglés).

Obviamente hace falta más educación en probabilidad y estadística en particular y en matemáticas y ciencias en general. La estadística es un campo amplio y complejo, pero ¡lo de arriba es de bachillerato!

Tuesday, August 02, 2011

What do SOAP, Flex and GWT have in common?

I suppose that you can find many things if you think about it for a while. However, today I am focused on what I think it is their main problem: they all try to attract developers by, supposedly, making their lives easier, but in the end they just manage to have developers without control of their developments, and this is bad for their users.

SOAP, and its related standards, seem to be more focused on making it easier for a developer to deploy a web service without knowing details of how this web service works, than in helping developers to create well-designed distributed applications. And then it comes REST, and JSON, and SOAP becomes a zombie.

Flex intends to provide developers with a more controlled environment than the web browsers, better development tools and documentation to develop web clients and, of course, lots of GUI bells and whistles; however, they do so by providing yet another virtual machine, never designed to be virtual machine in the first place, with another set of headaches (versions, bugs in different platforms, performance and security issues...). And then it comes Apple, and decides that Flash is not for their products, because Apple, with their many defects, is always thinking in their users' experience. And Flex developers are doomed because it seems that even Adobe has accepted that HTML 5 is the way to go, and should start learning another platform ASAP, if they ask me.

And then is GWT and the supposedly good idea to take a Java developer and allow her to develop web applications in Java by hiding the browser and the JavaScript behind Java objects. And this works, as good as it can work something that is so complex: the browser is a different world, and JavaScript is more flexible than Java, not the other way round. The result: a bunch of Java developers who still need to learn JavaScript, and to learn about the browser as a platform, if they want to create web applications that look and behave as web applications without the constraints imposed by GWT.

Users come first. If something is hard for developers and good for users, then that is the way to go. If something is easy for developers at the expense of the users that is a FAIL (sometimes this is not easy to foresee, other times it is). If something is easy for developers and good for the users, then that is an EPIC WIN (not many of these I am afraid). It seems to me that there are not many shortcuts in software development.

Monday, June 27, 2011

Spatial Data Infrastructures should avoid single points of failure, including proprietary software

In a meeting yesterday, someone proposed that SDI call for tenders which require open source software should justify the advantage, in economical terms, of using it instead of proprietary software. There are many reasons to prefer open source software to proprietary software, but this time I want to focus just on one: proprietary software can create a single point of failure in your system.

A single point of failure is a part of a system that when it fails leads to a complete system failure. If a vendor, owner of the source code of a certain proprietary software, goes out of business, their users cannot do much about it. Of course they can still use this software for some time, but it has become instant legacy software. No maintenance, no updates, no improvements. Sooner or later it will be replaced, of course with a cost: a hidden cost that nobody took into consideration when licensing that software. A hidden cost that is not only money: while replacing that software, the system will not be working properly for an unknown amount of time. It is not just replacing an old version for a new version, or making a few improvements or changes. It is taking a piece of your system, and replacing it with a completely different piece because you do not have another option. Murphy's law says that this will cause more trouble than you need: your system will be down for an unknown amount of time.

Of course this does not need to happen, software vendors, specially the biggest ones, do not go out of business often, you can design strategies to minimize the effect on users when making changes in your system etc. But thinking like this goes against the safety principles that should lead the development of mission critical systems where a failure is a big problem for their users. If we take Spatial Data Infrastructures seriously, we must consider them almost as mission critical systems, and single points of failure should be avoided. As it is normally possible to replace propietary software for a free, open source alternative (sometimes even for a better alternative if you ask me), this is one thing that should be taken into consideration in any SDI.