Rants: New algorithms

Solid state disks (SSD) seem to be the next big thing.

They consume less energy, they are faster, they are slimmer, they shrink in size and spand their capacity faster than mechanical disks. But I already predicted in this blog than in 2010 we will not see hard disks anymore, but the trend seem to be getting into reality faster.

Computers will get less expensive and have a lot more capacity and speed.

But also databases will have to be redone. Databases are optimized for slow disks with average seek time of about 80% of the access time, the other 20% being lag necesary for waiting for the disk to spin to the appropiate angle (which explains why some people are so obstinated on getting 10,000 rpm disks). They are optimizing the 20%, and they are not getting even a 10% of improvement.

But, but, but, can we do better than that?

I mean, can we find a better data structure for SSD? For example since random access makes it irrelevant where you read or write, maybe we can just use AVLs instead? Or use Java for handling that memory so that it becomes garbage collected (and compacted!!!) automatically ;-D

Now the disks will not have seek time nor rotation time, so disks will be almost as fast as main memory. This will mean the latency will be in the net.

But not only that: SQL databases use B-Trees in order to avoid expensive seek times. What now? Will we still be using B-Trees or will we use Hashtables or Linked Lists?

In the case of main memory, insertsort is almost as fast as quicksort, while bubblesort is incredibly slow. The reason is that in modern computers, when you grab a main memory bucket you are not just grabing it, but grabing all its neighbors into the L2 or L1 cache.

This means that the same cache idea can be applied to SSD, using main memory. Therefore we can still use reliably B-Trees, since we know that the cache structure of the memory means that we need to minimize mutation operations.

Rants

miércoles, 5 de septiembre de 2007

New algorithms

No hay comentarios:

Archivo del blog

Datos personales