I’ve written about this before, so included some links. I think it helps to differentiate between Scaler’s architecture and the other products which (claim to) detect audio. A key issue is whether the improvements you are thinking are “in the context of scaler” or whether they imply a pretty radical change for how Scaler currently works.
In fact, I replied to a similar post in December last
and again recently
My workflow with audio is
{1} run it through deCoda
{2} drop it into Scaler
{3} I then set up the deCoda progression (mostly triads) in section C, and then examine the Scaler detected Chords (mostly very much better) and audition aurally to swap out the deCoda chords as relevant.
This way, deCoda gives me timing, doesn’t miss out ‘repeated’ chords’ and does not confuse me with single notes. Scaler than gives me a much richer detection of the chords than the simply triads from deCoda.
The basic issue is that Scaler does not detect timing, but looks for chords changes. deCoda is trying to fits chords to a detected timeline. Each does (IMHO) a reasonable job, and I have found the combination works very well for my genre of music.
I think folk here would be interested to learn of the sort of improvements you would like to see, as many use audio detection.
PS: I have a copy of Band in a Box (not my music at all) and only fire it up occasionally to use the chord detection. There are genres in which it is quite good I have found … jazzy, fusion type stuff.