Servos in the traditional electro-mechanical sense can have definite mass issues. However, a servo is really just any self-regulating feedback system, whether it's feedback in an amplifier or in a speaker. Even in mechanical systems, there's ways to speed up the system. For example, we could shine a laser at the speaker cone, use the speaker cone excursions to modulate the laser, and then use the demodulated output of the laser as a feedback signal to linearize the mechanical motion of the speaker (whew!). That would certainly be a servo system. Some servo loudspeakers could use mechanical transducers or strain gauges. The solution used in modern disc recorders is to simply use a separate feedback coil that's wound on the same form as the driving coil--like a speaker with dual voice coils. Even there, however, we have a time domain problem in that, at some frequency, the two coils won't be moving together on the form, or the form may actually have standing waves traveling within it.
I agree with Rob that it is amazing how good (and how bad) conventional loudspeakers can sound, especially when you realize that resonances from the listening room, or even the speaker itself, can end up in the amplifier's feedback loop. That's one reason why different speaker cables can sound different, although most reviewers aren't aware of that aspect. Keeping EMF generated by the speaker out of the amplifier's feedback loop is probably one reason why low feedback triode amplifiers can sound so pleasant.
On the other hand, a mechanical resonance can be used to advantage to provide a desired sound--one that is not easy to obtain through electrical manipulation (equalization) of the signal. Guitar speakers would be a good example, where different models are purposefully engineered to have specific tonal qualities. Most of us have heard speakers that just "sound good," even though their response curves may not be ideal. It all depends on where the resonances fall. The funny thing is, if you corrected a speaker so that it reproduced sound in an absolutely "flat" manner, it probably wouldn't sound very exciting!
Terry