OK,
So the challenge is about creating a standalone module to basically extend BASIC array basic features and incorporate extra high-level functionality.
Then why not do the most obvious thing (for the dialects that don't have such features built into the core): create a CPP DLL with a flat ANSI C interface that would encapsulate some or all of simple data vector classes and export their standard high-level methods for use by dialects that support calls to external dynamic link libraries/shared objects at all?
I bet this would be the fastest library possible if compiled with the VC++/G++ maximum optimization settings.