As you experienced Nasal/C-code hackers should recall, namespaces are just hashes, with keys representing symbols - right? Well yes, mostly, but each of those symbols are unique in a way from all of the other strings out there: they're interned. (That means matching strings are substituted so they have the same pointer, i.e. one string represents all instances of "io", stored through a pool/hash of all used symbols.) Though appearing at runtime in the keys in namespaces, the interned strings get created up during code generation (codegen), where the symbols (TOK_SYMBOL) get converted to Nasal strings, are interned to get the correct string (using the globals->symbols hash), and are stored in the naCode's constants' block. From there, the symbol-strings are used to set and get various lvalues (both local/global symbols and objects' members) in an optimized way (that's the whole point of the exercise). Looking at hash.c, there are some specialized functions that make use of the potential optimizations:
- naiHash_sym (looks up an interned symbol)
- naiHash_newsym (adds a symbol in the first empty slot for the hashcode)
The first, naiHash_sym, is pretty neat and the prime example of the optimization: it runs through a hashcode's potential slots, checking only pointer equality (note that interned symbols' hashcodes are computed during interning, so that's another step that doesn't have to be done are runtime). naiHash_newsym is another nice optimization but a little problematic, due to its assumption that the key doesn't exist already. It's basically used for adding another argument as a local key, but it doesn't care about if it exists already, it just sees an occupied slot and keeps going. Consider the following example that illustrates calling with an argument into a hash that already has the argument's key in it:
- Code: Select all
var f_creates_arg = func(arg...) {
foreach (var k; keys(caller(0)[0]))
print(k);
debug.dump(arg);
}
call(f_creates_arg, nil, nil, {arg:nil}); # call into namespace: arg=nil; with arguments of: arg=[]
I need to independently confirm this, but the above should print the following:
- Code: Select all
arg
arg
nil
This shows that the key is being set twice (which violates a normal precondition of hashes): once as an argument and once to create an existing key in the namespace. The one set first is the one being picked up (e.g. {arg:nil} versus the arg..., which is []). And this behavior persists even through resizing: hashset() (which is used to reinitialize a hash after reallocation) only keeps appending keys in empty slots, so the number of keys doesn't change (even if there are multiple of the same keys).
For this reason, I would suggest amending naiHash_newsym to check keys' pointer equality before continuing; that way symbols aren't "trodden" over like this. (Please note that if "arg" exists in the hash as non-interned, then it will be trodden over anyways.) I would argue that finding an existing key (a few simple pointer comparisons!) would be more efficient generally, because the hash would never need to be resized if an existing one is found, whereas the old version would append regardless. (I think I once counted well over a hundred "arg" symbols in the __js0 namespace from the continual firing of bindings, which obviously isn't good.)
(Also, concerning efficiency in hash.c, should hash.c::equals check that hashcodes are equal first, as an optimization? Or is a for loop cmp okay?)
Another concerning issue is the fact that many symbols aren't interned - particularly those in the global namespace. Module names are always simple strings, both in io.load_nasal and in the hashset(key, naInit_module()) code in FGNasalSys—not interned in any way. That means that a simple module request (e.g. "io") has to recurse through all of the closures (usually 2–3 in addition to locals) twice. The first pass only checks for interned symbols, and starts out at the local, then the module, and finally the global namespace. Then, not finding it, it has to go back with normal hash checking, all the way through the same namespaces but now with a klunky key-comparison algorithm. There are relatively easy ways to fix this, though:
For Nasal code I have a simple hack to intern a symbol and, using that, also check if a given key is interned. This would need to be used everywhere that symbols are created using `ns[sym]` syntax, becoming `ns[internsym(sym)]`. For C++/cppbind I would suggest a symbol class or something that will handle interning a C string (or even a C symbol using macros) into a Nasal symbol. Not only should this be used when setting up any kind of symbol (module or extension function), but it can also simplify ghost member checking, to something like this:
- Code: Select all
const cppbind::Symbol m_aMember("aMember"); // name of member as a special class
naRef aMember_value; // value to return when member is requested
static const char* findmember(naRef ghost, naRef member, naRef* out) // can't remember specifics...
{
if (member == aMember) { // compares id's of underlying naRefs, beacause both are interned
*out = aMember_value;
return ""; // no error
} else return NULL; // not found
}
This eliminates the need for strcmp() and naStr_data(): since the member is guaranteed to be interned, one only needs to check pointers instead (which seems simpler to me).
I think that these are useful considerations and worth the effort, even if they fall heavily in the realm of pedantry—my specialty .