Board index FlightGear Development Nasal

Intern'ing

Nasal is the scripting language of FlightGear.

Intern'ing

Postby Philosopher » Tue Nov 12, 2013 2:04 pm

I just have to write about this topic...

As you experienced Nasal/C-code hackers should recall, namespaces are just hashes, with keys representing symbols - right? Well yes, mostly, but each of those symbols are unique in a way from all of the other strings out there: they're interned. (That means matching strings are substituted so they have the same pointer, i.e. one string represents all instances of "io", stored through a pool/hash of all used symbols.) Though appearing at runtime in the keys in namespaces, the interned strings get created up during code generation (codegen), where the symbols (TOK_SYMBOL) get converted to Nasal strings, are interned to get the correct string (using the globals->symbols hash), and are stored in the naCode's constants' block. From there, the symbol-strings are used to set and get various lvalues (both local/global symbols and objects' members) in an optimized way (that's the whole point of the exercise). Looking at hash.c, there are some specialized functions that make use of the potential optimizations:
  • naiHash_sym (looks up an interned symbol)
  • naiHash_newsym (adds a symbol in the first empty slot for the hashcode)
(There's another that looks similar, naiHash_tryset, but it does not deal with interned symbols: it uses the findkey() method, which in turn uses the equals() method, checking for more general key equality, instead of pointer equality.)

The first, naiHash_sym, is pretty neat and the prime example of the optimization: it runs through a hashcode's potential slots, checking only pointer equality (note that interned symbols' hashcodes are computed during interning, so that's another step that doesn't have to be done are runtime). naiHash_newsym is another nice optimization but a little problematic, due to its assumption that the key doesn't exist already. It's basically used for adding another argument as a local key, but it doesn't care about if it exists already, it just sees an occupied slot and keeps going. Consider the following example that illustrates calling with an argument into a hash that already has the argument's key in it:

Code: Select all
var f_creates_arg = func(arg...) {
    foreach (var k; keys(caller(0)[0]))
        print(k);
    debug.dump(arg);
}
call(f_creates_arg, nil, nil, {arg:nil}); # call into namespace: arg=nil; with arguments of: arg=[]

I need to independently confirm this, but the above should print the following:
Code: Select all
arg
arg
nil

This shows that the key is being set twice (which violates a normal precondition of hashes): once as an argument and once to create an existing key in the namespace. The one set first is the one being picked up (e.g. {arg:nil} versus the arg..., which is []). And this behavior persists even through resizing: hashset() (which is used to reinitialize a hash after reallocation) only keeps appending keys in empty slots, so the number of keys doesn't change (even if there are multiple of the same keys).

For this reason, I would suggest amending naiHash_newsym to check keys' pointer equality before continuing; that way symbols aren't "trodden" over like this. (Please note that if "arg" exists in the hash as non-interned, then it will be trodden over anyways.) I would argue that finding an existing key (a few simple pointer comparisons!) would be more efficient generally, because the hash would never need to be resized if an existing one is found, whereas the old version would append regardless. (I think I once counted well over a hundred "arg" symbols in the __js0 namespace from the continual firing of bindings, which obviously isn't good.)

(Also, concerning efficiency in hash.c, should hash.c::equals check that hashcodes are equal first, as an optimization? Or is a for loop cmp okay?)

Another concerning issue is the fact that many symbols aren't interned - particularly those in the global namespace. Module names are always simple strings, both in io.load_nasal and in the hashset(key, naInit_module()) code in FGNasalSys—not interned in any way. That means that a simple module request (e.g. "io") has to recurse through all of the closures (usually 2–3 in addition to locals) twice. The first pass only checks for interned symbols, and starts out at the local, then the module, and finally the global namespace. Then, not finding it, it has to go back with normal hash checking, all the way through the same namespaces but now with a klunky key-comparison algorithm. There are relatively easy ways to fix this, though:

For Nasal code I have a simple hack to intern a symbol and, using that, also check if a given key is interned. This would need to be used everywhere that symbols are created using `ns[sym]` syntax, becoming `ns[internsym(sym)]`. For C++/cppbind I would suggest a symbol class or something that will handle interning a C string (or even a C symbol using macros) into a Nasal symbol. Not only should this be used when setting up any kind of symbol (module or extension function), but it can also simplify ghost member checking, to something like this:

Code: Select all
const cppbind::Symbol m_aMember("aMember"); // name of member as a special class
naRef aMember_value; // value to return when member is requested

static const char* findmember(naRef ghost, naRef member, naRef* out) // can't remember specifics...
{
    if (member == aMember) { // compares id's of underlying naRefs, beacause both are interned
        *out = aMember_value;
        return ""; // no error
    } else return NULL; // not found
}

This eliminates the need for strcmp() and naStr_data(): since the member is guaranteed to be interned, one only needs to check pointers instead (which seems simpler to me).

I think that these are useful considerations and worth the effort, even if they fall heavily in the realm of pedantry—my specialty ;).
Thanks,
Philosopher
(inactive but lurking occasionally...)
Philosopher
 
Posts: 1590
Joined: Sun Aug 12, 2012 6:29 pm
Location: Stuck in my head...
Callsign: AFTI
Version: Git
OS: Mac OS X 10.7.5

Re: Intern'ing

Postby Hooray » Fri Nov 15, 2013 3:19 am

You are probably not getting very much feedback because there's not a single person around here who has dealt with the Nasal engine at that level, including any active/inactive core developers, and I bet that, after so many years (Nasal having been added over a decade ago!), even Andy Ross himself would need a while (and skim over the code) to see what you're really saying there - normally, I would suggest to post to the devel list, but in this case it's kinda pointless because it's obvious that you know about more Nasal internals than all others here combined.... :D

It seems to make sense though, the proposed changes at the C/C++ level certainly do sound sensible - I'm just not sure if we can afford requiring/enforcing internsym() in all Nasal code, if that's what you were saying here (not entirely sure):

This would need to be used everywhere that symbols are created using `ns[sym]` syntax, becoming `ns[internsym(sym)]`.


If so, I would want to see this done at the codegen level,i.e. the internsym() call added there...




PS: Thanks for taking the time to write this, I added it to the newsletter, and even made up a new column for it :lol:
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 11100
Joined: Tue Mar 25, 2008 8:40 am

Re: Intern'ing

Postby Philosopher » Fri Nov 15, 2013 3:32 am

I know, I just edited it! It would be interesting to see if anyone actually reads it...

It's actually more of a manual job than that; there are edge cases where things might look like symbols but are best left as regular strings (prime example is properties, e.g. as the hash argument to props.Node.new(); interning them there would be a waste of time and inefficient.).

One things I forgot to mention is that `{"arg":nil}` is not interned, while `{arg:nil}` is. That is, using string literals versus symbols makes a small difference, and this same difference is achieved using `hash["arg"]` versus `hash.arg` or `hash[internsym("arg")]`.

For posterity, here's internsym (quick hack, you see ;)):
Code: Select all
var internsym = func(sym) {
    cassert('''issym(sym)'''); # this is abstracted away, basically makes sure it matches [_a-zA-Z][_a-zA-Z0-9]* - this is also a security check
    return compile("""keys({"~sym~":})[0]"""); # get the only key in the hash, which happens to be interned
};
Thanks,
Philosopher
(inactive but lurking occasionally...)
Philosopher
 
Posts: 1590
Joined: Sun Aug 12, 2012 6:29 pm
Location: Stuck in my head...
Callsign: AFTI
Version: Git
OS: Mac OS X 10.7.5

Re: Intern'ing

Postby Johan G » Fri Nov 15, 2013 5:48 pm

Sorry this got a bit longer than I thought. And more than a bit off topic. :oops:

Philosopher wrote in Fri Nov 15, 2013 3:32 am:I know, I just edited it! It would be interesting to see if anyone actually reads it...

Read it, yes, understood it, nope. You would have to dumb it down a lot before I would get it I'm afraid. :wink:

Lets just say that my experience in C and C++ programming is limited to pretty much "Hello World! Like Lisp, Scheme, Lua and probably some I do not remember. I have some very limited experience of Amiga-Basic (late 1980's), Borland's Turbo Pascal (early 1990's), Microsoft's Q-Bbasic and Visaul Basic (mid 1990's), JavaScript (mid 1990's) and FORTH (mid 2000's). But I think the only two programming languages I do have more than very limited experience with is HP's RPL (mid 1990's to near present) and Python (about 2000 to near present), unsurprisingly they are the only languages I have done actual useful stuff in (like the circular slide rule I am always carrying, and even using once or twice a week).

Some day I want to taste SmallTalk, Fortran and Ada. Some quick looks at Algol, Cobol and Simula would be interesting as well, though comparisons between seemingly simple tasks in Fortran and Cobol gives me quite some respect for the Cobol programmers still out there (which seems to be more and more sought after).

I find that, even with my limited experience, there is something to gain from looking at more than one language. They all have very different paradigms and philosophies, sometimes very different. In many way it opens the mind, not only in ways related to programming. But still, my experience is very limited, as my main goal is only to know more than he average person on as many topics as possible. It opens the mind. I have later on noticed that it both make me more forgiving and, in some few cases, more scared than I would have expected me to be otherwise.

By the way I recommend a quick look at FORTH some day. It is so utterly different from anything else I have seen. I recommend "Thinking in Forth", by Leo Brodie for that.



By the way (two), to get back on topic, the small excursions i have had into FlightGear's codebase left me confused. It looked a lot less like I would expect C++ code to look like, as I have never seen the double colon notation <something>::<something else> and some other things.
Low-level flying — It's all fun and games till someone looses an engine. (Paraphrased from a YouTube video)
Improving the Dassault Mirage F1 (Wiki, Forum, GitLab. Work in slow progress)
Johan G
Moderator
 
Posts: 5278
Joined: Fri Aug 06, 2010 5:33 pm
Location: Sweden
Callsign: SE-JG
IRC name: Johan_G
Version: 3.0.0
OS: Windows 7, 32 bit

Re: Intern'ing

Postby Hooray » Fri Nov 15, 2013 6:26 pm

Don't worry, I think there are only two active core developers (TheTom & Zakalawe) who really understood what P. wrote here - most others won't care either way, well maybe AndersG and ThorstenB would, because they previously looked at Nasal internals, too - but keep in mind that Philosopher fixed some Nasal code generator issues with just 3-5 lines of code that Andy himself considered "obscure" in a comment in the issue tracker.

So it's pretty safe to say that Philosopher is increasingly the Nasal domain expert here, I already consider him my "Nasal internals go-to guy", and the de-facto Nasal maintainer :lol:
I don't think there's any feasible way to downstrip the write-up, because it requires familiarity with C, data structures and scripting languages, well and Nasal internals.

I consider myself familiar with Nasal (not to the extent that P. is though!), but I still need to look at the code to understand everything that he said here.

Some day I want to taste SmallTalk, Fortran and Ada. Some quick looks at Algol, Cobol and Simula would be interesting as well, though comparisons between seemingly simple tasks in Fortran and Cobol gives me quite some respect for the Cobol programmers still out there (which seems to be more and more sought after).

there's a saying that good developers should learn 2-3 new languages per year to gain a better understanding of other programming paradigms.
However, I don't think that it makes too much sense to look at many other languages if you have had only superficial programming exposure so far - it would get confusing quickly.

On the other hand, with 10+ years of Python under your belt, you should have a strong foundation to explore different paradigms, even using Python (functional programming for example).

I find that, even with my limited experience, there is something to gain from looking at more than one language.

that's definitely true, it will hugely improve your understanding of programming in general, even if you should only skim over things: for example, look at compiled (C) vs. interpreted languages (Python,Java or even Nasal), low-level (assembly), object oriented (smalltalk), functional/metaprogramming (LISP/scheme), statically typed (Ada, Haskell), concurrent languages like Google's Go or distributed (Erlang) or GPGPU languages like OpenCL. There are also some really cool niche languages, like for example Falcon.

Regarding forth, that's a stack-machine language, and basically how Nasal works internally - so Philosopher already understands how it works, because he knows Nasal :D
However, Forth is typically re-targeted, i.e. primitive stack operations (PUSH, POP, DUP, DUP2, JMP, JIF etc) are implemented using native machine instructions, to make things faster - which is also the approach taken by the Google JavaScript engine: it compiles JavaScript down into machine instructions, which why google chrome is so blazingly fast.

So Forth is awesome for people doing low-level hardware programming, especially on hardware for which there are no development environments (compilers), because you can build a whole development environment in a "bottom-up" programming fashion.

Being interested in different languages and programming paradigms is a good thing because it encourages out-of-the-box-thinking.
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 11100
Joined: Tue Mar 25, 2008 8:40 am

Re: Intern'ing

Postby Johan G » Fri Nov 15, 2013 7:37 pm

Hooray wrote in Fri Nov 15, 2013 6:26 pm:...with 10+ years of Python under your belt, you should have a strong foundation to explore different paradigms, even using Python (functional programming for example).

Unfortunately, that is more like 5 years of skimming through resources now and then out of curiosity, 2 years with a spare time project (the above mentioned device) shelved once in a while when bumping into to big obstructions and finally three years trying hoping to not forget. :oops:

Did not even touch oop. Actually I learned more about oop here on the forum than when I was working on that code. :wink:

Let's just say that shelving and picking up one single project many enough times have taught me to have my code both rich in comments, documentation strings (a Python thing) and a diary-like work log to get to the productive zone faster, or at all. :lol:

Hooray wrote in Fri Nov 15, 2013 6:26 pm:Regarding forth, that's a stack-machine language, and basically how Nasal works internally - so Philosopher already understands how it works, because he knows Nasal :D

It is my understanding that many virtual machines running scripted languages are stack machines so I am not the least surprised.

To get nearly back on topic, and while I will make no promises, where can I learn more about that double colon notation? Is it related to some kind of framework? Right now I am a bit confused whenever I look into the codebase.
Low-level flying — It's all fun and games till someone looses an engine. (Paraphrased from a YouTube video)
Improving the Dassault Mirage F1 (Wiki, Forum, GitLab. Work in slow progress)
Johan G
Moderator
 
Posts: 5278
Joined: Fri Aug 06, 2010 5:33 pm
Location: Sweden
Callsign: SE-JG
IRC name: Johan_G
Version: 3.0.0
OS: Windows 7, 32 bit

Re: Intern'ing

Postby Hooray » Fri Nov 15, 2013 7:45 pm

where can I learn more about that double colon notation? Is it related to some kind of framework? Right now I am a bit confused whenever I look into the codebase.


Are you referring to the C++ code ?

That's just a namespace qualifier, in namespaces/classes - e.g.:

Code: Select all
class Foo {
 void bar();
};

void
Foo::bar() {
// bar() function/method inside/of the Foo namespace/class
}

// or

namespace X {
struct Stuff {
// struct is now to be qualified via X::Stuff, because it's inside X
};
};

int main() {

Foo f;
f.bar();

Foo *f2 = new Foo();
f->bar();

X::Stuff s;


}



Nasal is using the dot notation (namespace.member) for all instances, i.e. both in declarations and calls
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 11100
Joined: Tue Mar 25, 2008 8:40 am

Re: Intern'ing

Postby Johan G » Fri Nov 15, 2013 8:02 pm

I think I got it, thanks. :D
Low-level flying — It's all fun and games till someone looses an engine. (Paraphrased from a YouTube video)
Improving the Dassault Mirage F1 (Wiki, Forum, GitLab. Work in slow progress)
Johan G
Moderator
 
Posts: 5278
Joined: Fri Aug 06, 2010 5:33 pm
Location: Sweden
Callsign: SE-JG
IRC name: Johan_G
Version: 3.0.0
OS: Windows 7, 32 bit

Re: Intern'ing

Postby Philosopher » Fri Nov 15, 2013 8:57 pm

Hooray wrote in Fri Nov 15, 2013 6:26 pm:
I find that, even with my limited experience, there is something to gain from looking at more than one language.

that's definitely true, it will hugely improve your understanding of programming in general, even if you should only skim over things: for example, look at compiled (C) vs. interpreted languages (Python,Java or even Nasal), low-level (assembly), object oriented (smalltalk), functional/metaprogramming (LISP/scheme), statically typed (Ada, Haskell), concurrent languages like Google's Go or distributed (Erlang) or GPGPU languages like OpenCL. There are also some really cool niche languages, like for example Falcon.


Yeah, there are some really whacky and strange languages out there, some that I would like to explore. That's not something that I have done much of, but I'm sure they all contribute something to the whole discussion of what a programming language can do, what it feels like, and what "niche" or region it occupies. They also can influence what new languages come about; this reminds me of the "middle-east government problem" we briefly discussed in MUN: there's some governments in place in various parts of the world, and nations in the Middle East have gone through several forms of governments (mostly due to the bad influence of the West :oops:). US would like to advocate for democracy, but perhaps the people of those countries will come up with their own, completely new government that works for them ("government that works for them" being the equivalent of a programming language working well for the developers who use it). So I would say that types of governments and types of programming languages are closely related.

Regarding my experience, I had to beg my brother to teach me a language (it wasn't fair that he was learning one and I wasn't), and that was Python (he wasn't a great teacher, just for the record). I've tried various projects in it, but I mostly shelved it for lack of things to do; I can still program in it pretty well, though. Prior to that I had been exposed to a little bit of programming, like conditionals, loops, and Boolean logic. Python really got me started with Nasal, when for some reason I started hacking around with FG (can't remember how I even knew how to get started...). Nasal has really lead me to continuously do things with programming, I'm never bored since there's so much to accomplish with Nasal. Nasal also got me into C and a bit of C++; I basically picked those up by myself too. Also, this summer I got into writing bash scripts, mainly since I was setting up a build environment and there's no way I could remember the commands for more than a day ;). I now have scripts that contain all of the important commands. I recently used Python to write a regex generator (takes a list of words, compiles an optimized regex based on the concept of "switches" or alternative matches using the | operator). This week I returned to that and built an even better algorithm that optimizes about as much as I can do. (That was a really hard problem to solve; it involved finding subsets of prefixes/postfixes that must have the necessary connections—all of the permutations of the pre-/postfixes in that subset—for it to be optimizable.)

Speaking on their differences, I would say that Nasal excels at meta programming (think caller(), closure(), bind(), and compile() - few languages have such things) and simplicity (everything is more or less transparent and easy, no complicated syntax like declarations or "::" ;)). Python is best for scripting just anything at all; it has a really expansive library especially of types (dict(), set(), list(), tuple(), and frozenset() are just a few of the builtin types, which plenty of methods like "|".join(["a","b"]).), and it has the advantage of running on my iPod (due to the jailbroken self-SSH connection...). Bash is quite unique in lending itself to über-simple semantics for executing commands (you can use that part of a CLI without even recognizing it as a pretty functional scripting language) while having really weird workings for other things (for loops, if statements, etc.). However, it excels at interacting with the filesystem and processing simple information.

However, Forth is typically re-targeted, i.e. primitive stack operations (PUSH, POP, DUP, DUP2, JMP, JIF etc) are implemented using native machine instructions, to make things faster - which is also the approach taken by the Google JavaScript engine: it compiles JavaScript down into machine instructions, which why google chrome is so blazingly fast.

Haha, my brother keeps asking if it's possible to compile Nasal... Not that he likes the language in particular, or has ever tried it :P.
Thanks,
Philosopher
(inactive but lurking occasionally...)
Philosopher
 
Posts: 1590
Joined: Sun Aug 12, 2012 6:29 pm
Location: Stuck in my head...
Callsign: AFTI
Version: Git
OS: Mac OS X 10.7.5

Re: Intern'ing

Postby Hooray » Fri Nov 15, 2013 9:25 pm

Haha, my brother keeps asking if it's possible to compile Nasal...

The Nasal mailing list has been offline for several years and I can't seem to find an archive of it, but yes, back then, there's been talk about compiling Nasal IR via LLVM, Andy really liked the idea for some reason that eludes me ... I guess mfranz could elaborate on the details.

However, I find Nasal often painful to use for stuff that isn't trivial, especially when it comes to refactoring and shuffling code around, all the duck-typing makes it far too easy to introduce issues that go unnoticed for several iterations, until it's too late/too difficult to determine when a certain error got introduced.

As an aside, I also didn't particularly like Nasal when I got first in touch with it in FlightGear, and in fact I ended up implementing AngelScript next to it to convince myself just how bad Nasal really was. But I guess it's one of these "programmer vs. software engineer" debates - a programmer will always stick with what he knows, while a true software engineer will be open-minded and also explore new/niche stuff that isn't otherwise common/used. Obviously, Nasal knowledge isn't as "useful" as Perl, Python or even Lua knowledge - I think that's part of the reason why we are seeing so few C++ developers actually making use of it, it's a niche language and not really useful outside FG.

To be honest, fast forward a decade later, I have come to realize that I was certainly biased, and my thinking was colored by wanting to use popular/established tools/languages, so I didn't really understand very much about Nasal internals and technically Nasal in FG could even be considered superior in comparison to Python or Lua - i.e. due to an inherent thread-safe design right from the start. Unfortunately it's one of these things where Nasal is more future-proof than FG itself, so that these strengths cannot currently be easily explored, i.e. multi-threaded scripting in FlightGear still is a niche, too.

Making Nasal compile to native code would not be such an impossible undertaking, even though I highly doubt that it would be particularly useful :lol:

For scripting purposes in FlightGear, it has obviously served us well - arguably, a more popular/established language probably would have worked, too.
On the other hand, if it had been Python, Perl or Lua, that would have probably caused completely new issues, due to the sheer amount of bindings/libs available there - i.e. FG scripting would have probably gone overboard quickly, and people would have used all sorts of bindings and native code modules/libraries, and most things would no longer work out-of-the-box or be compatible in the first place.

So there are some advantages to having an "obscure niche language" like Nasal. Technically, it has all the right features that you would expect in a modern language, so I cannot find any reasons not to use it.

Now, scripting a flight simulator is obviously a challenge in and of itself, and over time I have really come to the conclusion that while having a dynamically-typed GC-managed language significantly lowers the barrier to entry for non-coders, it's causing lots issues sooner or later due to the sheer amount of code written by people who have a BASIC-approach to programming, which is unfortunate because it's reflecting badly upon Nasal and also FlightGear, due to performance issues (framerate, latency).

Thus, I believe that using a purely functional language with support for static and strong typing would have served the simulator better. This may seem weird, but we've sort of reached a critical mass here now, because core development is stalling in comparison to what's going on in the base package (see ohloh.net), so we're seeing more and more stuff being contributed in scripting space, where it's far too easy to bring the simulator down to its knees currently - without the Nasal engine itself being prepared to handle such things, not just due to GC issues, but also due to the sequential nature of 99% of our code.

We only need to look at the init mess, where declarative stuff is mixed with callbacks started via listeners and timers, or scripts running next to critical code that should ideally be running in a separate context. Overall, it's a challenge similar to the browser/JavaScript issue, where rogue JS code could halt a browser.

Not being able to inspect the Nasal VM at runtime is another issue, i.e. some way to identify routines that are causing (contributing to) performance issues would be great obviously.
Originally, Lua also didn't provide a solution to most of these problems, but the Lua community is just so huge that there's tons new stuff available each month.

Most of these concerns should no longer be showstoppers once the HLA effort materializes a little more, but based on lurking (and playing around with it a little), that seems like another 2-3 years away currently.

Then again, we do have a few people who are familiar with Nasal internals already, and adding new features (not just APIs) is no longer an obscure challenge. So it would just be a matter of coming up with a list of desirable features for "FG Scripting for 2014+" :lol:
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 11100
Joined: Tue Mar 25, 2008 8:40 am


Return to Nasal

Who is online

Users browsing this forum: No registered users and 0 guests