Board index FlightGear Development Nasal

the real cost of the Nasal Garbage Collector

Nasal is the scripting language of FlightGear.

Re: the real cost of the Nasal Garbage Collector

Postby legoboyvdlp » Tue Nov 24, 2015 2:40 pm

Hooray, if Nasal was to be moved to a seperate thread, would multithread be requured?
User avatar
legoboyvdlp
 
Posts: 7981
Joined: Sat Jul 26, 2014 2:28 am
Location: Northern Ireland
Callsign: G-LEGO
Version: next
OS: Windows 10 HP

Re: the real cost of the Nasal Garbage Collector

Postby Hooray » Tue Nov 24, 2015 2:49 pm

Unlike FlightGear's main loop, Nasal itself is designed to be thread-safe (even without using a Global Interpreter Lock like Python's GIL!) - what isn't thread-safe, is the FlightGear integration layer that "links" the Nasal engine and the FlightGear main loop, which mainly means stuff like 1) property tree APIs (think setprop/getprop), 2) fgcommands, 3) other extension functions, 4) classes and objects exposed to Nasal (e.g. via cppbind, but also the legacy NasalPositioned module).

Basically, the idea with HLA is to allow subsystems to have their own private property tree, and "mount" it in the global tree, so that reads/writes to a node like /ai would be dispatched to the thread/process running the AI federate, which implicitly serializes the request, while the whole thing is still running asynchronously.

The nice thing here is that this would also allow other subsystems to have their own property trees, think /models or even /canvas.

And in the case of Canvas you could even have one property tree instance per canvas texture, and cull/update that asynchronously using osg::Cull/UpdateVisitor, as long as there is only a single thread ever mutating its state, regardless if you are using Nasal or not.

Nasal threads running in the main loop can already provide ~500 hz - and Nasal/Canvas scripts running in "fgcanvas" mode, can provide similar performance:
Image

(this is a screen shot from over a year ago, without any other optimizations applied)

But it would make sense to move away from 1) the way we are dumping Nasal symbols into the globals hash unconditionally, and 2) start looking at subsystems with removability/multi-threading in mind (think HLA/RTI)
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: the real cost of the Nasal Garbage Collector

Postby Philosopher » Tue Nov 24, 2015 3:18 pm

FWIW, I may have an idea that may reduce the number of times the GC is called. But I would most likely need to find a Linux box to compile FG (or upgrade my Mac or something... it hasn't been able to compile FG in forever).

The idea is to essentially take all the local closures out of the GC until they are requested by caller() or creation of a function (i.e. to be accessed by closure() later on) or some other method (think my extensions I made). That way the cost of running a very simple function is essentially 0 for the GC, not 1 hash object. Obviously I do not have data for how this would help, but I think it would at least reduce the number of temporary objects created, and thus the number of times the GC is called.
Philosopher
 
Posts: 1593
Joined: Sun Aug 12, 2012 7:29 pm

Re: the real cost of the Nasal Garbage Collector

Postby Hooray » Tue Nov 24, 2015 3:25 pm

I am not too optimistic regarding GC patches, I mean there are a bunch of related patches circulating around, including even stuff that core developers like ThorstenB and AndersG came up with, and still nobody committed those, despite them being useful to experiment with the GC.

Otherwise, I am sure that we have at least 3-5 people around here who would know how to implement Andy's suggestion and turn the GC into a generational one.

So I guess I agree that the lowest hanging fruit is waiting for HLA to materialize and then move Nasal and its APIs to a dedicated thread/process.

That said, regarding building SG/FG: I suggest you get in touch with Hamza: he once provided me with SSH access to one of his boxes so that I could help review/edit a few SG/FG related patches, when I didn't easily have access to a working build environment (like on an Android phone ... :mrgreen: )

(I guess, using Torsten's Phi work, you could even see a screen shot of fg)

But what is the problem when building FG on Mac ? I am surprised, given that our most active core developer is Mac based, so there should not be too many insurmountable problems on that front ?

EDIT: I sent a heads-up to Hamza. But for such experiments, it would be great to retrieve the "Nasal standalone" branch that is using simgear ... :oops:
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: the real cost of the Nasal Garbage Collector

Postby AndersG » Tue Nov 24, 2015 4:07 pm

Hooray wrote in Tue Nov 24, 2015 3:25 pm:I am not too optimistic regarding GC patches, I mean there are a bunch of related patches circulating around, including even stuff that core developers like ThorstenB and AndersG came up with, and still nobody committed those, despite them being useful to experiment with the GC.


There were some serious issues with my patch - namely that the destruction of some C++ side FG objects kept alive by Nasal references must not be invoked on a different thread. I didn't see the crashes, but they appeared for people using Canvas heavy aircraft.
James did some work to fix these problems but I'm not sure how completely he managed. I still run the patch with his updates, however, and for the aircraft (I don't think I use any canvases, though) I use it works.

Running the Nasal GC in a different thread could certainly be revisited, and if the GC updated to an incremental one it could be done without impacting the main loop much.

As an aside: when testing/measuring performance it is rather important to measure within the expected domain. E.g. different parts of FlightGear scale their load (computation per unit of time) differently with frame rate, e.g.:
1. JSBSim uses about the same amount of CPU (for a particular FDM) no matter the frame rate (down to hitting the max simulation time per frame).

2. Per frame work (including Nasal timer 0 loops) will increase with the frame rate. Going from 20 fps to 2000 fps will thus increase this load by two orders of magnitude.
Callsign: SE-AG
Aircraft (uhm...): Submarine Scout, Zeppelin NT, ZF Navy free balloon, Nordstern, Hindenburg, Short Empire flying-boat, ZNP-K, North Sea class, MTB T21 class, U.S.S. Monitor, MFI-9B, Type UB I submarine, Gokstad ship, Renault FT.
AndersG
 
Posts: 2524
Joined: Wed Nov 29, 2006 10:20 am
Location: Göteborg, Sweden
Callsign: SE-AG
OS: Debian GNU Linux

Re: the real cost of the Nasal Garbage Collector

Postby Thorsten » Tue Nov 24, 2015 5:06 pm

Just to spell that out:

There's nothing wrong at all with testing things and sharing observations as they are. It's a good thing, and it's the start of all insight. And observations are what they are.

However, if you give an interpretation of an observation, you're in essence advocating a theory. Which means you can be wrong, and which means you should have some context in how to create such a theory. Personally, I feel that to the degree I am not completely sure, it'd be a good idea to add qualifiers like 'I think' or 'I suspect' or 'perhaps could be the case'.

If you go beyond that and give a recommendation for a change, you should be sure of your theory, and you should be sure you understand what your proposed alternative does.

Finally, I don't think it's necessarily a good idea to call other people's work 'sloppy' or similar things. But if you feel you must criticize that way, you better be rock-solid right. Because otherwise you're just being rude. If you go and criticize pieces of code without having a real insight into how they work and what they do (and it's hard to gain such insight without actually working with the code and trying a few changes), you're likely to alienate people and at the same time present yourself as someone who can't get his facts straight.

Criticism is important, especially if you have a genuinely better alternative to offer - but it quickly becomes destructive if you just fire off blanket statements without a real insight.

My final two cents to this thread.
Thorsten
 
Posts: 12490
Joined: Mon Nov 02, 2009 9:33 am

Re: the real cost of the Nasal Garbage Collector

Postby Hooray » Tue Nov 24, 2015 7:20 pm

AndersG wrote in Tue Nov 24, 2015 4:07 pm:There were some serious issues with my patch - namely that the destruction of some C++ side FG objects kept alive by Nasal references must not be invoked on a different thread. I didn't see the crashes, but they appeared for people using Canvas heavy aircraft.



thanks for the update, and thanks specifically for participating in this discussion

There are not many other people around who are able to make working modifications to the GC scheme, so it is good having your feedback.

The threaded GC exploding with Canvas objects could make sense, because Canvas is fairly heavy when it comes to C++ objects wrapped as Nasal ghosts.

But it should also show up when using dialogs or even just tooltips, because Tom is using the cppbind framework for wrapping all sorts of C++ objects (usually boost stuff) as Nasal ghosts.

Andy's original idea was to allocate naRefs into generations

Looking at the code, we can do that by having a handful of different pools, and by promoting objects from one generation to another whenever they "survive" a GC pass.

We could also add extension functions to directly "mark" certain APIs/namespaces as never being subject to GC (think core APIs), i.e. having a "core" generation that would contain lots of stuff that never needs to be freed.


The alternative would be modifying gc.c to come up with an interface, so that we can try out different open source GC schemes/libs (last I checked, there were at least a handful that could be integrated, not just incremental, but also concurrent/threaded).

Something like this could be exposed as a startup runtime, so that people can play with different GC modes/implementations

The question is if we can find someone willing to review/commit something like this as an option (build-time/startup), so that such experiments actually make it into sg.


IMO, settimer(0) as an API should either be completely deprecated or wrapped using maketimer() that listens to /sim/signals, and we should show a warning, too.

Just like setlistener() should be wrapped by a "makelistener" equivalent, because those two are the most common causes for resource leaks, as well as callbacks that are ACCIDENTALLY registered to be invoked several times pe frame ..
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: the real cost of the Nasal Garbage Collector

Postby hamzaalloush » Tue Nov 24, 2015 10:12 pm

Ok Thorsten, I'm sorry if i sounded rude to you.

I've now assigned a machine to Philosopher that he can use if he wants to compile on Linux.

so now comes the question, are there better tests to perform without needing any coding skills?

until now, i think we established that more modules in Nasal/ folder means more garbage collection footprint, but i would like something to be explained to me, quoting Hooray:

Hooray wrote in Tue Nov 24, 2015 1:24 pm:Some of your results are a little odd, i.e. are not yet interpreted properly - but the underlying data/findings are interesting.


Which thing specifically do you find odd, baring my interpretation which could be wrong, is it the fact that i get a reduction of FPS with more modules? well i have tested on less powerful systems, i don't seem to get much higher increase in FPS due to minimal Nasal modules, which also could explain how Richard's full cockpit view of the F-15 with/without Nasal(Canvas displays off) are nearly the same(1.5 fps difference), what is the reason for this discrepancy between both our readings.

Also interested in this point by Anders,

AndersG wrote in Tue Nov 24, 2015 4:07 pm:2. Per frame work (including Nasal timer 0 loops) will increase with the frame rate. Going from 20 fps to 2000 fps will thus increase this load by two orders of magnitude.


Where in Flight Gear can i evaluate this load increase, is it number of iteration of the Nasal sub-system for instance? actually, do Nasal timer loops have an actual footprint?

Thanks to everybody who posted in this thread, as i think it is important to understand how our scripting language works, and to improve if necessary, but to be honest i believe that our current implementation is not most optimal, as when i did test Vito's airplane(the first version with less scripting in it), i would get better FPS using it on the powerful machine, then an airplane using Canvas, baring the number of vertices.
hamzaalloush
 
Posts: 631
Joined: Sat Oct 26, 2013 10:31 am
OS: Windows 10

Re: the real cost of the Nasal Garbage Collector

Postby Hooray » Tue Nov 24, 2015 10:30 pm

no, like I said, that is to be expected - the garbage collector is basically a search function which has to search all loaded modules for references/objects that are reachable.

The current practice in $FG_ROOT/Nasal is to load /everything/ into the global Nasal namespace, regardless if it is actually used or not - the same thing is currently done with C++ code exposing bindings to Nasal (see the linked to/quoted thread shown above).

Basically, imagine it like a toolbox with plenty of "useful" tools in it (Nasal modules) - as long as everything is in it, it's obviously easily available, too - but there will also be an increased cost, i.e. weight to carry the toolbox.

Thus, it does not matter if you are using Canvas or not: the corresponding modules will be made available to Nasal, which does increase the "search space" (volume) that the GC function has to traverse/process.

The solution for that, I mentioned in the other thread - including C++ patches, it's trivial actually: instead of adding stuff in a hard-coded fashion using postinit() hacks, the corresponding bind() and unbind() methods of the underlying SGSubsystem are used - which were introduced for tied properties, but can be used for this purpose, because when the system is added/removed, those methods will be invoked by the subsystem manager.

What AndersG said is that a timer executed each frame will obviously invoke the callback much more frequently if the frame rate goes higher (imagine running a Nasal callback 20 frames per second or 100: the same callback may only need to be updated 10 times per SECOND, yet its run-time is proportional to frame-rate, because it is invoked per-frame).

Regarding your interpretation/conclusion, I would like to re-iterate, that it is not a good idea to interpret your results, unless you find a way to carefully phrase your findings as a "question" - you are still missing quite a bit of surrounding info, and people like AndersG, Richard, Thorsten and myself can help you fill in some of the gaps here.

PS: A "Nasal timer 0 loop" is a loop that is executed once per frame, so is likely to be extremely heavy, and may also cause the Nasal GC scheme to be invoked once per frame (in theory), so the "0" is just referring to the delay of the loop until the next execution - 0.1 is 100ms, 0.5 is 500ms and 1.0 once per second. Typical frame rates are 25-60 fps, so you see that it makes a huge difference how often a callback is invoked, and people should not use "0" normally
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: the real cost of the Nasal Garbage Collector

Postby AndersG » Tue Nov 24, 2015 10:47 pm

hamzaalloush wrote in Tue Nov 24, 2015 10:12 pm:
AndersG wrote in Tue Nov 24, 2015 4:07 pm:2. Per frame work (including Nasal timer 0 loops) will increase with the frame rate. Going from 20 fps to 2000 fps will thus increase this load by two orders of magnitude.


Where in Flight Gear can i evaluate this load increase, is it number of iteration of the Nasal sub-system for instance? actually, do Nasal timer loops have an actual footprint?


Looking at the CPU utilization of the FG process might give some information (for per-frame CPU work), but that was not the point.
The point is simply that if each frame cost X amount of computation, then the cost of 2000*X amount of computation is two orders of magnitude (100x) larger than 20*X. That means that per-frame work that would incur a completely negligible cost at 20 fps could look like a major resource consumer if you profiled it in a situation where you have 2000 fps.

Another example:
If you profile FG at frame rates (well?) below 1 fps you would find that JSBSim is a very expensive or even the most expensive part of FG due to item 1. (This was tried and led to some confusion a number of years back).
Callsign: SE-AG
Aircraft (uhm...): Submarine Scout, Zeppelin NT, ZF Navy free balloon, Nordstern, Hindenburg, Short Empire flying-boat, ZNP-K, North Sea class, MTB T21 class, U.S.S. Monitor, MFI-9B, Type UB I submarine, Gokstad ship, Renault FT.
AndersG
 
Posts: 2524
Joined: Wed Nov 29, 2006 10:20 am
Location: Göteborg, Sweden
Callsign: SE-AG
OS: Debian GNU Linux

Re: the real cost of the Nasal Garbage Collector

Postby Hooray » Tue Nov 24, 2015 11:00 pm

Basically, people should not be using callbacks that are invoked once per frame (and in fact, some code we've seen was registered to be invoked several times per frame!).

For profiling purposes, you can use the built in google-perftools support (see the wiki).
Or use "oprofile" on Linux.

But as has been said, you would want to cap frame rate to a reasonable value, depending on the components you are examining, or understand that work executed per frame, is accumulating if you increase frame rate beyond what is normally seen in FG (60-120 fps should be a safe assumption for max).

Regardless of that, what you were doing may still be useful when interpreting other parts of the simulator, i.e. those that should not be proportional to frame rate directly.
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: the real cost of the Nasal Garbage Collector

Postby hamzaalloush » Tue Nov 24, 2015 11:46 pm

Thank you for the very detailed reply, and remaining specific to my questions.

Hooray wrote in Tue Nov 24, 2015 10:30 pm:Basically, imagine it like a toolbox with plenty of "useful" tools in it (Nasal modules) - as long as everything is in it, it's obviously easily available, too - but there will also be an increased cost, i.e. weight to carry the toolbox.

Thus, it does not matter if you are using Canvas or not: the corresponding modules will be made available to Nasal, which does increase the "search space" (volume) that the GC function has to traverse/process.


I think this is the most accessible explanation of the GC to the uninitiated on Nasal like myself, that i've seen.

Hooray wrote in Tue Nov 24, 2015 10:30 pm:The solution for that, I mentioned in the other thread - including C++ patches, it's trivial actually: instead of adding stuff in a hard-coded fashion using postinit() hacks, the corresponding bind() and unbind() methods of the underlying SGSubsystem are used - which were introduced for tied properties, but can be used for this purpose, because when the system is added/removed, those methods will be invoked by the subsystem manager.


Which would be this thread: initNasalCanvas() - making Canvas optional, Also 've read the following about postinit() on the Simgear doxygen:

Code: Select all
void SGSubsystem::postinit    (        )     [virtual]

Initialize parts that depend on other subsystems having been initialized.

This method should set up all parts that depend on other subsystems. One example is the scripting/Nasal subsystem, which is initialized last. So, if a subsystem wants to execute Nasal code in subsystem-specific configuration files, it has to do that in its postinit() method.


So you would like to substitute the given SGSubSystem mechanism of initializing Canvas as a Nasal dependency, in favor of bind(), unbind() which would add/remove the Canvas dynamically? have i understood this right? and how would this favor with the new direction of going from tied properties to Property Objects as written on the wiki, i hope I'm not embarrassing myself here. :)

And thank you for the very straight forward answer on best practices of use timers.

edit: i just saw Anders response, i would like to thank him as well, i will look into learning measuring methods for per-frame CPU work, so that further benchmarking aren't skewed to said 0 timer implementations, or at least taken into account.
hamzaalloush
 
Posts: 631
Joined: Sat Oct 26, 2013 10:31 am
OS: Windows 10

Re: the real cost of the Nasal Garbage Collector

Postby Hooray » Wed Nov 25, 2015 12:10 am

You don't want to look at postinit() - it's a huge messy hack that is currently used to unconditionally add stuff to the "toolbox" to add everything that people may possibly need, without regard to the GC - and it even breaks reset/re-init by not removing references/dependencies to subsystems that are removed (i.e. you can trivially segfault FG using all APIs added via postinit)

The bind()/unbind() approach is quoted in my very first reponse at: viewtopic.php?f=30&t=28084#p265728
The whole thread is to be found here: viewtopic.php?p=214436#p214436

The whole "toolbox" analogy is just intended to demonstrate why/how availability of Nasal modules (and C++ bindings exposed to Nasal) works, and how it is adding to the search space (weight of the tool box, vs. time it takes you find a certain tool).

Using bind/unbind, FlightGear can look at what you want to do, by checking what subsystems are there, and only add stuff that you need - i.e. your toolbox may only have a screwdriver/hammer if that is all you need, which makes your toolbox more lightweight obviously, i.e. decreases the search space.

To understand the algorithm/heuristics of the existing GC implementation, you would have to imagine it like an "address book" with addresses (or telephone numbers) in it. There is a only single "root" address book known to Nasal, but that contains addresses where it can find other address/phone books.

And every Nasal function/instruction can internally refer to existing phone numbers/addresses, but also create new ones - which need to be added to an address book.

The "root" (top-level) address book contains addresses to a bunch of other address books (say 20), which in turn contain addresses to even more phone/address books (say another 100 each).

Whenever a telephone/address books refers to an existing "book" it's just a "reference" basically - but when it refers to an actual "thing" (like a number/string, object) it's the final result of looking up that address.

Normally, Nasal would keep creating tens of thousands of these "address books" internally, but at some point it may be running out of space - so what it is doing is to stop EVERYTHING, and then search the root address book and mark all addresses that are reachable (referenced) elsewhere, and remove all entries that are unused.

That way, it is freeing up space for new address/phone numbers.
Internally, this is using the notion of so called "memory pools" which are phone books for different kinds of data (think scalars, hashes, vectors) - so that those show up in different "pools" from which to create new phone books.

Which also means that the GC does not necessarily have to "print" (create) new phone/address books when it is lacking space, but may find an unused one and use that.
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: the real cost of the Nasal Garbage Collector

Postby Thorsten » Wed Nov 25, 2015 8:17 am

Basically, people should not be using callbacks that are invoked once per frame (and in fact, some code we've seen was registered to be invoked several times per frame!).


Rewind a few years back, and you could find several people (Vivian Meazza certainly) argue on the devel list that if you must use Nasal loops they should always be 'per frame' no matter the real requirements to avoid uneven frame durations - which you in theory always get when you use different loop timers.
Thorsten
 
Posts: 12490
Joined: Mon Nov 02, 2009 9:33 am

Re: the real cost of the Nasal Garbage Collector

Postby Hooray » Wed Nov 25, 2015 4:40 pm

I understand where that line of reasoning came from, but I don't agree with the conclusion/recommendation - but probably it depends hugely on your use-case, too.

In general, "nasal & loops" are/were frowned upon, because many people didn't quite the understand the relationship between Nasal execution and the main loop footprint. I think there at least a dozen of threads on "nasal for loops are bad" to be found in the archives for that reason, recommending the use of settimer(0) loops instead
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

PreviousNext

Return to Nasal

Who is online

Users browsing this forum: No registered users and 4 guests