Board index FlightGear Development Nasal

Nasal, multitreading and performance

Nasal is the scripting language of FlightGear.

Nasal, multitreading and performance

Postby CaptB » Sun Aug 30, 2020 10:24 am

Split off from the topic A320-family development.


Curtis wrote in Sun Aug 30, 2020 9:01 am:Thanks for the information legoboyvdlp :).


that's why I wonder if there is a great room of improvement about the FPS for the A320-family in flightgear,
20 fps seems very low for a modern PC (intel i5 4670K 4 Core @ 3.40GHz, AMD Vega Frontier Edition), according to the message of Merspieler.

Perhaps by splitting the tasks to the other cores of the CPU (multithreading, parallel programming) and reducing the complexity of nasal loops you could increase the FPS, I don't know if multithreading is allowed in nasal or if all the tasks can be run only in sequential mode on a single CPU core (limitations of flightgear which prevent to optimize further more the nasal code).


I'm not sure where the 20FPS comes from. I am on a 2500K@4,2 with an RX580 i get from ~30 to well above 50 depending on scenery and active view with the bus(3D branch) At any case most sims(x-plane is experimenting with vulkan) are CPU bound and single core performance matter. It's been discussed multiple times on the forum why multithreading might not change the situation much.
Last edited by Johan G on Sun Aug 30, 2020 10:32 pm, edited 1 time in total.
Reason: Split off from the topic "A320-family development".
Ongoing projects(3D modelling): A320, MD-11, A350, B767
FG767: https://fg767.wordpress.com/
CaptB
 
Posts: 685
Joined: Thu May 23, 2013 7:36 pm
Callsign: EKCH_AP
IRC name: CaptB
Version: next
OS: Xubuntu

Re: A320-family development

Postby merspieler » Sun Aug 30, 2020 11:07 am

I wouldn't call my i5 modern anymore. It's 7 years old and due for a replacement (Just waiting for the new AMD CPUs)

My 20fps(actually more like 15-10) come from AW + osm2city + compositor + max settings... at KBOS... Obviously, if I go down with the settings or fly over other areas, I get more.

The issue I have even at higher fps are the frame times. They are rather high which indicates a CPU bottleneck.

Curtis: keep in mind, when comparing to PMDG, that Boeings are less complex than Airbuses.
Nia (you&, she/her)

Please use gender neutral terms when referring to a group of people!

Be the change you wish to see in the world, be an ally to all!

Join the official matrix space
merspieler
 
Posts: 2241
Joined: Thu Oct 26, 2017 11:43 am
Location: Wish to be in YBCS
Pronouns: you&, she/her
Callsign: you&, she/her
IRC name: merspieler
Version: next
OS: NixOS

Re: A320-family development

Postby legoboyvdlp » Sun Aug 30, 2020 1:54 pm

Yeah - I've pretty much figured that the issue is the CPU.

Perhaps by splitting the tasks to the other cores of the CPU (multithreading, parallel programming) and reducing the complexity of nasal loops you could increase the FPS, I don't know if multithreading is allowed in nasal or if all the tasks can be run only in sequential mode on a single CPU core (limitations of flightgear which prevent to optimize further more the nasal code).


I think you are right about nasal loops. I have been working on reducing the intensity of loops and also rewriting older code to be more efficient. I may consider an iteration counter and / or event-driven programming (i.e. listeners) rather than a monolithic loop. I also believe a large contributor was the ECAM. What I discovered a few non-critical JSBSim systems were running at 120hz and that made some weaker CPU's struggle.


Threads aren't really viable in Nasal as property tree isn't thread-safe. It can be done for certain things, but you need to be very very very careful with it.
So, I think if you update, you might well see 25-30 frames per second...?

As for multithreading in general, the situation at present is that only parts of the renderer are multithreaded - in effect, very little is threaded nevermind multicore.
User avatar
legoboyvdlp
 
Posts: 7981
Joined: Sat Jul 26, 2014 2:28 am
Location: Northern Ireland
Callsign: G-LEGO
Version: next
OS: Windows 10 HP

Nasal, multitreading and performance

Postby Hooray » Sun Aug 30, 2020 2:51 pm

Perhaps by splitting the tasks to the other cores of the CPU (multithreading, parallel programming) and reducing the complexity of nasal loops you could increase the FPS, I don't know if multithreading is allowed in nasal or if all the tasks can be run only in sequential mode on a single CPU core (limitations of flightgear which prevent to optimize further more the nasal code).


Multi-threading in Nasal is supported and possible - however, you could basically say that can only use it properly if you have a strong background in programming and/or in FlightGear internals - otherwise, it's unlikely that people can use it successfully - because the rest of FlightGear is basically single-threaded, and that also applies to all extension functions and FlightGear specific Nasal APIs.

Basically, it's not feasible to introduce multi-threading at a later time, unless you have a strong background in software engineering. However, designing your systems with threading in mind upfront can work out reasonably well - but obviously you need to work out the data flow and dependencies between different code routines.

The only person to really optimize all of the aircraft, including Nasal code, is Richard and he happens to be a FlightGear core developer, and also has a background in computer science - and certainly must have been using FlightGear for 10+ year :D

However, Richard has written a number of helpers which he is in the process of sharing, i.e. adding to fgdata - so that others, less familiar with fg internals, can also use these helpers.

Speaking in general, you almost certainly don't want to use multi-threaded Nasal code - unless you have a corresponding background, or at least know Nasal/the property tree and the SGSubsystem architecture inside out.

Sooner or later, it seems rather likely that Canvas based avionics may optionally get their own/private Nasal instance (and property tree) per Canvas - so that the corresponding scripts can run outside the FlightGear main loop. Under the hood, a Canvas is already primarily a property tree - one that watches certain properties/locations, and that maps reads/writes to the corresponding OSG APIs.

Proceeding "a is" simply isn't viable for FlightGear as a whole, because the way Nasal and the Canvas system work, there is more and more rendering code tied to non-deterministic code (among others due to Nasal's garbage collector). However, Canvas based avionics have a well-defined set of inputs (think properties, and calling certain FG extension functions, e.g. to query the navdb) and well defined outputs (usually just a single FBO/RTT texture).

Thus, conceptually the code used to update such a texture does not need to run inside the FlightGear main loop. Especially when keeping in mind that a typical cockpit may have 6+ of these FBO textures, all of which are currently updated by Nasal script running at frame rate inside the main loop.

The shared requirement these update routines is that they require a state vector of properties and API calls - some are fixed (i.e. always the same properties), whereas others are dynamic and may change depending on the mode/context in question (imagine showing different modes/elements of a PFD/ND).

However, our experience when designing the MapStructure/ND frameworks has been that regardless of the number of instruments, it isn't feasible to always poll/getprop properties during each update cycle - instead, it makes sense to use memoization (caching) - Richard came up with a dedicated framework for that, and also for splitting work across multiple frames.

This is an approach that the MapStructure framework also used (instead of threading).

Thus, if you were to render n 10+ instances of a ND, it would be kinda pointless to do so at frame rate while always polling /position/*, /orientation, and /fdm/* - likewise, using listeners would not be a good idea for state that changes per frame.

In other words, what's needed is a partioning mechanism to subscribe to relevant state, and then only do the fetching once (per update cycle), where all subscribing avionics would merely get a copy of the state, rather than each doing their Nasal/C++/Nasal context switches for each extension function call.

Richard has worked out a generic scheme to accomplish exactly that (see his fgdata commits to /Nasal).

In the mid-term, it will make sense to have the discussion if, and how, to specify lists of relevant properties for each canvas - which would include output and input properties. At that point, the Canvas system itself could traverse that list at the CanvasMgr level, and provide each canvas texture with a copy of fresh state, without unnecessarily causing property tree "traffic".

At that point, the setup would resemble the "instant replay/flight recorder" subsystem - because that, too, is using a configuration scheme to encode relevant I/O properties (per aircraft).

The thing is, once you have this sort of info PER INSTRUMENT (per canvas), you can trivially use a dedicated SGPropertyNode per Canvas texture, and then also hook up a dedicated FGNasalSys instance to the Canvas texture in question.

With this sort of setup, you'll then end up with an off-screen RTT/FBO context that can be asynchronously updated in the background, without having to run inside the main loop - you would even have to tell OSG to only run it at ~30 hz, because it could easily run at 100+ hz otherwise, i.e. updating a texture unnecessarily.

The kind of coding to populate/update and render such a canvas texture would be a little different compared to what people are currently doing, but it would be well worth it - because you could literally have 10+ canvas textures computed/updated and generated outside the main loop, with the only required synchronization point being the list of subscriptions (properties, and FlightGear APIs) - and obviously the final stage where some locking will be required to fetch the generatede Canvas FBO from the OSG worker thread.

And this, too, could be facilitated by some work that Richard has shared with the community, namely "Emesary" - which he is already using to hook up a threaded garbage collector to Nasal/FlightGear.

Admittedly, all of this may seem a little roundabout and maybe even complex - but short of using a simple "worker thread" setup, threading Nasal code is unlikely to work due to the sheer number of architectural restrictions in FlightGear itself.

Fixing up the Canvas system to support an optional mode where the property tree is a private one, not shared with the main fgfs tree, is comparatively straightforward however - right now, Nasal itself is integrated in the form of a singleton, so that would need to change - but that's under way already, due to unit testing work that bugman has been working on recently.

More recently, Jules has been working on CompositeViewer support, which is another promising option - because compositeviewer means that a completely independent scene graph can be easily processed and shown, without cluttering up the main loop.
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: A320-family development

Postby Octal450 » Sun Aug 30, 2020 4:58 pm

Just pointing that should be done FG side not aircraft side... I suggest an own thread to chat about it rather than here.

merspieler wrote:Curtis: keep in mind, when comparing to PMDG, that Boeings are less complex than Airbuses.

That is not true at all.

Kind Regards,
josh
Last edited by Octal450 on Sun Aug 30, 2020 11:36 pm, edited 1 time in total.
Skillset: JSBsim Flight Dynamics, Systems, Canvas, Autoflight/Control, Instrumentation, Animations
Aircraft: A320-family, MD-11, MD-80, Contribs in a few others

Octal450's GitHub|Launcher Catalog
|Airbus Dev Discord|Octal450 Hangar Dev Discord
User avatar
Octal450
 
Posts: 5583
Joined: Tue Oct 06, 2015 1:51 pm
Location: Huntsville, AL
Callsign: WTF411
Version: next
OS: Windows 11

Re: A320-family development

Postby Johan G » Sun Aug 30, 2020 10:33 pm

Octal450 wrote in Sun Aug 30, 2020 4:58 pm:I suggest an own thread to chat about it rather than here.

Fixed that for you (all). :wink:
Low-level flying — It's all fun and games till someone looses an engine. (Paraphrased from a YouTube video)
Improving the Dassault Mirage F1 (Wiki, Forum, GitLab. Work in slow progress)
Some YouTube videos
Johan G
Moderator
 
Posts: 6629
Joined: Fri Aug 06, 2010 6:33 pm
Location: Sweden
Callsign: SE-JG
IRC name: Johan_G
Version: 2020.3.4
OS: Windows 10, 64 bit

Re: Nasal, multitreading and performance

Postby Hooray » Thu Sep 03, 2020 9:34 am

Threads aren't really viable in Nasal as property tree isn't thread-safe. It can be done for certain things, but you need to be very very very careful with it.

Pointing out that the property tree isn't thread-safe, and that therefore threading in Nasal isn't viable, is not exactly helpful.

Fixing up the property tree to be thread-safe (or using an alternate threadsafe implementation, e.g. the one from boost) would not be too difficult, but it would also not solve anything - when it comes to multi-threading, you want coarse multi-threading, and not finely-grained threading at the property level.

Whenever people point out how they cannot use threads due to FlightGear's property tree implementation, they don't see the full picture - it's just one part of the equation; in general, it's the whole FlightGear/Nasal bridge that cannot be considered thread-safe - however, that doesn't mean that fixing the property tree makes threading Nasal a good idea.

On average, a modern computer will have 8+ cores - so there's more to be gained from identifying "tasks" that have few data external dependencies (read: property I/O).
At that point, you can also use Nasal's threading support to thread out functions that may depend on properties, you just need to fetch/update the corresponding properties once per frame, copy them to a native Nasal data structure (which is thread-safe to do), and if necessary, use Nasal's synchronization mechanisms (lock/mutex, semaphore).

It would be relatively straightforward to come up with a dedicated Nasal execution framework where scripts don't get access to any global state/APIs, but only a "sandbox'ed" Nasal context which has only access to the native Nasal APIs (minus all the FlightGear specifics) - at that point, such a context can also run outside the main loop, at well over 100 hz (and even 500hz isn't unrealistic).

However, to do anything useful, you obviously need a way to tell FlightGear what kind of data I/O your script needs to do, i.e. which property it wants to be copied to the thread per frame, and which properties it wants to copy back to the main loop after a frame.

Basically, we could add a dedicated API (or even just XML file) so that such scripts can register property subscriptions along the lines of:

Code: Select all
register_property_subscription("/orientation/*", "readonly");
register_property_subscription("/position/*", "readonly");
register_property_subscription("/fdm/*", "readonly");
register_property_subscription("/autopilot/*", "readonly");


at that point, the underlying FGNasalSys instance could populate a script-specific private SGPropertyNode instance to fetch/replicate the relevant state from the main loop into the worker thread.

You could then run all sorts of background computations without being restricted by the frame rate/frame spacing of the main loop.

Whether or not that makes any sense, depends obviously on the specific use-case.

But all the tooling for this kind of setup is already in place in SG/FG respectively, including specific SGThread/Emesary and SGSocketUDP

And as a matter of fact, at least for the add-on system it would actually make sense to provide such an execution mode - because many add-ons don't need to run directly inside the main loop, that's the kind of thing where Emesary could really help move Nasal code out of the main loop, without going haywire - because we don't have that many add-ons, and it would still be strictly opt-in - imagine things like an AI traffic generator, that's the sort of thing that could be easily executed outside the main loop, and that would be ideally implemented in the form of a separate executable or at least as an add-on.

Either way, FlightGear's lack of a thread-safe property tree implementation certainly isn't the showstopper that some people want to make believe - even today, you can identify a handful of subsystems that would be better off by running outside the main loop, with their own dedicated property tree instances, with only well-defined synchronization/check pointing steps to replicate state as needed.

And this is in fact something that some of the original core developers worked out a long time ago, back when David Megginson and Dave Culp were still involved, they discussed over and over again that a global property tree might not make much sense for all of FlightGear, because that would quickly limit the evolution of FlightGear as a whole, but also the computing constraints put on certain subsystems that would inevitably be frame rate-limited.

In fact, David M. also suggested to use a "proper" FDM instance for AI traffic, and simply introduce /fdm[0] ... /fdm[1] sub-trees per aircraft, so that the flight dynamics would not have to use pseudo FDMs, but could instead use the standard FDM (possibly using a configurable update rate), and then run one thread for each flight dynamics engine, that would merely multiplex over all /fdm[0-n] instances.

Back then, the reasoning was that pseudo-FDMs using simplifications like a performance database, would sooner or later hit a breakeven point, where implementing/maintaining the pseudo FDM engine could quickly become more complex than running an actual FDM instance, especially once people begin wanting to use an autopilot, or even a route-manager - they sooner or later realize that they don't mind running a real FDM in the background. David M. also repeatedly pointed out that many of these low-level optimizations are likely to become moot over time, because of the progress that was being made in CPU development at the time :wink:

Anyway, all of this was discussed a long time ago back in the early 2000s - and even back then, the lack of a thread-safe property tree certainly wasn't the real problem :D
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: Nasal, multitreading and performance

Postby merspieler » Thu Sep 03, 2020 9:58 am

Hooray wrote in Thu Sep 03, 2020 9:34 am:or using an alternate threadsafe implementation, e.g. the one from boost


boost ain't an option... we want to get rid of boost as a dependency.

As well, I think if there's multithreaded nasal, it should be off by default and only turend on for code that specificly asks for it cause
it takes some dev skill to not have race conditions and such. So not every dev might be up to that yet.
Nia (you&, she/her)

Please use gender neutral terms when referring to a group of people!

Be the change you wish to see in the world, be an ally to all!

Join the official matrix space
merspieler
 
Posts: 2241
Joined: Thu Oct 26, 2017 11:43 am
Location: Wish to be in YBCS
Pronouns: you&, she/her
Callsign: you&, she/her
IRC name: merspieler
Version: next
OS: NixOS

Re: Nasal, multitreading and performance

Postby Hooray » Thu Sep 03, 2020 10:08 am

Right, getting rid of boost is being pushed primarily by a single core developer - however, for many years, James has also been encouraging people to actually use boost in their merge requests/patches (including some that I had to redo based on James asking for boost to be used). This is now in the process of being changed, mainly because people agreed to use a more recent version of the C++ standard.

Realistically, boost is widely used in some key components of FlightGear, including some stuff written by very senior and enormously experienced core contributors like Tim and Mathias - both of whom are basically inactive these days, which makes their work unmaintained, too.

That also applies to the canvas (cppbind specifically), where some of our most sophisticated boost uses are to be found: http://wiki.flightgear.org/Deboosting_FlightGear

All of this is to say, given the state of affairs, it seems more likely for the Qt5 GUI (or even HLA/RTI, FGPythonSys) to arrive anytime soon :lol:

You're right about Nasal based threading being ideally off by default, and strictly opt-in - and indeed, it is my understanding that this is already current practice (see e.g. Thorsten's work)
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU

Re: Nasal, multitreading and performance

Postby Johan G » Sat Sep 05, 2020 5:11 am

One off-topic new post was moved back to the original topic.
Low-level flying — It's all fun and games till someone looses an engine. (Paraphrased from a YouTube video)
Improving the Dassault Mirage F1 (Wiki, Forum, GitLab. Work in slow progress)
Some YouTube videos
Johan G
Moderator
 
Posts: 6629
Joined: Fri Aug 06, 2010 6:33 pm
Location: Sweden
Callsign: SE-JG
IRC name: Johan_G
Version: 2020.3.4
OS: Windows 10, 64 bit

Re: Nasal, multitreading and performance

Postby tdammers » Tue Sep 08, 2020 11:23 am

One thing I'd like to say here though:

Even with coarse-grained multithreading, making the property thread-safe will be important. Nasal being one of the big bottlenecks typically, especially on computers designed for multi-threading performance rather than single-core performance, splitting up the Nasal code into "modules" that can run concurrently, and exploit parallel execution when possible, is still going to be necessary in order to tap into the full potential of a modern 6+ core CPU. Which means that all of the following must be thread safe:

- The property tree
- The Nasal interpreter
- Any other FG/Nasal bridges

For example, in a typical airliner, you have several EFIS screens, each with its own, largely independent Nasal code to drive a canvas surface for it; and then you have various subsystems, many of which are also mostly independent. They could all run in parallel, coordinating their activities via a thread-safe property tree.
tdammers
 
Posts: 391
Joined: Wed Dec 13, 2017 11:35 am
Callsign: NL256
IRC name: nl256

Re: Nasal, multitreading and performance

Postby Hooray » Tue Sep 08, 2020 3:51 pm

Nasal is designed to be thread-safe.
SGPropertyNode could be used in a thread-safe fashion, especially if we restrict usage to a single SGPropertyNode (tree) per thread

The Nasal/FG bridge is hardly thread-safe at all - however, most native Nasal APIs are thread safe.

For example, in a typical airliner, you have several EFIS screens, each with its own, largely independent Nasal code to drive a canvas surface for it; and then you have various subsystems, many of which are also mostly independent. They could all run in parallel, coordinating their activities via a thread-safe property tree.


what I described above would actually be simpler - having a single SGSubsystemGroup inherit from SGThread to run subsets of required subsystems inside the group, e.g.:

- FGNasalSys (minus the standard FG/Nasal bridge (extension functions and bindings))
- SGPropertyNode
- property rules
- CanvasMgr

At that point you would have a private property tree running outside the main loop, that would only be accessible to the Nasal interpreter instance (FGNasalSys, the property rules system and the canvas obviously).

Such a setup could provide frame rates well beyond 60 hz, so one would need to use OSG to reduce the required update/rendering rate.

"coordination" would ideally not work via the property tree at all, for a plethora of reasons - instead, the background thread would keep its own list of relevant property paths and copy the state from the main loop into the worker thread, so be used at an update rate of say 1-5 hz - that way, the worker thread only needs a simple copy of relevant properties (/position, /fdm, /orientation, /autopilot) and the only output would be an updated FBO (canvas texture) - i.e. all inputs would be read-only.

This is the kind of setup that even OSG itself can synchronize, without having to use low-level threading primitives like mutexes or semaphores.


https://groups.google.com/forum/?nomobi ... p8HAf6a2sJ
Robert Osfield (OSG project lead) wrote: To get the render to image result to the second viewer all
you need to do is assign the same osg::Image to the first viewer's
Camera for it to copy to, and then attach the same osg::Image to a
texture in the scene of the second viewer. The OSG should
automatically do the glReadPixels to the image data, dirty the Image,
and then automatically the texture will update in the second viewer.
Please don't send support requests by PM, instead post your questions on the forum so that all users can contribute and benefit
Thanks & all the best,
Hooray
Help write next month's newsletter !
pui2canvas | MapStructure | Canvas Development | Programming resources
Hooray
 
Posts: 12707
Joined: Tue Mar 25, 2008 9:40 am
Pronouns: THOU


Return to Nasal

Who is online

Users browsing this forum: No registered users and 2 guests