I don't want to check any discussion - I want the evidence that prompted your conclusion that multi-threading is the relevant factor. I'm quite certain if I install any 13 year old software - multi-threaded or not - it'll be quite fast on modern hardware - much faster than modern software. Because it uses 13 year old technology to do its thing and had to cope on 13 year old hardware.
So there's still nothing about multi-threading in what you argue, quite the opposite.
When I move AI FDM computation from main thread to another running parallel with main thread, main loop is shorter and may run with higher frequency and ofcourse faster.
Yes - have you ever written code for multiple CPUs? Because I have.
What you describe is the theory. In practice though, you need extra infrastructure to synchronize the two instances properly (if you don't, you get crashes) - which means you need to make processes wait for the sync point and you need to use the sync infrastructure rather than just using variables as you need them.
So it may happen that you make the two bits of code run in parallel much slower because they have to wait for each other to reach the next sync point whenever you want to exchange a variable.
Basically code has to be designed up-front to run in parallel and the problem has to be suitable in the first place - otherwise you don't gain anything. If you would re-write 20 years of FG with a strict design idea in mind, yeah - you'd get somewhat faster code (not much, because rendering is still the main chunk). But there's no workforce to do that, we're lacking a few dozen man =-years here. If you try it with the existing code which is grown rather than designed - good luck.