A Statistical Method for Self-Tuning Trading Systems

[Versiunea romaneasca] [MQLmagazine.com in romana] [English edition]

I am starting to write this article in the idea that it might become useful in a not too far time horizon. I have been thinking at this for some time. The autotuning is an idea that is out there for some while, however the extreme complexity of retail trading systems approach made it quite inapplicable. However, HFT trading systems are generally simple, and quite similar rules are applied to hundreds of equities, in a parametrized fashion. Thus, if rules are simple and can be written in “when this, do that , resulting this” instead of maze-like logics of retail systems, then a set of optimal statistics on the entry data (spanning over an array of quotes, orderbook data and so on) is the premise for simple operations like quoting, getting filled, taking profit or stop loss in just a few ticks away. As seen in the previous Progress Apama article, these algorithms are highly unstable, therefore there is a constant run for newer algorithms.

Now you may ask, why would we attempt autotuning of trading systems if the institutionals do not (at least on the HFT level). Thing is, some of their algos are so fast that they might not even have time enough to record the data in a database, so automatic tuning, that requires extra analysis, might be a complete waste of time, with a too small improvement brought to the algos compared to the extra latency that is forming up. What will be coming in these articles has therefore an experimental nature and doesn’t have to be taken as a solution, rather an attempt – I cannot tell for sure if there is any point in attempting to find, with regular statistics, recurring parameters of a stochastic process.

So, let’s presume that an array of parameters (such as individual volatilities, moving averages, executions volume, etc.), is to be fed to a database or to an inner storing array. We have to take into account the fact that a trading signal may exist permanently, at every tick. Supposing the volume to be constant, then same signal can be wired into two actions: either buy or sell. Each signal will therefore have two results, one for buying, one for selling. We could however create separate rules for selling.

For instance, a signal comprised of a volatility and the distance between mid quote and mid quotes moving average comes with: 0.03 stddev & 2 ticks. To twist it a bit, we introduce quoting, and we note 0 for no fill. Wired into a buy with a 3 tick take profit and 2 tick stop loss, at a given time, makes 2 ticks profit, wired into a sell it is not filled. So the buys database will be added a (0.03 ; 2 ; 2) and the sells (0.03 ; 2 ; 0).

And now the things become interesting. At the end, after the data is gathered for a mass of signals that make up the “big number” for the statistics, we have to look thru the database. First, a simple watch will group the database by action and result. In this case, we have six final states:

1. BUY: quote, get filled, take profit (BQFT)
2. BUY: quote, get filled, stop loss (BQFS)
3. BUY: quote, don’t get filled, cancel order (BQC)
4. SELL: quote, get filled, take profit (SQFT)
5. SELL: quote, get filled, stop loss (SQFS)
6. SELL: quote, don’t get filled, cancel order (SQC)

So we have six cases. Analysing BUY with quote, get filled, take profit, is just one of the six states and will comprise a large part of data, the same as the other states. The purpose of the statistical analysis is to have a statistic prepaired per each case. For instance, our case should look, in the end, like:

0.03, 2 : 45% BQFT, 35% BQSL, 20% BQC ; 30% SQFT, 30% SQSL, 40% SQC

After the intial mass of signals along with results is collected, statistics have to be deployed for each combination of the parameters. Having real numbers as parameters, with lots of digits , will lead to the fact that you may have actually a number of cases equal to the number of signals that were the input, rendering statistics impossible. So, there has to be applied some data aggregation. For instance, for each parameter set, can be calculated the quartiles then the number of bins with the Freedman-Diaconis formula (this is an example, other methods can be applied). This will yield up a different number of bins for each of the parameters. Then a new array will be created, where each case will have a sticker denoting its number of bins. For instance the 0.03 stddev is labeled as “Bin 1″ on the stddev scale and the 2 tick distance is labeled as “Bin 2″ on the tick distance scale. The total number of cases may be smaller or equal than the product of bins count per each criteria. If you have 3 criterias and 10 bins each, that may be even 1000 number of cases. More likely, a core number of cases will have a lot of situations recorded, while others will barely have one situation recorded. However, the more platykurtic the repartition of situations per cases, the more enforced the need for a “bigger number” of situations. After applying the statistics, our database statistics will look like:

Bin 1, Bin 2 : 45% BQFT, 35% BQSL, 20% BQC ; 30% SQFT, 30% SQSL, 40% SQC

What is important is that relevant statistics is to be extracted. If this is the current case, what would be the trading decision? Go for a buy with a chance of take profit in 45% of the situations this case appears? Only that it has to stand to 35% stop losses… Because on the sell side it’s even worse, with stop losses situations roughly equal to take profit ones. This case has to be decided as a buy case, although with reserves. Unless the profits are relevant enough, there is no decision to be taken. Nevertheless, it is mandatory for the algo to have a decision module that picks only relevant cases for trading. But how fast can the machine be, because after the initial data is gathered, each new signal, (that can be even a new tick, in a HFT-like algo), will trigger a massive recalculation of bin sizes on the entire database, to update the statistics, and at the same time to allow the delivering and execution of the trading decision in real time?

Remember my article about the CEP engine. The CEP is visioned there as a recognition – trigger module, but what if the CEP is first used as a recording device ? What if the CEP is wired to the statistics engine? How much latency would this add to the calculus? Can we still talk about real time?

2 Responses to “ A Statistical Method for Self-Tuning Trading Systems ”

  1. Investeo on May 25, 2010 at 3:54 pm

    Bogdan, keep up with good work. What do you think of getting the processing power from GPU’s? What I have had in my mind for a few months is writing a wrapper for NVIDIA’s CUDA to MQL5 and moving part of the calculations to GPU cores. They can perform massive calculations in parallel. What convinced me is that Bloomberg is using GPU’s for options calculations. I even bought CUDA - enabled laptop to serve the job but still I am struggling with a free time to improve my skills and complete the task. Please see http://www.nvidia.co.uk/object/cuda_finance_uk.html and you will see what GPUs are capable of.


  2. Bogdan Caramalac, MQLmagazine sr.editor on May 25, 2010 at 6:34 pm

    Investeo, an application has to be written specifically to use the GPU. It’s not only Bloomberg using it, quite all the hedge funds are writing their own platforms to use GPUs. After all, if you have a good video board on your computer, why leave the GPU sleep well? However, as you can see here, MetaQuotes decided not to use CUDA technology for the Strategy Tester - of course, they were interested in this cloud computing feature, with the introduction of remote agents - but some extra power coming from the GPU should not have been wasted. Also, I don’t know anything if the main terminal is able to use this technology. I don’t see how, if GPUs are “slow in processing of general purpose algorithms” , hedge funds would be interested in using this technology. All I know, when these guys are interested in something, better go get it, because they always know better.