Design pattern for in-line PosSizer
Author: LenMoz
Creation Date: 9/27/2018 4:42 PM
profile picture

LenMoz

#1
QUOTE:
there's an implementation logistics problem doing it the other way around

I have a prototype that does runs bar by bar for all symbols. You might be interested in the design. The key is to code the strategy as a class having three methods, Initialize, TradeOneBar, and Finalize. The design makes use of global storage. It runs as a multi-symbol backtest, letting WL sequentially present the symbols in the normal fashion.
1. On the first symbol, it sets up date variables to track the symbol having the most bars and its count, because not all symbols may have the same date range. These are stored to global.
2. For each symbol, it instantiates the strategy class, myclass, and runs myclass.Initialize. "Initialize" contains all the DataSeries development and other initialization typically done ahead of the bar loop. It stores the myclass object in global storage as _symbol. If the symbol has the most timestamps, it updates the globals from 1.
3. Finally, at the last symbol, the work begins, handled by a custom PosSizer class. This class will hold signals from all symbols.
a. Gets the list of timestamps for the symbol having the most (per global)
b. For each timestamp (bar), for each symbol, retrieves the corresponding myclass from global and if its Bars contain the timestamp, runs its TradeOneBar method. The last active position is passed in so WLP "IsLastPositionActive" is replaced by check for lastPosition being not null. myclass is re-saved to global to keep its variables up-to-date.
c. At each bar, after the last symbol, PosSizer logic adjusts or removes signals per PosSizer rules.
d. After the last timestamp, each myclass is retrieved from global and its "Finalize" method is run
4. The null "Override in script" WLP PosSizer sets things up for the Visualizers, etc.

That's the pattern. There's overhead in casting myclass repeatedly, but the pattern can work for some percentage of strategies.
profile picture

superticker

#2
So at what point are WL indicators being evaluated: Initialize, TradeOneBar, or Finalize?

And are all the results of all these WL indicators being saved in global cache as a myclass _symbol instance?

QUOTE:
There's overhead in casting myclass repeatedly,...
You're saying since the cache misses are done in "blocks" (and not elementwise), where a block holds all relevant symbol data contiguously in global memory, then cache misses are minimized. So you're solving the Principle of Locality problem by grouping all relevant DataSeries objects into these contiguous myclass instance blocks--very clever.

This design pattern works because the L3 processor cache is partially associative. It updates blockwise on a cache miss, so it brings in multiple DataSeries results--if they are stored contiguously--on a single cache miss. (The L1 cache is more fully associative, so this trick won't work for the fastest processor cache level; there will be a more serious speed penalty at that level.)

QUOTE:
...[this design] pattern can work for some percentage of strategies.
It can work for strategies that can keep the myclass instances small by using only a few WL indicators. You need to define the DataSeries fields within myclass so they are stored as contiguously as possible to maximize your Principle of Locality there. Have you thought about converting to single precision inside myclass so you double your Principle of Locality?

It's a clever design. So how big is your Data Range? How many indicator results are being stored in myclass, and what's the speed penalty of doing it this way ... 5 to 7 times?

---
I have to wonder if the speed penalty is worth this across-symbol bar-by-bar evaluation method? Is knowing what other symbols are doing really going to significantly influence trading on a given bar for a given symbol? Will it make more money? I'm thinking it will only help some of the time, but I could be wrong.
profile picture

LenMoz

#3
QUOTE:
So at what point are WL indicators being evaluated: Initialize, TradeOneBar, or Finalize?
Initialize, which is all that code ahead of the "for (bar=…" loop. "TradeOneBar" is the code inside that "for" loop. "Finalize" may be overkill, not needed, intended where the strategy may capture its own run info, built in now in case I ever find a use for it..

QUOTE:
all relevant DataSeries objects in these contiguous myclass instance blocks
Those DataSeries plus any other strategy variables, and the code.

I just did a 12-year run of the Dow 30 in about 7 seconds. I typically run 5 years or more across 100 symbols.
QUOTE:
converting to single precision
Most of the work is using DataSeries that I can't control. I would expect no gain.
profile picture

superticker

#4
QUOTE:
I just did a 12-year run of the Dow 30 in about 7 seconds.
That's good. I tried the same thing with my production strategy (evaluating 15+ indicators) and it took 20 seconds. Interesting. I'm employing some robust statistical methods, one requiring a triple sort, and that takes time. I typically simulate/optimize over 800 daily bars.
profile picture

LenMoz

#5
QUOTE:
I have to wonder if the speed penalty is worth this across-symbol bar-by-bar evaluation method?

From day one I've been bothered by the fact that the boilerplate strategy design is out of sync with post-PosSizer holdings much of the time. It's worth something.

BTW, my Dow test only develops two DataSeries, not exactly typical.
profile picture

superticker

#6
QUOTE:
From day one I've been bothered by the fact that the boilerplate strategy design is out of sync with post-PosSizer holdings much of the time.
Quite honestly, I've only experimented with PosSizers. I think my coding efforts are better spent modeling external variables to better predict stock behavior. But certain strategies do perform better in certain market climates. For example, a buy-high strategy performs best when the market is bullish, and become problematic when the market is bearish. So using a PosSizer to throttle a particular strategy based on market climate/behavior makes good sense.

Perhaps I have this totally wrong, but I thought a "standard" WL PosSizer was design to throttle the entire strategy "collectively" with its associated dataset, and not on a symbol-by-symbol bases. Do I have this wrong? But I appreciate your design is intended to work on a symbol-by-symbol bases, so your design goal is somewhat different. I think this finer granularity could be an advantage, but I'm not sure.

QUOTE:
my Dow test only develops two DataSeries, not exactly typical.
And if you keep your myclass object instances small (like two DataSeries objects), this approach should work if these instances fit into an L3/L2 cache block. But if these instances grow larger than a cache block (thereby creating multiple cache block misses), then you'll have a slow down. At any rate, your approach is very clever at minimizing L3 cache block misses when stepping through slower moving myclass objects.
This website uses cookies to improve your experience. We'll assume you're ok with that, but you can opt-out if you wish (Read more).