Spearman DataSeries is slow in optimization
Author: Carova
Creation Date: 6/15/2017 6:39 PM
profile picture

Carova

#1
Hi Eugene!

I seem to be having a caching issue with the TASC Spearman indicator during an Optimization. My strategy has the following line
CODE:
Please log in to see this code.

The Value for SCPeriod can only be five different values (20, 40, 60, 80, & 100). The WL that I am using has several hundred symbols.

The optimization takes about 7 minutes per single evaluation with the calculation of DataSeries "sc" (since it is very computationally intensive) but only 18 seconds when I comment out this calculation.

Since the DataSeries "sc" can only have 5 possible values for the period, this DataSeries should be cached rather quickly, and computation times should speed up dramatically as the optimization run proceeds. But that is not happening.

I looked at the code for Spearman and it shows (to my VERY untrained eye) that it should be caching the DataSeries but it is obviously not doing so.

1) Is there any issue with the code for the calculation of the Spearman?
2) Is there any way that I can verify that the caching is actually taking place?

Thanks!
Vince


profile picture

Eugene

#2
Hi Vince,

There are nested loops making the code computationally intensive. Spearman is free from caching issues. As you can see, code below instantiates 5 series and extracts them from Bars.Cache instead of plotting them directly:

CODE:
Please log in to see this code.
profile picture

Eugene

#3
P.S. Comment out this line to see for yourself:

CODE:
Please log in to see this code.
profile picture

Carova

#4
Thanks Eugene for the quick reply. I see that what you coded shows that it is indeed being cached in that application.

Could it be that the caching in not occurring in the optimization? As you pointed out, the calculation of the Spearman series is extremely intensive. Therefore I would expect that the first calculation of it with a given period would take time, but any future need for the series would grab the cache near instantaneously. I am not seeing that. Is there any way for me to test if caching is indeed occuring in an optimization?

As a side note, I have always expected that each iteration of an optimization to progress faster than previous ones since more of the potential DataSeries would have been previously cached, obviating the need for their calculation. I have not observed that.

Vince
profile picture

Carova

#5
Hi Eugene!

Here is a script that demonstrates the issue.
CODE:
Please log in to see this code.


I am optimizing it using PSO with the Dow 30 as the WL, from 1/1/1995 to 6/1/2010.

When I time the calculations the results are attached.

The times are dominated by the calculation of the Spearman. If the cache was supplying the results after all the possible periods had been calculated the time for each instance would drop back to 3 seconds, but it doesn't.

Vince
profile picture

Eugene

#6
Vince, I don't find any "issues" here. Each optimization run is a fresh new run. Simply put, Bars.Cache exists for that single run and then it's rebuilt. You're probably confusing it with some sort of global memory. The good news is that Wealth-Lab has it so you could put it to use to cache the DataSeries between runs if its calculation time is a concern. Check out SetGlobal and GetGlobal in the QuickRef. There are examples in forum posts, too.
profile picture

Carova

#7
Hi Eugene!

Do I understand correctly that WLP clears the cache memory after each run? If that is the case, then why does my memory usage balloon during an optimization run?

Vince
profile picture

Eugene

#8
Garbage collection is controlled by .NET framework. It's a more intricate process than simply clearing after a run.
profile picture

Carova

#9
Hi Eugene!

I have read the info on Get/Set Global in the Help and the few items here on the forum. From what I see there does not appear to be a method to cache all of the computationally intensive DataSeries for all of the period values and all of the symbols of a WL in a portfolio simulation in a manner to allow their retrieval for subsequent use in an optimization run.

Am I missing something?

Furthermore, if it is indeed possible, if there are two scripts running there seems to be a need to prevent using the same Global series names for what could potentially be different computations. Is that correct?

Vince
profile picture

Eugene

#10
Hi Vince,

When caching a series via SetGlobal, define a unique string which based on combination of Bars, .BarScale and .BarInterval, and Spearman's period. That's the point of using global memory across different tools in WLP.
profile picture

Carova

#11
Hi Eugene!

When you suggest "Bars" am I to assume that the Symbol name will be included?

Can you give me a few lines of code which I can use as the starting point for the Set and Get of these series? (Yes, I know, I need to learn how to program in C# much better! ;) )

Vince
profile picture

Eugene

#12
QUOTE:
When you suggest "Bars" am I to assume that the Symbol name will be included?

Right, this is what I mean.

On second thought, just the Spearman period won't be enough because it doesn't reflect the symbol name. So your string has to include key parameters like the following:

1) sc.Description [where sc. is the Spearman DataSeries - just the period won't be enough], and
2) three properties of the Bars object: .Symbol, .Scale and .BarInterval.

How to dress that string is up to you, just be sure that it's correctly parsed by Strategies that retrieve using GetGlobal.

QUOTE:
Can you give me a few lines of code which I can use as the starting point for the Set and Get of these series?

For example, check out Cone's post #55 in this thread: How to generate orders in real time during the trading day?
profile picture

Carova

#13
Hi Eugene!

Here is my (poor) attempt to do it. It does not Compile. All sorts of errors.
CODE:
Please log in to see this code.


What am I doing wrong?

Vince
profile picture

Eugene

#14
Vince, here's a quick example:
CODE:
Please log in to see this code.
profile picture

Carova

#15
Thanks Eugene! That answers my question.

BTW, is there a way to access the "Data Range" for the Portfolio simulation so that I can include that in the identifier?

Vince
profile picture

Eugene

#16
Fortunately for you, there is: GetDataRange in C.Components. But I'm not sure why is it required?
profile picture

Carova

#17
Thanks Eugene!

QUOTE:
But I'm not sure why is it required?


I need to separate the Training Range (with the cached series for the Spearman) from the Testing Range (where the cached series would be incorrect).

Vince