"x-y" Linear Regression?
Author: Carova
Creation Date: 5/17/2019 10:24 AM
profile picture

Carova

#1
Hi Eugene!

The current linear regression indicator displays the results against time. How do I go about performing an "x-y" regression, i.e. a regression of a series against a series? Thanks!

Vince
profile picture

Eugene

#2
Hi Vince,

If you come up with its C# code there will be! :-)
profile picture

Carova

#3
Hi Eugene!

I know it is built into .Net but I am at a loss as how to access it to make a WL indicator from it. :( (profound C# ignorance)

Vince

profile picture

Eugene

#4
Are you sure you're not confusing .NET itself with an external library like ALGLIB or Math.NET? Speaking of built in things, to my knowedge Microsoft Chart control (MSChart) supports various regressions (linear, log, polynomial etc) of a time series. MS123 Visualizers uses it, for instance. If you have a specific example of what you're trying to accomplish don't hesitate to point me to it.
profile picture

superticker

#5
I use Math.NET all the time, and they do have some nice regression fitting routines. Check out their examples: https://numerics.mathdotnet.com/Regression.html

Now nothing in Math.NET does plotting, only curve fitting. If you need to plot (Do you?), and you're writing a Performance Visualizer (Are you?), then I would check out: https://code.msdn.microsoft.com/mschart

If you need a plot and you're not writing a Performance Visualizer, then I would use either Excel or R. Both interface with any .NET application like Weath-Lab. Excel might be easier to get started with, but if you already have R installed, then I would use that instead. And R does regression fitting too, which is a good option if you are not wanting to do fitting as part of your strategy code. (Personal note: The regression fitting I do is part of my strategy execution, so I use Math.NET exclusively for that.)

I can post my WL library routine for fitting y = beta00*1/x + beta0 + beta1*x with Math.NET if that helps, which covers raising x to powers of -1, 0, and 1 as you can see from the equation. I skip including the x² term because that's too sensitive to outlier behavior. You "might" include a (ln x) or exp(x) term in there, but I wouldn't put too many degrees of freedom in your regression model because that would over fit it for a fuzzy system problem.

---
I just had a wild and crazy idea. If it's possible to define a 2D double array:
CODE:
Please log in to see this code.
and store it in the WL global cache, then one may be able to write a general purpose Performance Visualizer to plot some arbitrary 2D thing easily. The Performance Visualizer should probably purge the xyScatterPlot array after plotting so it's not stuck in the cache. There might some conflicts if two separate strategy instances try writing into the same global array simultaneously. Hmm, that's a down side of employing the WL global cache. Well, you could define your own protected strategy 2D array in the constructor of the strategy that wants to make the xyScatterPlot. But then that strategy instance will have to exist when the Performance Visualizer tries to plot it. Is that a problem?
profile picture

Carova

#6
Thanks Eugene! Yes, I did mean Math.NET. My error!


Hi superticker!

Yes, I had seen that but my limited C# skills prevented me from even beginning to understand an approach to using it. I was attempting to use it in a script, not a Visualizer, which made the task potentially easier, but still beyond my abilities.

I did locate C# code for linear regression (https://gist.github.com/NikolayIT/d86118a3a0cb3f5ed63d674a350d75f2) and attempted to use it to construct a WL Indicator

CODE:
Please log in to see this code.


but I am at a loss for a number of items which are creating errors.

Help Eugene!

Vince


profile picture

Eugene

#7
The original code doesn't convince me that you're not trying to reinvent the wheel i.e. standard LinReg indicator:

https://gist.github.com/NikolayIT/d86118a3a0cb3f5ed63d674a350d75f2
CODE:
Please log in to see this code.


At any rate, Wealth-Lab works with DataSeries which by definition is the basic data structure that represents a historical series (i.e. a List<double> with an associated List<DateTime>).

Vince, please count me out for this topic.
profile picture

superticker

#8
Well, the code you have will probably work okay. Did you need an LR decomposition (your above code)? Are you fitting many x-arrays against a single y-array? If so, then an LR decomposition would be faster. I kind of thought you just wanted to make a single plot, not an array of plots for many different x-arrays. Math.NET supports LR decomposition setups too.

If all you want to do is fit a "single" line, then the Simple Linear Regression example at https://numerics.mathdotnet.com/Regression.html will do that; that example is included below. When you run this code, the Execute() statement will loop for every stock in your dataset. So just run it with a single stock, not the entire dataset.
CODE:
Please log in to see this code.

You'll need to copy the MathNet.Numerics.dll library into Wealth-Lab's install directory before you can run this. Hmm; there's a MathNet.Numerics.xml file in there too. I'm not sure if you need that one, but you can include it just to be safe.

There is a Community.Components function to convert a DataSeries to an array if you're interested, but I haven't used it: https://www.wealth-lab.com/Forum/Posts/Convert-DataSeries-to-C-Array-38617

None of this will plot, of course. That's another problem. For off-line work, I just plot with Excel. The WL discussions talk about multiple methods for getting WL data into Excel. Excel can fit regression models too. Go to the Data >> Data Analysis menu and select "regression". You may have to install the Excel Data Analysis pack if you haven't already done so.
profile picture

Carova

#9
Hi Eugene! The WL LinReg indicator does create an x-y indicator where "x"=time. I want the "x" to be some other data series.

Thanks superticker!

How do I convert the code you provided so that it creates an "indicator" of length "l", i.e. it does the fit over a period of "l" bars?

Vince
profile picture

superticker

#10
QUOTE:
How do I convert the code you provided so that it creates an "indicator" of length "i", i.e. it does the fit over a period of "i" bars?
Is the x-array going to be time (in bars)? If so, why don't you just use the WL LinearRegSlope.Value function http://www2.wealth-lab.com/WL5Wiki/LinearRegSlope.ashx to get a slope for a given time-period window? You don't need a general xy-plotting routine for this.

I'm not sure where you are going with this. Would you be wanting to cross correlate one time series (say an index) with another (say a stock)? See cross correlation https://en.wikipedia.org/wiki/Cross-correlation Some indexes would correlate better with different stocks. If I remember right, I "think" WL does have a correlation visualizer for doing that already, so you don't need a cross correlation for that.

Now if you're looking for leading indicators (where lag time is involved), then a cross correlation is needed. For example, we know that oil prices are correlated with gold prices. But which leads (oil or gold) and which follows? Now you need a cross correlation.

So please describe the goals of your xy-plot? What's on the x-axis, and what's on the y-axis? And what's the overall purpose?
profile picture

Carova

#11
Hi superticker!

I am attempting to get the slope of the regression of two series (an x and a y). For the specific case where x=time we have LinearRegSlope, but for the general case where x != time there is no equivalent. That is what I am trying to create with this indicator. Is it clearer now?

Vince
profile picture

superticker

#12
Well then, for x != time, my posted solution (Post# 8) should work for you. You just need to convert your x and y arrays to double[] before calling Fit.Line(). And as mentioned in Post# 8, there's a Community.Components To.Array call if you need to convert from DataSeries to double[]. Alternatively, you could just use a for loop (which the compiler can probably optimize better):
CODE:
Please log in to see this code.

I can't get more specific than that unless you can offer an example where x != time, because the implementation and regression equation is likely to vary for different cases. Also, if x != time, then you probably want to create a Performance Visualizer, not an indicator (which is base on a time dependent DataSeries).

One implementation comment. Within a WL strategy, I try to use only WL compatible data types. So I'm reluctant to place double[] data types within a WL strategy. What you can do is create a personal code library with Visual Studio and place your double[] types in there instead. Then when you call your personal routines from your strategy, the only arrays present in your strategy will be of type DataSeries. But I would get your regression code debugged in the WL editor first, then move it into a personal *.DLL library.

My strategies are about 450 lines, but my personal libraries are 6 times that size in total. I never call external packages (like Math.NET) from within a strategy because they have WL incompatible data types.
profile picture

Carova

#13
Hi superticker!

I am looking to use the xyLR indicator to track two closely related items (Unleaded Gasoline and Crude Oil Futures) for short to intermediate term hedge trades. I examined price ratios, but they are not too good for this purpose, so I am interested in exploring the slopes of a variety of time periods to see if that works better. I believe that this approach might work well for pairs-trading a number of highly correlated trading vehicles where price ratios are not suitable.

Vince
profile picture

Eugene

#14
@Vince,

What's wrong with using Correlation of the ROC of each one of your closely related items, for example?

FYI:
QUOTE:
You'll need to copy the MathNet.Numerics.dll library into Wealth-Lab's install directory before you can run this. Hmm; there's a MathNet.Numerics.xml file in there too. I'm not sure if you need that one, but you can include it just to be safe.

The XML file is not required (I don't think it'd help Wealth-Lab Editor's Autocomplete much anyway) but make sure to uncheck "Downloaded from the internet" in file's properties before copying or it will not work (Wealth-Lab will act as if it's not there)!

Here's how: How to unblock files downloaded from Internet in Windows 10
profile picture

Carova

#15
Hi Eugene!

Looked at that approach, but since the two instruments are so highly correlated (>0.98) that was not useful.

QUOTE:
The XML file is not required (I don't think it'd help Wealth-Lab Editor's Autocomplete much anyway) but make sure to uncheck "Downloaded from the internet" in file's properties before copying or it will not work (Wealth-Lab will act as if it's not there)!

Here's how: How to unblock files downloaded from Internet in Windows 10


Thanks! I am still trying to figure out how to get this into indicator form. :(

Vince
profile picture

superticker

#16
QUOTE:
I am looking ... to track two closely related items (Unleaded Gasoline and Crude Oil Futures) for short to intermediate term hedge trades.
Okay. So why not decorrelate one with the other? Or am I missing something?
CODE:
Please log in to see this code.

So if the decorrelatedLine has a positive slope, then unleaded gas is doing better than crude oil, and vice versa. Pretty simple.

There are an unlimited number of ways to decorrelate something. I don't mean is suggest ROC is the only approach to decorrelation. For example, if you redesign VWAP so it can operate over Daily bars (The current WL version can't do that.), you can decorrelate with that instead.

An echo cancellation filter is an example of decorrelation with a "lag time". Engineers use this type of filter in analog landline phones calls so it's possible to carry your two-way call with two wires rather than three; otherwise, one direction would confound the other with an echo.

---
QUOTE:
I am still trying to figure out how to get this into indicator form. :(
Just remember an indicator is for a time varying transform. If the transform isn't a function of time (not x == time), then don't make it an indicator. A DataSeries (which indicators create) is always a function of time.
profile picture

Carova

#17
Hi superticker!

I have tried a number of different formulations attempting to tease out an effective way to separate out the "indicator" that might work. This has included your approach of decorrelation. In all of the cases the trading noise leads to many whipsaws. This was why I am exploring alternatives.

QUOTE:
I am still trying to figure out how to get this into indicator form. :(

QUOTE:
Just remember an indicator is for a time varying transform. If the transform isn't a function of time (not x == time), then don't make it an indicator. A DataSeries (which indicators create) is always a function of time.


What my indicator would be is the slope as a f(time).

Vince
profile picture

superticker

#18
QUOTE:
In all of the cases the trading noise leads to many whipsaws.
You can take the decorrelatedLine output and run it through some kind of EMA. I would pick one that's adaptive. WL has several. Exactly, what are the ticker symbols you are trying to decorrelate? I would like to look at them myself. You may be looking for an inverse correlation between these pairs that's not really there in the first place.

QUOTE:
What my indicator would be is the slope as a f(time).
So you should be able to write an indicator that produces a DataSeries for what you want to do. Rather than using the ROC(Close,1,"...") in the decorrelation calculation, you could substitute the LinearRegSlope(Close,5,"...") instead. I don't especially like than solution because you should be doing the decorrelation with as much acuity as possible, ROC(,1,) employs one bar resolution, then post conditioning the decorrelatedLine with a smoothing function (some EMA) as necessary. But you can try decorrelating with LinearRegSlope(Close,5,"...") or LinearRegSlope(new ROC(Close,1,"..."),5,"...") instead. Keep it as simple as possible!

As any rate, I think employing LinearRegSlope(Close,,"...") in your indicator somehow is the way to go for finding a slope. You only need to use the Math.NET Fit.Line() call if x != time, which is not the case with your immediate problem.
profile picture

Carova

#19
QUOTE:
You can take the decorrelatedLine output and run it through some kind of EMA. I would pick one that's adaptive. WL has several. Exactly, what are the ticker symbols you are trying to decorrelate? I would like to look at them myself.


What is your data provider? The reason I ask is that different providers use slightly different "nomenclature" for the Futures symbols than what the NYMEX uses. Here are charts for the current contract of Unleaded Gasoline (https://www.barchart.com/futures/quotes/RB*0/technical-chart) and WTI Crude Oil (https://www.barchart.com/futures/quotes/CLN19/technical-chart). Perhaps your data provider uses these symbols in some fashion to construct a back-adjusted continuous contract.

QUOTE:
What my indicator would be is the slope as a f(time).


I probably should have said "x-y slope as a function of time".

Vince
profile picture

superticker

#20
QUOTE:
I probably should have said "x-y slope as a function of time".
So think about inserting LinearRegSlope() in your code so it can return that x-y slope behavior over time--which is what it's design to do. Now you're left with forming the behavior (or equation) of the x-y instrument pairs over time so you can pass that relationship (equation) into LinearRegSlope().

Fidelity is my only data provider. If Fidelity doesn't list it, I don't buy it.
profile picture

Carova

#21
QUOTE:
So think about inserting LinearRegSlope() in your code so it can return that x-y slope behavior over time--which is what it's design to do. Now you're left with forming the behavior (or equation) of the x-y instrument pairs over time so you can pass that relationship (equation) into LinearRegSlope().


That was one of my early attempts at addressing the issue. Way too much lag, which resulted in very poor performance.

Vince

PS. That is because you need to normalize the slope by dividing with a smoothed price series
profile picture

superticker

#22
QUOTE:
That was one of my early attempts at addressing the issue. Way too much lag, which resulted in very poor performance.
If there is a time lag between the two instruments, then you need the cross correlation function (see Post# 10) to determine exactly what that lag is so you can time shift one series relative to the other first before doing anything else. WL and Math.NET don't have a cross correlation function, but you can write the nested FOR loop from the cross correlation definition. You'll see two nested sigmas in its equation for the two nested FORs. The outside FOR loop drives the the starting indexes for the innermost FOR loop.

You can simply line the two time series's up manually for now and add the cross correlation later. Let's not make this problem too complicated out of the starting gate until we're sure all this will work.

Fidelity can't resolve ticker CLN19 or RBN19, so I can't run them on WL. Are there Fidelity symbol equivalents for these two instruments?

I'm just casually looking at the 6-month plots of these two instruments now (Daily bars), and I don't see that they are inversely correlated for arbitrage-pair trading. If anything, they look highly correlated to me. Am I on the wrong time scale?

Or perhaps I'm not understanding the goals here. The idea is to sell one to buy the other--right--because one goes down when the other goes up on a periodic bases? Is this period expected to cycle over days or minutes?
profile picture

Eugene

#23
@superticker
QUOTE:
Fidelity can't resolve ticker CLN19 or RBN19, so I can't run them on WL. Are there Fidelity symbol equivalents for these two instruments?

These are Futures contracts (energies).
profile picture

Carova

#24
superticker,

QUOTE:
If there is a time lag between the two instruments,


There is no time lag between the instruments.

QUOTE:
I'm just casually looking at the 6-month plots of these two instruments now (Daily bars), and I don't see that they are inversely correlated for arbitrage-pair trading. If anything, they look highly correlated to me. Am I on the wrong time scale?

Or perhaps I'm not understanding the goals here. The idea is to sell one to buy the other--right--because one goes down when the other goes up on a periodic bases? Is this period expected to cycle over days or minutes?


It is a fully-hedged pairs trade - buy the stronger, sell the weaker. Think of it as a mean-reversion trading strategy, or a swing trade for equities. The period can be as short as a few days or as long as a couple or three weeks. But you need to get in at the right time or the profit is lost.

Vince
profile picture

superticker

#25
I don't know how to trade these two instruments. And I never studied futures. I can't help you. Perhaps someone who understands futures knows how to do this. I'm out of my area here.
This website uses cookies to improve your experience. We'll assume you're ok with that, but you can opt-out if you wish (Read more).