stock market

This Is Not Investment Advice

Got it? I hate disclaimers, but unless you've paid me for said service (actually, unless you're currently paying me for it!), I'm not your broker/advisor/investment consultant, I'm just some random guy on the Internet that you have no reason to trust, and I have to make that clear lest someone lose a crapload of money based on something that I've written. I'm not telling you to buy/sell/hold/gamble/mortgage your house or anything like it - if you want stock picking tips, talk to your broker, the homeless guy outside your building, or your brother-in-law, maybe they will be willing to tell you what the next Google is going to be. I'm not, so don't consider anything on this site to be a suggestion to put your own real money into anything unless you decide completely independently of this site that it's worthwhile to do so. In any case, here I'm going to be focusing more on what doesn't work than on what does, so hopefully I can steer you away from some common pitfalls. The one piece of advice I can give you is that there's no magic bullet here, and you're probably better off not playing this game at all ("this game" referring to playing the stock market's short to medium term gyrations for profit) - most people that actively trade lose money, and that includes some extremely smart and well educated people. The larger banks tend to steer mostly if not completely away from this stuff, and that should give you pause. Most of the methods touted as "sure things" and "95% correct" are pure unadulterated garbage, doubly so if you can understand them without a degree in mathematics or finance! If you do trade, NEVER do so with money that you can't afford to lose, since there's a good chance that you will lose it all! The market is largely the socially acceptable rich man's version of the casino, so bear that in mind - even if the outcome is not merely determined by a roll of the dice, that doesn't mean your odds are any better.

The Market

In all that follows, I'll be using the term "the market" or "the stock market" to refer quite generally to any of the somewhat standardized trading venues where people buy and sell either stocks, futures, forwards, commodities, options, or other stranger and more interesting things (though bonds are sufficiently different in their mechanics that I'll generally not be talking much about them). I'll be the first to admit that there are significant differences in the way all these investment vehicles function, so it is perhaps a bit inappropriate to lump them all together. However, here I am more interested in talking about the underlying mechanisms of human trade than about the dirty idiosyncratic process of doing it in the real world.

The basic idea of the market is simple: a lot of people have stuff, and a lot of people have money. Since the value of this stuff as measured by money changes over time (or is, in fact, uncertain in the first place), a lot of people try to game these changes to pick up a few bucks in the process. Buy low, sell high. Sometimes the stuff has some intrinsic value - cotton, for instance, is a physically useful thing. Sometimes less so - gold is precious mainly for its scarcity. And sometimes the value is entirely based on the future expectation of value - for instance, one might purchase Google stock in the hope that either the stock value will increase or the company may start to pay dividends to the shareholders. The precise measure of value is not important. What is important is that people are willing to haggle over prices - if the measure of value is too crystal-clear, there's no point in trading since everyone can just plug the numbers into a formula and know whether they are getting a good deal or not. This is why you don't see people trading sets of one dollar bills for five dollar bills as a career - trust me, if there was any money in it, people would be doing it.

Getting down to the details now...if you're used to seeing stock tickers, you may be a bit confused by something I said above. I mentioned that the value of something is never fixed, but all the time we see things like "The Dow is up 10 points today," and they quote a very specific value, right down to the penny for the Dow. Isn't that fixed? Well, in a sense. The model works like this: at a given moment in time, we can ask a few questions about a stock (or a bond, commodity, future, etc. - again, from now on I'll speak imprecisely so that I can get through more than a sentence without a listing of every investment device out there). The big ones for the moment are the following:

Generally speaking, all three of these will be extremely close to each other in a liquid market (one that has enough action to make things flow smoothly). For a stock that's part of the Dow Jones Industrial Average, these three prices will usually be within a penny or two of each other. This is why it makes sense to quote an exact price for the stock - if you just look at the last price paid, you're going to be really close to the market's estimation of value at the moment.

However, this doesn't explain at all why markets move. For this we need to ask some more refined questions:

These questions begin to touch on market dynamics. Suppose 100 shares of XYZ Corp. are available to purchase at $10.00 a share, 200 at $10.01, and 300 at $10.02. Now suppose I come to the market and decide that I want to buy 350 shares of XYZ. Though the current price is listed at $10.00, I'm not going to be able to get 350 shares at that price; I'll have to take 100 at $10, 200 at $10.01, and 50 at $10.02. In the process, I've just moved the lowest possible purchase price by two cents, and I've also moved the last purchase price to $10.02 from wherever it was before.

Now, this won't necessarily affect the highest possible price that someone is willing to pay for the stock. But to see how this comes about, let's assume that there was an offer on the table to purchase 400 shares at $9.99 before I entered the market. Now, one element of the market that I haven't mentioned is that there is a whole lot of noise coming in, as in random market buy and sell orders. These are orders from people who aren't particularly interested in a penny or two, so they aren't trying to hold out for the best possible price in penny terms, they just want to place an order and have it filled instantly, so they'll buy or sell at whatever the best available price is. Enter Joe Scalper - he has the brilliant idea that he can make some quick cash off of these n00bs as long as he's on both sides of the trades. For instance, in the scenario above, after I've moved the lowest available share price (referred to as the offer or ask price; the highest price someone's willing to purchase for is referred to as the bid) up to $10.02, he may make a slightly lower offer for 100 shares at $10.01, simultaneously making a bid to purchase 100 shares at $10. This means that once a few random purchases come in, he'll be able to purchase his 100 shares at $10 and sell them at $10.01, making $1 for his efforts. This is the underlying mechanism that makes the bid-ask spread tend to shrink.

So where is the motion? The motion often arises from the fact that we don't necessarily know that Joe Scalper will jump in at $10.00 and $10.01. It may be that whoever was willing to pay $9.99 decides after my purchase that the stock is actually worth $10.01 to him as long as he gets in, so he may jump in there. Or, the market may reject my move and someone may offer immediately to sell some more shares at the original price of $10.00. It all depends on the mood of the participants, and to some extent on luck.

An eBay analogy is pretty good here - suppose you're on eBay looking for a very specific antique birdhouse. The offer is basically the lowest "Buy it now!" price on this antique birdhouse; the bid is not usually explicitly defined on eBay, but is essentially the highest "Buy it now!" price that you could set and be assured that someone will immediately purchase your birdhouse.

Which brings me to an important point. If you've spent any time on eBay, you'll realize that the variability in price is usually much, much larger than one cent. A large part of this is due to the fact that there is a much smaller volume of antique birdhouse transactions than there are, for instance, US dollar vs. Euro trades through a typical Forex broker (though another piece of the variability is in the fact that antique birdhouses are not all alike in quality or condition, whereas a Euro is a Euro is a Euro). This means that the antique birdhouse market is highly illiquid. That's not necessarily a bad thing, though, and to be honest, unless you really know what you're doing in the stock market, you're probably likely to make more money (or at least lose less) flipping antique birdhouses on eBay than you would trading USD/Euro contracts!

So to sum up: the fine-grained market motions are set by an interaction between two groups of people, 1) the careful, penny-pinching Joe Scalpers that very carefully set their bid and ask prices and decide exactly what they are willing to pay for the stock, and 2) the impatient investors that just decide they want to buy or sell at whatever price is available, thus "picking off" the best priced shares avaliable/desired from group 1).

Momentum And Mean Reversion

Once we understand the basic underlying mechanism, we can ask what happens when we step back and take a bigger picture view. After all, we're not particularly interested in becoming Joe Scalpers - this can be profitable, but it is an extremely cutthroat business that is very difficult to become good at since there is so much competition. More often than not, the moment you spot a hole that you want to jump into, one of the more experienced scalpers with a faster trading connection than you have will jump in a fraction of a second before you have a chance to. Plus, you're glued to a computer screen all day, and that's not much fun!

Once we zoom out a bit, it does become more reasonable to look at a stock's movements on a graph. Price data for many stocks is readily available (Yahoo! Finance is great for daily price data, and Dukascopy has a lot of finer-resolution data, especially currency data, both for free download; I wouldn't necessarily recommend either of these for serious analysis, since they occasionally have flaws, but for playing around, you can't beat the price!), so we can chart the time series and start looking for patterns.

Here's the problem: humans tend to see patterns literally everywhere. That is, we invent patterns to explain data even if the data doesn't follow a pattern. Once we have a pattern in mind, we easily find all sorts of confirming evidence and discard any contradictory evidence. There are two particular patterns that tend to pop up quite often when we look at stock charts: one is momentum, the other is mean reversion.

Momentum is the tendency for motion in a particular direction to continue. Mean reversion is exactly the opposite, it is the tendency for a quantity to move back in whatever direction it came from. If it strikes you as odd that these two completely contradictory concepts are often cited as being at work in the market, it should. You really can't have it both ways, it seems.

Worse, we can provide plausible arguments for the presence of each behavior. Momentum is explained fairly reasonably by the fact that when people see that a stock is moving up, they get excited, and think that it will continue in that direction, so they buy more of the stock hoping to cash in. This drives the prices further, and actually speeds up the climb. And mean reversion? Well, if people think want to buy low and sell high, then it would stand to reason that the higher the price goes the fewer people will want to buy it, because it's getting too expensive. So we've just plausibly suggested two completely contradictory long-term behaviors in the market...what's the truth?

The truth is subtle. A perfect balance between momentum and mean reversion would essentially give us a completely random market (since there's always some more or less random noise coming in), and this is very close to what we see. Sometimes momentum dominates, and sometimes mean reversion, and usually they just about balance out. Given this, one must wonder whether there is really anything of use in these concepts anyhow; this is a very good question, and to my mind, not one that has been answered very fairly over the entire history of the stock market.

If you read any book on trading, it will tell you that "the trend is your friend." It will offer advice such as comparing two moving averages and looking for "short term versus long term MA crossings" to predict good entry and exit points. And this works quite well if you are in a market that "trends" - the problem, of course, is that markets go through trending and non-trending periods, and you don't know which one you're in until it's over! And any trend-following strategy gets you killed if the market is "moving sideways," as they like to put it.

My perspective on this? Well, it's a tricky issue. I've found that these crossing moving average strategies tend to be garbage. By "garbage" I mean that they don't do any better on average than flipping a coin to determine your entries and exits. This doesn't mean that they don't occasionally make a lot of money; it just means that they often lose a lot of money, too. I would not rely on one of these as a primary trading strategy, despite what any "For Dummies" book might tell you. When it comes right down to it, I'm not entirely sure if it's "useful" or not to even think of such a thing as a trend. Certainly on the extreme long term there are trends (primarily driven by macroeconomic factors), but those play out over years, not days. And in a completely random data set it is possible to find segments that look like they are trending, whereas in reality they are completely random.

The Parameter Optimization Pitfall

Suppose you've come up with a strategy and you want to figure out how to use it to make money. Most strategies have free parameters in them - for instance, in a moving average crossing strategy, you need to specify the number of days in the long term moving average and the number of days in the short term one. Some strategies will have fewer free parameters, some will have more; rare is the strategy that has no "knobs to turn."

Clearly, if we're being serious about this stuff, we want to optimize the knob positions to find the best possible set of parameters, the ones that will maximize our profits. Say we want to trade for a year - we want our parameters to give us the highest possible profit over that year. The problem, of course, is that we don't know in advance which parameters will be best. All we have to work with is historical data, so we have the bright idea that we should just figure out which parameters worked the best over our set of historical data.

There are a few problems with this approach. First of all, it introduces another free parameter into our method, namely the number of days to backtest during optimization. This isn't necessarily deadly, though, and in fact it usually decreases the effective number of free parameters. Imagine, for instance, our two parameter moving average strategy - if we optimize the parameters over the past N days, then we've effectively reduced our theory from a two parameter theory to a one parameter one, so we've gained something. In fact, we've redesigned our strategy, in a sense, since the strategy as a whole now includes the optimization. The second problem is best phrased as a question, though, and this one may be deadly: does this actually work to make money in the market?

You see, the problem is that we could apply this strategy (look for crossings of the X day moving average and the Y day moving average, and optimize the MA strategy over the past N days to find X and Y) to a purely random set of data, and find values for X and Y that gave pretty nice profits over the past N days. [There are numerous algorithms for optimizing a payoff function over several variables; with this simple strategy, you could simply test every combination of X and Y up to maybe 200 days, and see which one did the best; with more than 2 parameters this is not convenient, so a more subtle approach must be taken] But for random data, there is absolutely no way to predict what will happen over the next N days, so the resulting X and Y parameters are completely and utterly useless, despite our laborious efforts at optimization. So what ultimately matters is not the actual payoff achieved by this optimization over the past N days, it is whether the payoff over the last N days will predict anything about the payoff in the next M days.

A lot of people think that the solution to this is just to pick some values and stick with them. They literally stop listening altogether the moment the word "optimize" is mentioned because they understand that meaningless optimization can be worse than no optimization at all; they essentially attempt to redefine their strategy as a parameter free one by arbitrarily pinning the parameters - in other words, instead of looking at the general class of "crossing MA" strategies, they prefer to use the parameter-free "17-day vs. 4 day MA" strategy. I disagree strongly with this sentiment. There are good optimizations and bad optimizations, and your strategy is only worth anything at all if there exists a good optimization for it. For a completely pinned strategy, there exist no optimizations, let alone good ones. A good optimization exists if and only if you can backtest a strategy to set parameters over the past N days and have parameters that do well over the next M days, where N and M are known and fixed (or at least slowly moving). Otherwise you have garbage, plain and simple. Some classes of strategies may possess good optimizations (which still may be difficult to find - they essentially require optimizations over M and N, which must in themselves be good optimizations, so you see the kind of recursive nightmare we can get into with this stuff), and many, many more will not. One definition of a random time series (good for our purposes) is simply a time series that admits no good optimizations.

The real problem now becomes deciding whether or not true price data admits good optimization. There are a few schools of thought on this. First, I should mention that everyone is in agreement that on the shortest time scales, good optimizations abound. Just about any nontrivial strategy can be properly optimized to perform well on 10 second or tick-by-tick data; of course, it is a different matter altogether to achieve a strategy that performs well on such data once you add in trading fees, slippage, bid-ask spreads, and human time cost. If you're exceedingly careful, you may be able to figure out how to eke out a decent hourly wage for your time spent at the computer staring at the ticker and executing dozens of trades a day. In fact, optimizing a strategy along these lines can give you a pretty good estimate of what the market values a very short term trader's time at - if anyone cares to take this up, I would be exceedingly curious to know the result. One problem is that it is not enough to merely look at past price data, for several reasons, not the least of which is that if you're making money off of such small moves, you cannot simply double the amount you make by doubling your investment because you are at the level where you may, in fact, move the market yourself. Also, historical price data is notoriously bad at these levels, and it's very difficult to say based on historical data whether you will get filled, at what price you will be filled, and how close to your stop level you will get stopped out (a stop is basically an "if things get ugly" clause in your purchase, that says you want to get out immediately if you lose more than a certain amount on a trade). To anyone considering such a short term strategy, I'll warn you, you're this close to playing Joe Scalper, and you really better know what you're doing, because it's not trivial to set something up that works. Every penny counts, so you need to have solid data, a blazing fast trading system, and probably even a separate computer that you use for on-the-fly analysis. It's one thing to train neural networks with genetic algorithms and whatever other buzzword laden "AI" technique you want to use when you're trading on the day/week/month scale; it's another thing altogether to set up such a system that is fast enough to respond in a timely manner to a direction switch over the past 45 seconds!

There is less agreement beyond that, though. There are several forms of the so-called "Efficient Market Hypothesis," all of which basically send the message that you're wasting your time if you think you can make money in the market, at least if you're hoping for more gains than the overall macroeconomic factors achieve anyways. Some people think that market timing (or technical analysis, trying to use the past price behavior to predict the future - basically what we're going to be discussing here; there is another school of thought called fundamental analysis that examines balance sheets and stuff, which unfortunately is much more difficult to evaluate the merits of for lack of good free historical data) is useless, but looking at measures of company health are useful, for instance examining cash flow, products, leadership and whatnot. Others feel that all important details about a company are already priced into the stock, so there's really nothing other than insider information that could help you predict future movements. I personally don't think that the efficient market exists as the academics might prefer; however, I also don't think the market is quite as inefficient as a lot of traders would like to believe. In my opinion there are extremely subtle patterns that are somewhat difficult to discern and even more tricky to make money off of. The time series are not random, but they are not easily predictable, either. Kind of like people, go figure...

In any case, here's the message of this section: if and when you optimize a strategy, be extremely careful about your optimization methodology. Test it against random data to see what happens there, and if you're getting the same kinds of results on random data as you are on real data, you've more than likely developed a useless strategy. Most strategies are, in fact, useless, so make sure that any strategy you go with for real trading has actually predicted the future in the past - after all, any fool can predict the past, and any foolish strategy can be set up to do the same. It takes a lot more to perform well in the future.

One more bit of advice: if you ever come across a strategy that gives profits that are almost linear in the number of trades performed, be skeptical. Usually this is a sign that your testing is marking as "profits" something that you really can't profit off of, whether it's an error in the algorithm, an assumption of perfect execution at exactly the desired price, allowing your strategy (by accident, I hope) to "see the future," etc. Rarely do strategies that perfect come along, so if it looks like you've got one, examine it very closely to figure out where you screwed up.

Randomization Testing To Verify Proper Optimization

Having by now (hopefully) convinced you that mere backtesting is not enough, I suppose we should discuss what is the right thing to do. We want to figure out a procedure to determine what (if it exists) does constitute a good optimization of parameters.

The ultimate goal is exactly what we said in the last section, to reduce the number of free parameters to zero by providing an effective means of setting the parameters. To understand how one might do this, we'll look at an example of the types of thought processes and tests we'd go through.

Suppose we really want to make that moving average strategy work. We may have decided by crude, direct optimization that over the period of our historical data the best performer was the "17/4 always-in" strategy, where you are long whenever the 4-day MA is above the 17-day MA and short otherwise (this may or may not be the case - in real life, it depends strongly on what particular data set you use, and over what time period your data exists, which is part of the reason this type of strategy is so terrible). We've also generated a dozen random data sets with the same distribution as the real data, and we've found optimal parameters for each of these random data. We are led to be suspect of the 17/4 strategy because when we compare its profits to those of optimized MA strategies on the random data, they don't seem that different. On some of the random data sets the best strategies did better than the 17/4 did on real data, and on some the best did worse. Even more deadly for 17/4, when we look at the individual trades made, nothing about the 17/4 strategy stands out. It's not any more insulated from loss or any more predisposed towards gain than many of the strategies on random data. Thus we are led to believe that the naive optimization is a spurious one.

So we look deeper. We decide that the real problem is that the problem is that, while there may be something to the general strategy of looking at a short term MA vs. a long term one, the proper parameters probably aren't fixed, but they vary as time goes by. Our hope - and this is not a given, but must be supported by the data - is that for some number N the optimal parameter from the last N days will be the optimal parameter, or at least close to it, for the next N.

Now instead of optimizing for the MA windows, we optimize over N, taking our new strategy to be "Follow the suggestion of whichever crossing MA strategy performed the best over the past N days." Note that our original naive optimization viewed in light of the new phrasing of the strategy amounts to taking N to be equal to the entire length of our dataset and not optimizing N at all - it's basically naive pinning to whatever amount of data we have; this is required at some level, of course. After all, suppose we've optimized over N and decided that it's again a useless optimization. Calling the last strategy N-strategy, do we now develop M-strategy, defined as "Follow the suggestion of whichever N strategy performed the best over the past M days?" After all, N-strategy is just M-strategy with M equal to the entire length of our dataset, completely unoptimized and naively pinned. Do we just keep recursing until MAX_STACK_DEPTH? We could define a new new strategy, called zeta-Z-strategy, which is defined as "Follow the suggestion of whatever strategy resulting from nesting the above series of optimizations zeta times performs the best over the past Z days." This now has two free parameters; maybe we can optimize over those, too! You see the point, this can go on ad-infinitum, and you really do need to just cut it off somewhere.

My opinion on where to cut it off? I'm not exactly sure. I know that if you go past three or four levels of recursion here, you're definitely engaged more in numerology than analysis. What I don't know is if there is anything to be gained by going past one level of optimization at all. After all, most MA crossing strategies only make trades once in a while, so looking at the best performer over N days versus N+1 days is not going to tell you much. In many cases the results will be identical since all of the trades are exactly the same. Even among moving average strategies, during a long upswing in the market, all MA strategies will have you in a long position, so there will be absolutely no difference in any of their performance. Perhaps for more complicated strategies the differences will be more important and it would make sense to optimize several levels deep, I'm not positive. Keep in mind that a lot of this is essentially moot because most strategies (MA strategies among them) do not have very good optimizations.

Anyways, though, back to the point - how would we tell if our second level of optimization was a good one? Simple. We generate a new dataset by shuffling the old returns (returns are just the change in price from data point to data point - actually we should shuffle the log returns or the ratios of prices and use the new ones to generate the new dataset, which will then have more or less the same distribution; shuffling returns has the unfortunate side effect of sometimes making the price go below zero, which is a simplistic explanation of why it's the Wrong Thing To Do), run our new optimizations on each of many randomized datasets, and compare the results to the optimized true strategy. Roughly speaking, we should see some sort of a bell curve distribution (this is not precise, and it really depends on what type of strategy and price data you're using) if we look at the profits of the uber-strategy on each dataset, and it should probably be centered around zero profit for the randomized data. You'll be able to get a sense of how effective your strategy is on real data by picking it out and seeing where its profit value lies on this bell-curve. If it's lying way above the rest of the points, then there's a chance you've got something real. What you've then essentially shown is that there is something about the real data that distinguishes it from random data. If, conversely (and more likely) your profit lies close to the center of the bell curve, you've probably got nothing, because your strategy performed just about as well on completely random and unpredictable data.

This is, to some extent, a fundamental shift in thinking when it comes to technical analysis, and is something that a lot of people never quite come to terms with. Your mission is not to find a strategy that performs better than other strategies on a given set of data. Well, okay, scratch that, that is the ultimate goal, but it's not the right way to think in order to get there, because we don't know a priori what the dataset is that we want to perform the best on. What you do need to think is that your mission is to find a strategy that performs better on the data you're given than on an ensemble of roughly statistically equivalent sets of random data. Statistical equivalence is hard to come by, generally, because there are a lot of ways to quantify it; our shuffling of log returns doesn't quite do it, and there certainly are statistical tests that would give different answers for the shuffled data and the real data, but it does a couple of important things that are necessary and more or less sufficient for our purposes. First, it preserves the overall change in price - if the S&P goes up by 200 points over the period of interest, the shuffled S&P will also go up by 200 points. Second, it preserves the distribution of log returns exactly, since it's just rearranging them (actually, that the overall change will be the same follows from the identical distribution). This means that the only information we've lost in the shuffle is the precise orderings of the log returns, which is exactly what we're hoping to profit off of with a technical strategy. This is why the technique is so useful, it gives us a set of control datasets that deliberately lack the pattern we intend to exploit. We're really performing a simplified form of statistical sampling so that we can evaluate the effectiveness of our strategy without figuring out how the distribution of the returns will propagate through the strategy; trust me, this propagation is something you most definitely want to avoid calculating by hand for anything but the most trivial of strategies. It becomes even nastier when you consider the fact that you don't even know the distribution of returns.

I'll warn you, though, if you start to deal with things like options, you'll likely not be able to use this shuffling approach. The value of an option depends quite critically - almost exclusively, even - on the distribution, so it doesn't help you there to merely rearrange the order, though it is worth noting that rearranging the order of one-minute returns will potentially affect the distribution of ten-minute (or anything longer than one-minute) returns, so there are some tricks that can be played there that could potentially be useful. But options are a nasty beast that I don't intend to get into at the moment...