Tuesday, March 18, 2008

Random Walk Part 2: A Random Walk to San Antonio

If you're reading this without reading yesterday's post, you're a complete idiot. Geez, what a moron! Take the time to read it, or perhaps even better, don't read either. Go outside and get some fresh air.

Burton Malkiel, author of A Random Walk Down Wall Street, loves to use examples of probability to show how the stock market can be random and yet have the appearance of a pattern to analysts looking for a theory to beat the market. He tells the story of a statistics professor who began each year with the same example. He would have everyone in the class make up a bunch of results for a coin flip and, while he was out of the room, one person would actually flip a coin a bunch of times and record the results. Then they would mix up those results with the made-up ones and pile them on his desk. Every year he was able to pick out the real result from the made-up ones. Why? Because the real result had the longest streak in it. If you make up coin flip results, you'll probably write something like: HTHTTTHHT. In other words, you'll put down a result that looks random. But the real result will be more like: HHHHTTHHHHTTTTTT.

Malkiel draws a similar lesson from sports. It's one that is mildly frustrating for sports fans to accept. Let's say Gerry McNamara is shooting 41% from 3-point range for some European team. One day he starts the game with 5 straight made threes. He's on a streak! What are the odds that he makes his sixth shot? If I was in the building, as he released that sixth shot, I would have a certainty that it was going in. He's hot! He's in the zone! In fact, the odds are 41%. Malkiel cites statistical studies of basketball shooting tendencies that show there is no corollary between the result of one shot and the result of the next shot. In retrospect, Gerry had a great game. But you can't use streaks to predict the future. You can't predict a made shot. You can't predict when a baseball player will have a slump, or break out of one. You can't predict March Madness.

But Paul, I've predicted those things correctly. Yes, but you've also predicted them incorrectly. There's no system for picking NCAA brackets. If there was, we'd at least see one perfect bracket each year. But we never do. (Here's a quick, cool story about the odds of getting a perfect bracket. Basically, it's impossible.)

Yesterday I was talking to a guy at Borders who said he had finished in the top 3 in his pool 3 straight years. His strategy: spend only a moment thinking about the games and go with your gut. I wasn't impressed. Come back when you've finished in the money 10 straight years and maybe I'll copy your brackets next year. The tournament is certifiably insane. This decade we've had a final four with two 8-seeds (2000), and another with an 11-seed (2006). That crazy 2000 season saw only two double-digit seeds get out of the first round. A year later, nine double-digit seeds got out of the first round! But we know this.

Now that I've argued that no system for picking brackets will work, I'm going to propose a system for picking your bracket. Like fantasy baseball, there is so much about the tournament that is predictable. That's where you need to try to make your money. (I say "you" because I think I've only finished in the money once in my life, and that was when I was about 15. I was in the 99th percentile on ESPN.com going into the final four during my junior year of high school. My track record is not good in pools, though. Fair warning.)

OK, so you know the basic seed v. seed stats for the first round. Number 1 seeds never lose, 4-seeds lose about once every five games, etc. That's a great place to start. If you have never read any of Peter Tiernan's statistical analysis on ESPN.com, you need to. I'm not going to go into all the details, but one of the things Tieran has tried to do is identify common characteristics of teams that pull off upsets. Generally, those include a high regular season scoring margin (especially for small- and mid-level conference teams), frontcourt scoring, and actual tournament experience. I tend to think that the sample of only 21 years is a little too small to draw strong conclusions from those similarities, but it is worth considering when looking at your bracket. Far more important is the average success of the different seeds.

There are a lot of different pools out there. A certain bookstore I'm aware of will probably have only about 10 to 15 entries. ESPN.com will have hundreds of thousands. You should bet on chaos far more liberally when facing a large number of opponents than when you're in a small competition. If something crazy happens, and you happen to have it on your ESPN.com bracket, you'll have a slightly wider margin of error than if the seeds generally hold.

In your office pool, you have to consider the scoring system. The aforementioned bookstore's pool is using a 1-2-4-8-10-12 scoring system that, it bears mentioning, I did not devise. That means the same amount of points are available for picking the final four as picking the 32, 16, and eight (32). Then the final two and the champion picks are worth less, total (20 and 12). Therefore, whoever picks the most final four teams is going to have a very good chance of winning the pool. Since the number of entries is small, it would be wise to submit a conservative final four, rather then go for an all-or-nothing final four.

Here are the numbers to consider, as quoted by Tiernan.
The 92 Final Four teams of the modern era are made up of:
• 38 No. 1 seeds (41.3 percent of semifinalists)
• 21 No. 2 seeds (22.8 percent)
• 12 No. 3 seeds (13.0 percent)
• nine No. 4 seeds (9.8 percent)
• four No. 5 seeds (4.3 percent)
• three No. 6 seeds (3.3 percent)
• three No. 8 seeds (3.3 percent)
• two No. 11 seeds (2.2 percent)

Let's start with the lower seeds. Since almost one fourth of the final fours in the modern era were a four-seed or lower, I should have one four-seed or lower make the final four in my bracket, right? Wrong! Even if you narrow it down to just 4, 5, 6, 8, and 11 seeds, there are 20 teams to pick that one team from. It's just not worth taking that gamble in a small pool.

The average final four has between one and two 1-seeds. So the conservative method would be to take two 1-seeds in your final four. Well, I'm advocating an ultra-conservative method for small and medium-sized office pools. If you take 3 1-seeds, you've got a great chance of getting points if only one 1-seed makes the final four and a good chance to get two teams right if two 1-seeds make the final four. If you only pick two, and they're the wrong 1-seeds, you're out of the tournament.

That ultra-conservative strategy leaves one more slot in the final four that can be filled with a 2 or a 3seed. (Okay, maybe a 4-seed, if it's a very strong team.) In fact, I'd advocate for a 1-1-1-2 final four in an office pool, the very same final four that Cousin Eric mocked Clark Kellogg for predicting. It gives you the safest chance for a solid amount of points. It's up to your knowledge and instinct to decide which teams they will be.

The added bonus from the conservative strategy is that even if they don't make the final four, 1-seeds win a lot of games. If one of your 1-seeds only makes the elite eight, you've still collected all of its points up to that point.

Obviously, there are many points to be won in the early rounds as well. Pick a few 7, 6, 5, or 4 seeds to get upset here or there. If you're right, you'll separate yourself from the pack. If you're wrong, just accept that the tournament is largely a random walk, and set yourself up to have the most teams alive in the later rounds. Take risks with picks only if your pool's scoring system will reward you for doing so, and never stray far from the statistical trends for seed performance.

Finally, even if you enter 15 different pools with 15 different brackets like me, have at least one "Sheet of Integrity," so at the end of the random walk to San Antonio, you have a chance to boast that you saw just about every step of the way coming.

------

P.S. After I typed this whole thing out, I realized this strategy takes a lot of the fun out of filling out brackets, and some of the satisfaction out of winning. But I'm going to give it a try this year, and I'll keep you posted on the results.

Labels:

3 Comments:

Blogger Chris said...

I'm picking Cornell over Stanford. It's my Pick Of Integrity.

Incidentally, my bracket of integrity was created on ESPN.com, where as part of each team's bio, they include record against the Top 25. I used that as my bible.

Final Four: Wisconson, UNC, Texas, UCLA. Final: ULCA over UNC.

3/19/2008 2:15 AM  
Blogger justinistired said...

Okay -- just for posterity:

Pick of integrity: Butler over Tennessee

Cinderella: WVU to the elite 8, over 'Zona, Duke & Xavier. They will be very tired.

Final Four: UNC, G'town, Texas, UCLA

Finals: Texas over UNC.

3/19/2008 5:46 PM  
Blogger Chris said...

let it be so. (as justin said, because stanford is already creaming cornell).

comeback!

3/20/2008 4:44 PM  

Post a Comment

<< Home