Handicapping with Neural Nets

**Leverage** · 11-25-09, 05:35 AM

NN in my experience are not very helpful. Try looking at Genetic Algorithms to evolve lines. Very interesting stuff.

On a side note there isn't a NN as good as mine lol

**Wrecktangle** · 11-25-09, 10:00 AM

I've not been successful with neural, but it's a complex field. I think someone who had their sh*t in one sock could make it work after a few decades of trial and error.

**twister** · 11-25-09, 11:26 AM

Originally posted by Wrecktangle

I've not been successful with neural, but it's a complex field. I think someone who had their sh*t in one sock could make it work after a few decades of trial and error.

Wasn't there some woman who used a Neural Net. to lay horses on BetFair, and with some random money management system managed to make £100,000 for a year. Wish I could remember the details though.

**LLXC** · 11-25-09, 08:28 PM

josie88, were you using random training and/or test sets? With most "learning" algorithms, I have noticed it takes good training data, and a good amount of it. Therefore, it usually won't perform well until a few weeks of data munching.

**CaptainPrice** · 11-25-09, 08:41 PM

im interested in this

**josie88** · 11-26-09, 05:19 PM

LLXC,

The data was culled from sportsdatabase.com and imported into Excel. The program is alyuda forecaster xl and I went back to 1995, instructing the NN to select 75% or so test set, 25% random (not sequential) games to test against. I was concerned about the test set size and so created 6 NN's using identical training criteria (hidden nodes, stop training when generalization loss equals x amount, etc), the only difference in each NN being how the NN itself chose within the total data set which games would be used to learn and which would be used to test.

The test results varied from 48% to 62%, 52% being the average. I believe this speaks to the point about dataset size. My thought is even with this rather large sample size an even bigger one would be helpful.

I've also discovered not surprisingly that you need a separate net for each team. Not a big deal since all the heavy lifting is done already, just substitute the data and off you go.

Much work still to do. Learning as I go along.

**Wrecktangle** · 11-27-09, 07:10 PM

Originally posted by LLXC

josie88, were you using random training and/or test sets? With most "learning" algorithms, I have noticed it takes good training data, and a good amount of it. Therefore, it usually won't perform well until a few weeks of data munching.

This might have been my problem, and in sports in general...leagues change and hard to train to.

**Indecent** · 11-27-09, 10:15 PM

Originally posted by josie88

I've also discovered not surprisingly that you need a separate net for each team. Not a big deal since all the heavy lifting is done already, just substitute the data and off you go.

Much work still to do. Learning as I go along.

How are you formatting your data for input? You shouldn't be training one network for each team, or at the very least you shouldn't have to.

Here's a really quick example. Imagine a NN whose data set only consists of points scored and points allowed. The network itself only has two inputs,
1) home average points - away average points
2) home average points allowed - away average points allowed.

Then you need to prepare the data first to calculate the to date averages for both teams, and then subtract the away teams value for use as a network input.

In case I explained it poorly here's a practical example.
Over 5 games, team A (home) averages 100 points for and 95 points against. Team B average 120 points for and 130 points against.

The network inputs would then be
1) 100 - 120 = -20.
2) 95 - 130 = -35

Then the next game when Team A plays Team C you calculate averages for both including their previous game for network inputs, in this case points for and points against. You do this for each game after you wait a few weeks to let the teams accumulate meaningful stats.

Does that make any sense?

**josie88** · 11-28-09, 01:11 AM

Yes that makes sense. The Xl forecaster preprocesses data on the fly which is why I'm using it. Big time saver. For example, it's able to make sense of text such as OVER and linear data as well in the same net. It's able to handle categorical (OVER, UNDER) as categories and scores as numbers. Whenever possible, I use excel's replace function to assign numbers, though. OVER could be assigned a 1, UNDER a 0 and so on. Previous game win ATS might be a 1, loss ATS a 0. The sheet mostly looks like numerical digit cells when I'm done. The forecaster will allow you to specify which columns are to be treated as linear and which as categorical. At the moment the NN output is categorical, i.e. WIN ATS or LOSE ATS.

After training it outputs additional training data on another sheet and tells you how it treated each column. Sometimes it does stupid things you don't want, other times it will disregard a column of info you'd like to keep. You just keep playing with it until it behaves rationally. I'm certain though that any time it does something illogical it's because I overtrained it and it can no longer generalize about the future. So you throttle back some of the settings and give it another whirl. Lather, rinse, repeat.

For the benefit of folks who are reading, NN's tend to need a lot of data. The more I fed it the happier it was and the more consistent my nets became. At times, the NN would deliberately delete a column, insisting that the data was irrelevant (the NN couldn't find any relationship to the desired output). This program also will tell you the weights of each category that made up the net. This I found useful because after training I could go back and see which columns were making little or no contribution and then delete them and retrain to see if it made any difference.

NN's are neat things to watch but they think like a human brain. They make associations that are random at first and then compare against fresh data. If the association is positive, that connection is strengthened, if not, it's punished. The NN repeats this many thousands of times in a short time and continually checks itself against the training data vs the test data. There are so many factorial combinations of weights and frankly I haven't a clue as to how the NN zeroes in.

When to stop training? Who knows? There are a small handful of options that you can tinker with in the program. I just threw a dart and kept track of the settings as I went along.

My topology is still totally lacking. I'm happy with the data set though. For now anyway.

**Indecent** · 11-28-09, 12:42 PM

Do you know what kind of activation function you are using? Data normalization techniques? I've never used XL before, curious if you can adjust these.

**josie88** · 11-28-09, 01:45 PM

The program is free for 30 days, the demo is limited only by a 500 row dataset. http://www.alyuda.com/downloads/down...ecaster_xl.exe

Any thoughts would be welcome.

**GETMONEYKID** · 12-01-09, 03:46 PM

ask rickj

**Wrecktangle** · 12-02-09, 08:39 PM

The freebie 30 day trial is a full implementation, no?

**josie88** · 12-03-09, 01:31 PM

yep, 100% working less the 500 record limit

update: I'm very comfortable with the program now. It took several days to get to a high level of competency.

I'm fairly convinced at this point that the key to this is the topology.

For the folks reading the thread, topology is a defined structure of a NN. To be successful, you need to know what you can predict. A topology, for example, might be using historical stats to predict a game result. You'd have 'x' inputs and an output "Win" or 'Lose'. Very simple and likely an unsuccessful topology. Another topology might be not to ask the NN to predict an outcome, but something else that can be used to predict a game outcome, i.e. a score, power rating, or line.

Another topology might be to instruct the NN to provide more than one output, i.e. offensive and defensive ratings. Even a NN to help predict another NN is possible. As I am discovering there are hundreds if not thousands of topologies that might work.

Basically my model here is to use common sense to design one topology, create an appropriate dataset, preprocess, and train. Only when I'm sure it's a dead end will I move on to another topology.

Without the benefit of prior information on what might work (there's nothing I can find yet), this is still a trial and error approach and could be a *very* long process to discover something that can work. I can see how a NN consultant could make some good coin, if there's such a thing.

Even with that daunting news it's clear that with the proper topology and dataset this approach could hit 54% and maybe a few points more.

**LLXC** · 12-03-09, 01:51 PM

Is there a reason why you decided to use NN? NN use to be much more popular 5-10 years ago, and researchers still hope it will solve more unsupervised learning problems. However, with the results of games, you could consider that supervised learning. There are other algorithms out there that would work better IMO.

**Indecent** · 12-03-09, 02:11 PM

Originally posted by josie88

Without the benefit of prior information on what might work (there's nothing I can find yet), this is still a trial and error approach and could be a *very* long process to discover something that can work.

It is a long process. I found that topology is second to your stats representation. Try playing around with the data first- adjusting for opponent strength, adjusting stats to try yards per play, yards per game, yards per minute, etc. I would recommend building a graphical tool of the network topology so you can take a quick glance at the network weights/biases to be able to see the network training in real time. It will help pick out problems with your stats or topology earlier, and might help you start thinking outside your comfort zone.

To a certain extent, NN's are as much science as they are an art. You have to play around and find what works, and it takes time.

Originally posted by josie88

I can see how a NN consultant could make some good coin, if there's such a thing.

There is. I've done contract work in ai since college and met an absolutely brilliant man who did nn consultant work for a major car company and probably every investment company on wall-street. I bet he was paid extremely well, and probably worth every penny.

**Indecent** · 12-03-09, 02:19 PM

Originally posted by LLXC

Is there a reason why you decided to use NN? NN use to be much more popular 5-10 years ago, and researchers still hope it will solve more unsupervised learning problems. However, with the results of games, you could consider that supervised learning. There are other algorithms out there that would work better IMO.

In theory NN's can be used to solve unsupervised learning problems, and you are probably thinking of Kohenen Self Organizing Maps. They are a type of neural network with no hidden layer, where each input is mapped directly to each output. I've found it useful in trying to classify teams, but it has since been dropped from my handicapping algorithm for more advanced techniques.

Which algorithms would work better do you think? I've considered using several alternatives, but once I had the nn framework/training done for one sport it was trivial to change the code for new sports.

**Wrecktangle** · 12-03-09, 07:31 PM

Looks like an interesting tool. I can't remember the tool I was using for NN, but I agree with you on the "art" part. Perhaps my patience ran out before the painting was finished.

**Saab** · 12-13-09, 08:06 PM

I briefly tested NN's with some baseball stats, but I couldn't find a large and easy to manipulate database for basketball or football.

I am now more interested in genetic algorithms though. I really think if you spend enough time on some of these ai algorithms you can come up with something very very good...

**Indecent** · 12-13-09, 10:01 PM

Originally posted by Saab

I really think if you spend enough time on some of these ai algorithms you can come up with something very very good...

**Hybris** · 12-18-09, 07:33 PM

Neuroxl

http://www.neuroxl.com/white_paper_artificial_intelligence.htm

Neuroxl | Jambi City Government Latest News Portal

Any good? I would love to try this out but I´m a total newbie with all of this. Anything good I can read to get a better understanding of how to use stats and why?

**Indecent** · 12-20-09, 10:52 PM

If you aren't a programmer, I'm afraid you are in for an uphill battle.

**FreeFall** · 12-20-09, 11:19 PM

I don't see the advantage to using a NN in this situation to generate a line. They are used more for robots who are finding there way through dynamic environments.

**Indecent** · 12-21-09, 01:02 AM

Originally posted by FreeFall

I don't see the advantage to using a NN in this situation to generate a line. They are used more for robots who are finding there way through dynamic environments.

No offense, but you have no idea what you are talking about.