Originally Posted by
Blax0r
I think the statistics he's trying to compute has a huge time-cost to generate per race; ie, he has to loop through each horses/riders/whatever's entire history, do some calculation, and use that output as an input for the following races. One example would be some crazy decision tree that is "path-dependent", meaning he needs some representation of the route taken to reach the current node. You can definitely pull in the necessary data and do the entire computation in php, but at 200,000 historical samples (according to Travis), it'll be slow.
He was trying to minimize this cost through various channels, but it may ultimately be best to just store those stats (but do the actual model output for "live" races in php). If you have a better solution, I would be interested to hear it as well, since I have a similar problem.