how to learn to analyze data?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nivi
    SBR Hustler
    • 06-09-19
    • 70

    #1
    how to learn to analyze data?
    how can you built your own predicting model in Excel etc?

    is anyone who can show us how?
  • Sanity Check
    SBR Posting Legend
    • 03-30-13
    • 10962

    #2
    You can look for "big data" courses @ places like udacity, coursera, udemy, khan academy et al.




    Many are free and offer decent content.

    Statistical analysis, big data and similar topics are part of the analytical models books used to generate odds.
    Comment
    • nivi
      SBR Hustler
      • 06-09-19
      • 70

      #3
      Originally posted by Sanity Check
      You can look for "big data" courses @ places like udacity, coursera, udemy, khan academy et al.




      Many are free and offer decent content.

      Statistical analysis, big data and similar topics are part of the analytical models books used to generate odds.
      thank you so much
      Comment
      • gojetsgomoxies
        SBR MVP
        • 09-04-12
        • 4222

        #4
        probability, econometrics, python..... maybe start with figuring out how to calculate power ratings. i may be wrong, this may not be simple.

        anyway, i think power ratings are a good place to start with analytics as they are ubiqitous, easy to understand and are logically consistent with how people think simply about quant stuff i.e. rams are 7 points better than saints, but it's in NOLA so add back 4 points for saints strong HFA = 3 point rams spread.

        another good place to learn alot of stuff for free is mit open course ware. thousands of free courses with lectures/course notes readily avail.
        Comment
        • semibluff
          SBR MVP
          • 04-12-16
          • 1515

          #5
          Originally posted by gojetsgomoxies
          probability, econometrics, python..... maybe start with figuring out how to calculate power ratings. i may be wrong, this may not be simple.

          anyway, i think power ratings are a good place to start with analytics as they are ubiqitous, easy to understand and are logically consistent with how people think simply about quant stuff i.e. rams are 7 points better than saints, but it's in NOLA so add back 4 points for saints strong HFA = 3 point rams spread.

          another good place to learn alot of stuff for free is mit open course ware. thousands of free courses with lectures/course notes readily avail.
          Nice idea. Horrible example.
          Comment
          • nivi
            SBR Hustler
            • 06-09-19
            • 70

            #6
            mit open courses
            sounds amazing
            Comment
            • peacebyinches
              SBR MVP
              • 02-13-10
              • 1112

              #7
              Another thing to add to your toolkit, bayesian statistics: https://www.coursera.org/learn/bayesian-statistics
              Viewing statistics through the eyes of a bayesian framework can be a game changer for a lot of people. I wish I had adapted a bayesian approach in handicapping (and my real life job..) sooner.
              Comment
              • nivi
                SBR Hustler
                • 06-09-19
                • 70

                #8
                Originally posted by peacebyinches
                Another thing to add to your toolkit, bayesian statistics: https://www.coursera.org/learn/bayesian-statistics
                Viewing statistics through the eyes of a bayesian framework can be a game changer for a lot of people. I wish I had adapted a bayesian approach in handicapping (and my real life job..) sooner.
                how?
                can you give us an example how it applies on sport betting?
                Comment
                • turtledoves
                  SBR MVP
                  • 08-27-17
                  • 3398

                  #9
                  based on my limited understanding, big data courses are for analyzing huge amounts of data, 100s of GB to TBs across a cluster of machines. you don't need to use tools like hadoop and apache spark when analyzing smaller datasets for sports analytics on a single machine.

                  might want to look into python, scikit-learn, pandas, numpy, scipy
                  tensorflow for machine learning is the hot tool
                  Comment
                  • peacebyinches
                    SBR MVP
                    • 02-13-10
                    • 1112

                    #10
                    Nate Silver (of fivethirtyeight.com) wrote an entire (very accessible) book on how it applies to sports betting, poker, finance, politics, weather, and even earthquakes called 'The signal and the noise'. It's a great place to start learning about statistics that is an easy read, not math intense, and entertaining in my opinion. If you prefer a more academic place to start there are a ton of cheap text books that show you the math behind the madness, such as this.
                    Comment
                    • Sanity Check
                      SBR Posting Legend
                      • 03-30-13
                      • 10962

                      #11
                      Originally posted by turtledoves
                      based on my limited understanding, big data courses are for analyzing huge amounts of data, 100s of GB to TBs across a cluster of machines. you don't need to use tools like hadoop and apache spark when analyzing smaller datasets for sports analytics on a single machine.

                      might want to look into python, scikit-learn, pandas, numpy, scipy
                      tensorflow for machine learning is the hot tool
                      It was confirmed people use big data analysis to win on fantasy sports platforms like fanduel and draftkings.

                      Its an old thing that has been around for years.

                      Part of the reason states began banning fanduel and draftkings in many US states.
                      Comment
                      • nivi
                        SBR Hustler
                        • 06-09-19
                        • 70

                        #12
                        boys
                        give a Fuking example
                        a vid,a screenshot,wuteva

                        of how it analyzes data and gives an outcome


                        do not just post links and comment that they re great
                        Comment
                        • Bsims
                          SBR Wise Guy
                          • 02-03-09
                          • 827

                          #13
                          Originally posted by Sanity Check
                          It was confirmed people use big data analysis to win on fantasy sports platforms like fanduel and draftkings.

                          Its an old thing that has been around for years.

                          Part of the reason states began banning fanduel and draftkings in many US states.
                          I believe the big data that was being used was done by insiders using all the customer picks. Hence the bans.
                          Comment
                          • Sanity Check
                            SBR Posting Legend
                            • 03-30-13
                            • 10962

                            #14
                            Originally posted by nivi
                            boys
                            give a Fuking example
                            a vid,a screenshot,wuteva

                            of how it analyzes data and gives an outcome


                            do not just post links and comment that they re great


                            I'll give you an example.

                            Imagine your goal was to increase the accuracy of points over/under plays. A person could write an algorithm to compile and analyze game stats over the last 5 to 10 years. Based on data, they might find saturday or sunday has a higher statistical probability of over games in comparison to mondays or tuesdays.

                            There might be certain days of the year when a high percentage of teams hit the over play.

                            The point is something like a person can't predict what type of trends or patterns might emerge until after they analyze the data. Maybe there will be no patterns or trends. But sometimes a person might find something they can use to be more profitable.



                            Originally posted by Bsims
                            I believe the big data that was being used was done by insiders using all the customer picks. Hence the bans.
                            Bro. You don't know what big data is. Go derail some other thread with your jibber jabber.
                            Comment
                            • nivi
                              SBR Hustler
                              • 06-09-19
                              • 70

                              #15
                              Originally posted by Sanity Check
                              I'll give you an example.

                              Imagine your goal was to increase the accuracy of points over/under plays. A person could write an algorithm to compile and analyze game stats over the last 5 to 10 years. Based on data, they might find saturday or sunday has a higher statistical probability of over games in comparison to mondays or tuesdays.

                              There might be certain days of the year when a high percentage of teams hit the over play.

                              i am looking for something easier
                              like team A v team B
                              calculate he odds for an over outcome

                              how can i accomplish that ?
                              Comment
                              • Sanity Check
                                SBR Posting Legend
                                • 03-30-13
                                • 10962

                                #16
                                Originally posted by nivi
                                i am looking for something easier
                                like team A v team B
                                calculate he odds for an over outcome

                                how can i accomplish that ?
                                There are many variables in sports, there isn't necessarily an easy way to be consistently accurate without putting in some work.

                                The basics would be looking at average points scored per game and avg points allowed per game as a basis for offense versus defense. Then looking at the last few games played to see if they're on a hot or cold streak. Then looking at how far into the season we are to get an indication of how motivated teams might be. Etc. Just really basic things most have likely tried and not found much success with.

                                There are a few really basic trends casual gamblers probably know. Like how NBA games in the playoffs have a tendency to be more tactical and slow paced, with higher pressure, resulting in lower point games than in regular season.

                                The idea behind big data is to take that type of observation to the next level. Its something I thought about but never really got around to doing as I've been looking for ways to make money outside of gambling.
                                Comment
                                • nivi
                                  SBR Hustler
                                  • 06-09-19
                                  • 70

                                  #17
                                  Originally posted by Sanity Check
                                  The basics would be looking at average points scored per game and avg points allowed per game as a basis for offense versus defense. Then looking at the last few games played to see if they're on a hot or cold streak. Then looking at how far into the season we are to get an indication of how motivated teams might be. Etc. Just really basic things most have likely tried and not found much success with.

                                  yes i know how to analyze data of two teams manually
                                  thank you

                                  what i m looking for is a software to calculate many teams from many leagues
                                  using poisson distri, elo ratings etc
                                  in no time

                                  can we have that?
                                  Comment
                                  • HeeeHAWWWW
                                    SBR Hall of Famer
                                    • 06-13-08
                                    • 5487

                                    #18
                                    Originally posted by Sanity Check
                                    It was confirmed people use big data analysis to win on fantasy sports platforms like fanduel and draftkings.
                                    Big data has become a marketing term, flung around by journalists without much idea what it means. Mostly when it's used they mean machine learning - although again, that's lost its meaning as a buzzword.

                                    Something like "predictive modelling" is probably most accurate.
                                    Comment
                                    • turtledoves
                                      SBR MVP
                                      • 08-27-17
                                      • 3398

                                      #19


                                      Comment
                                      • Bsims
                                        SBR Wise Guy
                                        • 02-03-09
                                        • 827

                                        #20
                                        Originally posted by Sanity Check
                                        Bro. You don't know what big data is. Go derail some other thread with your jibber jabber.
                                        I was simply pointing out why these fantasy companies ran afoul of the legal system a few years ago.

                                        I agree with HeeeHAWWWW. "Big data has become a marketing term, flung around by journalists without much idea what it means."
                                        Comment
                                        • Sanity Check
                                          SBR Posting Legend
                                          • 03-30-13
                                          • 10962

                                          #21
                                          Originally posted by nivi
                                          yes i know how to analyze data of two teams manually
                                          thank you

                                          what i m looking for is a software to calculate many teams from many leagues
                                          using poisson distri, elo ratings etc
                                          in no time

                                          can we have that?

                                          Hopefully someone else will comment, that's something I know nothing about.


                                          Originally posted by HeeeHAWWWW
                                          Big data has become a marketing term, flung around by journalists without much idea what it means. Mostly when it's used they mean machine learning - although again, that's lost its meaning as a buzzword.

                                          Something like "predictive modelling" is probably most accurate.


                                          Machine learning has been around for decades. Its an old field that dates back to Alan Turing's day in the 1950s - 1960s. Big data is a more recent development which might include things like deep packet inspection and widespread collection of consumer and user meta data which are not necessarily related to machine learning.

                                          AFAIK anyways.

                                          Originally posted by Bsims
                                          I was simply pointing out why these fantasy companies ran afoul of the legal system a few years ago.

                                          I agree with HeeeHAWWWW. "Big data has become a marketing term, flung around by journalists without much idea what it means."
                                          Fantasy sports and sports gambling are banned for the same reasons state based regulation disapproves of bitcoin exchanges and escorts advertising on craiglist.

                                          Fantasy sports, gambling, crypto and adult entertainment are all sources of revenue for small business owners in the USA which translates to economic growth and a degree of prosperity. The goal for legislators is to restrict economic growth and crackdown on independent operators in those areas.
                                          Comment
                                          • Believe_EMT
                                            SBR Wise Guy
                                            • 03-31-19
                                            • 508

                                            #22
                                            Originally posted by nivi
                                            can we have that?
                                            bro, no one is going to hand you the keys to the kingdom. guys break their dikks working years to build models to find the slightest of edges. besides, we are all given this information every single fukkin day

                                            line opens with limited information
                                            more information becomes known
                                            line moves
                                            nearly 100% of knowable information is known
                                            line closes

                                            independent event happens

                                            a nearly infinite number of independent events happen that prove the closing line is far more efficient than the closer

                                            if you want to continue to reinvent the wheel, have it at. just don't come around demanding people hand you the answers. do some fukking work.
                                            Comment
                                            • nivi
                                              SBR Hustler
                                              • 06-09-19
                                              • 70

                                              #23
                                              Originally posted by Believe_EMT
                                              do some fukking work.

                                              i do fukking your whore mother daily
                                              now fukk off and stop spamming my threads you c*nt
                                              Comment
                                              • oilcountry99
                                                SBR Wise Guy
                                                • 08-29-10
                                                • 707

                                                #24
                                                Originally posted by nivi
                                                i do fukking your whore mother daily
                                                now fukk off and stop spamming my threads you c*nt
                                                Nivi.... EMT is correct, lazy won’t get you anywhere in this arena. Take a downer
                                                Comment
                                                • nivi
                                                  SBR Hustler
                                                  • 06-09-19
                                                  • 70

                                                  #25
                                                  Originally posted by oilcountry99
                                                  Nivi.... EMT is correct, lazy won’t get you anywhere in this arena. Take a downer
                                                  wanna chat with me?
                                                  send me a pm

                                                  but do not spam on the thread
                                                  Comment
                                                  • peacebyinches
                                                    SBR MVP
                                                    • 02-13-10
                                                    • 1112

                                                    #26
                                                    Fine, here is a solid example, step by step, of how to analyze data and pick winners.

                                                    First, assume we have Team A and Team B. Scrape the web for a lines and the past data relating to the 3 most predictive statistics or metrics (e.g. the 3 things that produce the highest variance in determining outcomes) determined by either an independent component analysis or factor analysis (I am more familiar with ICA but it is up to you). To be more specific, you need to somehow incorporate some measure of kurtosis (fancy word for the 'tailed-ness' or skew of a distribution)in which to recover the multiple source signal by finding the correct weight vectors with the use of projection pursuit (this will be important later).The kurtosis of the probability density function of a signal, for a finite sample, is computed as

                                                    where is the sample mean of , the extracted signals. The constant 3 ensures that Gaussian signals have zero kurtosis, Super-Gaussian signals have positive kurtosis, and Sub-Gaussian signals have negative kurtosis. The denominator is the variance of , and ensures that the measured kurtosis takes account of signal variance. The goal of projection pursuit is to maximize the kurtosis, and make the extracted signal as non-normal as possible.
                                                    Using kurtosis as a measure of non-normality, we can now examine how the kurtosis of a signal extracted from a set of M mixtures varies as the weight vector is rotated around the origin. Given our assumption that each source signal is super-gaussian we would expect:
                                                    1. the kurtosis of the extracted signal to be maximal precisely when .
                                                    2. the kurtosis of the extracted signal to be maximal when is orthogonal to the projected axes or , because we know the optimal weight vector should be orthogonal to a transformed axis or .

                                                    For multiple source mixture signals, we can use kurtosis and Gram-Schmidt Orthogonalization (GSO) to recover the signals. Given M signal mixtures in an M-dimensional space, GSO project these data points onto an (M-1)-dimensional space by using the weight vector. We can guarantee the independence of the extracted signals with the use of GSO.
                                                    In order to find the correct value , we can use gradient descent method. We first of all whiten the data, and transform into a new mixture , which has unit variance, and . This process can be achieved by applying singular value decomposition to ,
                                                    Rescaling each vector , and let . The signal extracted by a weighted vector is . If the weight vector w has unit length, that is , then the kurtosis can be written as:
                                                    The updating process for is:


                                                    where is a small constant to guarantee that converge to the optimal solution. After each update, we normalized , and set , and repeat the updating process till it converges.





                                                    Great, now that you've extracted this information using your preferred software (personally I am a MATLAB guy, but you can do this in python as well since I know that is waaayy more popular and intuitive for most people) you only have two more things left to do before you can determine which team to place your bet on. Next to last step: find a coin and flip it. If it is heads, bet on Team A, if it is tails, bet on Team B. Last step: take some xanax and chill out. No one here, or anywhere, is going to walk you through how to simply 'analyze data'. That's like asking us to teach you how to 'do physics'. Be more specific.
                                                    Last edited by peacebyinches; 07-01-19, 11:57 AM.
                                                    Comment
                                                    • nivi
                                                      SBR Hustler
                                                      • 06-09-19
                                                      • 70

                                                      #27
                                                      Originally posted by peacebyinches
                                                      Be more specific.
                                                      why all this negativity
                                                      i thought this thread would be a positive one for everyone


                                                      what i want is to learn how to do this

                                                      Comment
                                                      • peacebyinches
                                                        SBR MVP
                                                        • 02-13-10
                                                        • 1112

                                                        #28
                                                        Originally posted by nivi
                                                        what i want is to learn how to do this

                                                        https://www.youtube.com/watch?v=mUO2wPNthAw
                                                        Copy data into an excel file?? Control+c will copy, control+v will paste.
                                                        Comment
                                                        • nivi
                                                          SBR Hustler
                                                          • 06-09-19
                                                          • 70

                                                          #29
                                                          Originally posted by peacebyinches
                                                          Copy data into an excel file?? Control+c will copy, control+v will paste.
                                                          this is what it delivers


                                                          pretty much he applies Poisson distibution
                                                          and the fukker selling it for 50 euros

                                                          what i want to do is a way to insert all the data of a league in excel and to get the possibilities calculated by Poisson distribution and if possible Elo ratings as well,
                                                          for the coming matches


                                                          i can do it manually for a match with two teams but if i can do it for a whole league with excel that would save me soooooooo much time
                                                          Comment
                                                          • HeeeHAWWWW
                                                            SBR Hall of Famer
                                                            • 06-13-08
                                                            • 5487

                                                            #30
                                                            Originally posted by nivi
                                                            i can do it manually for a match with two teams but if i can do it for a whole league with excel that would save me soooooooo much time
                                                            Use R, import your match results, bit of format fiddling, then use the PlayerRratings package:
                                                            Implements schemes for estimating player or team skill based on dynamic updating. Implemented methods include Elo, Glicko, Glicko-2 and Stephenson. Contains pdf documentation of a reproducible analysis using approximately two million chess matches. Also contains an Elo based method for multi-player games where the result is a placing or a score. This includes zero-sum games such as poker and mahjong.



                                                            Gives you ELO (and various others like glicko and Stephenson) for all rows in a few seconds.
                                                            Comment
                                                            • peacebyinches
                                                              SBR MVP
                                                              • 02-13-10
                                                              • 1112

                                                              #31
                                                              I second using R (might take a little effort to learn if you haven't used it before, but should be well worth it), and that looks like a very handy package HH. I can't believe I've never checked the R package repositories for anything similar! Looks like there's some NBA and MLB and NFL.

                                                              holy crap... looks like I'm not getting any actual work done for a while now
                                                              Comment
                                                              • nivi
                                                                SBR Hustler
                                                                • 06-09-19
                                                                • 70

                                                                #32
                                                                can anyone post a video of predicting possibilities for soccer matches with R?
                                                                Comment
                                                                • ChuckyTheGoat
                                                                  BARRELED IN @ SBR!
                                                                  • 04-04-11
                                                                  • 37274

                                                                  #33
                                                                  Originally posted by nivi
                                                                  can anyone post a video of predicting possibilities for soccer matches with R?
                                                                  Nivi, guys like Heehaw + Peace are doing a very good job in this thread. I can't pin-point the link, but the Poisson Dist output is a good start. Here is the basic premise:

                                                                  *We need to get to the point where we project TWO NUMBERS, E (HomeGoals) and E (AwayGoals). if we assume two INDEPENDENT Poisson distibutions...then everything else falls out. Think of it like two columns:

                                                                  F(0), F(1), F(2), etc. From that we get f(0), f(1), f(2) etc. Do that for both Home + Away. We can then calculate our Home Wins, Away Wins etc. This is a very good starting point.

                                                                  And I echo Peace's note above. Read Nate Silver's book. He's not going to give u a license to print money...but it's a very good starting point.
                                                                  Where's the fuckin power box, Carol?
                                                                  Comment
                                                                  • ChuckyTheGoat
                                                                    BARRELED IN @ SBR!
                                                                    • 04-04-11
                                                                    • 37274

                                                                    #34
                                                                    It's a link u posted in another thread:

                                                                    Unique tool for calculating various probabilities for sports events using Poisson distribution formula with many enhancements.


                                                                    Navigate those tabs on the top. Poisson is a really good basic approach for soccer game analysis.

                                                                    The real question is whether your INPUT will be close to correct. The output from that link is certainly good, if not a little confusing to look at.
                                                                    Where's the fuckin power box, Carol?
                                                                    Comment
                                                                    • nivi
                                                                      SBR Hustler
                                                                      • 06-09-19
                                                                      • 70

                                                                      #35
                                                                      Originally posted by ChuckyTheGoat
                                                                      It's a link u posted in another thread:

                                                                      Unique tool for calculating various probabilities for sports events using Poisson distribution formula with many enhancements.


                                                                      Navigate those tabs on the top. Poisson is a really good basic approach for soccer game analysis.

                                                                      The real question is whether your INPUT will be close to correct. The output from that link is certainly good, if not a little confusing to look at.
                                                                      yeah this calculator looks dope
                                                                      but i really have no idea how to use it

                                                                      here is a more simple one
                                                                      Poisson calculator using goal expectancy to calculate percentage chance of number of goals scored by home and away team
                                                                      Comment
                                                                      SBR Contests
                                                                      Collapse
                                                                      Top-Rated US Sportsbooks
                                                                      Collapse
                                                                      Working...