1. #1
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Want your own college basketball database using python?

    Step 1: Download anaconda from here: https://www.anaconda.com/products/individual

    Step 2: Open Spyder that is installed with it

    Step 3: Paste the code below into a blank script, pasting over the lines at the top of the script that get automatically

    Step 4: Click the run button

    You might get errors based on installation issues. Post them here, I'll tell you how to fix

    If you want more years, go to

    "years = ['2020']" in the code

    add any years you want to it, but put around single quotes ex. '2019'

    Use a comma to seperate them ex. ['2020','2019','2018']

    I haven't tested many years in the past, could have errors in formatting.

    Let me know if any of the data is incorrect. This is year old, I haven't used it (very tired of losing at college basketball)
    It will give you a csv file with the data, whereever you have the script saved.


    Code:
    # -*- coding: utf-8 -*-
    """
    Created on Mon Nov 30 20:10:35 2020
    
    @author: Waterstpub87
    """
    
    import numpy as np
    import pandas as pd
    
    years = ['2020']
    
    for year in years:
    
            schoolsurl = "https://www.sports-reference.com/cbb/seasons/" + year + "-school-stats.html"
    
            schools = pd.read_html(schoolsurl)
    
    
            df = schools[0]
    
            df = df
    [list(df)]
            
    
            scl = df['Overall']
    
            scl['School'] = scl['School'].str.replace('NCAA','')
            scl['School'] = scl['School'].str.strip()
            scl.index = scl['School']
            scl['URL'] = scl['School']
            scl['URL'] = scl['URL'].str.replace(' ','-')
            scl['URL'] = scl['URL'].str.replace('.','')
            scl['URL'] = scl['URL'].str.replace('&','')
            scl['URL'] = scl['URL'].str.replace('(','')
            scl['URL'] = scl['URL'].str.replace(')','')
            scl['URL'] = scl['URL'].str.replace("'",'')
            scl['URL'] = scl['URL'].str.replace("--",'-')
            scl['URL'] = scl['URL'].str.lower()
            scl['URL'] = scl['URL'].str.replace('little-rock','arkansas-little-rock')
            scl['URL'] = scl['URL'].str.replace('uc-','california-')
            scl['URL'] = scl['URL'].str.replace('university-of-california','california')
            scl['URL'] = scl['URL'].str.replace('purdue-fort-wayne','ipfw')
            scl['URL'] = scl['URL'].str.replace('fort-wayne','ipfw')
            scl['URL'] = scl['URL'].str.replace('omaha','nebraska-omaha')
            scl['URL'] = scl['URL'].str.replace('siu-edwardsville','southern-illinois-edwardsville')
            scl['URL'] = scl['URL'].str.replace('texas-rio-grande-valley','texas-pan-american')
            #scl['URL'] = scl['URL'].str.replace('vmi','virginia-military-institute')
            scl['URL'] = scl['URL'].str.replace('cal-state-long-beach','long-beach-state')
            scl.loc['Louisiana']['URL']='louisiana-lafayette'
            scl.loc['VMI']['URL']='virginia-military-institute'
            scl = scl[scl['School'] != 'Overall']
            scl = scl[scl['School'] != 'School']
            
            scl.index = scl['URL']
    
    
            for x in scl['URL']:
                try:
                    url = 'https://www.sports-reference.com/cbb/schools/' + x + '/' + year + '-gamelogs.html'
                    data = pd.read_html(url)
                    data = data[0]
                    data = data
    [list(data)]
                    
                    data['School1'] = scl.loc[x]['School']
                    if x == 'abilene-christian' and years[0]==year:
                        results = data
                    else:
                        results = results.append(data)
                except:
                        pass
            results.to_csv('CBBDB.csv')
            cols = ['G','Date','Location','Opp','Results','P oints','Points Against','FG','FGA','FG%','3P','3PA','3P       %','FT','FTA','FT%','ORB','TRB','AST','S TL','BLK','TOV','PF','Blank','OPPFP','OPFPA','OPFG%','OPP3P', 'OPP3PA','OPP3P%','OPPFT','OPPFTA','OPPF T%','OPPORB','OPPTRB','OPPAST','OPPSTL', 'OPPBLK','OPPTOV','OPPPF','School']
            results.columns = cols
            mid = results['School']
            results.drop(labels=['School'], axis=1,inplace = True)
            results.insert(2, 'School', mid)
                
            results.drop(labels=['Blank'], axis=1,inplace = True)
            results.drop(labels=['FG%'], axis=1,inplace = True)
            results.drop(labels=['3P %'], axis=1,inplace = True)
            results.drop(labels=['FT%'], axis=1,inplace = True)
            results.drop(labels=['OPFG%'], axis=1,inplace = True)
            results.drop(labels=['OPP3P%'], axis=1,inplace = True)
            results.drop(labels=['OPPF T%'], axis=1,inplace = True)
        
    
    
            results = results[results.Date != 'School']
            results = results[results.Date != 'Date']
            results= results.fillna(0)
            counter = 6
            cols = list(results)
            while counter < 34:
                column = cols[counter]
                results[column] = results[column].astype(int)
                counter = counter +1
    
    
            results['Pace'] = (.50*(results['FGA'] + (.49*results['FTA']) + results['TOV'] - results['ORB'])) + (.50 *     (results['OPFPA'] + (.49 * results['OPPFTA'])-results['OPPORB']+results['OPPTOV']))
            results['School'] = results['School'].str.replace('Cal State Long Beach','Long Beach State')
            results['School']= results['School'].str.replace('SIU Edwardsville','Southern Illinois-Edwardsville')
            results['School']= results['School'].str.replace('VMI','Virginia Military Institute')
        
            results['Opp']= results['Opp'].str.replace('UMBC','Maryland-Baltimore County')
            results['Opp']= results['Opp'].str.replace('UNLV','Nevada-Las Vegas')
            results['Opp']= results['Opp'].str.replace('Detroit','Detroit Mercy')
            results['Opp']= results['Opp'].str.replace('BYU','Brigham Young')
            results['Opp']= results['Opp'].str.replace('Southern Miss','Southern Mississippi')
            results['Opp']= results['Opp'].str.replace('UTEP','Texas-El Paso')
            results['Opp']= results['Opp'].str.replace('UTSA','Texas-San Antonio')
            results['Opp']= results['Opp'].str.replace('UCF','Central Florida')
            results['Opp']= results['Opp'].str.replace('LSU','Louisiana State')
            results['Opp']= results['Opp'].str.replace('Ole Miss','Mississippi')
            results['Opp']= results['Opp'].str.replace('LIU-Brooklyn','Long Island University')
            results['Opp']= results['Opp'].str.replace('UMass-Lowell','Massachusetts-Lowell')
            results['Opp']= results['Opp'].str.replace('California','University of California')
            results['Opp']= results['Opp'].str.replace('USC','Southern California')
            results['Opp']= results['Opp'].str.replace('UConn','Connecticut')
            results['Opp']= results['Opp'].str.replace('UMass','Massachusetts')
            results['Opp']= results['Opp'].str.replace('UCSB','UC-Santa Barbara')
            results['Opp']= results['Opp'].str.replace('UNC Wilmington','North Carolina-Wilmington')
            results['Opp']= results['Opp'].str.replace("St. Peter's","Saint Peter's")
            results['Opp']= results['Opp'].str.replace('UNC Asheville','North Carolina-Asheville')
            results['Opp']= results['Opp'].str.replace('NC State','North Carolina State')
            results['Opp']= results['Opp'].str.replace('UNC','North Carolina')
            results['Opp']= results['Opp'].str.replace('Central Connecticut','Central Connecticut State')
            results['Opp']= results['Opp'].str.replace('UT-Martin','Tennessee-Martin')
            results['Opp']= results['Opp'].str.replace('TCU','Texas Christian')
            results['Opp']= results['Opp'].str.replace("Saint Mary's","Saint Mary's (CA)")
            results['Opp']= results['Opp'].str.replace("Pitt","Pittsburgh")
            results['Opp']= results['Opp'].str.replace("VCU","Virginia Commonwealth")
            results['Opp']= results['Opp'].str.replace("UIC","Illinois-Chicago")
            results['Opp']= results['Opp'].str.replace("SMU","Southern Methodist")
            results['Opp']= results['Opp'].str.replace("Penn","Pennsylvania")
            results['Opp']= results['Opp'].str.replace("USC Upstate","South Carolina Upstate")
            results['Opp']= results['Opp'].str.replace("UMKC","Missouri-Kansas City")
            results['Opp']= results['Opp'].str.replace("UNC Greensboro","North Carolina-Greensboro")
            results['Opp']= results['Opp'].str.replace("St. Joseph's","Saint Joseph's")
            results['Opp']= results['Opp'].str.replace("ETSU","East Tennessee State")
            results['Opp']= results['Opp'].str.replace("Pennsylvania State","Penn State")
            results['Opp']= results['Opp'].str.replace("North Carolina Greensboro","North Carolina-Greensboro")
            results['Opp']= results['Opp'].str.replace("Southern California Upstate","South Carolina Upstate")
            results['Opp']= results['Opp'].str.replace("University of California Baptist","California Baptist")
            results['Opp']= results['Opp'].str.replace('SIU-Edwardsville','Southern Illinois-Edwardsville')
            results['Opp']= results['Opp'].str.replace('VMI','Virginia Military Institute')
                
            results.to_csv('CBBD'+year+'.csv')
    Last edited by Waterstpub87; 11-30-20 at 09:29 PM.

  2. #2
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    couple of errors in the paste;
    df = df

    [list(df)]

    the brackets should be next to the df in the same line

  3. #3
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    This will throw warnings in the console. Don't worry about it. I never cared to fix it.

  4. #4
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    You get a spreadsheet that looks like this, with all information, roughly 40 columns
    G Date School Location Opp Results P oints Points Against FG FGA 3P
    1 11/5/2019 Abilene Christian 0 Arlington Baptist W 90 39 36 75 6
    2 11/10/2019 Abilene Christian @ Drexel LÂ (1 OT) 83 86 30 65 9
    3 11/16/2019 Abilene Christian 0 Pepperdine L 69 73 20 50 5
    4 11/18/2019 Abilene Christian @ Nevada-Las Vegas L 58 72 22 59 7

  5. #5
    jacksonstreet
    jacksonstreet's Avatar Become A Pro!
    Join Date: 10-19-20
    Posts: 182
    Betpoints: 114

    If you find a way to include the closing line and % bet on each side, I'll show you a way to predict point spread winners at a 70%+ clip.

  6. #6
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Quote Originally Posted by jacksonstreet View Post
    If you find a way to include the closing line and % bet on each side, I'll show you a way to predict point spread winners at a 70%+ clip.
    Hard to get that data. Also hard to know if its right. In the past compared SBR to vegas insider?, numbers were completely different

  7. #7
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    For those who sent me PMs, my box was full. I have space now

  8. #8
    Fullkelly
    Fullkelly's Avatar Become A Pro!
    Join Date: 02-20-17
    Posts: 52
    Betpoints: 2017

    Your in high demand the PM box is full again.

  9. #9
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Quote Originally Posted by Fullkelly View Post
    Your in high demand the PM box is full again.

    Should be good now.

  10. #10
    jacksonstreet
    jacksonstreet's Avatar Become A Pro!
    Join Date: 10-19-20
    Posts: 182
    Betpoints: 114

    Quote Originally Posted by Waterstpub87 View Post
    Hard to get that data. Also hard to know if its right. In the past compared SBR to vegas insider?, numbers were completely different
    Yeah - I think that's intentional. Closing lines are pretty easy to get, but if we had access to accurate %'s bet on each side, we'd be able to build a model that would win at such a high rate that books would be out of business eventually.

  11. #11
    jonal
    jonal's Avatar Become A Pro!
    Join Date: 06-01-09
    Posts: 772
    Betpoints: 2195

    Hello,

    I get the following error when I tried to run the script. Any suggestions?

    Thanks in advance,

    ---------------------------------------------------------------------------------------------------

    runfile('C:/Users/Jonathan/.spyder-py3/temp.py', wdir='C:/Users/Jonathan/.spyder-py3') File "C:\Users\Jonathan\.spyder-py3\temp.py", line 27 scl = df['Overall'] ^IndentationError: unexpected indent

  12. #12
    KVB
    It's not what they bring...
    KVB's Avatar SBR PRO
    Join Date: 05-29-14
    Posts: 74,849
    Betpoints: 7576

    Nice work Water St.


  13. #13
    KVB
    It's not what they bring...
    KVB's Avatar SBR PRO
    Join Date: 05-29-14
    Posts: 74,849
    Betpoints: 7576

    Quote Originally Posted by jacksonstreet View Post
    ...but if we had access to accurate %'s bet on each side, we'd be able to build a model that would win at such a high rate that books would be out of business eventually.
    This is absolutely not true. The books can control that action, it's the purpose of the line.

    They wouldn't dig their own graves. In fact, what they would do, and they do do, is exploit your thinking and take full advantage of running you, and bettors like you, in circles.

    We witness it every single day and I have helped write the programs to exploit the bettors reliably.



    One of the issues with the integrity of the information is that what many sources put out as betting percentages are simply a survey of their website members or traffic, and nothing more.

    Information that some of us get access to, from books, including back end from Vegas, can be misleading because the individual books are under no obligation to report the truth here.

    It's another thing we see every single day.

    While I often say we all have access to the same information and it's just how you use it, when it comes to some information it just isn't true. Some of us have access to info others don't, and we respect that privilege.

    Nomination(s):
    This post was nominated 1 time . To view the nominated thread please click here. People who nominated: Dan Kelly

  14. #14
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Quote Originally Posted by jonal View Post
    Hello,

    I get the following error when I tried to run the script. Any suggestions?

    Thanks in advance,

    ---------------------------------------------------------------------------------------------------

    runfile('C:/Users/Jonathan/.spyder-py3/temp.py', wdir='C:/Users/Jonathan/.spyder-py3') File "C:\Users\Jonathan\.spyder-py3\temp.py", line 27 scl = df['Overall'] ^IndentationError: unexpected indent
    Some of the indenting gets messed up on the paste. Hit tab once in like 27, see if that fixes it.

  15. #15
    jonal
    jonal's Avatar Become A Pro!
    Join Date: 06-01-09
    Posts: 772
    Betpoints: 2195

    Quote Originally Posted by Waterstpub87 View Post
    Some of the indenting gets messed up on the paste. Hit tab once in like 27, see if that fixes it.
    It fixed the error, but now when I run the script nothing appears in the Console. Is there suppose to be a message returned when the script successfully runs?

  16. #16
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    No. Nothing should appear in the console. You should have 2 csv files where ever you have the script saved if it ran successfully. You might get warnings depending on your python version

  17. #17
    jonal
    jonal's Avatar Become A Pro!
    Join Date: 06-01-09
    Posts: 772
    Betpoints: 2195

    Quote Originally Posted by Waterstpub87 View Post
    No. Nothing should appear in the console. You should have 2 csv files where ever you have the script saved if it ran successfully. You might get warnings depending on your python version

    so the last line of your code is this:

    results.to_csv('CBBD'+year+'.csv')

    am i suppose to add a file location? sorry for all the questions.

  18. #18
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    No, should be saved where the script is saved.

    If you want to add a directory to put it in a specific place, do this:

    Results.to_csv('c\\documents\\' plus the rest.

    Whatever folder you want. Put it in single quotes and double any slash

    So c\documents
    Becomes 'c\\documents\\'

  19. #19
    mikmak
    mikmak's Avatar Become A Pro!
    Join Date: 05-03-13
    Posts: 29
    Betpoints: 239

    First of all, thank you so much for sharing this. I'm an IT professional but not a programmer. I've decided to teach myself Python for scraping and try to build a database so I can back test my models. Right now, I'm using Excel and it does an ok job using power query for scraping but it runs out of memory and is slow as dirt. I also want to automate so much of what I am currently doing and Excel just isn't going to cut the mustard.

    I experienced the indent error discussed above and had to play around with the tabs in
    [list(df)] section and I'm past that issue but now I'm getting the following error:

    Code:
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
    
      File "C:\Users\mikma\.spyder-py3\temp.py", line 27, in     scl['School'] = scl['School'].str.replace('NCAA','')
    
      File "C:\Users\mikma\anaconda3\lib\site-packages\pandas\core\frame.py", line 2902, in __getitem__    indexer = self.columns.get_loc(key)
    
      File "C:\Users\mikma\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2897, in get_loc
        raise KeyError(key) from err
    
    KeyError: 'School'

    Any ideas? And thank you again so much for trying to help others on this board. We need more people like you on the interwebz!
    Last edited by mikmak; 12-08-20 at 03:15 PM.

  20. #20
    Roscoe_Word
    Roscoe_Word's Avatar Become A Pro!
    Join Date: 02-28-12
    Posts: 4,000
    Betpoints: 8667

    One year went an entire NBA season and logged a 55% ATS mark.

    That was with a notebook, pen and calculator.

    Then got a computer and learned some code to automate things.

    Have never repeated that mark since.

    Waterspub....thanks for some past help you've given.......

    Ahh...sorry, posted in wrong thread.

    Mean't to post in "How many people use coding" thread......

  21. #21
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Quote Originally Posted by Roscoe_Word View Post
    One year went an entire NBA season and logged a 55% ATS mark.

    That was with a notebook, pen and calculator.

    Then got a computer and learned some code to automate things.

    Have never repeated that mark since.

    Waterspub....thanks for some past help you've given.......

    Ahh...sorry, posted in wrong thread.

    Mean't to post in "How many people use coding" thread......
    Appreciate the kind words. Always good to help people. Plenty of people have helped me here with stuff.

  22. #22
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Quote Originally Posted by mikmak View Post
    First of all, thank you so much for sharing this. I'm an IT professional but not a programmer. I've decided to teach myself Python for scraping and try to build a database so I can back test my models. Right now, I'm using Excel and it does an ok job using power query for scraping but it runs out of memory and is slow as dirt. I also want to automate so much of what I am currently doing and Excel just isn't going to cut the mustard.

    I experienced the indent error discussed above and had to play around with the tabs in
    [list(df)] section and I'm past that issue but now I'm getting the following error:

    Code:
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
    
      File "C:\Users\mikma\.spyder-py3\temp.py", line 27, in     scl['School'] = scl['School'].str.replace('NCAA','')
    
      File "C:\Users\mikma\anaconda3\lib\site-packages\pandas\core\frame.py", line 2902, in __getitem__    indexer = self.columns.get_loc(key)
    
      File "C:\Users\mikma\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2897, in get_loc
        raise KeyError(key) from err
    
    KeyError: 'School'

    Any ideas? And thank you again so much for trying to help others on this board. We need more people like you on the interwebz!
    Not sure, haven't been able to replicate this.

    What year are you running for? did you change?

    if not, can you do the following:

    In the console on the right (where it displayed your error)

    can you type:

    schools next to In [2]: and hit enter, tell me if you get a table with columns and rows

    If not, thats an issue

    if you do,

    can you type list(schools) in the same place and hit enter, and tell me if you see the word 'School'

  23. #23
    gauchojake
    Have Some Asthma
    gauchojake's Avatar SBR PRO
    Join Date: 09-17-10
    Posts: 33,724
    Betpoints: 13164

    I got the same error
    This is what was retuned when I keyed in the list(schools) command

    [ Unnamed: 0_level_0 Unnamed: 1_level_0 Overall ... Totals
    Rk School G W ... STL BLK TOV PF
    0 1 Abilene Christian 31 20 ... 293 81 436 661
    1 2 Air Force 32 12 ... 161 43 395 534
    2 3 Akron 31 24 ... 158 91 397 548
    3 4 Alabama A&M 30 8 ... 174 63 391 538
    4 5 Alabama-Birmingham 32 19 ... 191 93 471 527
    .. ... ... ... .. ... ... ... ... ...
    382 349 Wright State 32 25 ... 209 113 396 516
    383 350 Wyoming 33 9 ... 175 66 418 626
    384 351 Xavier 32 19 ... 201 114 446 535
    385 352 Yale 30 23 ... 188 101 389 449
    386 353 Youngstown State 33 18 ... 188 85 397 579

    [387 rows x 38 columns]]
    Last edited by gauchojake; 12-22-20 at 05:42 PM.

  24. #24
    gauchojake
    Have Some Asthma
    gauchojake's Avatar SBR PRO
    Join Date: 09-17-10
    Posts: 33,724
    Betpoints: 13164

    BTW thanks for posting this. I have been messing around with a few different iterations of Python and this one is the easiest to work with so far. I am not a programmer by any stretch and I am learning from scratch on the internet how to do this.

  25. #25
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    I'm updating my installation. I can't replicate the error you are getting. It reads the table correctly, but it isn't reading it like a table. You can literally see where the word school is.

    Do me a favor? Change
    scl = df['Overall']

    to
    scl = df['Overall'].copy()

  26. #26
    gauchojake
    Have Some Asthma
    gauchojake's Avatar SBR PRO
    Join Date: 09-17-10
    Posts: 33,724
    Betpoints: 13164

    I still get the same error

    File "C:\Users\jake\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2897, in get_loc raise KeyError(key) from errKeyError: 'School'

  27. #27
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    You are running 2020?

  28. #28
    gauchojake
    Have Some Asthma
    gauchojake's Avatar SBR PRO
    Join Date: 09-17-10
    Posts: 33,724
    Betpoints: 13164

    I looked and I saved the script with 2019 as the year. Edited to 2020 and this was the error returned

    Traceback (most recent call last): File "C:\Users\jake\.spyder-py3\Basketball Project.py", line 21, in scl['School'] = scl['School'].str.replace('NCAA','') File "C:\Users\jake\anaconda3\lib\site-packages\pandas\core\frame.py", line 2902, in __getitem__ indexer = self.columns.get_loc(key) File "C:\Users\jake\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2897, in get_loc raise KeyError(key) from errKeyError: 'School'

  29. #29
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    I have to update my version. If you have a fresh install, it might be causing issues. They update how things work behind the scenes, and sometimes it causes things to change. Will do later. My python was like 4 or so versions back.

  30. #30
    gauchojake
    Have Some Asthma
    gauchojake's Avatar SBR PRO
    Join Date: 09-17-10
    Posts: 33,724
    Betpoints: 13164

    Yeah I just installed it. Cool thanks for the help.

  31. #31
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Ok, now I get the same error:

    fix is
    Code:
            schools = pd.read_html(schoolsurl,header=[1])
    
    
            df = schools[0]
    
            #df = df
    [list(df)]
            
    
            scl = pd.DataFrame(df['School'].copy())
    in lines 17-25

    you also need to make an edit around line 89
    currently looks like
    Code:
             results = results[results.Date != 'School']
            results = results[results.Date != 'Date']
            results= results.fillna(0)
            counter = 6
            cols = list(results)
    You need to add a line after to make it look like:
    Code:
           results = results[results.Date != 'School']
            results = results[results.Date != 'Date']
            results= results.fillna(0)
            counter = 6
            cols = list(results)      
            results = results[results['FG'] != 'School']
    the first lines should be in line with the others, the indentation gets messed up when I post it.

  32. #32
    gauchojake
    Have Some Asthma
    gauchojake's Avatar SBR PRO
    Join Date: 09-17-10
    Posts: 33,724
    Betpoints: 13164

    I am getting different errors now but it's probably due to my lack of experience coding and not understanding the nuances. I'll play around a little more because I want to see if I can get it.

  33. #33
    Nappyx
    Nappyx's Avatar Become A Pro!
    Join Date: 11-05-17
    Posts: 652
    Betpoints: 616

    @Waterstpub87why don't you just scrape all the information and post the file here so that the non-coders of the bunch don't have to monkey with your code to pull the results. Would probably save many folks in in this thread a lot of time....

  34. #34
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Quote Originally Posted by Nappyx View Post
    @Waterstpub87why don't you just scrape all the information and post the file here so that the non-coders of the bunch don't have to monkey with your code to pull the results. Would probably save many folks in in this thread a lot of time....
    Depending on if basketball reference feels that is proprietary information, it might get taken down. Also, not sure how to do that.

    Also, what fun is that? Teach a man to fish and all.
    Last edited by Waterstpub87; 12-23-20 at 02:18 PM.

  35. #35
    gauchojake
    Have Some Asthma
    gauchojake's Avatar SBR PRO
    Join Date: 09-17-10
    Posts: 33,724
    Betpoints: 13164

    Success!! I had to remove two lines of code but got the script to run. It's a little messy but I spot checked the data and it looks good. Thank you sir.

12 Last
Top