1. #1
    sharpcat
    sharpcat's Avatar Become A Pro!
    Join Date: 12-19-09
    Posts: 4,516

    Help with excel data cleaning

    If I have a column in my database with the final score of the game grouped into one cell is there a function to split this into 3 columns easily?

    L 0-12
    L 6-23
    L 28-31
    L 20-23

    I would like to quickly change this to 3 columns without having to go through manually to eliminate both time consumption and human error.

    L 0 6
    L 6 23
    L 28 31
    L 20 23

    Any help would be greatly appreciated.

  2. #2
    TomG
    TomG's Avatar Become A Pro!
    Join Date: 10-29-07
    Posts: 500

    There are several ways to do this. The easiest way is to simply highlight the column and then go to Data -> Text to Columns and use delimiters.
    Points Awarded:

    sharpcat gave TomG 20 SBR Point(s) for this post.


  3. #3
    roasthawg
    roasthawg's Avatar Become A Pro!
    Join Date: 11-09-07
    Posts: 2,990

    Yeah, what Tom said... text to columns then use space and "-" as the delimiters.

  4. #4
    dr_wolf
    dr_wolf's Avatar SBR PRO
    Join Date: 07-20-10
    Posts: 417
    Betpoints: 19055

    or make an macro and you separate what you want (you have to know macro programming) if you have questions give me a PM

  5. #5
    arwar
    arwar's Avatar Become A Pro!
    Join Date: 07-09-09
    Posts: 208
    Betpoints: 1544

    as I create scrapers (and associated data manipulators) on a pretty much full time basis, one trick that is often overlooked is using the 'search and replace' function. i am not an EXCEL fan but if load your data file in Notepad you can convert it to comma delimited (which will be columns) by doing the following:

    1. search and replace all with <, >
    2. search and repalce all <-> with <, >

    so your data:
    L 0-12
    L 6-23
    L 28-31
    L 20-23

    becomes:
    L, 0, 12
    L, 6, 23
    L, 28, 31
    L, 20, 23

    you can also use this method to defeat some of the idiotic features of EXCEL like autocorrect and having scores that
    are formattted like 12-10 from being converted to dates.

    save the resulting Notepad file with an extension of .CSV rather than .TXT and it will be understandable to EXCEL as
    well as any other programs that use comma delimited file formats.

  6. #6
    Wrecktangle
    Wrecktangle's Avatar Become A Pro!
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    For those of you not used to notepad, you can do similar as arwar mentions in MS Word.

    Also, I might mention when you import into MS Excel, you can control each column as to import character type. That way you can get around the auto-importing of data that looks like: 1-15, 3-2, 6-5 as dates.

  7. #7
    That Foreign Guy
    I got sunshine in a bag
    That Foreign Guy's Avatar Become A Pro!
    Join Date: 07-18-10
    Posts: 432
    Betpoints: 3069

    Notepad is way better than word for data, it doesn't try any stupid autoformatting and handles special characters better.

  8. #8
    sharpcat
    sharpcat's Avatar Become A Pro!
    Join Date: 12-19-09
    Posts: 4,516

    Quote Originally Posted by arwar View Post
    as I create scrapers (and associated data manipulators) on a pretty much full time basis, one trick that is often overlooked is using the 'search and replace' function. i am not an EXCEL fan but if load your data file in Notepad you can convert it to comma delimited (which will be columns) by doing the following:

    1. search and replace all <spaces> with <, >
    2. search and repalce all <-> with <, >

    so your data:
    L 0-12
    L 6-23
    L 28-31
    L 20-23

    becomes:
    L, 0, 12
    L, 6, 23
    L, 28, 31
    L, 20, 23

    you can also use this method to defeat some of the idiotic features of EXCEL like autocorrect and having scores that
    are formattted like 12-10 from being converted to dates.

    save the resulting Notepad file with an extension of .CSV rather than .TXT and it will be understandable to EXCEL as
    well as any other programs that use comma delimited file formats.
    Thank you very much.

    I will give this a try Tom's technique worked for me although like you said it changed many of the scores to dates. I found that it worked better for me to use the "fixed width" command in "text to columns" to split the scores and than just came back and selected "edit" "replace" and replaced all of the " - " with " ". This worked better than "delimited" in this scenario because of the auto correct to date feature.

  9. #9
    Insoluble
    Insoluble's Avatar Become A Pro!
    Join Date: 05-29-10
    Posts: 71
    Betpoints: 3463

    You could also use the LEFT, MID, and RIGHT functions in Excel to help parse out the elements you need. This can be a little more difficult , especially with scores where you don't know whether they are single digit or double digit, but I think its a good tool to go over:

    Suppose you had this in cell A1: "L 0-12"
    If you wanted to separate the L, you could use the function "=LEFT(A1, 1)" This takes the 1 leftmost character from A1 and returns it in your cell. In order to get the number on the right side you could use "=RIGHT(A1,2)" and you would separate the 12 into your new cell. And finally to get the 0 separated you could use the mid function which goes a certain amount of characters from the left and returns the amount of character you specify. In this case we would need to go 3 over from the left and we only want to return 1 character so our formula would be "=MID(a1,3,1)" and this would return the 0.

    The only catch with this is that if you are dealing with a ton of scores, you might not always have single digit data and you might need to return a 10 instead of a 0, in which case you would need to ask for 2 digits in your formula. The way I deal with this is I always try to take as many digits as I need and then I create an IF statement to clean up any extra "-" or "," I might accidentally get from getting too many characters returned. I can go into this more if you'd like but I don't want to get too confusing for ya unless you're interested.

    Hope that helps, these functions have made it easy for me to parse tons of data really quickly into fields that I like.

  10. #10
    sharpcat
    sharpcat's Avatar Become A Pro!
    Join Date: 12-19-09
    Posts: 4,516

    Quote Originally Posted by Insoluble View Post
    You could also use the LEFT, MID, and RIGHT functions in Excel to help parse out the elements you need. This can be a little more difficult , especially with scores where you don't know whether they are single digit or double digit, but I think its a good tool to go over:

    Suppose you had this in cell A1: "L 0-12"
    If you wanted to separate the L, you could use the function "=LEFT(A1, 1)" This takes the 1 leftmost character from A1 and returns it in your cell. In order to get the number on the right side you could use "=RIGHT(A1,2)" and you would separate the 12 into your new cell. And finally to get the 0 separated you could use the mid function which goes a certain amount of characters from the left and returns the amount of character you specify. In this case we would need to go 3 over from the left and we only want to return 1 character so our formula would be "=MID(a1,3,1)" and this would return the 0.

    The only catch with this is that if you are dealing with a ton of scores, you might not always have single digit data and you might need to return a 10 instead of a 0, in which case you would need to ask for 2 digits in your formula. The way I deal with this is I always try to take as many digits as I need and then I create an IF statement to clean up any extra "-" or "," I might accidentally get from getting too many characters returned. I can go into this more if you'd like but I don't want to get too confusing for ya unless you're interested.

    Hope that helps, these functions have made it easy for me to parse tons of data really quickly into fields that I like.
    Very good info

    I will play around with this I am sure I can find many applications to use this with.

    I have been teaching myself all of this computer related stuff and have made tons of progress over the last year or two but it is very helpful to have a community to ask questions when you can not make sense of what you are reading in a book. Hopefully soon I can start understanding some programming languages.

  11. #11
    Wrecktangle
    Wrecktangle's Avatar Become A Pro!
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    Quote Originally Posted by That Foreign Guy View Post
    Notepad is way better than word for data, it doesn't try any stupid autoformatting and handles special characters better.
    I agree, but I'm always surprised by how many folks have not used it.

  12. #12
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    say you have "L 0-12" in A1, in B1 put "=LEFT(A1,1)", C1 put "=RIGHT(A1,LEN(A1)-FIND("-",A1))", D1 put "=TRIM(MID(A1,FIND("-",A1)-2,2))"

    copy down to last cell, will save a lot of time and resources

  13. #13
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    or actually reverse C1 and D1, my fault

Top