Most efficient test programatically for goodness of fit?

**MonkeyF0cker** · 05-11-09, 03:20 PM

I think I found a good test for this actually. Anderson-Darling.

**Data** · 05-11-09, 03:28 PM

Originally posted by MonkeyF0cker

Does anyone recommend an efficient test for this? K-S? Shapiro-Wilk? These seem too cumbersome for what I'm trying to accomplish. There must be an easier test.

AFAIK, the K-S is the simplest to the extent it considered to be useless. The D’Agostino-Pearson omnibus test is a good compromise between difficulty and quality. Regardless, why reinvent the wheel if you can test you data in (free) R that has almost a dozen of normality tests already there as functions?

**MonkeyF0cker** · 05-12-09, 12:58 AM

I've considered R but I'm looking to make the algorithm as efficient as possible as I will be handling datasets of approximately 3 million records every time it runs.

**Data** · 05-12-09, 10:23 AM

Originally posted by MonkeyF0cker

I've considered R but I'm looking to make the algorithm as efficient as possible as I will be handling datasets of approximately 3 million records every time it runs.

I advise you to read on applicability. In most cases you should decide on a single way to treat the entire series instead of testing each data set for normality.

**MonkeyF0cker** · 05-12-09, 11:31 AM

I've worked out the particulars of efficiency now. I really only need to test for normality on occasion. I've built several tables that contain the frequency distribution bin counts and I simply add to those counts when new game data is acquired. This should speed up the process considerably.