Tuesday, March 27, 2012

Auto Create Statistics

Is it advisable to enable the auto create statistics for a database ? Why
may one not do so ? Using SQL 2K
Thanks"FR" <floydrev@.hotmail.com> wrote in message
news:O5ZGTGRSDHA.2152@.TK2MSFTNGP12.phx.gbl...
> Is it advisable to enable the auto create statistics for a database ? Why
> may one not do so ? Using SQL 2K
For OLAP systems, generally the answer is YES, but be aware that auto
create/update stats don't do fullscans on tables over 8Mb which can cause
bad optimizer decisions in larger tables and/or exceptionally skewed data.
For OLTP systems, it depends, partly for the reasons above for OLAP. But the
problems caused on an OLTP system can be far more significant, primarily
resulting in unexpected serialization due to a bad query plan caused by
non-representative statistics (becuase of not running a fullscan).
Auto update stats can also make debugging very difficult because the system
chooses to go ahead in a non-deterministic way (ie, how and when the user
operates the application) and regenerate stats. Personally, I prefer to have
auto create/update switched off on the large OLTP systems I deal with.
Experience has shown me that blindly going ahead and enabling auto update
stats can have some rather detrimental effects which take a long time to
diagnose and resolve because they are so subtle in their effects.
In addition there is a small but potentially significant overhead for the
auto update which may lead to serialization in some situation.
Fundamentally the problem of statistics which have not undergone a fullscan
is that they're not representative of the underlying data.
As an aside, statistics have two other problems.
Firstly, the optimizer assumes that data distributions between statistics
are mutually exclusive, although in reality this is sometimes not the case
at all.
Secondly, histogram statistics are only held for the first column of
composite indices, which can be a real pain. It's another consideration to
bear in mind when you choose columns in an index. For example, if you often
reference a table with certain columns as SArgable values (eg, Where
ActiveStatus=1 And PrintedStatus=0), you should consider whether there is
any benefit having both statuses in the same composite index. Only knowing a
likely distribution of your data will allow you to make an informed
decision. Sadly a lot of IT managers love to say "it could be any number of
rows" which doesn't help! It's one of those times when the DBA should be
speaking directly with the business.
For small databases you probably won't notice any problems when the auto
create/update is switched on.
Kind Regards, Howard

No comments:

Post a Comment