B-tree or cluster b-tree?
Date: 09-02-10
From: "David Rose" <drose108_at_gmail.com>
I read the stream of posts from 11/09 on the subject of v11 index usage and
would like to provide an example and get some feedback on whether to use a
b-tree or cluster b-tree index in this situation:
Records in table: 850,000
Number of distinct values in indexed Field 1: 700
Maximum number of same values in Field 1: 25,000
Minimum number of same values in Field 1: 1
B-tree or cluster b-tree?
David Rose
* Senior Software Engineer
* J.D. Clark & Co., a division of UMB Fund Services
* www.jdclark.com
* Email drose108_at_gmail.com
< From: Peter Jakobsson
< Subject: Re: v11 index usage
< Newsgroups: gmane.comp.lang.inug-4d.tech
< Date: 2009-11-09 11:56:52 GMT (13 weeks, 1 day, 5 hours and 5 minutes ago)
< Hi Thomas
< Thanks for posting that code which looks extremely useful for
< optimising a big database. Thats exactly the missing info when
< deciding whether to switch to cluster or not, so you've probably just
< saved a lot of people a bit of time (assuming your putting it in the
< public domain).
< 4D refer to this ratio as 'index selectivity'. When I went on the Luc
< Michel Six course for v11, he talked about this and I took a lot of
< notes but I don't seem to have any actual numbers there.
< I would add my own thoughts to your contribution as follows (Health
< Warning: - these are just based on my current understanding of how
< cluster indexes work - not authoritative in the least as I have no
< experience yet)
< I think that you might need to add a few more parameters to your
< results to get a full picture:
< - the average cluster index population
< - the minimum index population
< - the number of indexes where the cluster population is below a
< certain minimum (e.g. 10 entries) compared with the number of unique
< values in the table
< This is because you might get a high selectivity result (high number
< of unique values) but still end up with 200 keys per index, simply
< vbecause you've got a lot of data. In this case it may still be useful
< to use cluster.
< On the other hand if you have the same unique-value-ratio (relative
< index selectivity), but not a lot of data, you might only end up with
< 5 keys per index - not much benefit in switching to cluster. (4D will
< not use a cluster below a certain threshold anyway - it has it's own
< optimisation algorithm for this)
< These are just my current thoughts on the problem, but you raise a
< very relevant issue and it would be good to develop some collective
< thinking on this in my opinion.
< Regards
< Peter
**********************************************************************
The New 4D Partner Program - All the tools you need to be a successful 4D Developer.
Sign up Now: http://www.4d.com/community/partner/registration.html
4D Internet Users Group (4D iNUG)
FAQ: http://lists.4d.com/faqnug.html
Archive: http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub: mailto:4D_Tech-Unsubscribe_at_lists.4D.com
**********************************************************************