Changes [Sep 25, 2007]
CategoriesCategory index page: http://wiki.tcl.tk/2187
jdc One of the requests on http://wiki.tcl.tk/17960 is a way to simplify the Category tagging. Would listing the most used Categories at the bottom of the edit page with checkbuttons and adding the checked categories to the page be useful? Listing all categories (400+) is not be possible, but having a number of them handy (20?) will improve the categorisation. Any opinions?
jdc I added the category selection to WubWiki. Look here for an example. This will require a list of the most used categories to work. I used a variable now but this can come from a protected page as well.
Posted at May 11/2007 09:58PM:
stevel: Jos, I prefer the way it is now. Sure people will add their own categories, but that will just be a hint for the wikignomes who will change them to a smaller and tighter set of categories, which in turn will drive the ToC
The problem with general categories, like Examples or Applications, is that with 17,000+ pages and growing, the wiki only displays the first 100 hits on a search category, so the casual browser, who might want to use such categories, isn't going to see the majority of the relevant pages. Plus, having only a few categories leads to a frustrating searching algorithm for a browser - someone interested in browsing the site looking for programs dealing with tcl and land maps might end up finding that science (or data) was the only broad category available... and that's not going to productively lead to relevant issues.
The problem with even more highly specific categories ("geology applications written in tcl/tk script language only" or whatever...) is that pendulum can swing so far that our categories become the page itself.
(Note that this is a totally separate problem from that of someone not knowing into what category a page should be placed, or not knowing that a category exists, or not knowing how to spell the category.)
I suppose one possible new feature would be a mechanism whereby a seed list of words literally used on the wiki pages , winnowed out to remove the prepositions, articles, pronouns, etc. (and of course, removing other irrelevant words) might then somehow be used to classify a page - somehow, having software which grew to know (perhaps by some sort of bayian application) that if this set of words are used on a page, then one should suggest the following categories as possible candidates. However, I suspect that seed database is too large for reasonably including in the wikit starkit.
One thing of course that the wikit permits is the use of both broad general categories as well as more specialized categories. That's one of the beauties of the software. One can use a category of Example, along with a category of Chemistry. And if someone wants to tie together work being done with genetic engineering, it is relatively painless to create a new category and add it to the relevant pages.
Just a few thoughts on my mind this morning.
Posted at May 14/2007 06:34PM:
colin:
Virden, Larry W. wrote:
We need to find a middle ground with respect to categories used on the wiki.
The problem with general categories, like Examples or Applications, is that with 17,000+ pages and growing, the wiki only displays the first 100 hits on a search category, so the casual browser, who might want to use such categories, isn't going to see the majority of the relevant pages.
That can be parameterised, if it'd be useful.
Secondly, you don't need to search on a Category, you click the category link, then click the category title in the resultant page. That has no limit on number of results, IIRC.
Plus, having only a few categories leads to a frustrating searching algorithm for a browser - someone interested in browsing the site looking for programs dealing with tcl and land maps might end up finding that science (or data) was the only broad category available... and that's not going to productively lead to relevant issues.
What's wrong with letting Categories grow like pages do, and then weeding them out?
The problem with even more highly specific categories ("geology applications written in tcl/tk script language only" or whatever...) is that pendulum can swing so far that our categories become the page itself.
That's why categorisation isn't such a good way to store data, IMHO.
(Note that this is a totally separate problem from that of someone not knowing into what category a page should be placed, or not knowing that a category exists, or not knowing how to spell the category.)
I think there's a book somewhere about categories in abstract, or in use in anthropology: Women, Fire and Dangerous Things.
I suppose one possible new feature would be a mechanism whereby a seed list of words literally used on the wiki pages , winnowed out to remove the prepositions, articles, pronouns, etc. (and of course, removing other irrelevant words) might then somehow be used to classify a page - somehow, having software which grew to know (perhaps by some sort of bayian application) that if this set of words are used on a page, then one should suggest the following categories as possible candidates. However, I suspect that seed database is too large for reasonably including in the wikit starkit.
Well, why not look at the categories on a given page in the current system, and derive 'possibly related' pages using Bayesian methods? Or is that what you're suggesting?
One thing of course that the wikit permits is the use of both broad general categories as well as more specialized categories. That's one of the beauties of the software. One can use a category of Example, along with a category of Chemistry. And if someone wants to tie together work being done with genetic engineering, it is relatively painless to create a new category and add it to the relevant pages.
Just a few thoughts on my mind this morning.
Yeah, the fact that you can add Categories ad lib, and create new ones is a real strength, I think.