June 27, 2005
Sig Rinde talks about tags. And he has created a new gizmo that allows you to play with his ideas:
Even with tags we easily become overwhelmed and would require some data-structure to find our way. Technorati follows 1.3 million tags now!
This new gizmo
Every person on this planet has a tag; name or social security number etc. 6.45 billion of them.
Uses multiple tag choices to choose and find.
And so what?
Using multiple tags, about 20 tags would cover the 1.3 million single-use tags at Technorati.
Using multiple tags, about 33 tags could give a unique identity to every person in the whole world.
(Quite a few years since I studied statistics, believe I'm in the ballpark, but anybody out there who could corroborate?)
And 20-30 tags are less cumbersome to navigate than 1.3 million, or 6 billion!
Multiple tags can replace any single tag, however unique that is.
You're tagged with your name. That does not say much, does it? Unless I know you of course.
Now try multiple tags. Add 10 tags, red hair, tall, birthplace etc. and you may be one of 153,000 with exact same tag set. Add yet another one that says more about you, say 'Italian speaking' - voila, you got only 9,675 individuals with the same tags. Add one more, now 634 identicals. Add two more and 'highlighting' exactly those 14 tags gives one return; you.
Ditto for plants, ditto for file structure on your computer, goodbye folders and search. Etc.
Add that a set of tags gives immediate (and complete!) information about the object. Far beyond what a two dimensional system may give (First and middle name, family name, does not give much information that).
And that is what knowledge is all about. Expand on that.
Time for a remake of Carl von Linné's work?
is built around the stuff he's building over at Thingamy,
We've all heard of prime numbers. Is there such a thing as "prime tags"?
And if so, what are they called, and how many of them are there?
[Disclaimer: Sig and I work together.]
Posted by hugh macleod at June 27, 2005 2:52 PM
Are there 20 or 30 keywords that, in combination, could describe any book ever written? Don't "folksonomies" exist because it's so much easier to freely tag things than to fit them into predefined buckets (whether it's one bucket or multiple buckets)?
Just some thoughts... any ideas how one would go about determining this "master list"?
Interesting question. We've all heard of prime numbers. Is there such a thing as "prime tags"?
And if so, what are they called, and how many of them are there?
Toby, you hit the real issue there!
And Hugh, another bulls eye, "prime tags" I like a lot! A bit like common denominators...
And after all, why do people sometimes get into heated discussions, fights, even wars - semantics, cultural differences in how we understand words...
Most tags or keywords could probably be represented or replaced by another set of tags - perhaps such tags that in itself could shed more light on the "meaning" of the tag, keyword, word of the author?
Carl Linnaeus found that using family, form and so forth in the name would tell more about the plant than just "dandelion" (the French calls it 'piss en lit' which tells a bit more :). And that is what is called 'knowledge'. Relationship between objects.
In that sense the folksonomies does not forward knowledge as such, even if it's colourful, interesting etc.
Lets keep this discussion going, this I like a lot! :-D
Mine All Mine!!!!!
It's easy to visualize how you can create a limited set of tags for a given set of things that can be classified into a nice taxonomy (e.g. "People" or "Books"). But can you really come up with a small set of tags that would classify "Everything"?
Now, you're talking. This is exactly what I was thinking was needed to classify blogs/webcasts so that like minded could find each other. I knew you'd know.
I suspect that prime tags are the tags that have more rather than less commonly shared meaning in defined areas of human activity or knowledge ... like "skates", "sticks", "rules", "teams" etc. in hockey, or "yarn", "stitches", "patterns" for knitting, and so on.
The confluence of structured taxonomies dancing back and forth with miscellaneous author-designated tags that reflect the participation and interaction of people sharing and searching for meanings or pertinent information ... now there's something I will watch develop with great interest.
There once was a paper I read titled "Cooperative Classification And Communication Using Shared Metadata" .. I believe the author was the person (aah .. the wonder of Google - Adam Mathes) who came up with the term folksonomies, or is one of the people in that tribe who follow and lead the development of the understanding of folksonomies.
Hugh your ability to pick labels for ideas that are both provocative and meaningful is truly impressive.
That is a real skill... and 'Prime Tags' is an excellent example.
Sean McGrath recently posted on "semantic primes" (click my name for the link).
If you Google "fractal thicket" you'll find my approach to breaking down 'composite' semantics into basic concepts like person-place-thing.
A Flickr pic with two people and a pizza could be tagged "person1 person2 thing1" for starters...
That Clay Shirky article reminded me of the 'pathway problem' - where do you put a pathway across a park? If you build the path first, it's like a hierarchical categorisation - in Clay's words the 'Yahoo' approach. If you let the people use the park without pathways, they will create them for you - more the 'Google' approach, or del.icio.us tags where the 'pathways' to information will be formed by people tagging links.
Ric, will follow your suggestion, Junior promises that you (and everybody else) shall be able to add tags to any post and comment (but not deduct of course) in the 'experiment'... in next version... soon (see geek definition of 'soon') :-)
That'll enable the pathway development nicely, perhaps...
There are an infinite number of prime numbers, of course, which is a reassuring thought when making the analogy between tags for information and these numerical building blocks (that even though information is categorised, it is not limited to finite categories).
The more I think about it, the less I think trying to limit the number of tags is going to work. Whatever set you come up with, there will always that one situation where yet-another-tag is going to be required. I'm thinking a better approach might be to create classes of tags. (E.g. color: red-green-blue, genre: humor-drama-scifi, etc.) This would increase the ability to identify like things as they would tend to use tags from the same set of classes. For instance, something referencing a person would use tags that describe hair color (blond), ethnicity (Hispanic), and so on.
William, nobody's saying the number of tags should be limited to a certain number. Sig's point was how surprisingly few are needed in order to handle large amounts of information.
And the tags will be created from the bottom-up, not the top-down, I'm guessing.
William's idea of creating classes of tags might be called a tagsonomy. :-)
There are techniques for automatically gleaning "concepts" from a textual document that have been in use for quite some time by knowledge management software. The idea is to avoid requiring people to add their own tags. It has been a while since I followed the market, so all of the company names I used to know have disappeared in the Great Popping of the dot com Bubble, but Intellisophic has something like what I'm talking about:
As far as I know, my name is a prime tag, because as far as I can tell, I'm the only person on earth who has my name, or has ever had it. My last name is pretty rare, I know almost everyone who has it in the US, and there are very few in Europe from what I can gather.
So there can be simple prime tags that are unique identifiers.
It seems to me that an interesting solution would be to take the way people are tagging things and algorithmically determine the optimal set of tags.
The only algorithms I've seen run on folksonomies so far have been simple co-occurence metrics, which tell you "related" tags. I have a couple of ideas of how individual tags can be aggregated, which I'm trying out. I'll keep you posted.
[NOTE TO SELF:] Stick to cartooning. This is so out of your league...
Learning which properties or tags act as good identifiers is big in machine learning, and has several algorithms. Finding the 'prime tags' for some finite group of items is a matter of finding which tag most effectively splits the overall group into smaller subgroups, recursively, until you are left with unique results. Decision trees are a popular type of machine learning classifier and are an excellent example of this technique.
I think that if you consider a formal taxonomy, you'll have your idea of prime tags. In fact, it's one of the ways that Tim Berners-Lee intended digital marking (tags) to be used, I believe, when he wrote his original article on the Semantic Web.
An ontology is a superset of a taxonomy (taxonomy is the heirarchical, object oriented view of knowledge categories - your prime tags, especially if you leave off several layers of leaf nodes). A taxonomy becomes a knowledge representation only when each level of it's heirarchy is filled with many enumerations of data. If you just take the taxonomy (and as I suggested, cut off several layers of leaf nodes), then you have basically a nice tree structure of prime tags.
It could work, you know . . .