Stephan Baumann presented my WhoAmI Application at Semantic and Digital Media Technologies in Koblenz. See full presentation here:
It overruns it time but finally: The Invitation to my final talk. Slides are german only, but with a second chance to listen to…
Ort: Germany, Berlin, Informatikgebäude Takustr. 9 Raum K137
Zeit: Mittwoch, 5. März 2008, 15.15 Uhr
After a long time of silence but hard working, I am happy to announce the final presentation of my Diploma Thesis. Feel free to come and discuss with me on Friday, this week, 3 pm in room 2.16 at DFKI Kaiserslautern.
— Mediated Identity – Semantic Data Mining through Social Media Web Services —
Due to recent Internet developments the user is no longer just an information consumer – she interacts with other people through the websites, publishes new content and shares her personal media online. This act of self-expression and presentation is part of her virtual identity and reveals much about her personality, interest and doings. What data have I published where? What does it tell about me? Who am I on the Internet? – are just some of many interesting questions to answer. But as the data is decentralized and stored in different independent systems, there are hardly interconnections. It is difficult for the user and other people to get a complete overview of the content and therefore to observe and further analyze her virtual identity. An approach to this problem is discussed in the thesis. The data mining web application ‘WhoAmI’ is developed to centralize the user’s media followed by a data analysis and various visualizations. The presentation gives a deeper view into the thesis task and shows how the web application is implemented with the help of the Ruby on Rails framework using a modern, agile web development approach. An application demo follows afterwards.
As a result of the questions “what category does this tag belong to?” on encyclopedias knowledge we obtain a list of categories. But in what extent does this count in the interestprofile? The Formula beneath trys to calculate some weight.
c stands for count, g for weight. the formula reads as follows: ” The weight of a category arises by its invers count of appearance multiplied with the sum of all tags in all services weighted by the count of appearance of one tag in a service, the weight of the tag an the weight of the service”.
Weigth of service measures if tags are rarely used in a serive (e.g. delicius vs. flickr).
Weight of tag measures how important ist a tag, but this needs to be improved, it is not exactly clear how to detect important tags. Maybe you got any ideas?
The approach to use only tags for interest determination brings some new questions.
Not only that even tags that won’t get useable data from encyclopedias are maybe good to use in other context, e.g. all the geo-information provided via tag can be used to get data about users locations when providing data.
But what nature can tags be?
- numbers e.g. years
- recommenddations for other users beginning with “for:” (delicious)
- reminders like toRead, toDo (and similar noptations 2Read)
- opinions e.g. fun, funny, humor
- geography data starting with “geo:” or “ge:”
- mechanical tags with prefixes like “filetype:”
Is this all? Are there more then this? Does anyone bring it into order? Maybe Metatagging, will do but do you know?
After some hard weeks with only one hand on the keyboard but the whole mind full of ideas, now some new thinkings about Tags and its order.
Look at the following drawing. Some weeks ago i thought about how to connect the tags with some order. Slowly it gets clearer. One have to use existing classifiations. My first try will use wikipedia to do so. theoretically it is large enough to give satisfying results, let’s see what exercises will bring.
One additional question ist if it will be to much affected be the encyclopaedia.
What do you think about this try?
[Update] Some explanations to the image above. From left to right the bubbles stand for the following concepts. Services are all the origins of data (e.g. delicious, flickr, 43things, technorati, upcoming, lastfm) where the data comes from. Filter does what the word says it cleans the results if necessary. The tag-databases stores all tags found for a user. Thesaurus cleans up again. Encyclopdedia (e.g. Wikipedia, OpenCyc, Freebase) gives structure to the whole thing. With its categories various keywords get there order by request to it. At the end all is set into one profile of interests.
Further questions? Sure there are many things behind, but this glue can be discussed…
I’m wondering for what reasons SocialMedia apps always specialize one just on media? Flickr for picture, youtube for vids, del.icio.us for bookmarks? No system to get it all in one? Why?
Is it due gain higher usability? To create a hotspot?
This specialization has different reasons. One reasons is that systems supporting multiple media, need a clear structure with a high usability. Otherwise, the user would be overwhelmed by to many functions and will get lost what lowers his interest on the service. This is not easy to gain.
Another reason for specializing on just one media is to create a hotspot provider in this field of topic to attract and join people of same interest and build up a huge community. For example, Photographers meet at the online photo gallery flickr or music lovers at the music platform last.fm.
At least, searching media in specialized systems can be performed more precised and specific what leads to
more reliable results, imagine searching for music by artist, or pictures by color.
What do you say?
Who has not read my thesis about “Semantische Aggregation personenbezogener Daten” published by XML Clearinghouse is now able to study a more professional, more precisly, but english drawn up article about “what do we get from virtual identities, if we do so in a much larger approach”. Written by Hugo Liu, Pattie Maes, Glorianna Davenport (MIT) named Unraveling the Taste Fabric of Social Networks reveals what is possible with myspace.com, ontologies and many other cutting-edge technologies. Even Plato, the old greek, comes into play.
So prepare yourself and your identity on myspace.com or only be amazed what is possible.
After reading many articles about the pros and cons of the world of folksonomy with tags and the world of taxonomies with ontologies and classification, I am not quite sure how it will fit together. The dashed line in the center of the pic marks the frontier. At the moment I do not believe in reassambling, but maybe there will be something that pushes it forward.