
Wiki washers and grave diggers
by Larry Ketchersid
In using almost any Internet site, a typical user confronts the issues of identity, anonymity (or perceived anonymity) and authority at almost every turn. The recent publicity and announcement of wikiscanner [1] brings these issues to the forefront by exposing them even further and showing both sides of the issue: from one perspective, media manipulators could no longer hide behind anonymity; the other side jeered because privacy and anonymity took another blow. Charles Carrington, CEO of Livebolt Identity, describes the perception of online anonymity this way: "The availability of anonymity is assumed by many computer users. In fact, it’s a rarity. It's usually possible to track information back to a source computer, and the user of that computer is implicit (since most "personal" computers are used by a single person.) Our Federal government is diligent in tracking information over the internet, and the source and destination are key attributes. Within companies, it’s even more likely that you have no anonymity. "Out of sight, Out of mind" doesn't apply to computer networks."
Wikiscanner, through the usage of computation and knowledge about who (which corporations or ISPs) own what blocks of IP addresses, takes away some of the perceived anonymity that can be associated with changing a Wikipedia post. Kudos to wikiscanner creator Virgil Griffith [2] for his creativity; because of his invention he can no longer remain anonymous, as media rags [3] show how his creation reveals corps like Diebold editing Wiki articles on e-voting machines.
To see how this works, go to www.wikipedia.org [4] and click on "recent changes" in the left hand column under "interaction". I did this recently and followed a couple entries to (interestingly enough) the page on Schizophrenia (you can see what I saw by going here [5]). There are several entries here by named users (user ID is after the timestamp), but the entries with only IP addresses are anonymous users (or new users who did not realize that they could create an identity and login).
IP addresses are reserved in blocks by corporations, ISP’s and other organizations, so that the devices (i.e., desktop computers, servers, routers, etc.) can have an address to get to the Internet. There are many ways to search for the owner of a block of addresses; one is to go to www.ARIN.net (American Registry of Internet Users) where you can find access to tools such as "whois" to determine who owns these address blocks. Wikiscanner takes these manual steps (looking up changes on wiki, finding anonymous changes by IP address) and programmatically searching for owners of these addresses. According to Mr. Griffith’s website (http://virgil.gr [6]):
In the WikiScanner database, there are 34,417,493 anonymous edits dating from February 7th, 2002 to August 4th, 2007. The WikiScanner database was made by extracting all anonymous edits from the publicly available Wikipedia database dump (which is released about once a month).
There are 2,668,095 different organizations in the ip2location database which I am using to connect IP#'s to organization names.
Within the ip2location database, there are 187,529 different organizations with at least one anonymous Wikipedia edit
Identity and the concept of "authoritative" are interesting topics to consider in terms of Wiki, and also in terms of other community sites such as Digg, eBay, Amazon, MySpace, dating sites and others. With the wide open Internet, one characteristic we all love is that we can all voice our opinion, read it and either agree with it or disagree with it. But the concepts of "is this person really who they say they are" (identity) and "do they really know anything about what they are talking about" (authority) are difficult to determine on most sites.
Mr. Griffith’s Wikiscanner adds some identity to the anonymous posts. But if an anonymous Wiki editor representing a mischievous corporation hides at his house and uses AOL or some other ISP that has a monster block of addresses, identity is still in question.
Identity is hard to determine at any rate. Most people use email addresses as credentials on community web sites, but email address are easy to get, easy to use, easy to call yourself whatever you want. For example, email addresses are all you need to "Digg" a story at digg.com, but users who "Bury" stories are kept anonymous. My friend, Dr. Paul Levinson [7], Professor and Chairman of Communications and Media Studies at Fordham University, talked a few days back about the "unofficial bury brigades" at Digg, who use their anonymity and votes to bury certain stories and raise others. "Like the exclusionists on Wikipedia, the unoffical "bury brigades" on Digg think the system works best with less rather than more information, and therefore see it as a worthy quest to delete (on Wikipedia) or bury (on Digg) unworthy entries," Levinson says. "But does this approach court the greater damage of eliminating worthwhile entries? In the end, unless a story is outrightly false and misleading, what harm is done by its inclusion? This problem is further aggravated on Digg where, unlike on Wikipedia, those who bury stories do so in anonymity."
As many sites do, Digg.com uses an algorithm to determine how to place which stories. According to the Digg.com site FAQs : The promotion and burying of stories is managed by an algorithm developed by Digg. There is no hard number of Diggs/buries to promote or remove a story. It's based on a sliding scale that takes several factors into consideration, such as number of Diggs, reports, time of day, topic submitted to, Digging/burying diversity, etc. (Virgil Griffith, is de-anonymizing Digg your next post-grad project? Could have merit!). But without some type of official credential (and, no, I am not pushing for usage of some universal ID), anonymity will still be the rule on the Internet.
Without delving into biometrics or certificates (which most Internet users will not encounter with everyday consumer and social networking sites), a way to rank the validity of an identity is (from least identifiable to most): 1)anonymous users (identifiable only by IP address, as in the wikiscanner example); 2) email addresses (generic, as in gmail, msn, hotmail, other "free" services); 3) email addresses (corporate, which are spoofable but slightly more authoritative); 4) email addresses supported by a web of trust.
Web of trust is an important concept that can be utilized just by a quick tracing of links and usage. Web of trusts means that the identity in question is linked to either a site or other identity that you know and trust, or is linked to multiple sites using a similar identity. As a very esoteric example, I was curious if the MySpace of drummer Neil Peart of Rush was a true account or was a fan or fake account. The developer of this MySpace page knew this question of authority would come up, so a notice was placed on the page:
The Official
N e i l P e a r t
MySpace Music Page
How do you know?
Rush.com leads to Neilpeart.net
Neilpeart.net leads to here.
Enjoy!
So if you trust that the website www.rush.com is the official website of the band Rush, and it points to www.neilpeart.net as the official website of Neil Peart, then you should be able to trust that this MySpace page is legit. Unfortunately, since the person that writes the page is still anonymous (there are no identity requirements), you cannot be certain that it is Neil Peart writing on the page.
Even with all of this, how do you determine if a source is authoritative or not? John Scalzi, science fiction author of Old Man’s War and other novels, tried to post on Wikipedia a notice of SciFi author Fred Saberhagen’s death [8]. But the Wiki-"police" debated and decided he was not authoritative. Identity (I would assume) was not the question: with his Whatever and several other blogs, John’s identity is anything but anonymous. But how do you determine if a source is authoritative? Some sites (eBay) have ratings on certain identities so that you can determine if they are credible (again, the web of trusts concept, where although the eBay user may be anonymous to you, you can look at the eBay history to determine if the user is authoritative and even credible), but many sites do not.
Like identity, authoritativeness is determined on a community by community basis, usually on a point system or voting (as in Digg’s Digg/Bury system, eBays feedback or Amazon’s Reviewer rating system). Many communities count what they consider "loyalty votes" (votes that show up again and again from the same user) differently than other voters (Digg states in their FAQ’s that their algorithm looks for a "diversity" of users Digging the stories to promote the story, implying but not explicitly calling out "loyalty voting"; Amazon’s Reviewer ratings take into account more than ‘was this helpful’ yes/no votes, as you can see in this helpful Amazon "So You’d Like To:" guide. [9])
Those that comment the most, or vote the most, or buy the most with the best rating, even if their identity is anonymous or questionable, will continue to be considered authoritative and/or credible within their own community.
Larry Ketchersid is an entrepreneurial technologist and currently the CEO of Media Sourcery, Inc., a security software company. He is a martial artist, rugby player, writer, and family man. His first novel is Dusk Before the Dawn [10].