Virtualizing Social Networks – Part 1
Hello again. Well, I’d like to introduce one of my assistants, Todd. He stopped by to talk about how the same principals that can be applied to identity virtualization can also be applied to social network virtualization. I don’t like to let him out much, but I figured that one post here every so often can’t be that bad. I’m curious to hear what you think… In any case everyone say hi to Todd!
- Charlie
Thanks Charlie for the introduction. As I thought about identity virtualization and how CoreBlox used virtual directory technology for our Identity and Access Management (IAM) projects, it seemed this same logic could be applied to social network virtualization. There are several virtual directories out there, but for the purpose of the post I am going to use RadiantOne from Radiant Logic as the way to describe virtual directory technology.
Identity virtualization means the following to me:
- Layer of abstraction between identity consumers and identity providers
- Allows for mapping and correlation of identity information (e.g. one user identity across many systems)
- Application oriented views of identity information (e.g. what systems can be accessed by the user, and what permissions does the user have in each system)
When virtualization is applied to social networks, this is often referred to as a Social Graph. A good description of social graphs can be found on the ReadWriteWeb (RWW) site:
http://www.readwriteweb.com/archives/social_graph_concepts_and_issues.php
This is related to a post by Brad Fitzpatrick here:
http://bradfitz.com/social-graph-problem/
While these articles are a little on the older side, I think they are still fairly relevant. RWW defines a social graph as:
“Our society spawns one gigantic social graph. In this graph, each one of us is a node. There is an explicit connection, if we know each other. For example, two people can be connected because they work together or because they went to school together or because they are married.”
While there are API’s for constructing social graphs (Google’s: http://code.google.com/apis/socialgraph/), the virtual directory already has all the pieces necessary to identify and understand these relationships. To steal from an image I used in a recent presentation I did at TEC 2009, you can visualize identity virtualization as follows:

So, the virtual directory allows you to apply the following functions to populations of users:
Aggregation: Bring the user populations together so that they appear as the superset of all of the user populations. This creates one view of the users and surfaces this information through one access point (or protocol).
Correlation: Determine the union of the various user populations. So, by taking common attributes you can create the unique identity that represents a user across all of the populations.
Integration: Understand the context of identity information. This gives you that “Single version of the truth” that allows virtual directories to link back to the identity providers in the right context and the right process by fully understanding the relationships between the identities.
So, if we applied this same logic to social networks, the virtual directory can bring those relationships between users to light simply by leveraging capabilities that are already embedded in these products. The diagram for this might look something like this (the virtual networks listed there are just to show a sample of what could be included):

In this model, aggregation is accomplished through defined connectors into the various social networks to show the overall population across the networks. Since the enormous user populations would make viewing all identities across these networks impractical, you could instead provide a means of searching for a given user’s information in real-time.
Similarly, correlation is delivered through account mapping where each user defines their accounts across all of the social networks in which they participate. Allowing users to setup their own accounts ensures that the data is accurate and also allows users to opt-in by creating only the accounts that they would like to have included.
Finally, integration represents the relationships that a user has to other users on the subscribed social networks and, if possible, an indicator as to the type of relationship. So, for Facebook this would be my friends, for Twitter my followers and those that I am following, and on Flickr my friends and family.
Once I have this information and the mappings are defined in the virtual directory, I can easily use it to pull together information about people across those networks. Basically my account mapping gives me a unified profile in the virtual directory, which can be used to pull in information about me from each of my subscribed networks. I can also derive the links between the defined accounts and others with accounts defined in the virtual directory by crawling the relationships. For example: if I select three people from the virtual directory with Facebook accounts, I can view the friend lists for each of those people. If I see that person A has a relationship with person B and person C has a relationship with person B, I can then start to understand the links between all of the users. Finally, my defined interests on each of the social networks allow me to understand the context that I have both to the networks and to other users of the system. So, if on Facebook person A has interests in Apple and Skiing and on LinkedIn person A has interests in Social Media and IAM, I can relate the unified identities to other unified identities based upon their defined interests and their specific social network link. All of this information can be represented as my attributes in the virtual directory.
Data in the virtual directory is exposed through various protocols. So, applications could access this information through LDAP, SQL, or by making web service calls to pull the information needed to display the social graph. Leveraging the abstraction of the virtual directory allows the viewer to structure the data relationships by different attributes without modifying the underlying structure of the data itself. Additional social networks can be added in (or removed) on the fly without changing the consumers of this information. Since the virtual directory also provides LDAP’s model of security, its information could be secured at a fine-grained level of access which could help to alleviate privacy concerns with access to this information.
Due to the sheer volume of the information available across social networks, it would need to be shown that virtual directory technology can scale to this amount of data and to the complexity of the joins required to create views into the various relationships. Leveraging the caching capabilities of the virtual directory should help, but a proof-of-concept would highlight where there are limitations.
Hopefully this provides a solid overview of the approach. I am definitely interested in hearing feedback and ideas. I will post additional information as we gain more clarity into this approach.
In Part Two I will follow-up with some additional details on ways of putting this type of a system to use (assuming Charlie will have me back!).
Thanks,
Todd




Mike C April 7, 2009 | 1:01pm
Impressive.
I need to talk to Charlie.