Claiming Your Online Identity

Defining and managing your identity online can be time consuming but considering how many of our social and professional relationships begin with a Google query, it probably makes sense for all of us to invest a little time pulling it all together and presenting the image to the World that we want displayed, rather than whatever just happens to show up on the Internet.

First, let me make a distinction. In SEO circles, there is a practice called ‘reputation management’ that generally involves creating numerous pages of external content that will rank for the given brand’s related searches.  All those external pages are intended to rank below the corporate website but before whatever negative reviews or remarks exist, hopefully pushing any negativity down to the 3rd or 4th page of results, where no one will notice.  In other words, it is a focused spamming effort on behalf of said brand, with the goal of manipulating search results.  Sometimes brands have no choice and have to engage in this level of online warfare. Pragmatism aside though, this is NOT what I am describing here.  Rather I am talking about a proactive and cooperative effort to help Google identify all of your identities and content online, in return for preferred placement of your content when people search for you.

Google implemented an important new feature, along with the release of Google+, that allows them to recognize social channels and owned content and properly attribute this to the brand or individual.  The implementation takes advantage of a new semantic tagging attribute (@rel) in the HTML5 spec that specifies origination and authorship of content.  If you create a Google profile and link your content and social identities using these tags, you can essentially let Google know which content is really you.  In a recent presentation, Matt Cutts (Google) called this new ability “author rank”, and if you’ve been following their recent talk about quality signals, you know that authenticity of content is a big deal in recent algorithmic updates. In fact, they’re not only giving preferred placement to authenticated content for brand and name searches, they’re even displaying the photo of the author in many cases, to help set these aside as authenticated content, from reputable authors.

Okay, sounds good so far right?  Now we just need to work on the confusing mess that is our social network and tie it all together in some meaningful way.  After a bit of homework and practice on my online identity, I distilled down what I feel is a best practice approach.  First, I created a simple website for myself under my namesake URL and created links to all of my social accounts from this location.  NealCabage.com is my literally my homepage now. Next, I linked all my enlogica articles back to my homepage.  And of course enlogica has a couple of its own social accounts to facilitate outreach and easier content socialization. So its important to keep the social accounts owned by the blog separate from those attributed to my own personal identity.  So whereas I link to the blog’s social sites on the sidebar of the blog, I keep all my personal social media accounts separate and only link to them from my own personal homepage.  Once all of that is clear, I create a Google profile page for my own personal identity link all the personal identities together.  Iterative.ly should probably also do this for its own identity, separate from my own.

Sync Up Your Online Profile

Now let’s get a little more specific about how its actually implemented and the use of authorship tagging. On my homepage, each of my social account links contains the new attribute: @rel=”me”. The homepage should have proper authenticity to make such a claim since you already liked to it from your Google profile account.  Next, on the blog that you are reading now (for example), I placed a small author block at the bottom of each article that provides a link to my homepage. This links back to my personal homepage, with the attribute: rel=”author”.  Just like the @rel=”me” helped Google to identify my social profiles, @rel=”author” will help it to recognize any content that I have created and authenticate it.

Since I know this can be a little confusing, here’s the same information in enumerated form:

1. Create Google profile (profiles.google.com)
1a Link to your website and all social accounts
1b Link to any blogs or online magazines that you contribute to
2. On your blog, provide a link on each article page back to your home base site.  Include rel=”author”
3. On your homepage, link to all of your social accounts.  Include rel=”me”

Connecting Accounts to Google Profile

The final step is to validate that everything is setup correctly. Google provides a tool for you to use. Unfortunately at the time of this writing, the tool is giving me an error for using the @rel=”me” attributes on my social account links from my homepage. This appears to be a bug however; even Matt Cutts blog is getting the same error for the same reason.

Anyway, despite that hiccup, if everything was done properly, you will soon start to see that your homepage, social accounts, and your content begin to dominate the search results for terms related to your name or brand.  You may even begin to notice for profile picture next to your content.

There you have it.  My homepage, my social accounts, and my blog contributions are all accounted for and properly attributed and validated.  Now hopefully with a little time, Google begins to treat my identities with a positive bias for searches related to my personal ‘brand’.

The Future Is the Semantic Web

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The “Intelligent agent” people have touted for ages will finally materialize.
~Tim Berners-Lee

In the 1999, Tim Berners-Lee described his vision for the future of the Internet.  He described computers being able to parse and understand the same information that we can, and become personalized virtual assistants on our behalves.  Imagine the calendar application on your mobile device observing a schedule conflict when you begin to book a trip on Expedia.  Or what if your computer could do all of the research necessary to justify decision you task it with, and it compiles a detailed research report in support of a decision, ready for you in the morning.  To many, this is the true destination of the Internet, and all of the social sharing that caught everyone’s attention in web 2.0 is a mere preview of what is to come. One blog author I came across, described it by saying “Its a data orgy and your server is invited”.

For roughly a decade now, a movement called the Semantic Web (aka HyperMedia) has been seeking to enable this vision. It is thought that by properly annotating an HTML document, you will enable a computer to consume and understand the information, much like a human would, and thus be able to make informed decisions and act on our behalf.  To accomplish this, there must be structure and relational definitions, and several annotation protocols and descriptive vocabularies followed.  So far, adoption has been anemic however, for numerous reasons: the effort has so far been splintered by various factions creating competing technologies, and limited support by browsers and search engines, limiting incentives for early adopters to explore the technology more deeply.

But recently this has been changing.  As per the introduction of HTML5, there is now support for semantic tagging in all modern browsers, search engines have begun to reward websites with rich snippets in their search results when the semantic data is present, and reports have been written, detailing how BestBuy on others have seen lift in traffic as much as 30%, as a result of their recent RDFa implementations and resulting rich snippets in search results.

Others are also making use of semantic markup recently. Facebook has been using RDF in their social graph implementation since 2009 and recently began using the hCard and hCalendar markups of user profiles and events, respectively. Google has been making a push to ‘authenticate’ authors and authored works with the XFN rel=’me’ annotation, Yahoo Tech and LinkedIn are both using Microformats in their data as well. So now we have support, peer adoption, and recent evidence of positive ROI from the effort and we’re starting to see results; RDFa adoption increased 500% in the last 12 months alone! The only thing missing was significant corporate backing.

In response to this confusion and complexity, a consortium of the major search engines, including Google, Yahoo, Microsoft and Yandex, came together to create a standardized approach, called Schema.org. It assumes the use of Microdata rather than RDF/RDFa and provides a single resource for the major semantic vocabularies to be used.  The hope is that by simplifying the technology, and standardizing it in industry, adoption barriers will be reduced and the average development team will begin to embrace the technology.

So how does one implement semantic annotation?  There two primary annotation frameworks are the Resource Definition Language (RDF/RDFa) and Microformats.  RDF can be used markup with the HTML document with what are called Subject-Predicate-Object triplicates.  It is a robust language that is most commonly used for more robust datasets that require deep data linking.  It has been criticized for its complexity however and so a recent revision called RDFa focused on making it easier to use, with a focus on use of attributes. Here is an example of what it might look like if you were to markup a simple object in RDFa:

    <div xmlns:v=”http://rdf.semantic-vocabulary.org/#” typeof=”v:Person”>
        <p>Name: <span property=”v:name”>Neal Cabage</span></p>
        <p>Title: <span property=”v:title”>Technologist</span></p>
    </div>

Microformats meanwhile, was developed with simplicity in mind and a more natural extension of the HTML document.  MicroData is a subset of Microformats that Schema.org has settled on and even has its own DOM API in the HTML5 spec, so this is the presumed future standard for most websites:

    <div itemscope itemtype=”http://semantic-vocabulary.org/Person”>
         <p>Name: <span itemprop=”name”>Neal Cabage</span></p>
         <p>Title: <span itemprop=”title”>Technologist</span></p>
    </div>

At some level the concepts are quite simple, however there are numerous Ontologies and semantic vocabularies , which have been defined and are referenced in order to give meaning to your RDF or Microformat attributions.  Notice the semantic-vocabulary.org reference above?  It calls out to an externally defined schema for the Person hCard being defined here.  Schema.org has defined many of these directly for MicroData but there are others such as Dublic Core, DocBook, and Good Relations which is specifically very popular for eCommerce. There are plugins available for many of the major CMS and eCommerce platforms to assist with semantic markup as well.

The Semantic Vision
In the near term, the Semantic Web (or “Linked Data”) is merely an exercise of annotating your content more precisely with hope of getting the Google carrot at the end of the proverbial stick (e.g. rich snippets).  As a result, any short-term concrete gain main seem a bit hollow.  The real promise however, is in what is possible when semantic annotations in content reach critical mass.  This chart shows what some have been predicting in terms of the direction of Internet technology and proliferation.  The real intelligence and interoperations still very much lie ahead in what could be described as a Web 4.0 World.  This assumes that HTML5 marks the precipice of Web 3.0 and we all begin working on proper annotations now, to lay the foundation for those achievements in the future.

Web 4.0In the meantime, there are companies already beginning to do cool things, experimenting with the sort of predictive intelligence that one might expect from a “linked data” web 4.0 World.  Hunch.com in particular attempts to help you parse through excessive data on the Internet, by looking at your past interests and those of your friends, to determine what you might like and limit your choices accordingly.  Hunch.com’s CEO talks about how research exists to assert that fewer purchases are made when consumers are presented with too much information or too many choices.  It is thus worthwhile technology for retailer to pursue, in an effort to predictively get fewer precision choices in front of each consumer.

And with all of the data that currently exists in the World, we just continue to collect exponentially more, via social interactions, tags, various use profiles, online transactions and analytics data.  The buzz word in many organizations now is “big data” and they are looking to new tools such as Hadoop to help them address these problems.  In this growing cluster of data, we will eventually outgrow search as our most useful paradigm for how to access all of this data. And what will that look like?

I was asking myself this question the other day when I looked down at my iPhone and realized, the interface may already be here!  What would a computer interface look like that is largely Internet driven, but for which the user experience does not begin with an Internet browser or Google.com?  In fact, if the real vision of a semantic web, is intelligent consumption of data and that intelligence applied for specific applications as virtual agents, wouldn’t that manifest as a lightweight albeit more mature mashup style app, similar to what we already have on mobile phones and tablets today?  Its an interesting thought.  I can imagine a progression in that direction, with these applications continuing to grow in sophistication, intelligence, and awareness.

In closing, here is a video that I found while researching, that provides an excellent introduction to the topic.  If you’re looking to introduce these concepts to your team or stakeholders, this is a great place to start:

Web 3.0 from Kate Ray on Vimeo.

HTML5 Is Kind of a Big Deal

Have you looked at the new features that are part of HTML5?  I must admit that it took me some time to actually look at it because I dismissed it as just some new markup to have to deal with.  In that way, its really not correct for this to be labeled as HTML at all!   In fact, HTML5 is a collection of technologies that have been sorely missing from web browsers, and upgraded markup is a rather minor footnote.

The introduction of these technologies now means that we’re probably about to witness a major new architectural paradigm for web applications, toward that of a fat client, and away from the traditional static page model. Dare I say Web 3.0?  As someone who has always had an affinity for UI centric application, I must admit that I find this pretty exciting!

So let’s get into it. What is so exciting about HTML5? There are 5 major categories that I would us to describe these upgrades:

1. Dynamic Image Rendering

There are two major technologies to discuss here: SVG and Canvas. SVG is ‘retained mode’ and used mostly for data modeling or whereas the canvas ‘abstract mode’ and can be used for real-time painting and animations. In totality, these vector tools are what will allow the technology to replace what has otherwise been accomplished mostly with Flash up til now.

SVG – essentially a markup language for vector images that live in the document. That in itself is pretty cool which you consider the semantic web potential of the image data.   Mozilla has a really great example of SVG with their glow map that shows a glow dot on a map for each download of Firefox. The real time update of the graphic shows what can be done here.

Canvas – more for user interactivity and animations. So whereas SVG might be used in place of traditional Flex modeling, imagine the canvas replacing most uses of Flash.  A friend sent shared an example of an interactive animation in which the user draws a stick figure and then interacts with the stick figure as he begins walking across the screen and doing random activities.  The painting of the screen is all possible via the HTML5 canvas. The animation and interactivity, is the interplay of JavaScript with the canvas.

2. Native Media Support
Embedding an image or audio file is now as easy as embedding an image. This is probably the most famous feature of HTML5 as we all have heard that Steve Jobs pointed out that you no longer need Flash to embed videos, right?   But that is  just a tactical concern.  What really makes this cool are the implications of having natively supported video.  Imagine what you can do when you have a UI that can directly interact with the video, rather than simply play it.  For one thing, you can have multiple iterative hot spots throughout the document that trigger videos based upon user or application events. That’s true Flash or Director multimedia, directly in the DOM and SEO and Mobile friendly.

There are also interesting experimental technologies such as the Popcorn.js library that looks at triggering events within the UI or application via embedded footnotes in the video itself.  There are many people buzzing that Interactive/Internet TV is finally going to take off in 2012 because the technologies are finally going to be present to do so. Imagine the implication for TV content producers if they’re able to conduct the iterative events on an Internet TV video embedded directives in their video content?  They could instigate real time data and social media and coordinate it with their media experience.  That is very compelling.

3. Semantic Tagging
This is compelling from a Semantic Web perspective. So far in the evolution of the Internet, everyone has focused on creating websites and web applications that engage a user.  But what if a computer needed to interact wit your content?  The only significant example of this to date are the search engines and we should all be familiar with SEO and how pages are altered to make a document parseable by search engines.

But imagine taking it a level beyond that.  What if your computer acted as a virtual assistant for you and you were able to book a plan trip on Expedia, when your computer interrupts you to remind you that you have a scheduling conflict already.  These sort of aware systems would only be possible if they were themselves aware of the content we are interacting with.  So the idea with the semantic web, is that if we all properly tag our content with appropriate tags and meta data, we make it possible for such systems to be aware and to consume our content.

HTML5 takes a big step forward on this, both with semantically appropriate tagging, but also with formal adoption of meta tagging standards such as RDFa and the @rel attribute that can be used to map together authors and their contributions online, which I discussed a bit in another post.

4. Local Data Storage

Initially the HTML5 specification called for a location implementation of a SQL database.  Sadly, this was deprecated last year.  Many of the modern browsers have already implemented but it may not be there in future browsers.

What is there however, is an client-side key-value database solution for local storage. Using the localStorage API, you can store up to 5mb of data and it seems to persist indefinitely, or until the user manually purges their database.  So this can still be very useful. This is basically cookies on steroids.

This probably is a better solution than a local SQL database.  Consider the movement of NOSQL database systems toward non-structured document databases rather than tables and schemas; they essentially store JSON objects which are native and ideal for persisting state of a JavaScript application.  Given that an HTML5 JavaScript application would be the consumer of this database, it seems this might actually be the perfect solution for maintaining state, compared to a traditional SQL dB.

As for why this is a big deal, it has the potential to completely change the architectural paradigm of web applications!  Persistent state is one of the big issues that pushed traditional “fat client” application design toward a thin client/fat server model, since web applications relied on the server to remember everything.  If this issue of state is finally resolved, we could see a return to a fat client model, in which we’re doing way more development in JavaScript on the Client side, and much less on the server.  Many, many implications here!

5. Standardized Resources

Other resources have also been standardized here as well:

JS Web workers – this is a subtle yet bit one.  We’ve all probably experienced the occasional web application that seemed to load really slowly and kill usability because the JavaScript had a lot of work to do and ran away with the app.  With web workers, its possible to isolate certain JavaScript threads as background processes, similar to Unix Daemon processes.  That could he very helpful for large those pesky social 2.0 JavaScript includes that delay your document ready event or other data fetching or calculation intensives such as prime numbers, etc.

Cross Domain – AJAX can finally make calls cross domain, rather than being limited to the domain or origin.  This is again huge in terms of being able to build a fat client app, particularly in the days of  mashup APIs.

GEO Location – Also apparently each browser now provides a standardized Geo-location API. The implementation details are up to each browser; Firefox in particular apparently now implements Google’s Geo API.  So now we have standardized location approximation, for all computers not just mobile.  This has recently popped up in Google’s maps in fact.  Clearly this is an attempt to support creation of a single fat client app for mobile and the desktop.

Conclusion

So there you have it, HTML5. Each one of these respective upgrades is a big deal, but in my opinion the really big deal is that I think this will trigger a new architectural paradigm in web applications.

Imagine writing a single application that is equally engaging on your mobile app or desktop.  It retains its own state and doesn’t require page refreshes, so its an optimal experience with even a slow internet connection or lack of connectivity such as being on the rode or in airplane mode.  Imagine the client is where the majority of the application logic lives and only minimal server calls are required any more, and even those being written now in an event-model pattern using JavaScript via Node.js, rather than an entirely different server side technology.   Yep, this indeed has the potential to trigger a really exciting new era of web application development!