Updated 1/20/14: I made a couple changes based on feedback from Scott Young, who also suggested readers check out Linked Data Publishing for Libraries, Archives, and Museums: What Is the Next Step? and The like economy: Social buttons and the data-intensive web for further information.
Now that I’ve driven our blog’s SEO through the roof, it’s time to get nerdy. Social media doesn’t have to be all about memes and Katy Perry. Nope, it can be about metadata, too. Isn’t that wonderful?
In the dark olden days, when a piece of your web content was distributed on a social network, it was impossible to control how it would be presented. At its most simplistic, the destination network would take your URL, turn it into a hyperlink, and be done. There was no preview, no indication of what lay behind that HTTP request. If you used a mature social network, like Facebook, the service might scan through the URL, find some sample text, and offer that as a preview. If you were lucky, it’d find an image and offer that to your potential audience, too. Armed with a slight textual or visual preview, presumably users would be more likely to explore your content.
Nowadays social networks have mostly solved this problem. They try to make external content as appealing as possible. They want you to click on shared links, and thus they want shared links to be more than blue, underlined letters. Images and movies are springing up where there used to be only a stock-ticker timeline of text. This post will explore a couple popular methods of enhancing your web content’s appearance on social networks using embedded metadata.
Anyone who has written a little HTML has probably already set their web page up to be appealing when seen through a third-party service. In the administrative <head> of your websites, where various metadata and external resources like stylesheets lie, there’s probably a line like this:
<meta type="description" content="Innumerable cats followed by robots followed by more cats.">
That’s a self-closing <meta> tag, which the W3C says is for “various kinds of metadata that cannot be expressed using the title, base, link, style, and script elements.” In practice, this means that how <meta> tags are used is up to the whims of applications that consume HTML. Web browsers, for instance, do not display the “description” which we’ve added.
So what’s the point of <meta type=”description”>? Well, search engines use them. Search engine companies like Google have automated programs called bots that continuously crawl the web, downloading the contents of each web page and processing them. Google takes your description seriously and often uses it as the teaser text that appears alongside a search result. So this invisible, pretty much useless HTML element is now defining how our site appears when viewed through a massively popular third party. Now, if you don’t add a description, Google still shows some preview text. But it will guess; it will crawl the text of your page and display what it thinks are the most meaningful snippets. The result is often dismal: text that’s visually hidden using CSS might be chosen, meaningless strings of navigation links might appear, even this blog shows the lede from the latest post rather than a general description of what we’re about.
Here’s another made-up example, based on a search result that has no <meta type=”description”>:
Library ». Special Collections ». Reference ». Mutton Soup: More Adventures of Johnny Mutton by James Proimos — Why do people lie? Do gender and personality …
Pieces of navigation text, characters which are probably visual aids, and titles from a book carousel all appear as disconnected text.
Without further ado, let’s see how we might further enhance a web page uses Twitter’s Cards, a special schema for <meta> tags. Here’s markup for this blog post:
<meta name="twitter:card" content="summary" > <meta name="twitter:creator" content="@phette23" > <meta name="twitter:site" content="@ALA_ACRL" > <meta name="twitter:title" content="Gearing Up Your Sites for Sharing with Twitter & Facebook Meta Tags" > <meta name="twitter:description" content="Add Twitter Cards and Facebook's Open Graph to your website's meta tags for fun and profit." > <meta name="twitter:image" content="http://acrl.ala.org/techconnect/wp-content/uploads/2014/01/msu-twitter-card.png" >
The overall idea should be apparent: the name attribute defines which metadata field you’re using while the content attribute fills in the value for that field. The “twitter:” prefix is a kind of namespacing, which may be familiar if you’ve worked with XML. It basically serves to say “hey, use the Twitter schema to interpret this field’s meaning.” This is useful because, as we’ve already seen, some metadata fields might collide: if Twitter just used “description” as a field name, another different service might do the same, and both applications would get confused when parsing a site catering to both. Namespacing solves this problem.
Let’s walk through the various fields we’ve used:
twitter:card is the type of card, of which there are eight, each with its own rationale and set of fields. The most pertinent ones center around displaying an image, gallery, app, or video alongside your content but you can read their documentation to discover all the types.
twitter:creator is the Twitter account of the creator of the content. This could be very useful: just because someone shares this post, it doesn’t mean they know my Twitter handle or have to properly attribute me. Now, when someone clicks a link to content with @phette23 as the “twitter:creator,” the expanded view automatically shows my username whether it’s included in the tweet text or not. That saves some precious characters, too!
twitter:site is optional and can be a larger organization. Twitter’s own example makes more sense here: when an NYT journalist writes a story, the story can be associated both with the author’s account as well as the publication’s.
twitter:title is the title of the content.
twitter:description is a brief description.
twitter:image is the image that appears alongside the shared content in an expanded view.
The big payoff is that, when someone selects a tweet in their timeline, they’ll suddenly get a much richer, expanded view with affiliated accounts, context, and an image. Here’s a real, live example courtesy of Scott Young and Montana State University:
Can you spot where the “twitter:site,” “twitter:description,” and “twitter:image” appear? What was just a bland URL has exposed its underlying resource in a much fuller manner.
Finally, you can run your page through a validator to ensure that you marked up the tags properly.
Let’s move on to Facebook, the web’s aging but still dominant social network. Facebook does a sensational job of picking images out of a shared URL, but problems can still occur. The day before I drafted this post, I happened to spot this issue:
I’m not sure what happened here, but it appears as if Facebook picked a blown up, pixelated version of the Google logo for the link, which has nothing to do with Google. If I’m scanning through my feed, I might expect to see smiling students for this news item, but an indistinct, jumbled mess does little to attract my attention and I move right along.
Let’s fix that. Credit where it’s due, David Walsh’s post on Facebook meta tags is a great starting place and where I learned about them. Here’s an example:
<meta property="og:image" content="http://24.media.tumblr.com/0fc9023daa303558d036ecd63fd2c24e/tumblr_mjedslIPPH1qbyxr0o1_500.gif" > <meta property="og:title" content="Gearing Up Your Sites for Sharing with Twitter & Facebook Meta Tags"> <meta property="og:url" content="http://acrl.ala.org/techconnect/"> <meta property="og:site_name" content="ACRL Tech Connect"> <meta property="og:type" content="article"> <meta property="og:description" content="How to use meta tags to control how web content appears when shared on social networks" >
Most of this is pretty explanatory. Facebook calls this the “Open Graph Protocol,” thus all the “og:” prefixes which are namespacing these fields just like “twitter:” did above. The markup is straightforward: the property attribute defines what metadata field you’re talking about while the content attribute fills in the value for that field. Here’s a quick listing of the Open Graph fields and what you need to know:
og:image is an associated image. This is perhaps the most important field depending on your goals, since it gives you a chance to put your most eye-catching image alongside your content.
og:title is the title of the work at hand; notice how that means this particular blog post, not something larger (like the blog itself or ACRL)
og:description is a one or two sentence description, similar to a typical <meta type=”description”>.
og:url is the canonical URL of the item you’re sharing. This may not make much sense, but much web content is accessible at multiple URLs these days. Consider this post: if you’re reading shortly after publication, it might be on the Tech Connect home page. A few weeks later, it might be at http://acrl.ala.org/techconnect/?paged=2. But it will always be at http://acrl.ala.org/techconnect/?p=4062. That’s the canonical URL and the one we want associated with it on Facebook.
og:site_name is the larger website upon which a piece of content lives, so that’s the Tech Connect blog in our example. How far to go with this is subjective: is ACRL actually the “site” here? It’s mostly up to how you want content to appear on Facebook, not according to some indisputable web ontology.
og:type let’s you categorize the type of content. The Open Graph has a list of types, which is limited to music, video, article, book, profile, and website at the time of this writing, but it also contains logic for defining your own types. The Open Graph standard notes that “[w]hen the community agrees on the schema for a type, it is added to the list of global types.” David Walsh’s example uses a type of “blog” which, as far as I can tell, is not one of the standard types.
There’s a lot more to the Open Graph Procotol on their official website. We’ve covered it before, too: “Real World Semantic Web? Facebook’s Open Graph Protocol” goes beyond <meta> tags and talks more about the concept of linked data. Facebook’s own Open Graph documentation goes even further, which might be particularly useful if you want to associate your web content with a Facebook app or page. In terms of additional fields, the og:video and og:audio properties associate further media with your web content. For the most part though, simply using og:image and a few other metadata items gives you far greater control over your content’s appearance without too much added complexity. Furthermore, Facebook prefers content with Open Graph metadata; since Facebook has to guess how to format sites that lack OG information, those get downranked in the newsfeed.
While the Open Graph is publicized as a generic way for an object to be represented in any social graph, in practice it’s just for Facebook. However, since the protocol is public and widely used, I wouldn’t be surprised to see more startup social networks piggyback off of Open Graph rather than roll their own meta tags like Twitter did. In fact, I found a StackOverflow answer which suggests that Pinterest and Google+ do exactly this, opportunistically using particular pieces of Open Graph metadata (specifically og:image). So you may get pretty good bang for your buck with the Open Graph as opposed to Twitter’s more idiomatic “cards” which seem to be far too focused on Twitter and particular types of content to be generally useful for other applications.
If you read the StackOverflow link above, you’ll note that Google+ actually prefers Schema.org metadata over og:image, but will fall back to og:image if that’s not available. Google’s “Rich Snippets” technology is very similar to these social networks. Essentially, you tag certain pieces of your page with metadata so that Google can optimize how you’re displayed in search results, precisely paralleling the way social services optimize your shared content’s presentation. Schema.org, microformats, and linked data at large are a huge topic worth several posts of their own, so I won’t go into them too much.
As I note below in the Value Proposal section, it’s worth considering where your audience is and which approach will yield the most return on investment. A simple litmus test: where are your website’s referrals coming from? Organic Google search? Then Schema.org and rich snippets makes sense. Facebook or Pinterest? Hello, Open Graph. Twitter? Time to card it up. In particular, you may detect patterns wherein certain types of content warrant customized approaches. It’s sensible that rich media content like images and videos would be shared heavily on social networks, but perhaps not inspire too much attention from Google’s text-based search engine. On the other hand, textual content may be precisely the opposite: there’s no flashy way to preview it Twitter or Facebook, so don’t bother with enhancing that sharing vector, but consider how something like Schema.org could increase its exposure to search engines and other linked data applications.
One thing I feel I should note: <meta> tags, because they’re invisible and in the <head>, are pretty easy to neglect. Most content management systems don’t make it easy to alter <meta> tags, presumably because they think they know better than you or that it’s too niche for the majority content authors. Remember when I mentioned that this blog doesn’t have a <meta type=”description”>? That’s because WordPress doesn’t make it easy to edit <meta> tags, particularly on a page-by-page basis. Neither does Drupal. Instead, these frameworks tend to configure a few helpful <meta> tags for you but don’t infer a description and certainly won’t fill in Open Graph details for you. Luckily, the advantage of these CMSs is their extensibility, and there are Open Graph and Twitter Cards extensions for WordPress. Hat-tip to Michael Schofield, who informed me that the WP SEO plugin does both. Drupal has a Metatag module which makes editing said tags easier, but doesn’t have anything specifically catering to Twitter or Facebook that I’ve found. One could edit Drupal’s node templates, however, inserting an og:title field on every page with PHP code like
<meta property="og:title" content="<?php print $node->title; ?>">.
Secondly, <meta> tags are a rather flawed solution because each web page has only one <head> and thus one set of <meta>data associated with it. Consider this blog again: when you visit the home page, the last few posts are presented. Each post has its own URL, images, topic, even the authors are distinct. Yet we can’t put a slew of <meta> tags embedded in the body of each post; we only have one <head> to work with, so at best we could place a series of generic information from the blog at large. This is maybe a bit of a false problem; users sharing the blog home page probably don’t want social sites to tease content from any given post. But it seems problematic as the web becomes more and more modular. There are so many interfaces that present not one self-contained piece of content but collections; Meghan’s recent post on a tiled, Pinterest-like digital library display comes to mind. The blunt simplicity of <meta> tags is showing here. This is where more robust linked data technologies come in, since they don’t necessarily rely on a single HTML element but can use attributes of tags (e.g. Schema.org uses the presence of an itemScope attribute, on any tag, to determine where an object begins and ends in markup) instead.
For my library, it just so happens there isn’t a lot of value in Twitter Cards. Twitter lets you search for URLs in tweets just like any other text; I put in my library’s fully qualified domain name and three tweets came up, two of which were from yours truly and the third was a link pointing to a syllabus PDF. People just aren’t sharing our sites very much and that’s rather predictable. We’re a small college without a huge social media presence and don’t have a unique digital collection. Picking which image appears when the library home page is shared that one time is an inefficient use of time.
How much you get out of this social metadata depends on how much your library’s web properties are shared on the web. For some, I imagine that’s a great deal and controlling how content appears could be very valuable. Do you share lots of unique digital collections through social media, or link them on pertinent Wikipedia pages? Are you actively engaged in a social media archiving or content creation project, like NCSU’s #HuntLibrary project on Instagram? Then investing time in optimizing how social networks understand your content is logical.