Visualizing DSpace Data with Google Fusion Table & Viewshare

During my time as the Digital Resources Librarian at Kenyon College I had the opportunity to work with The Community Within collection, which explores black history in Knox County, Ohio.  At the beginning of the project, our goal for this collection was simple: to make a rich set of digitized materials publicly available through our DSpace repository, the Digital Resource Commons (DRC).  However, once the collection was published in the DRC, a new set of questions emerged. How do we drive people to the collection? Can we create more interesting interfaces or virtual exhibits for the collection? How do we tie it all together? To answer these questions, we started exploring the digital humanities landscape, looking for low cost tools we could integrate with our existing DSpace collections.  We started to think about the collection and associated metadata as a data set, which contained elements we could use to create a display different than the standard list of items.  We wanted to facilitate the discovery of individual items by displaying them to our users in different visual contexts, such as maps or timelines.

Two tools that emerged from this exploration were Google Fusion Tables, a Google product, and Viewshare, which is provided by National Digital Information Infrastructure and Preservation Program (NDIIPP) at the Library of Congress.  Google Fusion Tables provides a platform for researchers to upload and share data sets, which can then be displayed in seven different visualization formats (map, scatter plot, intensity map).  Various examples of the results can be seen in their gallery, which also illustrates the wide range of organizations using the tool, including academic research institutions, news organizations and government agencies.  Viewshare, according to their website, “is a free platform for generating and customizing views (interactive maps, timelines, facets, tag clouds) that allow users to experience your digital collections.”  While it does many of the same things as Google Fusion in allowing users to create visualizations of data sets, it is more specifically geared towards cultural heritage collections.

Both tools are freely available and allow users to import data from a variety of sources.  Because the tools are easy to use, it is possible to get started quickly in manipulating and sharing your data.  Each tool provides a space for the uploaded data and accompanying views, but also allows for you to embed this information in other web locations.  In the case of The Community Within, we created an exhibit which links to materials about churches in the collection using an embedded Google Fusion map display.

This blog entry will walk through how to successfully export and manipulate data from DSpace in order to take advantage of these tools, as well as how to embed the resulting interface components back into DSpace or other collection websites.

The How-To – DSpace and Google Fusion

1.  First, start with a DSpace collection.  Our example collection is a photo collection of art on the campus at Ohio State University.  In the screenshot below, we are already logged in as a collection administrator.

Note. Click the images to see them in their full-size.

A DSpace Collection

2.  We need to export the metadata.  So, click on “Export Metadata” (under Context).  This will download a .csv file.

Save the csv file.

3.  When you open the .csv, you may notice that metadata added to the collection at different times in different ways may show up differently.  We want to fix this before we send this file anywhere.

CSV data, pre-edit

Edited CSV data

4.  Save the file as a .csv file.  If you are given a choice, be sure to select a comma as the separating punctuation.

5.  Open Google Fusion.  If you do not use Google Drive (formerly Docs), you will need to login with a Google account or sign up for one.  Go to drive.google.com.

6.  Once you are logged in, click on Create > More > Fusion Table (experimental).
Select Create, Other, Fusion Table
7.  On the next screen, we’re going to select “From this computer”, then click on Browse to get to the csv we created above.  Once the file is in the Browse text box, click on Next.
Browse for file
8.  Check that your data looks ok, then click on Next again.  A common problem occurs here when your spreadsheet editor chooses a separator other than a comma.  Fixing is easy enough, just click Back and indicate the correct separator character.
Check your data
9.  On the next screen describe your table, then click on Finish.
Describe your table, and click Finish
10.  We have a Fusion table.  Now, let’s create our visualization.  Click on Visualize > Map.

Click on Visualize, then Map

Because our collection already contained Geocodes in the dc.coverage.spatial column, the map is automatically created.  However, if you would like to use a different column, you can change it by selecting the Location field to the top left of the map.  Google Fusion tables can also create the map using an address, instead of a latitude/longitude pair.  If the map is zoomed far back, zoom in before you get the embed code to make sure the zoom is appropriate on your Dspace page.

We have a map

11.  Now, let’s embed our map back in DSpace.  In Google Fusion, click on “Get embeddable link” at the top of the map.  In the dialog which comes up, copy the text in the field “Paste HTML to embed in a website” (Note: your table must be shared for this to work.  Google should prompt you to share the table if you try to get an embeddable link for an unshared table.  If not, just click on Share in your Fusion window and make the table public.)

Copy the link text
12.  Now, back in DSpace, click on Edit Collection.  In one of the HTML fields (I usually use Introductory Text) and paste the text you copied.

Paste the embed code

13.  Here’s a huge gotcha.  I have pasted the embed code below.  If you paste it just like this and click on Save, the Collection page will disappear because there is nothing between the tags.  We need to add something between the opening and closing <iframe></iframe> tag.  Usually, I use “this browser does not support frames.”

<iframe width=”500″ height=”300″ scrolling=”no” frameborder=”no” src=”https://www.google.com/fusiontables/embedviz?viz=MAP&amp;q=select+col4+from+1Fqwl_ugZxBx3vCXLVEfnujSpYJa9F0IICVqHLYw&amp;h=false&amp;lat=40.00118408791957&amp;lng=-83.016412&amp;z=10&amp;t=1&amp;l=col4″></iframe>

<iframe width=”500″ height=”300″ scrolling=”no” frameborder=”no” src=”https://www.google.com/fusiontables/embedviz?viz=MAP&amp;q=select+col4+from+1Fqwl_ugZxBx3vCXLVEfnujSpYJa9F0IICVqHLYw&amp;h=false&amp;lat=40.00118408791957&amp;lng=-83.016412&amp;z=10&amp;t=1&amp;l=col4″>This browser does not support frames.</iframe>

14.  Now, click on Save.  This will take you back to your collection homepage, which now has a map.
Embedded Map
15.  One last thing – that info window in the map is not really user friendly.  Let’s go back go Google Fusion and fix it.  Just click on “Configure info window” above the Fusion map.  It will bring up a dialog which allows you to choose which fields you want to show, as well as modify the markup so that, for example, links display as links.
Modify the info window
16.  No need to re-embed, just head back to your DSpace page and click refresh.
Final embedded map
Done!  You can play with the settings at various points along the way to make the map smaller or larger.

The How-To – DSpace and Viewshare

We can complete the same process using Viewshare.  If you skipped to this section, go back and read steps 1-4 above.

Back?  Ok.  So we should have a .csv of exported metadata from our DSpace collection.

1.  Log into Viewshare.  You will have to request an account if you don’t have one.
2.  From the homepage, click on Upload Data.

Click on Upload Data

3.  There are a multitude of source options, but we’re going to use the .csv we created above, so we select “From a file on your computer.”

Select "from a file"
4.  Browse for the file, then click on Upload.

5.  In the Preview Window, you can edit the field names to more user friendly alternatives.  You can also click the check box under Enabled to include or not include certain fields.  You can also select field types, so that data is formatted correctly (as in, links) and can be used for visualizations (as in dates or locations).

Edit the data

6.  When you have finished editing, click on Save.  You will now see the dataset in your list of Data.  Click on Build to build a visualization.

Select Build

7.  You can pick any layout, but I usually pick the One Column for simplicity’s sake.

Select a layout

8.  The view will default to List, but really, we already have a list.  Let’s click on the Add a View tab to create another visualization.  For this example, we’re going to select Timeline.

Select a Timeline View

9. There are a variety of settings for this visualization.  Select the field which contains the date (in our case, we just have one date, so we leave End Date blank), decide how you want to color the timeline and what unit you want to use.  Timeline lens lets you decide what is included in the pop-up.  Click on Save (top right) when you are finished selecting values.

Select options for View

10.  We have created a timeline.  Now we need to embed it back in DSpace. Click on Embed in the top menu.

Now we have a timeline

11.  Copy the embed code.

Copy the embed code

12.  Again, back in DSpace, we will click on Edit Collection and paste the embed code into one of the HTML fields.  And, again, it is essential that there is some text between the tags.

Paste the embed code

Now we have an embedded timeline!

An embedded timeline

Depending on the space available on your DSpace homepage, you may want to adjust the top and bottom time bands so that the timeline displays more cleanly.

Of course, there are a few caveats.  For example, this approach works best with collections that are complete.  If items are still being added to the collection, the collection manager will need to build in a workflow to refresh the visualization from time to time.  This is done by re-exporting, re-uploading, and re-embedding.  Also, Google Fusion Tables is officially an “experimental” product.  It is important to keep your data elsewhere as well, and to be aware that your Fusion visualizations may not be permanent.

However, this solution provides an easy, code-free way to improve the user interface to a collection.  Similar approaches may also work using platforms not described here. For example, here’s a piece on using Viewshare with Omeka, another open source collection management system.  The goal is to let each tool do what it does best, then make the results play nicely together.  This is a free and relatively painless way to achieve that goal.

About our Guest Author: Meghan Frazer is the Digital Resources Curator for the Knowlton School of Architecture at The Ohio State University.  She manages the school archives as well as the KSA Digital Library, and spends lots of time wrangling Drupal for the digital library site. Her professional interests include digital content preservation and data visualization.  Before attending library school, Meghan worked in software quality assurance and training and has a bachelor’s degree in Computer Science.  You can send tweets in her direction using @meghanfrazer.