I write a lot of “how-to” posts. This is fine and I actually think it’s fun…until I have a month like this past one, in which I worry that I have no business telling anyone “how-to” do anything. In which I have written the following comment multiple times:
//MF - this is a bad way to do this.
I decided to write this post because I think that maybe it is important to share the narrative of “how we struggle” alongside the “how-to”. I’ll describe the original problem I needed to tackle, my work towards a solution, and the remaining issues that exist. There are portions of the solution that work fine and I utilize those portions to illustrate the original requirements. But, as you will see, there is an assortment of unfinished details. At the end of the post, I’ll give you the link to my solution and you can judge for yourself.
The Original Problem
We want to implement a new gallery to highlight student work on our department’s main website. My primary area of responsibility is a different website – the digital library – which is the archival home for student work. Given my experience in working with these items and also with Views Isotope, the task of developing a “proof of concept” solution fell to me. I needed to implement some of the technically trickier things in our proposed solution for the gallery pages in order to prove that the features are feasible for the main website. I decided to use the digital library for this development because
- the digital library already has the appropriate infrastructure in place
- both the digital library and our main website are based in Drupal 7
- the “proof of concept” solution, once complete, could remain in the digital library as a browse feature for our historical collections of student work
The full requirements for the final gallery are outside of the scope of this post, but our problem area is modifying the Views Isotope module to do some advanced things.
First, I need to take the existing Views Isotope module and modify it to use hash history on the same page as multiple filters. Hash history with Isotope is implemented using the jQuery BBQ plugin, as is demonstrated on the Isotope website. This essentially means that when a user clicks on a filter, a hash value is added to the URL and becomes a line in the browser’s history. This allows one to click the back button to view the grid with the previous filters applied.
Our specific use case is the following: when viewing galleries of student work from our school, a user can filter the list of works by several filter options, such as degree or discipline, (i.e., Undergraduate Architecture). These filter options are powered by term references in Drupal, as we saw in my earlier Isotope post. If the user clicks on an individual work to see details, clicking on the back button should return them to the already filtered list – they should not have to select the filters again.
Let’s take a look at how the end of the URL should progress. If we start with:
Then, we select Architecture as our discipline, we should see the URL change to :
Everything after the # is referred to as the hash. If we then click on Undergraduate, the URL will change to:
If we click our Back button, we should go back to
With each move, the Isotope animation should fire and the items on the screen should only be visible if they contain term references to the vocabulary selected.
Further, the selected filter options should be marked as being currently selected. Items in one vocabulary which require an item in another vocabulary should be hidden and shown as appropriate. For example, if a user selects Architecture, they should not be able to select PhD from the degree program, because we do not offer a PhD degree in Architecture, therefore there is no student work. Here is an example of how the list of filters might look.
A good real-world example of the types of features we need can be seen at NPR’s Best Books of 2013 site. The selected filter is marked, options that are no longer available are removed and the animation fires when the filter changes. Further, when you click the Back button, you are taken back through your selections.
It turns out that the jQuery BBQ plugin works quite nicely with Isotope, again as demonstrated on the Isotope website. It also turns out that support for BBQ is included in Drupal core for the Overlay module. So theoretically this should all play nicely together.
The existing views-isotope.js file handles filtering as the menu items are clicked. The process is basically as follows:
- When the document is ready
- Identify the container of items we want to filter, as well as the class on the items and set up Isotope with those options.
- Pre-select All in each set of filters.
- Identify all of the possible filter options on the page.
- If a filter item is clicked,
- first, check to make sure it isn’t already selected, if so, escape
- then remove the “selected” class from the currently selected option in this set
- add the “selected” class to the current item
- set up an “options” variable to hold the value of the selected filter(s)
- check for other items in other filter sets with the selected class and add them all to the selected filter value
- call Isotope with the selected filter value(s)
To add the filter to the URL we can use bbq.pushState, which will add “a ‘state’ into the browser history at the current position, setting location.hash and triggering any bound hashchange event callbacks (provided the new state is different than the previous state).”
We then want to handle what’s in the hash when the browser back button is clicked, or if a user enters a URL with the hash value applied. So we add an option for handling the hashchange event mentioned above. Instead of calling isotope from the link click function, we call it from the hashchange event portion. Now our algorithm looks more like this, with the items in bold added:
- Include the misc/jquery.ba-bbq.js for BBQ (I have to do this explicitly because I don’t use Overlay)
- When the document is ready
- identify the container of items we want to filter, as well as the class on the items and set up Isotope with those options.
- Pre-select All in each set of filters.
- Identify all of the possible filter options on the page.
- If a filter item is clicked,
- first, check to make sure it isn’t already selected, if so, escape
- then remove the “selected” class from the currently selected option in this set
- add the “selected” class to the current item
- set up an “options” variable to hold the value of the selected filter(s)
- push “options” to URL and trigger hashchange (don’t call Isotope yet)
- If a hashchange event is detected
- create new “hashOptions” object according to what’s in the hash, using the deparam.fragment function from jQuery BBQ
- manipulate css classes such as “not-available” (ie. If Architecture is selected, apply to PhD) and “selected” based on what’s in “hashOptions”
- call Isotope with “hashOptions” as the parameter
- trigger hashchange event to pick up anything that’s in the URL when the page loads
I also updated any available pager links so that they link not just to the appropriate page, but also so that the filters are included in the link. This is done by appending the hash value to the href attribute for each link with a class “pager”.
And it works. Sort of…
The Unfinished Details
Part of the solution described above only works on Chrome and – believe it or not – Internet Explorer. In all browsers, clicking the back button works just as described above, as long as one is still on the page with the list of works. However, when linking directly to page with the filters included (as we are doing with the pager) or hitting back from a page that does not have the hash present (say, after visiting an individual item), it does not work on Firefox or Safari. I think this may have to do with the deparam.fragment function, because that appears to be where it gets stuck, but so far can’t track it down. I could directly link to window.location.hash, but I think that’s a security issue (what’s to stop someone from injecting something malicious after the hash?)
Also, in order to make sure the classes are applied correctly, it feels like I do a lot of “remove it from everywhere, then add it back”. For example, if I select Architecture, PhD is then hidden from the degree list by assigning the class “not-available”. When a user clicks on City and Regional Planning or All, I need that PhD to appear again. Unfortunately, the All filter is handled differently – it is only present in the hash if no other options on the page are selected. So, I remove “not-available” from all filter lists on hashchange and then reassign based on what’s in the hash. It seems like it would be more efficient just to change the one I need, but I can’t figure it out. Or maybe I should alter the way All is handled completely – I don’t know.
It is hard to have confidence in a solution when building while learning. When I run into a snag, I have to consider whether or not the problem is the entire approach, as opposed to a syntax error or a limitation of the library. Frequently, I find an answer to an issue I’m having, but have to look up something from the answer in order to understand it. I worry that the code contains rookie mistakes – or even intermediate mistakes – which will bite us later, but it is difficult to do an exhaustive analysis of all the available resources. Coding elegantly is an art which requires more than a basic understanding of how the pieces play together.
Inelegant code, however, can still help make progress. To see the progress I have made, you can visit https://ksamedia.osu.edu/student-work-archives and play with the filters. This solution is good because it proves we can develop our features using Isotope, BBQ and Views Isotope. The trick now is figuring out how to paint and put locks and doors on our newly built house, or possibly move a wall or two.
I also love Views Isotope, a Drupal 7 module that enabled me to create a dynamic image gallery for our school’s Year in Review. This module (paired with a few others) is instrumental in building our new digital library.
In this blog post, I will walk you through how we created the Year in Review page, and how we plan to extrapolate the design to our collection views in the Knowlton Digital Library. This post assumes you have some basic knowledge of Drupal, including an understanding of content types, taxonomy terms and how to install a module.
Year in Review Project
Our Year in Review project began over the summer, when our communications team expressed an interest in displaying the news stories from throughout the school year in an online, interactive display. The designer on our team showed me several examples of card-like interfaces, emphasizing the importance of ease and clean graphics. After some digging, I found Isotope, which appeared to be the exact solution we needed. Isotope, according to its website, assists in creating “intelligent, dynamic layouts that can’t be achieved with CSS alone.” This JQuery library provides for the display of items in a masonry or grid-type layout, augmented by filters and sorting options that move the items around the page.
At first, I was unsure we could make this library work with Drupal, the content management system we employ for our main web site and our digital library. Fortunately I soon learned – as with many things in Drupal – there’s a module for that. The Views Isotope module provides just the functionality we needed, with some tweaking, of course.
We set out to display a grid of images, each representing a news story from the year. We wanted to allow users to filter those news stories based on each of the sections in our school: Architecture, Landscape Architecture and City and Regional Planning. News stories might be relevant to one, two or all three disciplines. The user can see the news story title by hovering over the image, and read more about the new story by clicking on the corresponding item in the grid.
Views Isotope Basics
Views Isotope is installed in the same way as other Drupal modules. There is an example in the module and there are also videos linked from the main module page to help you implement this in Views. (I found this video particularly helpful.)
You must have the following modules installed to use Views Isotope:
You also need to install the Isotope JQuery library. It is important to note that Isotope is only free for non-commercial projects. To install the library, download the package from the Isotope GitHub repository. Unzip the package and copy the whole directory into your libraries directory. Within your Drupal installation, this should be in the /sites/all/libraries folder. Once the module and the library are both installed, you’re ready to start.
If you have used Drupal, you have likely used Views. It is a very common way to query the underlying database in order to display content.The Views Isotope module provides additional View types: Isotope Grid, Isotope Filter Block and Isotope Sort Block. These three view types combine to provide one display. In my case, I have not yet implemented the Sort Block, so I won’t discuss it in detail here.
To build a new view, go to Structure > Views > Add a new view. In our specific example, we’ll talk about the steps in more detail. However, there’s a few important tenets of using Views Isotope, regardless of your setup:
- There is a grid. The View type Isotope Grid powers the main display.
- The field on which we want to filter is included in the query that builds the grid, but a CSS class is applied which hides the filters from the grid display and shows them only as filters.
- The Isotope Filter Block drives the filter display. Again, a CSS class is applied to the fields in the query to assign the appropriate display and functionality, instead of using default classes provided by Views.
- Frequently in Drupal, we are filtering on taxonomy terms. It is important that when we display these items we do not link to the taxonomy term page, so that a click on a term filters the results instead of taking the user away from the page.
With those basic tenets in mind, let’s look at the specific process of building the Year in Review.
Building the Year in Review
Armed with the Views Isotope functionality, I started with our existing Digital Library Drupal 7 instance and one content type, Item. Items are our primary content type and contain many, many fields, but here are the important ones for the Year in Review:
- Title: text field containing the headline of the article
- Description: text field containing the shortened article body
- File: File field containing an image from the article
- Item Class: A reference to a taxonomy term indicating if the item is from the school archives
- Discipline: Another term reference field which ties the article to one or more of our disciplines: Architecture, Landscape Architecture or City and Regional Planning
- Showcase: Boolean field which flags the article for inclusion in the Year in Review
The last field was essential so that the communications team liaison could curate the page. There are more news articles in our school archives then we necessarily want to show in the Year in Review, and the showcase flag solves this problem.
In building our Views, we first wanted to pull all of the Items which have the following characteristics:
- Item Class: School Archives
- Showcase: True
So, we build a new View. While logged in as administrator, we click on Structure, Views then Add a New View. We want to show Content of type Item, and display an Isotope Grid of fields. We do not want to use a pager. In this demo, I’m going to build a Page View, but a Block works as well (as we will see later). So my settings appear as follows:
Click on Continue & edit. For the Year in Review we next needed to add our filters – for Item Class and Showcase. Depending on your implementation, you may not need to filter the results, but likely you will want to narrow the results slightly. Next to Filter Criteria, click on Add.
If you click Update Preview at the bottom of the View edit screen, you’ll see that much of the formatting is already done with just those steps.
Note that the formatting in the image above is helped along by some CSS. To style the grid elements, the Views Isotope module contains its own CSS in the module folder ([drupal_install]/sites/all/modules/views_isotope). You can move forward with this default display if it works for your site. Or, you can override this in the site’s theme files, which is what I’ve done above. In my theme CSS file, I have applied the following styling to the class “isotope-element”
I use the Rendered File Formatter and select the Grid View Mode, which applies an Image Style to the file, resizing it to 180 x 140. Clicking Update Preview again shows that the image has been added each item.
This is closer, but in our specific example, we want to hide the title until the user hovers over the item. So, we need to add some CSS to the title field.
In my CSS file, I have the following:
Note the opacity is 0 – which means the div is transparent, allowing the image to show through. Then, I added a hover style which just changes the opacity to mostly cover the image:
Now, if we update preview, we should see the changes.
The last thing we need to do is add the Discipline field for each item so that we can filter.
There are two very important things here. First, we want to make sure that the field is not formatted as a link to the term, so we select Plain text as the Formatter.
Second, we need to apply a CSS class here as well, so that the Discipline fields show in filters, not in the grid. To do that, check the Customize field HTML and select the DIV element. Then, select Create a class and enter “isotope-filter”. Also, uncheck “Apply default classes.” Click Apply.
Using Firebug, I can now look at the generated HTML from this View and see that isotope-element <div> contains all the fields for each item, though the isotope-filter class loads Discipline as hidden.
<div class="isotope-element landscape-architecture" data-category="landscape-architecture"> <div class="views-field views-field-title"> (collapsed for brevity) </div> <div class="views-field views-field-field-file"> (collapsed for brevity) </div> <div> <div class="isotope-filter">Landscape Architecture</div> </div> </div>
You might also notice that the data-category for this element is assigned as landscape-architecture, which is our Discipline term for this item. This data-category will drive the filters.
So, let’s save our View by clicking Save at the top and move on to create our filter block. Create a new view, but this time create a block which displays taxonomy terms of type Discipline. Then, click on Continue & Edit.
The first thing we want to do is adjust view so that the default row wrappers are not applied. Note: this is the part I ALWAYS forget, and then when my filters don’t work it takes me forever to track it down.
Click on Settings next to Fields.
Next, we do not want the fields to be links to term pages, because a user click should filter the results, not link back to the term. So, click on the term name to edit that field. Uncheck the box next to “Link this field to its taxonomy term page”. Click on Apply.
Save the view.
The last thing is to make the block appear on the page with the grid. In practice, Drupal administrators would use Panels or Context to accomplish this (we use Context), but it can also be done using the Blocks menu.
So, go to Structure, then click on Blocks. Find our Isotope-Filter Demo block. Because it’s a View, the title will begin with “View:”
Click Configure. Set block settings so that the Filter appears only on the appropriate Grid page, in the region which is appropriate for your theme. Click save.
Now, let’s visit our /isotope-grid-demo page. We should see both the grid and the filter list.
It’s worth noting that here, too, I have customized the CSS. If we look at the rendered HTML using Firebug, we can see that the filter list is in a div with class “isotope-options” and the list itself has a class of “isotope-filters”.
<div class="isotope-options"> <ul class="isotope-filters option-set clearfix" data-option-key="filter"> <li><a class="" data-option-value="*" href="#filter">All</a></li> <li><a class="filterbutton" href="#filter" data-option-value=".architecture">Architecture</a></li> <li><a class="filterbutton selected" href="#filter" data-option-value=".city-and-regional-planning">City and Regional Planning</a></li> <li><a class="filterbutton" href="#filter" data-option-value=".landscape-architecture">Landscape Architecture</a></li> </ul> </div>
I have overridden the CSS for these classes to remove the background from the filters and change the list-style-type to none, but you can obviously make whatever changes you want. When I click on one of the filters, it shows me only the news stories for that Discipline. Here, I’ve clicked on City and Regional Planning.
So, how do we plan to use this in our digital library going forward? So far, we have mostly used the grid without the filters, such as in one of our Work pages. This shows the metadata related to a given work, along with all the items tied to that work. Eventually, each of the taxonomy terms in the metadata will be a link. The following grids are all created with blocks instead of pages, so that I can use Context to override the default term or node display.
However, in our recently implemented Collection view, we allow users to filter the items based on their type: image, video or document. Here, you see an example of one of our lecture collections, with the videos and the poster in the same grid, until the user filters for one or the other.
There are two obstacles to using this feature in a more widespread manner throughout the site. First, I have only recently figured out how to implement multiple filter options. For example, we might want to filter our news stories by Discipline and Semester. To do this, we rewrite the sorting fields in our Grid display so that they all display in one field. Then, we create two Filter blocks, one for each set of terms. Implementing this across the site so that users can sort by say, item type and vocabulary term, will make it more useful to us.
Second, we have several Views that might return upwards of 500 items. Loading all of the image files for this result set is costly, especially when you add in the additional overhead of a full image loading in the background for a Colorbox overlay and Drupal performance issues. The filters will not work across pages, so if I use pager, I will only filter the items on the page I’m viewing. I believe this can fixed somehow using Infinite Scroll (as described in several ways here), but I have not tried yet.
With these two advanced options, there are many options for improving the digital library interface. I am especially interested in how to use multiple filters on a set of search results returned from a SOLR index.
What other extensions might be useful? Let us know what you think in the comments.
- Views Isotope: https://drupal.org/project/views_isotope
- Isotope JQuery library with examples: http://isotope.metafizzy.co/index.html
- If you want to go one step further, we have also implemented Colorbox, so when a user clicks on a tile, they get a popup overlay gallery, instead of going straight to the node. More information on Colorbox can be found at the Colorbox site (http://www.jacklmoore.com/colorbox/) and the Colorbox Drupal module (https://drupal.org/project/colorbox).
Once you have built a local development environment using an AMP stack, the next logical question is, “now what?” And the answer is, truly, whatever you want. As an example, in this blog post we will walk through installing Drupal and WordPress on your local machine so that you can develop and test in a low-risk environment. However, you can substitute other content management systems or development platforms and the goal is the same: we want to mimic our web server environment on our local machine.
The only prerequisite for these recipes is a working AMP stack (see our tutorials for Mac and Windows), and administrative rights to your computer. The two sets of steps are very similar. We need to download and unpack the files to our web root, create a database and point to it from a configuration file, and run an install script from the browser.
There are tutorials around the web on how to do both things, but I think there’s two likely gotchas for newbies:
- There’s no “installer” that installs the platform to your system. You unzip and copy the files to the correct place. The “install” script is really a “setup” script, and is run after you can access the site through a browser.
- Setting up and linking the database must be done correctly, or the site won’t work.
So, we’ll step through each process with some extra explanation.
Drupal is an open source content management platform. Many libraries use it for their website because it is free and it allows for granular user permissions. So as the site administrator, I can provide access for staff to edit certain pages (ie, reference desk schedule) but not others (say, colleague’s user profiles). In our digital library, my curatorial users can edit content, authorized student users can see content just for our students, and anonymous users can see public collections. The platform has its downsides, but there is a large and active user community. A problem’s solution is usually only a Google search (or a few) away.
The Drupal installation guide is a little more technical, so feel free to head there if you’re comfortable on the command line.
First, download the Drupal files from the Drupal core page. The top set of downloads (green background) are stable versions of the platform. The lower set are versions still in development. For our purposes we want the green download, and because I am on my Mac, I will download the tar.gz file for the most recent version (at the time of this writing, 7.23). If you are on a Windows machine, and have 7zip installed, you can also use the .tar.gz file. If you do not have 7zip installed, use the .zip file.
Now, we need to create the database we’re going to use for Drupal. In building the AMP stack, we also installed phpMyAdmin, and we’ll use it now. Open a browser and navigate to the phpMyAdmin installation (if you followed the earlier tutorials, this will be http://localhost/~yourusername/phpmyadmin on Mac and http://localhost/phpmyadmin on Windows). Log in with the root user you created when you installed MySQL.
The Drupal installation instructions suggest creating a user first, and through that process, creating the database we will use. So, start by clicking on Users.
Look for the Add user button.
Next, we want to create a username – which will serve as the user login as well as the name of the database. Create a password and select “Local” from the Host dropdown. This will only allow traffic from the local machine. Under the “Database for user”, we want to select “Create database with same name and grant all privileges.”
Next, let’s copy the Drupal files and configure the settings. Locate the file you downloaded in the first step above and move it to your web root folder. This is the folder you previously used to test Apache and install phpMyAdmin so you could access files through your browser. For example, on my Mac this is in mfrazer/sites.
You may want to change the folder name from drupal-7.23 to something a little more user friendly, e.g. drupal without the version number. Generally, it’s bad practice to have periods in file or folder names. However, for the purposes of this tutorial, I’m going to leave the example unchanged.
Now, we want to create our settings file. Inside your Drupal folder, look for the sites folder. We want to navigate to sites/default and create a copy of the file called default.settings.php. Rename the copy to settings.php and open in your code editor.
Each section of this file contains extensive directions on how to set the settings. At the end of the Database Settings section, (line 213 as of this writing), we want to replace this
$databases = array();
$databases['default']['default'] = array( 'driver' => 'mysql', 'database' => 'sampledrupal', 'username' => 'sampledrupal', 'password' => 'samplepassword', 'host' => 'localhost', 'prefix' => '', );
Remember, if you followed the steps above, ‘database’ and ‘username’ should have the same value. Save the file.
Go back to your unpacked folder and create a directory called “files” in the same directory as our settings.php file.
Now we can navigate to the setup script in our browser. The URL is comprised of the web root, the name of the folder you extracted the drupal files into and then install.php. So, in my case this is:
If I was on a Windows machine, and had changed the name of the folder to be mydrupal, then the path would be
Either way, you should get something that looks like this:
For your first installation, I would choose Standard, so you can see what the Standard install looks like. I use Minimal for many of my sites, but if it’s your first pass into Drupal it is good to see what’s there.
Next, pick a language and click Save and Continue. Now, the script is going to attempt to verify your requirements. You may run into an error that looks like this:
We need to make our files directory writable by our web server users. We can do this a bunch of different ways. It’s important to think about what you’re doing, because it involves file permissions, especially if you are allowing users in from outside your system.
On my Mac, I choose to make _www (which is the hidden web server user) the owner of the folder. To do this, I open Terminal and type in
sudo chown _www files
Remember, sudo will elevate us to administrator. Type in your password when prompted. The next command is chown, followed by the new owner the folder in question. So this command will change the owner to _www for the folder “files”.
In Windows, I did not see this error. However, if needed, I would handle the permissions through the user interface, by navigating to the files folder, right-clicking and selecting Properties. Click on the Security tab, then click on Edit. In this case, we are just going to grant permissions to the users of this machine, which will include the web server user.
Click on Users, then scroll down to click the check box under “Allow” and next to “Write.” Click on Apply and then Ok. Click OK again to close the Properties window.
On my Windows machine, I got a PHP error instead of the file permissions error.
This is an easy fix, we just need to enable the gd2 and mbstring extensions in our php.ini file and restart Apache to pick up the changes.
To do this, open your php.ini file (if you followed our tutorials, this will be in your c:\opt\local directory). Beginning on Line 868, in the Windows Extensions section, uncomment (remove the semi-colon) from the following lines (they are not right next to each other, they’re in a longer list, but we want these three uncommented):
Save php.ini file. Restart Apache by going to Services, click on Apache 2.4 and click Restart.
Once you think you’ve fixed the issues, go back to your browser and click on Refresh. The Verify Requirements should pass and you should see a progress bar as Drupal installs.
Next you are taken to the Configure Site page, where you fill in the site name, your email address and create the first user. This is important, as there are a couple of functions restricted only to this user, so remember the user name and password that you choose. I usually leave the Server Settings alone and uncheck the Notification options.
Click Save and Continue. You should be congratulated and provided a link to your new site.
WordPress is a very common blogging platform; we use it at ACRL TechConnect. It can also be modified to be used as a content management platform or an image gallery.
Full disclosure: Until writing this post, I have never done a local install of WordPress. Fortunately, I can report that it’s very straightforward. So, let’s get started.
The MAMP instructions for WordPress advise creating the database first, and using the root credentials. I am not wild about this solution, because I prefer to have separate users for my databases and I do not want my root credentials sitting in a file somewhere. So we will set up the database the same way we did above: create a user and a database at the same time.
Open a browser and navigate to the phpMyAdmin installation (if you followed the earlier tutorials, this will be http://localhost/~yourusername/phpmyadmin on Mac and http://localhost/phpmyadmin on Windows). Log in with the root user you created when you installed MySQL and click on Users.
Look for the Add user button.
Next, we want to create a username – which will serve as the user login as well as the name of the database. Create a password and select “Local” from the Host dropdown. This will only allow traffic from the local machine. Under the “Database for user”, we want to select “Create database with same name and grant all privileges.”
Now, let’s download our files. Go to http://wordpress.org/download and click 0n Download WordPress. Move the .zip file to your web root folder and unzip it. This is the folder you previously used to test Apache and install phpMyAdmin so you could access files through your browser. For example, on my Mac this is in mfrazer/sites. If you followed our tutorial for Windows, it would be c:\sites
Next, we need create a config file. WordPress comes with a wp-config-sample.php file. Make a copy of it and rename it to wp-config.php and open it with your code editor.
Enter in the database name, user name and password we just created. Remember, if you followed the steps above, the database name and user name should be the same. Verify that the host is set to local and save the file.
Navigate in your browser to the WordPress folder. The URL is comprised of the web root and the name of the folder where you extracted the WordPress files. So, in my case this is:
If I was on a Windows machine, and had changed the name of the folder to be wordpress-dev, then the path would be
Either way, you should get something that looks like this:
Fill in the form and click on Install WordPress. It might take a few minutes, but you should get a success message and a Log In button. Log in to your site using the credentials you just created in the form.
You’re ready to start coding and testing. The next step is to think about what you want to do. You might take a look at the theming resources provided by both WordPress and Drupal. You might want to go all out and write a module. No matter what, though, you now have an environment that will help you abide by the cardinal rule of development: Thou shalt not mess around in production.
Let us know how it’s going in the comments!
When I was a kid, I cherished the Paula Danziger book Remember Me to Harold Square, in which a group of kids call themselves the Serendipities, named for the experience of making fortunate discoveries accidentally. Last week I found myself remembering the book over and over again as I helped develop Serendip-o-matic, a tool which introduces serendipity to research, as part of a twelve person team attending One Week | One Tool at the Roy Rosenzweig Center for History and New Media at George Mason University (RRCHNM).
In this blog post, I’ll take you through the development of the “serendipity machine”, from the convening of the team to the selection and development of the tool. The experience turned out to be an intense learning experience for me, so along the way, I will share some of my own fortunate discoveries.
(Note: this is a pretty detailed play-by-play of the process. If you’re more interested in the result, please see the RRCHNM news items on both our process and our product, or play with Serendip-o-matic itself.)
The Eve of #OWOT
Approximately thirty people applied to be part of One Week | One Tool (OWOT), an Institute for Advanced Topics in the Digital Humanities, sponsored by the National Endowment for the Humanities. Twelve were selected and we arrive on Sunday, July 28, 2013 and convene in the Well, the watering hole at the Mason Inn.
Tom Scheinfeldt (@foundhistory), the RRCHNM director-at-large who organized OWOT, delivers the pre-week pep talk and discusses how we will measure success. The development of the tool is important, but so is the learning experience for the twelve assembled scholars. It’s about the product, but also about the process. We are encouraged to learn from each other, to “hitch our wagon” to another smart person in the room and figure out something new.
As for the product, the goal is to build something that is used. This means that defining and targeting the audience is essential.
The tweeting began before we arrived, but typing starts in earnest at this meeting and the #owot hashtag is populated with our own perspectives and feedback from the outside. Feedback, as it turns out, will be the priority for Day 1.
@DoughertyJack: “One Week One Tool team wants feedback on which digital tool to build.”
Mentors from RRCHNM take the morning to explain some of the basic tenets of what we’re about to do. Sharon Leon talks about the importance of defining the project: “A project without an end is not a project.” Fortunately, the one week timeline solves this problem for us initially, but there’s the question of what happens after this week?
Patrick Murray-John takes us through some of the finer points of developing in a collaborative environment. Sheila Brennan discusses outreach and audience, and continues to emphasize the point from the night before: the audience definition is key. She also says the sentence that, as we’ll see, would need to be my mantra for the rest of the project: “Being willing to make concrete decisions is the only way you’re going to get through this week.”
All of the advice seems spot-on and I find myself nodding my head. But we have no tool yet, and so how to apply specifics is still really hazy. The tool is the piece of the puzzle that we need.
We start with an open brainstorming session, which results in a filled whiteboard of words and concepts. We debate audience, we debate feasibility, we debate openness. Debate about openness brings us back to the conversation about audience – for whom are we being open? There’s lot of conversation but at the end, we essentially have just a word cloud associated with projects in our heads.
So, we then take those ideas and try to express them in the following format: X tool addresses Y need for Z audience. I am sitting closest to the whiteboards so I do a lot of the scribing for this second part and have a few observations:
- there are pet projects in the room – some folks came with good ideas and are planning to argue for them
- our audience for each tool is really similar; as a team we are targeting “researchers”, though there seems to be some debate on how inclusive that term is. Are we including students in general? Teachers? What designates “research”? It seems to depend on the proposed tool.
- the problem or need is often hard to articulate. “It would be cool” is not going to cut it with this crowd, but there are some cases where we’re struggling to define why we want to do something.
A few group members begin taking the rows and creating usable descriptions and titles for the projects in a Google Doc, as we want to restrict public viewing while still sharing within the group. We discuss several platforms for sharing our list with the world, and land on IdeaScale. We want voters to be able to vote AND comment on ideas, and IdeaScale seems to fit the bill. We adjourn from the Center and head back to the hotel with one thing left to do: articulate these ideas to the world using IdeaScale and get some feedback.
The problem here, of course, is that everyone wants to make sure that their idea is communicated effectively and we need to agree on public descriptions for the projects. Finally, it seems like there’s a light at the end of the tunnel…until we hit another snag. IdeaScale requires a login to vote or comment and there’s understandable resistance around the table to that idea. For a moment, it feels like we’re back to square one, or at least square five. Team members begin researching alternatives but nothing is perfect, we’ve already finished dinner and need the votes by 10am tomorrow. So we stick with IdeaScale.
And, not for the last time this week, I reflect on Sheila’s comment, “being willing to make concrete decisions is the only way you’re going to get through this week.” When new information, such as the login requirement, challenges the concrete decision you made, how do you decide whether or not to revisit the decision? How do you decide that with twelve people?
I head to bed exhausted, wondering about how many votes we’re going to get, and worried about tomorrow: are we going to make a decision?
It turns out that I need not have worried. In the winnowing from 11 choices down to 2, many members of the team are willing to say, “my tool can be done later” or “that one can be done better outside this project.” Approximately 100 people weighed in on the IdeaScale site, and those votes are helpful as we weigh each idea. Scott Kleinman leads us in a discussion about feasbility for implementation and commitment in the room and the choices begin to fall away. At the end, there are four, but after a few rounds of voting we’re down to two with equal votes that must be differentiated. After a little more discussion, Tom proposes a voting system that allows folks to weight their votes in terms of commitment and the Serendipity project wins out. The drafted idea description reads:
“A serendipitous discovery tool for researchers that takes information from your personal collection (such as a Zotero citation library or a CSV file) and delivers content (from online libraries or collections like DPLA or Europeana) similar to it, which can then be visualized and manipulated.”
We decide to keep our project a secret until our launch and we break for lunch before assigning teams. (Meanwhile, #owot hashtag follower Sherman Dorn decides to create an alternative list of ideas – One Week Better Tools – which provides some necessary laughs over the next couple of days).
After lunch, it’s time to break out responsibilities. Mia Ridge steps up, though, and suggests that we first establish a shared understanding of the tool. She sketches on one of the whiteboards the image which would guide our development over the next few days.
This was a takeaway moment for me. I frequently sketch out my projects, but I’m afraid the thinking often gets pushed out in favor of the doing when I’m running low on time. Mia’s suggestion that we take the time despite being against the clock probably saved us lots of hours and headaches later in the project. We needed to aim as a group, so our efforts would fire in the same direction. The tool really takes shape in this conversation, and some of the tasks are already starting to become really clear. (We are also still indulging our obsession with mustaches at this time, as you may notice.)
Tom leads the discussion of teams. He recommends three: a project management team, a design/dev team and an outreach team. The project managers should be selected first, and they can select the rest of the teams. The project management discussion is difficult; there’s an abundance of qualified people in the room. From my perspective, it makes sense to have the project managers be folks who can step in and pinch hit as things get hectic, but we also need our strongest technical folks on the dev team. In the end, Brian Croxall and I are selected to be the project management team.
We decide to ask the remaining team members where they would like to be and see where our numbers end up. The numbers turn out great: 7 for design/dev and 3 for outreach, with two design/dev team members slated to help with outreach needs as necessary.
The teams hit the ground running and begin prodding the components of the idea. The theme of the afternoon is determining the feasibility of this “serendipity engine” we’ve elected to build. Mia Ridge, leader of the design/dev team, runs a quick skills audit and gets down to the business of selecting programming languages, frameworks and strategies for the week. They choose to work in Python with the Django framework. Isotope, a JQuery plugin I use in my own development, is selected to drive the results page. A private Github repository is set up under a code name. (Beyond Isotope, HTML and CSS, I’m a little out of my element here, so for more technical details, please visit the public repository’s wiki.) The outreach team lead, Jack Dougherty, brainstorms with his team on overall outreach needs and high priority tasks. The Google document from yesterday becomes a Google Drive folder, with shells for press releases, a contact list for marketing and work plans for both teams.
This is the first point where I realize that I am going to have to adjust to a lack of hands on work. I do my best when I’m working a keyboard: making lists, solving problems with code, etc. As one of the project managers, my job is much less on the keyboard and much more about managing people and process.
When the teams come back together to report out, there’s a lot of getting each side up to speed, and afterwards our mentors advise us that the meetings have to be shorter. We’re already at the end of day 2, though both teams would be working into the night on their work plans and Brian and need I still need to set the schedule for tomorrow.
We’re past the point where we can have a lot of discussion, except for maybe about the name.
Wednesday is tough. We have to come up with a name, and all that exploration from yesterday needs to be a prototype by the end of the day. We are still hammering out the language we use in talking to each other and there’s some middle ground to be found on terminology. One example is the use of the word “standup” in our schedule. “Standup” means something very specific to developers familiar with the Agile development process whereas I just mean, “short update meeting.” Our approach to dealing with these issues is to identify the confusion and quickly agree on language we all understand.
I spend most of the day with the outreach team. We have set a deadline for presenting names at lunchtime and are hoping the whole team can vote after lunch. This schedule turns out to be folly as the name takes most of the day and we have to adjust our meeting times accordingly. As project managers, Brian and I are canceling meetings (because folks are on a roll, we haven’t met a deadline, etc) whenever we can, but we have to balance this with keeping the whole team informed.
Camping out in a living room type space in RRCHNM, spread out among couches and looking at a Google Doc being edited on a big-screen TV, the outreach team and various interested parties spend most of the day brainstorming names. We take breaks to work on the process press release and other essential tasks, but the name is the thing for the moment. We need a name to start working on branding and logos. Product press releases need to be completed, the dev team needs a named target and of course, swag must be ordered.
It is in this process, however, that an Aha! moment occurs for me. We have been discussing names for a long time and folks are getting punchy. The dev team lead and our designer, Amy Papaelias, have joined the outreach team along with most of our CHNM mentors. I want to revisit something dev team member Eli Rose said earlier in the day. To paraphrase, Eli said that he liked the idea that the tool automated or mechanized the concept of surprise. So I repeat Eli’s concept to the group and it isn’t long after that that Mia says, “what about Serendip-o-matic?” The group awards the name with head nods and “I like that”s and after running it by developers and dealing with our reservations (eg, hyphens, really?), history is made.
As relieved as I am to finally have a name, the bigger takeaway for me here is in the role of the manager. I am not responsible for the inspiration for the name or the name itself, but instead repeating the concept to the right combination of people at a time when the team was stuck. The project managers can create an opportunity for the brilliant folks on the team to make connections. This thought serves as a consolation to me as I continue to struggle without concrete tasks.
Meanwhile, on the other side the building, the rest of dev team is pushing to finish code. We see a working prototype at the end of the day, and folks are feeling good, but its been a long day. So we go to dinner as a team, and leave the work behind for a couple of hours, though Amy is furiously sketching at various moments throughout the meal as she tries to develop a look and feel for this newly named thing.
On the way home from dinner, I think, “there’s only two days left.” All of the sudden it feels like we haven’t gotten anywhere.
The decision to add the Flickr API to our work in order to access the Flickr Commons is made with the dev team, based on the feeling that we have enough time and the images located there enhance our search results and expand our coverage of subject areas and geographic locations.
We also spend today addressing issues. The work of both teams overlaps in some key areas. In the afternoon, Brian and I realize that we have mishandled some of the communication regarding language on the front page and both teams are working on the text. We scramble to unify the approaches and make sure that efforts are not wasted.
This is another learning moment for me. I keep flashing on Sheila’s words from Monday, and worry that our concrete decision making process is suffering from”too many cooks in the kitchen.” Everyone on this team has a stake in the success of this project and we have lots of smart people with valid opinions. But everyone can’t vote on everything and we are spending too much time getting consensus now, with a mere twenty-four hours to go. As a project manager, part of my job is to start streamlining and making executive decisions, but I am struggling with how to do that.
As we prepare to leave the center at 6pm, things are feeling disconnected. This day has flown by. Both teams are overwhelmed by what has to get done before tomorrow and despite hard work throughout the day, we’re trying to get a dev server and production server up and running. As we regroup at the Inn, the dev team heads upstairs to a quiet space to work and eat and the outreach team sets up in the lobby.
Then, good news arrives. Rebecca Sutton-Koeser has managed to get both the dev and production servers up and the code is able to be deployed. (We are using Heroku and Amazon Web Services specifically, but again, please see the wiki for more technical details.)
The outreach team continues to work on documentation, and release strategy and Brian and I continue to step in where we can. Everyone is working until midnight or later, but feeling much better about our status then we did at 6pm.
The final tasks are upon us. Scott Williams moves on from his development responsibilities to facilitate user testing, which was forced to slide from Thursday due to our server problems. Amanda Visconti works to get the interactive results screen finalized. Ray Palin hones our list of press contacts and works with Amy to get the swag design in place. Amrys Williams collaborates with the outreach team and then Sheila to publish the product press release. Both the dev and outreach teams triage and fix and tweak and defer issues as we move towards our 1pm “code chill”, a point which we’re hoping to have the code in a fairly stable state.
We are still making too many decisions with too many people, and I find myself weighing not only the options but how attached people are to either option. Several choices are made because they reflect the path of least resistance. The time to argue is through and I trust the team’s opinions even when I don’t agree.
We end up running a little behind and the code freeze scheduled for 2pm slides to 2:15. But at this point we know: we’re going live at 3:15pm.
Jack Dougherty has arranged a Google hangout with Dan Cohen of the Digital Public Library of America and Brett Bobley and Jen Serventi of the NEH Office of Digital Humanities, which the project managers co-host. We broadcast the conversation live via the One Week | One Tool website.
The code goes live and the broadcast starts but my jitters do not subside…until I hear my teammates cheering in the hangout. Serendip-o-matic is live.
At 8am on Day 6, Serendip-o-matic had its first pull request and later in the day, a fourth API – Trove of Australia – was integrated. As I drafted this blog post on Day 7, I received email after email generated by the active issue queue and the tweet stream at #owot is still being populated. On Day 9, the developers continue to fix issues and we are all thinking about long term strategy. We are brainstorming ways to share our experience and help other teams achieve similar results.
I found One Week | One Tool incredibly challenging and therefore a highly rewarding experience. My major challenge lay in shifting my mindset from that of a someone hammering on a keyboard in a one-person shop to a that of a project manager for a twelve-person team. I write for this blog because I like to build things and share how I built them, but I have never experienced the building from this angle before. The tight timeline ensured that we would not have time to go back and agonize over decisions, so it was a bit like living in a project management accelerator. We had to recognize issues, fix them and move on quickly, so as not to derail the project.
However, even in those times when I became acutely more aware of the clock, I never doubted that we would make it. The entire team is so talented; I never lost my faith that a product would emerge. And, it’s an application that I will use, for inspiration and for making fortunate discoveries.
(More on One Week | One Tool, including other blog entries, can be found by visiting the One Week | One Tool Zotero Group.)
Previously, we discussed the benefits of installing a local AMP stack (Apache, MySQL & PHP) for the purposes of development and testing, and walked through installing a stack in the Mac environment. In this post, we will turn our attention to Windows. (If you have not read Local Dev Environments for Newbies Part 1, and you are new to the AMP stack, you might want to go read the Introduction and Tips sections before continuing with this tutorial.)
Much like with the Mac stack, there are Windows stack installers that will do all of this for you. For example, if you are looking to develop for Drupal, there’s an install package called Acquia that comes with a stack installer. There’s also WAMPserver and XAMPP. If you opt to go this route, you should do some research and decide which option is the best for you. This article contains reviews of many of the main players, though it is a year old.
However, we are going to walk through each component manually so that we can see how it all works together.
So, let’s get going with Recipe 2 – Install the AMP Stack on Windows 7.
Notepad and Wordpad come with most Windows systems, but you may want to install a more robust code editor to edit configuration files and eventually, your code. I prefer Notepad++, which is open source and provides much of the basic functionality needed in a code editor. The examples here will reference Notepad++ but feel free to use whichever code editor works for you.
For our purposes, we are not going to allow traffic from outside the machine to access our test server. If you need this functionality, you will need to open a port in your firewall on port 80. Be very careful with this option.
As a prerequisite to installing Apache, we need to install the Visual C++ 2010 SP1 Redistributable Package x86. As a pre-requisite to installing PHP, we need to install the Visual C++ 2008 SP1 Redistributable Package x86.
I create a directory called opt\local in my C drive to house all of the stack pieces. I do this because it’s easier to find things on the command line when I need to and I like keeping development environment applications separate from Program Files. I also create a directory called sites to house my web files.
The last two prerequisites are more like common gotchas. The first is that while you are manipulating configuration and initialization files throughout this process, you may find the Windows default view settings are getting in your way. If this is the case, you can change it by going to Organize > Folder and search options > View tab.
This will bring up a dialog which allows you to set preferences for the folder you are currently viewing. You can select the option to “show hidden files” and uncheck the “hide file extensions” option, both of which make developing easier.
The other thing to know is that in our example, we will work with a Windows 7 installation – a 64-bit operating system. However, when we get to PHP, you’ll notice that their website does not provide a 64-bit installer. I have seen errors in the past when a 32-bit PHP installer and a 64-bit Apache version were both used, so we will install the 32-bit versions for both components.
Ok, I think we’re all set. Let’s install Apache.
We want to download the .zip file for latest version. For Windows binaries, I use apachelounge, which builds windows installer files. For this example we’ll download httpd-2.4.4-win32.zip to the Desktop of our Windows machine.
Next, we want to extract files into chosen location for Apache directory, eg c:\opt\local\Apache24. You can accomplish this a variety of ways but if you have WinZip, you can follow these steps:
- Copy the .zip folder to c:\opt\local
- Right-click and select “Extract all files”.
- Open the extracted folder, right-click on the Apache24 folder and select Cut.
- Go back up one directory and right-click to Paste the Apache24 folder, so that it now resides inside c:\opt\local.
This extraction “installs” Apache; there is no installer to run, but we will need to configure a few things.
We want to open httpd.conf: this file contains all of the configuration settings for our web server. If you followed the directions above, you can find the file in C:\opt\local\Apache24\conf\httpd.conf – we want to open it with our code editor and make the following changes:
1. Find this line (in my copy, it’s line 37):
Change it to match the directory where you installed Apache. In my case, it reads:
You might notice that our slashes slant in the opposite direction from the usual Windows sytax. In Windows, backslash ( \ ) delineates different directories, but in Unix, it’s forward slash ( / ). Apache reads the configuration file in the Unix manner, even though we are working in Windows. If you get a “directory not found” error at any point, check your slashes.
2. At Line 58, we are going to change the listen command to just listen to our machine. Change
3. There are 100 lines around 72-172 that all start with LoadModule. Some of these are comments (they begin with a “#”). Later on, you may need to uncomment some of these for a certain web program to work, like SSL. For now, though, we’ll leave these as is.
4. Next, we want to change our Document Root and the directory directive to the directory which has the web files. These lines (beginning on line 237 in my copy) read:
Later, we’ll want to change this to our “sites” folder we created earlier. For now, we’re just going to change this to the Apache installation directory for testing. So, it should read:
Save the httpd.conf file. (In two of our test cases, after saving the file, closing and re-opening, the file appeared unchanged. If you are having issues, try doing Save As and save the file to your desktop, then drag it into c:\opt\local\Apache24).
Next, we want to test our Apache configuration. To do this, we open the command line. In Windows, you can do this by going to the Start Menu, and typing
in the Search box. Then, press Enter. Once you’re in the command prompt, type in
(Note that the first part of this path is the install directory I used above. If you chose a different directory to install Apache, use that instead.) Next, we start the web server with a “-t” flag to test it. Type in:
If you get a Syntax OK, you’re golden.
Otherwise, try to resolve any errors based on the error message. If the error message does not make any sense after checking your code for typos, go back and make sure that your changes to httpd.conf did actually save.
Once you get Syntax OK, type in:
This will start the web server. You should not get a message regarding the firewall if you changed the listen command to localhost:80. But, if you do, decide what traffic you want to allow to your machine. I would click “Cancel” instead of “Allow Access”, because I don’t want to allow outside access.
Now the server is running. You’ll notice that you no longer have a C:\> prompt in the Command window. To test our server, we open a browser and type in http://localhost – you should get a website with text that reads “It works!”
Instead of starting up the server this way every time, we want to install it as a Windows service. So, let’s go back to our command prompt and press Ctrl+C to stop web server. You should now have a prompt again.
To install Apache as a service, type:
httpd.exe –k install
You will most likely get an error that looks like this:
We need to run our command prompt as an administrator. So, let’s close the cmd.exe window and go back to our Start menu. Go to Start > All Programs > Accessories and right-click on Command Prompt. Select “Run As Administrator”.
(Note: If for some reason you do not have the ability to right-click, there’s a “How-To Geek” post with a great tip. Go to the Start menu and in the Run box, type in cmd.exe as we did before, but instead of hitting Enter, hit Ctrl+Shift+Enter. This does the same thing as the right-click step above.)
Click on Yes at the prompt that comes up, allowing the program to make changes. You’ll notice that instead of starting in our user directory, we are starting in Windows\system32 So, let’s go back to our bin directory with:
Now, we can run our
httpd.exe –k install
command again, and it should succeed. To start the service, we want to open our Services Dialog, located in the Control Panel (Start Menu > Control Panel) in the Administrative Tools section. If you display your Control Panel by category (the default), you click on System & Security, then Administrative Tools. If you display your control panel by small icon, Administrative Tools should be listed.
Double click on Services.
Find Apache2.4 in the list and select it. Verify that the Startup Type is set to Automatic if you want the Service to start automatically (if you would prefer that the Service only start at certain times, change this to Manual, but remember that you have to come back in here to start it). With Apache2.4 selected, click on Start Service in the left hand column.
Go back to the browser and hit Refresh to verify that everything is still working. It should still say “It Works!” And with that affirmation, let’s move to PHP.
(Before installing PHP, make sure you have installed the Visual C++ 2008 Redistributable Package from the prerequisite section.)
For our purposes, we want to use the Thread Safe .zip from the PHP Downloads page. Because we are running PHP under Apache, but not as a CGI, we use the thread safe version. (For more on thread safe vs. non-thread safe, see this Wikipedia entry or this stackoverflow post)
Once you’ve downloaded the .zip file, extract it to your \opt\local directory. Then, rename the folder to simply “php”. As with Apache24, extracting the files does the “install”, we just need to configure everything to run properly. Go to the directory where you installed PHP, (in my case, c:\opt\local\php) and find php.ini-development.
Make a copy of the file and rename the copy php.ini (this is one of those places where you may want to set the Folder and search options if you’re having problems).
Open the file in Notepad++ (or your code editor of choice). Note that here, comments are preceded by a “;” (without quotes) and the directories are delineated using the standard Windows format, with a “\”. Most of the document is commented out, and includes a large section on recommended settings for production and development, so if you’re not sure of the changes to make you can check in the file (in addition to the PHP documentation). For this tutorial, we want to make the following changes:
1. On line 708, uncomment (remove semi-colon) include_path under “Windows” and make sure it matches the directory where you installed PHP (if the line numbers have changed, just search for Paths and Directories).
3. Beginning on Line 868, in the Windows Extensions section, uncomment (remove the semi-colon) from the following lines (they are not right next to each other, they’re in a longer list, but we want these three uncommented):
extension=php_mysql.dll extension=php_mysqli.dll extension=php_pdo_mysql.dll
Save php.ini file.
You may want to double-check that the .dll files we enabled above are actually in the c:\opt\local\php\ext folder before trying to run php, because you will see an error if they are not there.
Next, we want to add the php directory to our path environment variables. This section is a little tricky; be *extremely* careful when you are making changes to system settings like this.
First, we navigate to the Environment variables by opening the Control Panel and going to System & Security > System > Advanced System Settings > Environment Variables.
In the bottom scroll box, scroll until you find “Path”, click on it, then click on Edit.
Append the following to the end of the Variable Value list (the semi-colon ends the previous item, then we add our installation path).
Click OK and continue to do so until you are out of the dialog.
Lastly, we need to add some lines to the httpd.conf so that Apache will play nice with PHP. The httpd.conf file may still be open in your text editor. If not, go back to c:\opt\local\Apache24\conf and open it. At the bottom of this file, we need to add the following:
LoadModule php5_module "c:/opt/local/php/php5apache2_4.dll" AddHandler application/x-httpd-php .php PHPIniDir "c:/opt/local/php"
This tells Apache where to find php and loads the module needed to work with PHP. (Note: php5apache2_4.dll must be installed in the directory you specified above in the LoadModule statement. It should have been extracted with the other files, but to download the file if it is not there, you can go to the apachelounge additional downloads page.)
While we’re in this file, we also want to tell Apache to look for an index.php file. We’ll need this for testing, but also for some content management systems. To do this, we change the DirectoryIndex directive on line 271. It should look like
<IfModule dir_module> DirectoryIndex index.html
We want to change the DirectoryIndex line so it reads
DirectoryIndex index.php index.html
Before we restart Apache to pick up these changes, we’re going to do one last thing. To test our php, we want to create a file called index.php with the following text inside:
<?php phpinfo(); ?>
Save it to c:\opt\local\Apache24\htdocs
Restart Apache by going back to the Services dialog. (If you closed it, it’s Control Panel > System & Security > Administrative Tools > Services). Click on Apache2.4 and then click on Restart.
If you get an error, you can always go back to the command line, navigate to c:\opt\local\Apache24\bin and run httpd.exe –t again. This will check your syntax, which is most likely to the be problem. (This page is also helpful in troubleshooting PHP 5.4 and Apache if you are having issues.)
Open a browser window and type in http://localhost – instead of “It Works!” you should see a list configuration settings for PHP. (In one of our test cases, the tester needed to close Internet Explorer re-open it for this part to work.)
Now, we move to the database.
To install MySQL, we can follow the directions at the MySQL site. For the purposes of this tutorial, we’re going to use the most recent version as of this writing, which is 5.6.11. To download the files we need, we go to the Community Server download page.
Again, we can absolutely use the installer here, which is the first option. The MySQL installers will prompt you through the setup, and this video does a great job of walking through the process.
But, the since the goal of this tutorial is to see all the parts, I’m going to run through the setup manually. First, we download the .zip archive. Choose the .zip file which matches your operating system; I will choose 64-bit (there’s no agreement issue here). Extract the files to c:\opt\local\mysql. We do this in the same way we did the Apache24 files above.
Since we’re installing to our opt\local drive, we need to tell MySQL to look there for the program files and the data. We do this by setting up an option file. We can modify a file provided for us called my-default.ini. Change the name to my.ini and open it with your code editor.
In the MySQL config files, we use the Unix directory “/” again, and the comments are again preceded by a “#”. So, to set our locations, we want to remove the # from the beginning of the basedir and datadir lines, and change to our installation directory as shown below.
Then save my.ini.
As with Apache, we’re going to start MySQL for the first time from the command line, to make sure everything is working ok. If you still have it open, navigate back there. If not, remember to select the Run As Administrator option.
From your command prompt, type in
cd \opt\local\mysql\bin mysqld --console
You should see a bunch of statements scroll by as the first database is created. You may also get a firewall popup. I hit Cancel here, so as not to allow access from outside my computer to the MySQL databases.
Ctrl+C to stop the server. Now, let’s install MySQL as a service. To do that, we type the command:
Next, we want to start the MySQL service, so we need to go back to Services. You may have to Refresh the list in order to see the MySQL service. You can do this by going to Action > Refresh in the menu.
Then, we start the service my clicking on MySQL and clicking Start Service on the left hand side.
One thing about installing MySQL in this manner is that the initial root user for the database will not have a password. To see this, go back to your command line. Type in
mysql -u root
This will open the command line MySQL client and allow you to run queries. The -u flag sets the user, in this case, root. Notice you are not prompted for a password. Type in:
select user, host, password from mysql.user;
This command should show all the created user accounts, the hosts from which they can log in, and their passwords. The semi-colon at the end is crucial – it signifies the end of a SQL command.
Notice in the output that the password column is blank. MySQL provides documentation on how to fix this on the Securing the Initial Accounts documentation page, but we’ll also step through it here. We want to use the SET PASSWORD command to set the password for all of the root accounts.
Substituting the password you want for newpwd (keep the single quotes in the command), type in
SET PASSWORD FOR 'root'@'localhost' = PASSWORD('newpwd'); SET PASSWORD FOR 'root'@'127.0.0.1' = PASSWORD('newpwd'); SET PASSWORD FOR 'root'@'::1' = PASSWORD('newpwd');
You should get a confirmation after every command. Now, if you run the select user command from above, you’ll see that there are values in the password field, equivalent to encrypted versions of what you specified.
A note about security: I am not a security expert and for a development stack we are usually less concerned with security. But it is generally not a good idea to type in plain text passwords in the command line, because if the commands are being logged you’ve just saved your password in a plain text file that someone can access. In this case, we have not turned on any logging, and the SET PASSWORD should not store the password in plain text. But, this is something to keep in mind.
As before with Mac OS X, we could stop here. But then you would have to administer the MySQL databases using the command line. So we’ll install phpMyAdmin to make it a little easier and test to see how our web server works with our sites folder.
Download the phpmyadmin.zip file from the phpmyadmin page to the sites folder we created all the way at the beginning. Note that this does *not* go into the opt folder.
Extract the files to a folder called phpmyadmin using the same methods we’ve used previously.
Since we now want to use our sites folder instead of the default htdocs folder, we will need to change the DocumentRoot and Directory directives on lines 237 and 238 of our Apache config file. So, open httpd.conf again.
We want to change the DocumentRoot to sites, and we’re going to set up the phpMyAdmin directory.
Save the httpd.conf file. Go back to Services and Restart the Apache2.4 service.
We will complete the configuration through the browser. First, open the browser and try to navigate to http://localhost again. You should get a 403 error.
Instead, navigate to http://localhost/phpmyadmin/setup
Click on the New Server button to set up a connection to our MySQL databases. Double check that under the Basic Settings tab, the Server Name is set to localhost, and then click on Authentication. Verify that the type is “cookie”.
At the bottom of the page, click on Save. Now, change the address in the browser to http://localhost/phpmyadmin and log in with the root user, using the password you set above.
And that’s it. Your Windows AMP stack should be ready to go.
In the next post, we’ll talk about how to install a content management system like WordPress or Drupal on top of the base stack. Questions, comments or other recipes you would like to see? Let us know in the comments.
There are many cases where having a local development environment is helpful and it is a relatively straightforward thing to do, even if you are new to development. However, the blessing and the curse is that there are many, many tutorials out there attempting to show you how. This series of posts will aim to walk through some basic steps with detail, as well as pass on some tips and tricks for setting up your own local dev box.
First, what do I mean by a local development environment? This is a setup on your computer which allows you to code and tweak and test in a safe environment. It’s a great way to hammer on a new application with relatively low stakes. I am currently installing dev environments for two purposes: to test some data model changes I want to make on an existing Drupal site and to learn a new language so I can contribute to an application. For the purposes of this series, we’re going to focus on the AMP stack – Apache, MySQL and PHP – and how to install and configure those systems for use in web application development.
Apache is the web server which will serve the pages of your website or application to a browser. You may hear Apache in conjunction with lots of other things – Apache Tomcat, Apache Solr – but generally when someone references just Apache, it’s the web server. The full name of the project is the Apache HTTP Server Project.
PHP is a scripting language widely used in web development. MySQL is a database application also frequently used in web development. “Stack” refers to the combination of several components needed to run a web application. The AMP stack is the base for many web applications and content management systems, including Drupal and WordPress.
You may have also seen the AMP acronym preceded by an L, M or W. This merely stands for the operating system of choice – Linux, Mac or Windows. This can also refer to installer packages that purport to do the whole installation for you, like WAMP or MAMP. Employing the installer packages can be useful, depending on your situation and operating system. The XAMPP stack, distributed by Apache Friends, is another example of an installer package designed to set up the whole stack for you. For this tutorial though, we’ll step through each element of the stack, instead of using a stack installer.
So, why do it yourself if there are installers? To me, it takes out the mystery of how all the pieces play together and is a good way to learn about what’s going on behind the scenes. When working on Windows, I will occasionally use a .msi installer for an individual component to make sure I don’t miss something. But installing and configuring each component individually is actually helpful.
Before we begin, let’s look at some tips:
- You will need administrative rights to the computer on which you’re installing.
- Don’t be afraid of the command line. There are lots of tutorials around the web on how to use the basic commands – for both Mac (based on UNIX) and Windows. But, you don’t need to be an expert to set up a dev environment. Most tutorials give the exact commands you need.
- Try, if possible, to block off a chunk of time to do this. Going through all the steps may take awhile, from an hour to an afternoon, especially if you hit a snag. Several times during my own process, I had to step away from it because of a crisis or because it was the end of the day. When I was able to come back later, I had some trouble remembering where I left off or the configuration options I had chosen. If you do have to walk away, write down the last thing you did.
- When you’re looking for a tutorial, Google away. Search for the elements of your stack plus your OS, like “Apache MySQL PHP Mac OSX”. You’ll find lots, and probably end up referencing more than one. Use your librarian skills: is the tutorial recent? Does it appear to be from a reputable source? If it’s a blog, are there comments on the accuracy of the tutorial? Does it agree with the others you’ve seen?
- Once you’ve selected one or two to follow, read through the whole tutorial one time without doing anything. Full disclosure: I never do this and it always bites me.
Let’s get going with Recipe 1 – Install the AMP Stack on Mac OS X
Install the XCode Developer Tools
First, we install the developer tools for XCode. If you have Mac 10.7 and above, you can download the XCode application from the App Store. To enable the developer tools, open XCode, go to the XCode menu > Preferences > Downloads tab, and then click on “Install” next to the developer tools. This tutorial on installing Ruby by Moncef Belyamani has good screenshots of the XCode process.
If you have Snow Leopard (10.6) or below, you’ll need to track down the tools on the Apple Developer Downloads Page. You will need to register as a developer, but it’s free. Note: you can get pretty far in this process without using the XCode command line tools, but down the road as you build more complicated stacks, you’ll want to have them.
Configure Apache and PHP
Next we need to configure Apache and PHP. Note that I said “configure”, not “install”. Apache and PHP both come with OS X, we just need to configure them to work together.
Here’s where we open the Terminal to access the command line by going to Applications > Utilities > Terminal.
Once Terminal is open, a prompt appears where you can type in commands. The ” ~ ” character indicates that you are at the “home” directory for your user. This is where you’ll do a lot of your work. The “$” character delineates the end of the prompt and the beginning of your command.
Type in the following command:
“cd” stands for “change directory”. This is the equivalent of double-clicking on etc, then apache2, if you were in the Finder (but etc is a hidden folder in the Finder). From here, we want to open the necessary file in an editor. Enter the following command:
sudo nano httpd.conf
“sudo” elevates your permission to administrator, so that you can edit the config file for Apache, which is httpd.conf. You will need to type in your administrator password. The “nano” command opens a text editor in the Terminal window. (If you’re familiar with vi or emacs, you can use those instead.)
The bottom of your window will show the available commands. The “^” stands for the Control key. So, we want to search for the part to change, we press Control + W. Enter php and press Enter. We are looking for this line:
#LoadModule php5_module libexec/apache2/libphp5.so
The “#” at the beginning of this line is a comment, so Apache ignores the line. We want Apache to see the line, and load the php module. So, change the text by removing the #:
LoadModule php5_module libexec/apache2/libphp5.so
Save the file by press Control + O (nano calls this “WriteOut”) and press Enter next to the file name. The number of lines written displays at the bottom of the window. Press Control + X to exit nano.
Next, we need to start the Apache server. Type in the following command:
sudo apachectl start
Apache, as mentioned before, serves web files from a location we designate. By default, this is /Library/Webserver/Documents. If you have Snow Leopard (10.6) or below, Apache also automatically looks to username/sites, which is a convenient place to store and work with files. If you have OS 10.7 or above, creating the Sites folder takes a few steps. On 10.7, go to System Preferences > Sharing and click on Web Sharing. If there’s a button that says “Create Personal Web folder”, it has not been created, go ahead and click that button. If it says, “Open Personal Website folder”, you’re good to go.
On 10.8, the process is a little more involved. First, go to the Finder, click on your user name and create your sites folder.
Next, we need to open the command line again and create a .conf file for that directory, so that Apache knows where to find it. Type in these commands:
cd /etc/apache2/users ls
The ls at the end will list the directory contents. If you see a file that’s yourusername.conf (ie, mfrazer.conf) in this directory, you’re good to go. If you don’t, it’s easy to create one. Type the following command:
sudo nano yourusername.conf
So, mine would be sudo nano mfrazer.conf. This will create the file and take you into a text editor. Copy and past the following, making sure to change YOURUSERNAME to your user name.
<Directory "/Users/YOURUSERNAME/Sites/"> Options Indexes MultiViews AllowOverride None Deny from all Allow from localhost </Directory>
The first directive, Options, can have lots of different…well, options. The ones we have here are Indexes and MultiViews. Indexes means that if a browser requests a directory and there’s no index.html or index.php file, it will serve a directory listing. Multi-Views means that browsers can request the content in a different format if it exists in the directory (ie, in a different language). AllowOverride determines if an .htaccess file elsewhere can to override the configuration settings. For now, None will indicate that no part can be overridden. For Drupal or other content management systems, it’s possible we’ll want to change these directives, but we’ll cover that later.
The last two lines indicate that traffic can only reach this directory from the local machine, by typing http://localhost/~username in the browser. For more on Apache security, see the Apache documentation. If you would like to set it so that other computers on your network can also access this directory, change those last two lines to:
Order allow,deny Allow from all
Either way, press Control + O to save the file and Control + X to exit. Restart Apache for the changes to take effect using this command:
sudo apachectl restart
You may also be prompted at some point by OS X to accept incoming network connections for httpd (Apache); I would deny these as I only want access to my directory from my machine, but it’s up to you depending on your setup.
We’ll test this setup with php in the next step.
If you want to check php, you can create a new text document using your favorite text editor. Type in:
<?php phpinfo(); ?>
Save the file as phpinfo.php in your username/sites directory (so for me, this is mfrazer > Sites)
Then, point your browser to http://localhost/~yourUserName/phpinfo.php You should see a page of information regarding PHP and the web server, with a header that looks like this:
Now, let’s install MySQL. There’s two ways to do this. We could go to the MySQL downloads page and use the installers. The fantastic tutorials at Coolest Guy on the Planet both recommend this, and it’s a fine way to go.
But we can also use Homebrew, mentioned previously on this blog, which is a really convenient way to do things as long as we’re already using the command line.
First, we need to install homebrew. Enter this at the command prompt:
ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
Next, type in
If you receive the message: “Your system is raring to brew.” You’re ready to go. If you get Warnings, don’t lose heart. Most of them tell you exactly what you need to do to move forward. Correct the errors and type in brew doctor again until you’re raring to go. Then, type in the following command:
brew install mysql
That one’s pretty self-explanatory, no? Homebrew will download and install MySQL, as of this writing version 5.6.10, but pay attention to the download to see the version – it’s in the URL. After the installation succeeds, Homebrew will give some instructions on finishing the setup, including the commands we discuss below.
I’m going to pause for a second here and talk a little about permissions and directories. If you get a “permission denied” error, trying running the command again using “sudo” at the beginning. Remember, this elevates your permission to the administrator level. Also, if you get a “directory does not exist” error, you can easily create the directory using “mkdir”. Before we move on, let’s try to check for a directory you’re going to need coming up. Enter:
If you are successfully able to change to that directory, great. If not, type in
sudo mkdir /usr/local/var
to create it. Then, let’s go back to our home directory by typing in
Now, let’s continue with our procedure. First, we want to set up the databases to run with our user account. So, we type in the following two commands:
unset TMPDIR mysql_install_db --verbose --user=`whoami` --basedir="$(brew --prefix mysql)" --datadir=/usr/local/var/mysql --tmpdir=/tmp
The second command here installs the system databases; ‘whoami’ will automatically replace with your user name, so the above command should work verbatim. But it also works to use your user name, with no quotes, (ie –user=mfrazer).
Next, we want to run the “secure installation” script. This helps you set root passwords without leaving the password in plain text in your editor. First we start the mysql server, then we run the installation scripts and follow the prompts to set your root password, etc:
mysql.server start sudo /usr/local/Cellar/mysql/5.6.10/bin/mysql_secure_installation
After the script is complete, stop the mysql server.
Next, we want to set up MySQL so it starts at login. For that, we run the following two commands:
ln -sfv /usr/local/opt/mysql/*.plist ~/Library/LaunchAgents launchctl load ~/Library/LaunchAgents/homebrew.mxcl.mysql.plist
The ln command, in this case, places a symbolic link to any .plist files in the mysql directory into the LaunchAgents directory. Then, we load the plist using launchctl to start the server.
One last thing – we need to create one more link to the mysql.sock file.
cd /var/mysql/ sudo ln -s /tmp/mysql.sock
This creates a link to the mysql.sock file, which MySQL uses to communicate, but which resides by default in a tmp directory. The first command places us in the directory where we want the link (remember, if it doesn’t exist, you can use “sudo mkdir /var/mysql/” to create it) and the second creates the link.
MySQL is ready to go! And, so is your AMP stack.
But wait, there’s more…
One optional tool to install is phpMyAdmin. This tool allows you to interact with your database through your browser so you don’t have to continue to use the command line. I also think it’s a good way to test if everything is working correctly.
First, let’s download the necessary files from the phpMyAdmin website. These will have a .tar.gz extension. Place the file in your Sites directory, and double-click to unzip the file.
Rename the folder to remove the version number and everything after it. I’m going to place the next steps below, but the Coolest Guy on the Planet tutorial referenced earlier does a good job of this step for OS 10.8 (just scroll down to phpMyAdmin) if you need screenshots.
Go to the command line and navigate to your phpMyAdmin directory. Make a directory called config and change the permissions so that the installer can access the file. This should looks something like:
cd ~/username/sites/phpMyAdmin mkdir config chmod o+w config
Let’s take a look at that last command: chmod changes the permissions on a file. The o+w sets it so users who are not the directory’s owner can write to the file.
Now, in your browser, go to http://localhost/~username/sites/phpmyadmin/setup and follow these steps:
- Click on New Server (button on bottom)
- Click on Authentication tab, and enter the root password in the password field.
- Click on Save.
- Click on Save again on the main page.
Once the setup is finished, go to the Finder and move the config.inc.php file from the config directory into the main phpmyadmin directory and delete the config directory. So in the end, it looks like this:
Now, go to http://localhost/~username/sites/phpmyadmin in your browser and login with the root account.
You are ready to go! In future parts of this series, we’ll look at building the AMP stack on Windows and adding Drupal or WordPress on top of the stack. We will also look at maintaining your environment, as the AMP stack components will need updating occasionally. Any other recipes you’d like to see? Do you have questions? Let us know in the comments.
The following tutorials and pages were incredibly useful in writing this post. While none of these tutorials are exactly the same as what we did here, they all contain useful pieces and may be helpful if you want to skip the explanation and just get to the commands:
- Coolest Guy on the Planet AMP Tutorials (10.8, 10.7/10.6) by Neil Gee
- Using Homebrew to Support Drupal on OSX by Ron Golan
- Uninstall/Re-install MySQL from stackexchange.com
Everyone occasionally dives right into a problem without researching (gasp!) the best solution. For me, this once meant manually renaming hundreds of files and moving them into individual folders in preparation for upload to a digital repository. Then finally a colleague said to me, and rightly so, “Are you crazy? There’s scripts to do that for you.”
In my last post, I discussed file naming conventions and the best methods to ensure future access and use for files. However, as librarians and archivists, we don’t always create the files we manage. Donors bring hard drives and students bring USB drives and files get migrated…etc, etc. Renaming existing files to bring them in line with our standards is often a daunting prospect, but there are lots of methods available to save time and sanity.
In this post, I’ll review a few easy methods for batch renaming files:
- Automator for Mac OS X – a built-in tool which aids in building scripts for this type of task
- Column Editor for Notepad++ in Windows – edit a list of rename commands, then run in a batch
- Batch File with a Loop in Windows – create a batch file which loops through the files and executes rename commands
The first two methods do not require any knowledge of coding; the last is slightly more advanced. There are some caveats: if you are an experienced developer, it’s likely that you know a more efficient way. I also tried to avoid any third-party tools specifically touted as renaming applications, as I have not used them and therefore cannot recommend which is best. Lastly, while Photoshop and other photo editing software may help with this when working with image files, the options listed below should work with all file types.
In my example, I am using a set of 43 images waiting for upload to our digital library. The files originated on a faculty member’s camera, so the names are in the following format:
DSCN2956.jpg DSCN2957.jpg DSCN2958.jpg ...
The images are of the Olympic Stadium in Beijing, China, and I would like the file names to reflect that, i.e. Beijing-OlympicStadium-01.jpg
One of the features included in Mac OS X (10.4 and above) is Automator, the “personal automation assistant”, according to Apple Support. The tool allows you to define a group of actions to take, automatically, on a given set of triggers. For example, after discovering this tool I created a script which, when prompted, quickly grabs a screenshot and save it as a jpeg in a folder I specified.
For this post, let’s step through using the tool to batch re-name files. First, I found a tutorial online. These are everywhere, but specifically, I looked at “10 Awesome Uses for Automator Explained” by Matt Reich. Reich gives a good succinct tutorial, placed in the context of personal photos. We’re going to make a few changes in our steps, place it in the context of a digital collection and walk a little more slowly through the process. I’ll be using Mac OS 10.8 in the steps and screenshots.
1. Go to Finder, Open Applications and double-click on Automator.
2. We’re going to create an Application. Reich uses a Folder Action, which means that you would copy the items into the folder which would trigger the rename. That approach makes sense as you move personal photos from a camera into the same Photos folder over and over again (in fact, I plan to use it myself). However, in working with existing digital files that we just want to rename, which may need to live in many different folders, the Application is a more direct approach. This will allow us to act on the files in place. So, click on the Application Icon, and click on Choose.
3. Now we need to add some Actions. In the Library along the far left-hand pane, select “Files & Folders”. The middle pane will now show all of the options for acting on Files & Folders.
4. Click on “Rename Finder Items” and drag it to the large empty pane on the right.
5. The system will prompt you as to whether or not you want to “Copy the Finder items.” For this example, I opted not to, but if you prefer to make a copy, click on Add.
6. The window you’ve dragged over will default to settings for “Add Date or Time”. We want to do this eventually, but let’s start with changing the name and adding a sequence number. In the drop-down menu at the top of the window, change “Add Date or Time” to “Make Sequential”
7. Select the radio button next to “new name”, but don’t enter a default item name.
8. Set the rest of the parameters. For my purposes, I placed the number after the name, used a dash to separate, and used a three digit number set.
9. Click on “Options” at the bottom, and select “Show this action when the workflow runs.” The application will then prompt you to fill in the item name at runtime.
A note about the date: In cases where you’d like to append a system date (e.g. Created, Modified, Last Opened or Current), you would use “Add Date or Time”. To match our file naming conventions we have already established, we’ll want to select non-space and non-special characters as our separators, use Year Month Day as the format, and click the checkbox to “Use Leading Zeros”. I would use a dash to separate the name from the date and no separator for Year Month Day. Look at the example provided at the bottom to make sure the date looks correct.
However, in my case, I’m working with a set of files where the system dates aren’t much use to me. I want to know the date of the photo; this is especially likely if I were working with scanned files from a historical period. So, I’m going to use “Add Text” instead, and append my own date.
10. Repeat step 4: drag “Rename Finder Items” to the right pane. This time, select “Add Text” from the dropdown.
11. Leave the “Add Text” field blank, click on “Options” and select “Show this action when the workflow runs.” Then, when you run the application you’ll be prompted to add text and you can append 1950, for example, to the file name.
12. Click on File > Save As, and save your Application in a location where it is easy to drag and drop files, like the Desktop. For my example, I called the application BatchFileRename.
13. Navigate to the folder containing the files you want to rename, and select them all (can use Cmd+A). Drag the whole selection to the Automator file you just created, fill in the prompts and click “Continue”.
You now have a set of renamed files. Note that the script did not modify the “Date Modified” value for the file. The script is now set up for future projects as well; any time you want to rename files, just repeat step 13.
One thing you might notice is that the date is appended after the index number. If you wanted it before the index number, we would append it to the “item name” field in the Make Sequential box and skip the Add Text section all together.
A note from a paranoid librarian: I copied this set of files from its original location to do this example, so that if something went horribly wrong, I’d still have the originals. Until you get comfortable with batch renaming you might consider doing the same.
There are lots of other uses for the Automator tool, check out “10 Awesome Uses for Automator Explained” by Matt Reich for more ideas, or do a search for Automator tutorials in your favorite search engine.
Windows – Notepad++ Column Editor
I started out hoping to accomplish this task the same way I did in the Mac OS X – with no outside tools. However, the default renaming function in Windows lacks a few things for our purposes. If you select a group of files, right-click and select “Rename”, you can rename all of the files at once.
However, the resulting file names do not conform to our earlier standards. They contain spaces and special characters and the index number is not a consistent length, which can cause sorting headaches.
After some searching, I came across this stackoverflow page, which contained a very useful command:
dir /b *.jpg >file.bat
This command allows me to dump a directory’s files into a text file which I can edit into a series of rename commands to be run as a batch file. The editing of the text file is the most time-consuming part, but using the Column Editor in Notepad++ speeds up the process considerably. (This is where we break the “no third-party tool” convention. Notepad++ is a free text editor I use frequently for writing code and highly recommend, though this process may work with other text editors.)
1. Open a command prompt.
2. Navigate to the directory which contains the files that need to be renamed.
3. The command we found above is composed of several parts. “dir” lists the directory contents, “/b” indicates to only list the filenames, “*.jpg” means to grab only the jpg files, and “>file.bat” directs the output to a file called file.bat. We are going to keep everything the same except change the name of our output file.
dir /b *.jpg >rename.bat
4. In Windows Explorer, navigate to the directory and find the file you just created. Right click on it and select Edit with Notepad++ (or Open With > Your Text Editor).
5. Put the cursor before the first letter in the first line, and open the Column Editor (Edit > Column Editor or Alt+C).
6. This tool allows you to assign the same character to every line of text in the same space. We want to insert the beginning of the Windows rename command for each line. So, in the “text to insert” box, we type:
and click OK.
7. Open the editor again to add the portion of the rename command which goes after the old filename. Here is where we’ll designate our new name, again using the “text to insert” box. I typed:
(Note, if you are using file names of varying length, move to the column after the longest file name, then use Find & Replace at the end of the process to remove the extra spaces.)
8. Next, let’s append an index before the file extension. Open the Column Editor again and this time, select the number option. Start at 1, increment by 1, and check the leading zeros box. Click Ok.
9. Last, append the file extension and end the command for each line. Using the Column Editor’s “text to insert” box one more time, add:
10. The Column Editor adds one extra line at the bottom. Scroll down and delete it before saving the file.
11. Save the file and go back to the command prompt. (If you closed it, re-open it and navigate back to the directory before proceeding.)
12. Type in the full name of the batch file so it will execute, i.e.
You’ll see the rename commands go by, and the files will each have a new name. Again, this doesn’t appear to affect the Date Modified on the file.
Windows – Batch File with Loop
It is possible to write your own batch file that will loop through the files in question and rename them. I have never written my own batch file, so in the interest of researching this post, I decided to give it a shot. There is lots of documentation available online to help in this effort. I consulted Microsoft’s documentation, DOS help documentation, and batch file examples (such as this stackoverflow post and a page on OhioLINK’s Digital Resource Management Committee wiki, which focuses preparing files for DSpace batch upload).
A batch file just groups a number of Windows commands together in one file and executes them when the batch file is run, as we saw in our previous example. But, instead of writing the specific rename commands one by one using a text editor, a batch file can also be used to generate the commands on the fly. Save the following code to a file, place it in the same directory with the set of files and then double click to run it. Caveat: test this with sample files before you use it! I have tested on a few directories, but not extensively.
First, we use @echo off to stop the batch commands from printing to the command line window.
Then, we set EnableDelayedExpansion so that our index counter will work (has to do with evaluating the variable at execution). This is why when you see i in the loops, it is written !i! instead of %i% used for other variables.
Next, I set three prompts to ask the user for some information about the renaming. What’s the root name we want to use? What’s the file extension? How many files are there? (Note, this will only work for under 1000 files). The “/p” flag assigns the response to the prompt to a variable (r, e and n, respectively). When we reference these variables later, we’ll use the syntax %r% %e% and %n%.
set /p r=Enter name root: set /p e=Enter file extension (ie .jpg .tif): set /p n=More than one hundred files? (y/n):
Next, we set the index counter, which allows to add an incrementing index to our filenames.
set /a "i = 1"
If there are less than 100 files, we only need one leading zero in the index for our first ten files, and none for the remaining. If there are more than 100, obviously we’ll want a three digit index. So, the following if statement allows us to fork to one of two loops – for two digits or three digits.
if %n%==y (GOTO three) else GOTO two
Our first segment handles three digit indexes for more than 100 files. %%v is the temporary variable that holds each item as we iterate through the loop one time. *%e% represents a wildcard plus the extension given by the user. So, if the user enters .jpg, we want to select *.jpg, or all files with a .jpg extension. Everything that follows “do” is a command.
:three for %%v in (*%e%) do (
First, we want to see if, based on the index counter i, we need leading zeros. If i is less than ten, we want two leading zeros. If it’s less than 100, we want one leading zero. This affects the renaming statement that gets applied. All of the rename statements will rename the file currently in %%v to the root name (represented by %r%), followed by a hyphen, the correct number of leading zeros, the index number (represented by !i!) and the file extension (represented by %e%).
if !i! lss 10 ( rename %%v %r%-00!i!%e% ) else ( if !i! lss 100 ( rename %%v %r%-0!i!%e% ) else ( rename %%v %r%-!i!%e% ) )
Before we exit the loop, we want to increment the index to use with the next file. And, lastly, we need to add a “goto done” statement, so that we don’t execute the “two” segment.
set /a "i = i + 1" ) goto done
The “two” section is the basically the same, except that we only need two digit indexes since there are less than 100 files.
:two for %%v in (*%e%) do ( if !i! lss 10 ( rename %%v %r%-0!i!%e% ) else ( rename %%v %r%-!i!%e% ) set /a "i = i + 1" )
We end with our “done” label, which marks the exit point.
Here is the code as a whole:
@echo off @setlocal enabledelayedexpansion set /p r=Enter name root: set /p e=Enter file extension (ie .jpg .tif): set /p n=More than one hundred files? (y/n): set /a "i = 1" if %n%==y (GOTO three) else GOTO two :three for %%v in (*%e%) do ( if !i! lss 10 ( rename %%v %r%-00!i!%e% ) else ( if !i! lss 100 ( rename %%v %r%-0!i!%e% ) else ( rename %%v %r%-!i!%e% ) ) set /a "i = i + 1" ) goto done :two for %%v in (*%e%) do ( if !i! lss 10 ( rename %%v %r%-0!i!%e% ) else ( rename %%v %r%-!i!%e% ) set /a "i = i + 1" ) :done
I saved the file as BatchRename.bat, and then copied it to my test directory. Double click on the .bat file to open it. Enter the prompts and the batch file takes care of the rest.
The files are renamed and again, the Date Modified field was not changed by this action.
Of the three methods, I slightly prefer the Automator method, because of its simplicity and ability to be re-used: once the application is created it can be used over and over again with different sets of files. The batch file for Windows is similar in that it can be re-used once created, but does require some knowledge of coding concepts. With the Notepad++ method, we have simplicity, but you’ll need to step through the file editing with each new set. I love the Column Editor, however; the Insert Number function is incredibly useful for indexing files in file names without the pesky Window parentheses.
All of the methods are quick and easy ways to rename a large set of files. And from personal experience, I will attest that all are preferable to doing it manually.
I’m curious to hear our readers’ thoughts – feel free to leave questions and other recommendations in the Comments section below.
As a curator and a coder, I know it is essential to use naming conventions. It is important to employ a consistent approach when naming digital files or software components such as modules or variables. However, when a student assistant asked me recently why it was important not to use spaces in our image file names, I struggled to come up with an answer. “Because I said so,” while tempting, is not really an acceptable response. Why, in fact, is this important? For this blog entry, I set out to answer this question and to see if, along the way, I could develop an “elevator pitch” – a short spiel on the reasoning behind file naming conventions.
As a habit, I implore my assistants and anyone I work with on digital collections to adhere to the following when naming files:
- Do not use spaces or special characters (other than “-” and “_”)
- Use descriptive file names. Descriptive file names include date information and keywords regarding the content of the file, within a reasonable length.
- Date information is the following format: YYYY-MM-DD.
So, 2013-01-03-SmithSculptureOSU.jpg would be an appropriate file name, whereas Smith Jan 13.jpg would not. But, are these modern practices? Current versions of Windows, for example, will accept a wide variety of special characters and spaces in naming files, so why is it important to restrict the use of these characters in our work?
The Search Results
A quick Google search finds support for my assertions, though often for very specific cases of file management. For example, the University of Oregon publishes recommendations on file naming for managing research data. A similar guide is available from the University of Illinois library, but takes a much more technical, detailed stance on the format of file names for the purposes of the library’s digital content.
The Bentley Historical Library at University of Michigan, however, provides a general guide to digital file management very much in line with my practices: use descriptive directory and file names, avoid special characters and spaces. In addition, this page discusses avoiding personal names in the directory structure and using consistent conventions to indicate the version of a file.
The Why – Dates
The Bentley page also provides links to a couple of sources which help answer the “why” question. First, there is the ISO date standard (or, officially, “ISO 8601:2004: Data elements and interchange formats — Information interchange — Representation of dates and times”). This standard dictates that dates be ordered from largest term to smallest term, so instead of the month-day-year we all wrote on our grade school papers, dates should take the form year-month-day. Further, since we have passed into a new millennium, a four digit year is necessary. This provides a consistent format to eliminate confusion, but also allows for file systems to sort the files appropriately. For example, let’s look at the following three files:
If we expressed those dates in another format, say, month-day-year, they would not be listed in chronological order in a file system sorting alphabetically. Instead, we would see:
This may not be a problem if you are visually searching through three files, but what if there were 100? Now, if we only used a two digit year, we would see:
If we did not standardize the number of digits, we might see:
You can try this pretty easily on your own system. Create three text files with the names above, sort the files by name and check the order. Imagine the problems this might create for someone trying to quickly locate a file.
You might ask, why include the date at all, when dates are also maintained by the operating system? There are many situations where the operating system dates are unreliable. In cases where a file moves to a new drive or computer, for example, the Date Created may reflect the date the file moved to the new system, instead of the initial creation date. Or, consider the case where a user opens a file to view it and the application changes the Date Modified, even though the file content was not modified. Lastly, consider our earlier example of a photograph from 1960; the Date Created is likely to reflect the date of digitization. In each of these examples, it would be helpful to include an additional date in the file name.
The Why – Descriptive File Names
So far we have digressed down a date-specific path. What about our other conventions? Why are those important? Also linked from the Bentley Library and in the Google search results are YouTube videos created by the State Library of North Carolina which answer some of these questions. The Inform U series on file naming has four parts, and is intended to help users manage personal files. However, the rationale described in Part 1 for descriptive file names in personal file management also applies in our libraries.
First, we want to avoid the accidental overwriting of files. Image files can provide a good example here: many cameras use the file naming convention of IMG_1234.jpg. If this name is unique to the system, that works ok, but in a situation where multiple cameras or scanners are generating files for a digital collection, there is potential for problems. It is better to batch re-name image files with a more descriptive name. (Tutorials on this can be found all over the web, such as the first item in this list on using Mac’s Automator program to re-name a batch of photos).
Second, we want to avoid the loss of files due to non-descriptive names. While many operating systems will search the text content of files, naming files appropriately makes for more efficient access. For example, consider mynotes.docx and 2012-01-05WebMeetingNotes.docx – which file’s contents are easier to ascertain?
I should note, however, that there are cases where non-descriptive file names are appropriate. The use of a unique identifier as a filename is sometimes a necessary approach. However, in those cases where you must use a non-descriptive filename, be sure that the file names are unique and in a descriptive directory structure. Overall, it is important that others in the organization have the same ability to find and use the files we currently manage, long after we have moved on to another institution.
The Why – Special Characters & Spaces
We have now covered descriptive names and reasons for including dates, which leaves us with spaces and special characters to address. Part 3 of the Inform U video series addresses this as well. Special characters can designate special meaning to programming languages and operating systems, and might be misinterpreted when included in file names. For instance, the $ character designates the beginning of variable names in the php programming language and the \ character designates file path locations in the Windows operating system.
Spaces may make things easier for humans to read, but systems generally do better without the spaces. While operating systems attempt to handle spaces gracefully and generally do so, browsers and software programs are not consistent in how they handle spaces. For example, consider a file stored in a digital repository system with a space in the file name. The user downloads the file and their browser truncates the file name after the first space. This equates to the loss of any descriptive information after the first space. Plus, the file extension is also removed, which may make it harder for less tech savvy users to use a file.
That example leads us to the heart of the issue: we never know where our files are going to end up, especially files disseminated to the public. Our content is useless if our users cannot open it due to a poorly formatted file name. And, in the case of non-public files or our personal archives, it is essential to facilitate the discovery of items in the piles and piles of digital material accumulated every day.
So, do I have my elevator pitch? I think so. When asked about file naming standards in the future, I think I can safely reply with the following: “It is impossible to accurately predict all of the situations in which a file might be used. Therefore, in the interest of preserving access to digital files, we choose file name components that are least likely to cause a problem in any environment. File names should provide context and be easily understood by humans and computers, now and in the future.”
And, like a good file name, that is much more effective and descriptive than, “Because I said so.”
During my time as the Digital Resources Librarian at Kenyon College I had the opportunity to work with The Community Within collection, which explores black history in Knox County, Ohio. At the beginning of the project, our goal for this collection was simple: to make a rich set of digitized materials publicly available through our DSpace repository, the Digital Resource Commons (DRC). However, once the collection was published in the DRC, a new set of questions emerged. How do we drive people to the collection? Can we create more interesting interfaces or virtual exhibits for the collection? How do we tie it all together? To answer these questions, we started exploring the digital humanities landscape, looking for low cost tools we could integrate with our existing DSpace collections. We started to think about the collection and associated metadata as a data set, which contained elements we could use to create a display different than the standard list of items. We wanted to facilitate the discovery of individual items by displaying them to our users in different visual contexts, such as maps or timelines.
Two tools that emerged from this exploration were Google Fusion Tables, a Google product, and Viewshare, which is provided by National Digital Information Infrastructure and Preservation Program (NDIIPP) at the Library of Congress. Google Fusion Tables provides a platform for researchers to upload and share data sets, which can then be displayed in seven different visualization formats (map, scatter plot, intensity map). Various examples of the results can be seen in their gallery, which also illustrates the wide range of organizations using the tool, including academic research institutions, news organizations and government agencies. Viewshare, according to their website, “is a free platform for generating and customizing views (interactive maps, timelines, facets, tag clouds) that allow users to experience your digital collections.” While it does many of the same things as Google Fusion in allowing users to create visualizations of data sets, it is more specifically geared towards cultural heritage collections.
Both tools are freely available and allow users to import data from a variety of sources. Because the tools are easy to use, it is possible to get started quickly in manipulating and sharing your data. Each tool provides a space for the uploaded data and accompanying views, but also allows for you to embed this information in other web locations. In the case of The Community Within, we created an exhibit which links to materials about churches in the collection using an embedded Google Fusion map display.
This blog entry will walk through how to successfully export and manipulate data from DSpace in order to take advantage of these tools, as well as how to embed the resulting interface components back into DSpace or other collection websites.
The How-To – DSpace and Google Fusion
1. First, start with a DSpace collection. Our example collection is a photo collection of art on the campus at Ohio State University. In the screenshot below, we are already logged in as a collection administrator.
Note. Click the images to see them in their full-size.
2. We need to export the metadata. So, click on “Export Metadata” (under Context). This will download a .csv file.
3. When you open the .csv, you may notice that metadata added to the collection at different times in different ways may show up differently. We want to fix this before we send this file anywhere.
4. Save the file as a .csv file. If you are given a choice, be sure to select a comma as the separating punctuation.
5. Open Google Fusion. If you do not use Google Drive (formerly Docs), you will need to login with a Google account or sign up for one. Go to drive.google.com.
6. Once you are logged in, click on Create > More > Fusion Table (experimental).
7. On the next screen, we’re going to select “From this computer”, then click on Browse to get to the csv we created above. Once the file is in the Browse text box, click on Next.
8. Check that your data looks ok, then click on Next again. A common problem occurs here when your spreadsheet editor chooses a separator other than a comma. Fixing is easy enough, just click Back and indicate the correct separator character.
9. On the next screen describe your table, then click on Finish.
10. We have a Fusion table. Now, let’s create our visualization. Click on Visualize > Map.
Because our collection already contained Geocodes in the dc.coverage.spatial column, the map is automatically created. However, if you would like to use a different column, you can change it by selecting the Location field to the top left of the map. Google Fusion tables can also create the map using an address, instead of a latitude/longitude pair. If the map is zoomed far back, zoom in before you get the embed code to make sure the zoom is appropriate on your Dspace page.
11. Now, let’s embed our map back in DSpace. In Google Fusion, click on “Get embeddable link” at the top of the map. In the dialog which comes up, copy the text in the field “Paste HTML to embed in a website” (Note: your table must be shared for this to work. Google should prompt you to share the table if you try to get an embeddable link for an unshared table. If not, just click on Share in your Fusion window and make the table public.)
13. Here’s a huge gotcha. I have pasted the embed code below. If you paste it just like this and click on Save, the Collection page will disappear because there is nothing between the tags. We need to add something between the opening and closing <iframe></iframe> tag. Usually, I use “this browser does not support frames.”
<iframe width=”500″ height=”300″ scrolling=”no” frameborder=”no” src=”https://www.google.com/fusiontables/embedviz?viz=MAP&q=select+col4+from+1Fqwl_ugZxBx3vCXLVEfnujSpYJa9F0IICVqHLYw&h=false&lat=40.00118408791957&lng=-83.016412&z=10&t=1&l=col4″></iframe>
<iframe width=”500″ height=”300″ scrolling=”no” frameborder=”no” src=”https://www.google.com/fusiontables/embedviz?viz=MAP&q=select+col4+from+1Fqwl_ugZxBx3vCXLVEfnujSpYJa9F0IICVqHLYw&h=false&lat=40.00118408791957&lng=-83.016412&z=10&t=1&l=col4″>This browser does not support frames.</iframe>
14. Now, click on Save. This will take you back to your collection homepage, which now has a map.
15. One last thing – that info window in the map is not really user friendly. Let’s go back go Google Fusion and fix it. Just click on “Configure info window” above the Fusion map. It will bring up a dialog which allows you to choose which fields you want to show, as well as modify the markup so that, for example, links display as links.
16. No need to re-embed, just head back to your DSpace page and click refresh.
Done! You can play with the settings at various points along the way to make the map smaller or larger.
The How-To – DSpace and Viewshare
We can complete the same process using Viewshare. If you skipped to this section, go back and read steps 1-4 above.
Back? Ok. So we should have a .csv of exported metadata from our DSpace collection.
1. Log into Viewshare. You will have to request an account if you don’t have one.
2. From the homepage, click on Upload Data.
3. There are a multitude of source options, but we’re going to use the .csv we created above, so we select “From a file on your computer.”
5. In the Preview Window, you can edit the field names to more user friendly alternatives. You can also click the check box under Enabled to include or not include certain fields. You can also select field types, so that data is formatted correctly (as in, links) and can be used for visualizations (as in dates or locations).
6. When you have finished editing, click on Save. You will now see the dataset in your list of Data. Click on Build to build a visualization.
7. You can pick any layout, but I usually pick the One Column for simplicity’s sake.
8. The view will default to List, but really, we already have a list. Let’s click on the Add a View tab to create another visualization. For this example, we’re going to select Timeline.
9. There are a variety of settings for this visualization. Select the field which contains the date (in our case, we just have one date, so we leave End Date blank), decide how you want to color the timeline and what unit you want to use. Timeline lens lets you decide what is included in the pop-up. Click on Save (top right) when you are finished selecting values.
10. We have created a timeline. Now we need to embed it back in DSpace. Click on Embed in the top menu.
11. Copy the embed code.
12. Again, back in DSpace, we will click on Edit Collection and paste the embed code into one of the HTML fields. And, again, it is essential that there is some text between the tags.
Now we have an embedded timeline!
Depending on the space available on your DSpace homepage, you may want to adjust the top and bottom time bands so that the timeline displays more cleanly.
Of course, there are a few caveats. For example, this approach works best with collections that are complete. If items are still being added to the collection, the collection manager will need to build in a workflow to refresh the visualization from time to time. This is done by re-exporting, re-uploading, and re-embedding. Also, Google Fusion Tables is officially an “experimental” product. It is important to keep your data elsewhere as well, and to be aware that your Fusion visualizations may not be permanent.
However, this solution provides an easy, code-free way to improve the user interface to a collection. Similar approaches may also work using platforms not described here. For example, here’s a piece on using Viewshare with Omeka, another open source collection management system. The goal is to let each tool do what it does best, then make the results play nicely together. This is a free and relatively painless way to achieve that goal.
About our Guest Author: Meghan Frazer is the Digital Resources Curator for the Knowlton School of Architecture at The Ohio State University. She manages the school archives as well as the KSA Digital Library, and spends lots of time wrangling Drupal for the digital library site. Her professional interests include digital content preservation and data visualization. Before attending library school, Meghan worked in software quality assurance and training and has a bachelor’s degree in Computer Science. You can send tweets in her direction using @meghanfrazer.