C-Span Video Archives

16 03 2010

Researchers, political satirists and partisan mudslingers, take note: C-Span has uploaded virtually every minute of its video archives to the Internet. The archives, at C-SpanVideo.org, cover 23 years of history and five presidential administrations and are sure to provide new fodder for pundits and politicians alike. The network will formally announce the completion of the C-Span Video Library on Wednesday. Read more from the NYTimes.

Installing EPrints on Windows

12 03 2010

Updated manual for installing EPrints on your Windows system. Current manuals seem to be lacking in details. Here is a consolidation of instructions that have worked. Total install time should be around 30-45 minutes, depending on your technical experience. So if you ever wanted to play with a digital repository system – have fun.

Required Software

Apache 2.2.15-win32

ActivePerl 5.10.1 1007-MSWIN32-x86-291969 http://downloads.activestate.com/ActivePerl/releases/

MySQL 5.1.44-win32 http://dev.mysql.com/downloads/mysql/

EPrints v3.2.0 Windows Installer http://files.eprints.org/494/1/eprints-3.2.0.tar.gz

Optional Software

GhostScript 8.60 http://mirror.cs.wisc.edu/pub/mirrors/ghost/GPL/gs861/gs861w32.exe

Catdoc 0.94.2 http://hpux.connect.org.uk/hppd/hpux/Text/catdoc-0.94.2/

ImageMagick  6.3.5-6 http://linux.wareseeker.com/Multimedia/imagemagick-6.3.5-6.zip/321889

Install Apache

Run the Apache .msi file that you downloaded. The .msi is a self installer and will guide you through the process. Install Apache on port [80] as a service for all users. Name your server name (localhost), Domain name (localhost) and administrative email account (any email.com). Apache will install in the C:Program Files\Apache Foundation\Apache2.2 directory by default. Change the directory to C:EPrints\Apache2.

After installation Apache automatically starts. The  icon in the system tray means Apache has started. The icon means that the Apache Monitor Servers are running but not Apache.

Install ActivePerl

Run the ActivePerl.msi. Install into the C:\EPrints\Perl directory. When the installation of ActivePerl is complete, you will need to install 2 additional ppd components (DBD-mysql.ppd and mod_perl.ppd) from the command line. Open a command prompt (Command line 101: Start Menu – Run – type “cmd”) and enter:

ppm install http://capn.uwinnipeg.ca/PPMPackages/10xx/DBD-mysql.ppd
ppm install   http://capn.uwinnipeg.ca/PPMPackages/10xx/mod_perl.ppd

The mod-perl installer will prompt you for the Apache module path. Enter:


You will now need to add mod_perl support to Apache. Locate and edit the Apache configuration file, C:\EPrints\Apache2\conf\httpd.conf. Open the file in a text editor and add the following lines:

LoadFile   “C:/EPrints/Perl/bin/perl510.dll”
LoadModule   perl_module modules/mod_perl.so

Configuring Apache and Perl

Configuring Apache and Perl requires you to set environment variables so EPrints can find Perl and its libraries. To set environment variables, use Control Panel – System – Advanced System Settings – Advanced – Environment Variables…

Locate the Path variable and edit it. Make sure both C:\Prints\perl\bin and C:\EPrints\Apache2\bin are included in the Path variable. Use a semicolon (;) to separate the variables.

Create a new variable PERL5LIB, with the value C:/EPrints/EPrints/perl_lib (note the forward slashes).

Install MySQL

Now run the MySQL installer and choose a Custom installation in the directory C:\EPrints\MySQL. You will need to set the following options:

Install the server and client programs. The C+ files are not needed. Skip the registration.

Configure MySQL

When the installation of MySQL completes, you will be prompted to configure the server. The configuration is simple and straightforward. You should accept most of the default settings.

When MySQL configuration has finished, you will need to set an option manually in MySQL’s configuration file by editing C:\EPrints\MySQL\my.ini in a text editor.

Remove the option NO_AUTO_CREATE_USER from the my.ini file.

Now restart MySQL so the new option will take effect. In the Control Panel – Administrative Tools – Services – MySQL and choose restart.

Install optional components

Install GhostScript, ImageMagick, and catdoc. These tools are not essential to EPrints, but provide extra functionality.

Run the GhostScript executable and install in C:\EPrints\GhostScript.

Catdoc is a zip file.  Unzip the file and place the contents into the EPrints directory. The file path should be C:\EPrints\catdoc-0.94.2.

Run the ImageMagick executable and install in C:\EPrints\ImageMagick . Select the options “Update executable search path” and “Install PerlMagick for ActiveState Perl”. Other options can be deselected.

Install EPrints 3

Run the EPrints installer. This will install files into C:\EPrints\EPrints.

When the installer has finished copying files, it will prompt you for server SMTP information.

Configure EPrints 3

First open a command prompt and change directory to C:\EPrints\EPrints. Now you can run epadmin to configure the archive.

cd \EPrints\EPrints

To start the EPrints creation process, run:

perl bin/epadmin create

Note: Whenever you need to run an EPrints command line tool, it must be prefixed with perl.

Run epadmin and fill out the prompts. You will get the following prompts (note that when you see something in [square brackets], it’s the default value and can be selected by simply hitting enter)

Archive ID – the system name for your archive. Once entered, an archive/<archive_id> directory will be created where the configuration files will be copied.

Configure vital settings – Hit enter to say ‘yes’. This will lead to more prompting about core settings:

Hostname – Since I am testing EPrints on my Laptop  I chose to run EPrints locally thus my hostname is is your computer’s default IP address. If you are directing to a live webserver, ensure that your IT can set the DNS.

Webserver Port – Which port to you want to serve the archive on? The default is 80, so unless you can think of a good reason not to, just hit enter to accept the default.

Alias – I created no aliases. You can enter any number of aliases that will take users to this archive. Enter a ‘#’ when you don’t want to enter any more. You could have your archive served on eprints.myorganisation.org and eprints.myorg.org. As with the Hostname, your systems team need to be informed about these aliases too.

Administrator Email – Enter the email address of the repository administrator.

Archive Name – The full name of your archive. By default, this will be used on the header of the webpage and in the title bar of the browser.

Write these core settings – Enter ‘yes’.

Configure database –  Enter ‘yes’.

Database Name – epadmin will create the database for you. By default, epadmin uses your Archive ID for database name.

MySQL Host – The address of the server that the database is running on. If the database is on the same machine as the EPrints installation, enter ‘localhost’.

MySQL Port – You probably don’t need to enter a value.

MySQL Socket – As with MySQL Port, it’s unlikely that you need to enter anything.

Database User – The username with which to log into the MySQL Database. You don’t need to create this user, epadmin will do it for you. If you enter a MySQL username that already exists, it will be overwritten by epstats.

Database Password – The password for the Database User.

Write these database settings – Choose ‘yes’.

Create database <Database Name> – Choose ‘yes’, and epadmin can create the database.

MySQL Root Password – To create the database and the user, epadmin needs the MySQL Root Password.

Create database tables – say yes to have epadmin create all the database tables.

Create an initial user – Choose ‘yes’.

Enter a username – The username you will use to log into EPrints in your browser. Epadmin defaults to admin.

Select a user type (user|editor|admin) – There are three levels of user in EPrints. You probably want to be an administrator, so enter ‘admin’.

Enter Password – Enter a password

Email – Enter your email address.

Important: Note that, although you are prompted to build the static web pages, import LOC subject headings and update the apache config files, epadmin will FAIL to run them. Look above the message “That seemed to more or less work…” and See the error messages “…not recognized as an internal or external command…

You must run generate_static *Archives ID*, import_subjects *Archives ID*, and generate_apacheconf manually from the command prompt according to the standard instructions. *Archives ID* should match the Archives ID entered when you ran epadmin.

perl bin/generate_static *Archives ID*
perl bin/import_subjects *Archives ID*
perl bin/generate_apacheconf

Finally you need to add the EPrints configuration file to Apache. Edit C:\EPrints\Apache2\conf\httpd.conf and add at the bottom of the file:

PerlPassEnv PERL5LIB
Include C:/EPrints/EPrints/cfg/apache.conf

Starting Apache

Control Apache from the Services panel. Stop and start the service before testing, to reload the configuration file.


EPrints should now be accessible from your browser, at the hostname (localhost or you specified in epadmin.

Popular Science 137 Year Archive Scanned, Online, Free

4 03 2010

Gadget nerds: Prepare to lose the rest of your day to awesomeness. PopSci, the web-wing of Popular Science magazine, has scanned its entire 137-year archive and put it online for you to read, absolutely free. The archive, made available in partnership with Google Books, even has the original period advertisements. Read More

Search the PopSci Archives.

Digital Repository Management Uncovered

4 03 2010

Digital Repository Management Uncovered is a WEBWISE 2010 preconference presentation by Jessica Branco Colati and Sarah Shreeves.  Colati and Shreeves provide a great primer for understanding digital repositories. They discuss the components of  a DR management framework to include key areas, functions, and policies that provide for the  drive and sustainability DRs.  6 key components of DRs include (1) Hardware (2) Software (3) Content (4) Relationships (5) Controls & (6) Trust.  The abstract of the presentation reads:

“More and more libraries are establishing repository manager positions – either full time or as a piece of another position, but because of the newness of this area, the responsibilities of a repository manager are sometimes not well defined. This session will give an overview of the major areas of repository management institutions should be aware of and offer strategies and tools for participants. This session is platform agnostic and focuses on issues around preservation policies and activities, access and dissemination, and intellectual property of repository management, as well supporting sustainability and growth. The session will be useful whether or not your repository is in-house or hosted elsewhere.”

JISC Digital Repositories InfoKit covers the same ground as Colati & Shreeves’ presentation. It also contains information on a broad range of topics running from the initial idea of a digital repository and the planning process to the maintenance and ongoing management of the repository. The main focus is on institutional repositories.

Thanks to  IDEALS for providing access to the presentation. IDEALS collects, disseminates, and provides persistent and reliable access to the research and scholarship of faculty, staff, and students at the University of Illinois at Urbana-Champaign. Faculty, staff, and graduate students can deposit their research and scholarship – unpublished and, in many cases, published – directly into IDEALS. Departments can use IDEALS to distribute their working papers, technical reports, or other research material. Contact Sarah Shreeves, IDEALS Coordinator, for more information.

Repositories and the cloud – useful links via JISC

4 03 2010

Links prepared for the JISC and Eduserv  meeting to discuss repositories and the cloud on Tuesday 23rd of February. Full details are on the event website and Andy Powell has written a great blog post introducing the event and asking for people’s views. See, also, repositories in the cloud report for a recap of the event.

Repository specific links:

General cloud information:

JISC links:

CDL Hosted Archivist Toolkit and Archon Service

3 03 2010

The California Digital Library now offers an Archivists’ Toolkit / Archon Hosted Service, as part of its collection management tools offerings. CDL offers to provide the technical backend support for institutions that do not have that capability.

Overview and Features

CDL-hosted versions of the Archivists’ Toolkit (AT) and Archon archival data management systems are available to contributors.

AT and Archon are popular open-source archival data management systems providing broad, integrated support for records and manuscript collections. They can be used to record information about collection workflows (appraisals, accessioning, processing, and deaccessioning); describe collection items, including digitized objects within a collection; generate print, HTML, and EAD collection guides; note the physical location of collection items; track reference questions; and more.

Both products offer support for describing and managing archival collections, but each has different features and functionality (see Appendixes 2-3 of this CLIR report).

EAD collection guides generated by both tools can be contributed to the OAC. The AT, in particular, also supports the creation of METS-encoded digital objects, which can also be contributed to OAC and Calisphere. UC campus repositories may additionally submit these METS to the CDL-hosted UC Libraries’ Digital Preservation Repository (DPR).

What We Offer

By hosting one or both of these applications, we seek to provide the technical infrastructure for institutions that do not have the capacity to host the backend databases locally. Institutions only need to implement the application clients to use the systems. Our goal is to 1) provide you with different options of applications, depending on what is best suited to your institution’s needs, and 2) help your institution avoid costs associated with creating or hosting an archival data management system locally.

The service currently includes the following support:

* Storage of your data on our servers, including data recovery and backup

* Centralized system and software upgrades

* Technical/database support and user support, including transitioning users to newer software versions as they are released

Read more here.