Documents  
Authentication and Authorization   
This paper explores authentication and authorization solutions for accessing online resources --how to provide easy access to online information for authorized users, while protecting patron privacy.
@Creative Commons/ Non-commercial No Derivatives

A Discussion of Current Methods and a Vision
for Digital Libraries

by Ann Cary, Master of Science in Library & Information Science Candidate, Drexel University

Abstract

This paper explores authentication and authorization solutions for accessing online resources. Digital libraries are challenged with providing easy access to online information for authorized users, while protecting patron privacy. This paper presents the advantages and disadvantages of several current authentication and authorization systems used by libraries and other institutions, and discusses the Shibboleth architecture for federated, integrated access control. Shibboleth's advances and potential in digital library environments are also discussed.

Introduction

Traditionally, authentication and authorization have been relatively simple for libraries. A student ID card, government ID, or other form of identification is usually sufficient to authenticate the identity of patrons. A library card, issued at the time of authentication and perhaps defined by certain borrowing rules, provides authorization each time the patron uses materials or services.

With digital libraries, however, establishing patron identity and limiting access to valid patrons is more complex. Digital libraries require fast, unobtrusive, and secure authentication and authorization processes. A number of online access methods currently exist, although comprehensive, easy-to-use, and scalable solutions are still developing.

Comparison of Access Management Systems

Access management systems perform authentication and authorization for online resources. Although authentication and authorization are performed simultaneously in many systems, they are actually separate processes. Authorization and authentication do not have to be performed simultaneously, or even by the same systems or in the same locations.

Lynch (1998) defines authentication as "the process where a network user establishes a right to an identity -- in essence, the right to use a name", and authorization as “the process of determining whether an identity (plus a set of attributes associated with that identity) is permitted to perform some action, such as accessing a resource" (p. 3).

Table 1 compares the most popular access management systems currently used in libraries for online resources. These include IP address validation, web proxy service, username and password authentication, virtual privates networks (VPNs), centralized systems (such as the UK's Athens system), and distributed systems (such as the open source project Shibboleth).

IP Address Validation

Internet protocol (IP) address validation is a relatively simple access management system. In this system, resource providers compare the IP address of a user to a list of authorized IP addresses. If there is a match, the resource provider grants access. IP address validation does not require usernames or passwords, making it simple and transparent for patrons. The major drawback is that it does not grant access to users at remote locations unless individual IP addresses are entered into the database, which is a time-consuming process. IP address validation also poses a potential threat to the privacy of the user, since all information exchanges are tied directly to individual IP addresses.

Web Proxy Service

Web proxy service was introduced to supplement IP address validation by allowing off-campus users to log in to a proxy server, whose address would then be recognized and approved by the resource provider. The main benefit of web proxy service is that it can be integrated with an existing IP address validation system. It also provides excellent anonymity for off-campus users.

Proxy servers have limitations, however. They are relatively inefficient, consume a high amount of bandwidth, and burden campus servers with an extra layer of data transmission. A library's technology staff must set up and maintain the proxy server or outsource this responsibility. Troubleshooting also represents a significant time burden; Covey (2003) found that 47% of survey respondents experienced daily or weekly problems with their proxy server and that staff routinely invested entire workdays per month dealing with these problems.

Proxy servers are also vulnerable to attacks by hackers. Unlike an attack within a traditional IP address validation system, an attack on a proxy server can take longer to detect. One major breach of proxy security occurred in 2002 when unauthorized users downloaded approximately 50,000 JSTOR articles through open campus proxy servers before being detected (Carnevale, 2002).

Table 1
Methods of Access Control Currently Used in Libraries

Method IP address validation Web proxy service Username and password authentication Virtual Private Network (VPN) Centralized System - Athens Distributed System - Shibboleth
Description Resource provider's web server captures the IP address from which the request originates, and checks it against a list of approved addresses. A proxy server is set up to enable off-campus access. Local authentication system controls which users have access to the proxy server. Resource provider uses IP address validation of requests coming from proxy server. Resource provider validates some form of credential with a trusted server. Or, resource provider creates and maintains a list of usernames and passwords for all those entitled to use a resource. An on-demand, private network that uses the Internet to connect users with remote servers. Users must register and configure their computers to use VPN service. Username and password information for all participating institutions and all access-controlled resources are held in a single centralized database. Administration is distributed to each institution, which must keep its profile of users and resources up-to-date. Uses campus local authentication system to verify users. Pseudo anonymous attributes about users are sent to resource providers. Resource provider determines authorization. Based on "federations of trust" between institutions.
Pros •Relatively simple for libraries to deploy.
•Efficient in terms of performance.
•Invisible to patrons - no login required.
•Can be integrated with existing IP address validation.
•Provides a solution for efficient off-campus access.
•Simple to design and implement on small scales.
•Users are familiar with this access method.
• Operates more efficiently than proxy servers. • Institutions are not required to maintain hardware and software for access control.
• Publishers and vendors have strong incentive to participate so that their resources are accessible.
• User administration is extremely simple for the resource provider.
• Scales well.
• Flexible, affordable, easy to use and implement.
• Open source software with good documentation from Shibboleth development team.
• Users only need to remember one set of passwords for potentially thousands of online services.
Cons •IP address ranges must be constantly maintained.
•Adding off-site addresses is time consuming.
•Considerable work involved in set up and maintenance.
•Does not scale well - must be updated as the community grows.
•High level of bandwidth consumption - creates extra layer of data transmission on campus server.
•Considerable work involved in large institutions.
•Users must remember many passwords, each for a different service.
• Requires technical expertise and specialized hardware to set up and maintain. • If a resource provider does not participate, this can be very problematic for libraries. • Libraries may lack technical expertise to implement - requires campus-wide authentication system and directory.
• Currently only works with Web-enabled applications.
• Does not include an authentication system.
• Currently not all resources are Shibboleth enabled, so an alternate system for access is also needed.
Security •Relatively secure.
•Security breaches are easily detectable.
•Vulnerable to attacks on proxy server, which can be reconfigured. This may be more difficult to detect than an attack on the local system. •Relatively weak.
•Depends on user ability to protect passwords (memorization vs. post-it on the monitor).
•Often uses web basic authentication protocol, so usernames and passwords are not encrypted during web transfer.
• Provides high level of security if configured properly. Uses encryption and other methods to ensure data is not intercepted. • Reduces security issue of users having too many passwords to remember. • Reduces security issue of users having too many passwords to remember.
Privacy •Data requests from IP addresses can be linked to personal workstations in offices. •Can provide true anonymity if set up properly. •Personal information is directly tied to each username and password. • Provides high level of privacy from unauthorized viewers; however system administrator may have access. • User identities are held in a centralized database outside of the individual institution. This may be cause of concern for privacy. • Protects privacy - user information is only held at the home institution, not with each resource provider. • Goal is to release the minimum level of information necessary to make an authorization decision. • Users can control level of privacy vs. access.
Sources Lynch, 1998; Mikesell, 2004; Robiette, 2001 Carnevale, 2002; Mikesell, 2004; Robiette, 2001 Lynch, 1998; Robiette, 2001 Covey, 2003; Mikesell, 2004 Mikesell, 2004; Robiette, 2001 Covey, 2003; Klingenstein, 2004; Lemmons, 2004, Mikesell, 2004

 

Username and Password Authentication

In a username and password authentication system, a resource provider compares a username and password provided by the user with a list of valid username/password pairs. This list is either stored on a server that the resource provider trusts, or maintained by the resource provider themselves. This system is familiar to users who are accustomed to using usernames and passwords to gain access to resources.

Although this can be the simplest access management method on a very small scale, username and password authentication has numerous drawbacks in larger settings. It is considerable work for large institutions to maintain separate passwords for each individual and for each resource. Users must remember multiple passwords, leading to "password overload" which can be a serious risk to system security. As Duke University IT architect Michael Gettes says, "It’s no wonder some people keep their passwords in a desk drawer or slap sticky notes on their monitor. Password overload is a pain. More importantly, having an impossible number of passwords to remember is a sign that personal privacy is at risk.” (Lemmons, 2004, ¶ 1).

Virtual Private Networks

Virtual private networks (VPNs) are on-demand, private networks that use the Internet to connect users with remote servers. Users must register and configure their computers to use a VPN service. Once configured and connected, the VPN operates like a proxy server. VPNs provide much better security than proxy servers because data is encrypted during transfer and, if configured correctly, cannot be intercepted. However, VPNs require specialized hardware and software, and some institutions may lack the technical expertise to set up and maintain a VPN. In fact, Covey (2003) found that although 30% of academic libraries surveyed were running or testing a VPN, none of the libraries implemented the VPN themselves. To date, VPNs have not been as popular as proxy servers in academic institutions.

Centralized Systems - Athens

Centralized systems have also been used to manage access to restricted resources. In a centralized system, username and password information for all participating institutions and all access-controlled resources are held in a single centralized database. Administration is distributed to each institution, which must keep its profile of users and resources up-to-date. The most notable centralized system is the Athens system. Athens was developed in 1996 for the higher education environment in the UK, and has been expanded to encompass the UK’s National Health Service as well (Robiette, 2001).

One benefit of a centralized system is that individual institutions do not need to maintain hardware and software for access control. Thus, publishers and vendors have strong incentive to participate so that their resources are accessible to a wide range of institutions. At the same time, it can be very problematic for libraries if a resource provider does not choose to participate, because libraries may not be equipped to manage access to sources outside the system.

In 2004, the UK’s Joint Information Systems Committee (JISC) announced plans to migrate their authentication processing from Athens to Shibboleth, a distributed access management system (Chillingworth, 2004). JISC explains that Shibboleth provides significant benefits over Athens, such as flexibility, compatibility, and user friendliness (Morrow, Borda & Robiette, 2005).

Distributed Systems - Shibboleth

In 2003, Internet2 and the Middleware Architecture Committee for Education (MACE) introduced Shibboleth, a distributed, open source access management control system (Lemmons, 2004). Shibboleth separates the processes of authentication and authorization between institutions requesting access and those providing access. In Shibboleth, the user's home institution (called an ‘origin site’ or ‘identity provider’) is responsible for storing user information and authenticating users. Institutions that provide online resources (called ‘target sites’ or ‘service providers’) request authentication statements from identity providers when a user requests access to their services (Cantor, 2005). The user is granted authorization to access a particular resource if the response from the identity provider meets the service provider’s requirements.

Shibboleth is the middleware that allows these transactions to take place. Written in SAML (Security Assertion Markup Language), Shibboleth facilitates the definition of relationships between institutions as well as the exchange of pseudo-anonymous information about users. In order to use Shibboleth, identity providers must run a campus-wide authentication system and maintain a database of user attributes (referred to as an Attribute Authority). It’s recommended, but not required, that the identity provider also run a Single Sign On (SSO) system.

providers must be configured to support Shibboleth. Currently, implementation requires several software components, including the Apache Web server, Tomcat, and Sun’s Java (Carmody, 2001; Shibboleth origin deployment guide, section 3.a). Users must use a JavaScript capable Web browser which supports HTTP-level POST and redirections (Klingenstein, 2002). Beyond these requirements, Shibboleth can be implemented by any institution with access to the public Internet.

A major benefit of Shibboleth is the protection of patron identity. The goal of Shibboleth is to release the minimum amount of information about the user to achieve authorization. Each user has a single identity which is controlled by a single institution. Identity providers only send attributes that authenticate the individual as a valid user to the service provider. Shibboleth allows sites that provide resources and sites that request access to negotiate and determine what attributes or credentials need to be exchanged in order to gain access.

Shibboleth also allows users to control the release of their personal information. The Draft Shibboleth Architecture document from May 2002 states that “Shibboleth is also a system for allowing user choice in what information gets released about the user and to which site. Thus, the job of balancing access and privacy lies ultimately with the user, where it belongs” (Needleman, 2004, p. 253).

This model depends on trust relationships between identity and service providers. Individual institutions define these trust relationships and negotiate their complexity. To reduce the amount of negotiation involved, groups can form "federations of trust" where members agree to trust each other’s authentications. One example of a federation of trust is InCommon™, which consists of sixteen U.S. universities, institutions, and organizations (InCommon, 2005). SWITCHaai in Switzerland and HAKA in Finland are examples of international Shibboleth federations (Morrow, Borda & Robiette, 2005).

Figure 1 gives a visual representation and steps involved in a Shibboleth enabled request for access (Cantor, 2005).

Figure 1
Model of Shibboleth System
Model of Shibboleth System

  1. User requests access-limited information on a website of a service provider.
  2. The service provider issues an authentication request for the user. Request is directed to a “Where Are You From” (WAYF) or, if the service provider has taken on the role of WAYF, directly to an identity provider.
  3. If a WAYF is used, the WAYF directs the request to the appropriate identity provider (the user’s home institution).
  4. Identity provider performs local authentication of the user. This is outside the scope of Shibboleth.
  5. Identity provider issues an authentication assertion (with a handle for the user) in response to the service provider.
  6. If the service provider requires additional attribute information to make an authorization decision, the service provider sends an attribute request message (with the user’s handle) to an Attribute Authority associated with the identity provider.
  7. The Attribute Authority processes the attribute request message and issues a response message to the service provider. This response message could contain one or more assertions containing attributes that apply to the user, dependent upon restrictions set up by the user.
  8. The service provider denies or approves the user’s request to access the information. Either an error message or the requested resource is sent to the user.

Limitations of Shibboleth

In order to become a comprehensive access management system for digital libraries, Shibboleth needs to interact with non web-based services. This capability would strengthen the digital library's role as a portal to an individual's entire collection of resources. Shibboleth must also improve user access speed. The Shibboleth Overview and Requirements document cautions that "a user should be willing to tolerate some amount of delay on their initial access to each remote site" (Carmody, 2001, p. 12). This is a significant drawback for users who expect instantaneous access to resources. Finally, until all resource providers are Shibboleth-enabled, institutions will need to maintain additional access management systems for non-compliant resource providers.

A Vision for Digital Libraries

Within the first six months of its release in August of 2003, Shibboleth version 1.1 was implemented by more than 30 universities, content providers, and international partners (Klingenstein, 2004). Organizations that currently support Shibboleth include the courseware systems Blackboard™ and WebCT™, and database providers EBSCOhost and Elsevier ScienceDirect (Internet 2, 2005, April 25). Napster is also Shibboleth enabled, and successfully implemented a free music file sharing system for Penn State students in 2004 (Shibboleth 1.1 successfully used by Penn State Students…, 2004). In addition, four European countries are testing Shibboleth at a national level (Becker, 2002).

The development of Shibboleth is exciting for digital libraries. Imagine a library which provides access to the entire range of online resources, such as databases, electronic books, consortia borrowing, online meeting software, web-based email, courseware, and more. For institutions with SSO, after a single authentication Shibboleth would work behind the scenes each time a patron accessed a restricted service or resource. There would be no additional passwords to remember and easy access regardless of user location. This solution could truly provide flexible and secure access to digital libraries from around the world.

For further information on technology and programs discussed in this paper, please visit the following websites. All websites were accessible as of May 2, 2005.

Access Management Services

Athens Access Management System: http://www.athens.ac.uk/
Cisco VPN Client: http://www.cisco.com/en/US/products/sw/secursw/ps2308/
EZproxy by Useful Utilities: http://www.usefulutilities.com/
Squid Web Proxy Cache: http://www.squid-cache.org/

Internet2 / Shibboleth

InCommon™: http://www.incommonfederation.org/
Internet2: http://www.internet2.edu/
Internet2 Middleware Initiatives: http://middleware.internet2.edu/
OpenSAML 1.0.1: http://www.opensaml.org/
Shibboleth: http://shibboleth.internet2.edu/

References

Becker, P. (2002, August 5). Shibboleth: identity the Internet way. Digital Id World. Retrieved May 2, 2005 from http://www.digitalidworld.com/article.php?id=90

Cantor, S. ed. (2005, February 28). Shibboleth architecture: protocols and profiles. Retrieved May 2, 2005 from Shibboleth Project Documentation Web site: http://shibboleth.internet2.edu/docs/draft-mace-shibboleth-arch-protocols-latest.pdf

Carmody, S. (2001, February 20). Shibboleth working group overview and requirements document. Retrieved May 2, 2005 from Shibboleth Project Documentation Web site: http://shibboleth.internet2.edu/docs/draft-internet2-shibboleth-requirements-01.html

Carnevale, D. (2002, December 12). Security lapses on campuses permit theft from JSTOR database. The Chronicle of Higher Education. Retrieved May 2, 2005 from http://chronicle.com/free/2002/12/2002121201t.htm

Chillingworth, M. (2004). Future of Athens uncertain as JISC backs Shibboleth. Information World Review, (204). Retrieved March 17, 2005.

Covey, D. T. (2003). The need to improve remote access to online library resources: filling the gap between commercial vendor and academic user practice. portal: Libraries and the Academy, 3(4), 577-599.

InCommon™. (2005). InCommon™ participants. Retrieved May 2, 2005 from InCommon™ Web site: http://www.incommonfederation.org/participants.cfm

Internet2. (2005, April 25). Shibboleth Enabled Applications and Services. Retrieved May 2, 2005 from Shibboleth Project, Internet2 Middleware Web site: http://shibboleth.internet2.edu/seas.html

Internet2. (1996-2005). Internet2 Middleware Initiative. Retrieved May 2, 2005 from http://middleware.internet2.edu/

Klingenstein, N. (2004). Shibboleth. Computers in Libraries, 24(2), 30.

Klingenstein, N. (2002, May 13). DRAFT Shibboleth FAQ. Retrieved May 2, 2005 from Shibboleth Project Documentation Web site: http://shibboleth.internet2.edu/docs/draft-internet2-shibboleth-faq-v07.txt

Lemmons, P. (2004, September 3). 'Shibboleth' relieves password overload, enhances computer privacy. Retrieved May 2, 2005 from Duke University, News & Communications Web site: http://www.dukenews.duke.edu/2004/09/shibboleth_0904.html

Lynch, C. ed. (1998, April 14). A white paper on authentication and access management issues in cross-organizational use of networked information resources. Retrieved March 17, 2005 from Coalition for Networked Information Web site: http://www.cni.org/projects/authentication/authentication-wp.html

Martin, M. et. al. (2002). Federated digital rights management: a proposed DRM solution for research and education. D-Lib Magazine, 8(7/8). Retrieved May 2, 2005 from http://www.dlib.org/dlib/july02/martin/07martin.html

Mikesell, B. L. (2004). Anything, anytime, anywhere: proxy servers, Shibboleth, and the dream of the digital library. Journal of Library Administration, 41(1/2), 315-326.

Morrow, T., Borda, A., & Robiette, A. (2005, April). Shibboleth: connecting people and resources. Retrieved May 2, 2005 from JISC – The Joint Information Systems Committee Web site: http://www.jisc.ac.uk/uploaded_documents/JISC-BP-Shibboleth-v1-final.pdf

Needleman, M. (2004). The Shibboleth authentication/authorization system. Serials Review, 30, 252-253.

Robiette, A. (2001). Managing access to electronic information: progress and prospects. Serials, 14(3), 301-304.

Shibboleth 1.1 successfully used by Penn State students registering for Napster 2.0 Premium Service. (2004, January 27). PR Newswire. Retrieved March 17, 2005 from LexisNexis Academic database.

Shibboleth origin deployment guide: Shibboleth version 1.2.1. (2004, November 15). 2004. Retrieved May 2, 2005 from Shibboleth Project, Internet2 Middleware Web site: http://shibboleth.internet2.edu/guides/deploy-guide-origin1.2.1.html

Creative Commons License
This work is licensed under a Creative Commons License.


Contribute to this topic
Do you have an article, presentation, or other content to share on this topic?
You can post it on this topic page. Find out more about submitting documents in the Member Center.
Ratings You must be signed in to rate this item
Average (0 Votes)
Comments