bler's blog

Tuesday, November 16, 2010

Unit 12

Using a pre-installed VM presents different challenges than creating one directly around the planned digital collections. The biggest factor for which would be preferred would be the culture of knowledge fostered and the support (or lack thereof) from administration. I believe building one would be preferable in terms of a the robustness of the repository. However, if there is not a shared sense of commitment and understanding between shareholders, particularly those that decide policy and funding, it may prove to be too big a bite to try. In particular, preservation issues must be thoroughly addressed and I can see the logic of using a pre-configured set-up if doubts for long-term funding were present. It is hard to say in some ways because I am learning this in an isolated environment in a way and in the real-world more people and opinions would impact things a great deal. As far as pre-configuration helping to give more time to devote directly to the collection is a valid argument but the counter would be that pre-planning should/could do the same on a self-generated system.

Tuesday, November 9, 2010

unit 11

Perhaps because it is the freshest memory or maybe I have had more experience with evaluating websites in this field, but I was impressed with the Omeka site. In part I felt it was nicely geared for the beginner with both screencast and text based documentation. There was a logical set-up to the site and I felt the general overview was complete. One negative was that the FAQ page was deleted and had not been replaced for over a year. I particularly liked the Use Cases forum, in which developers explained how they used Omeka in different real life institutions. This gave me both insight into how to use Omeka but also a more general sense of what other types of people want to develop digital collections beyond libraries, archives and museums.

Drupal is aesthetically a little fussy for me but the documentation was significantly more detailed on the technical side than Omeka. It is also clearly much more popular and has been around longer which means its background information and forums have tackled more problems and offer more solutions. Drupal and Dspace home sites have a lot in common in terms of numerous links, depth of documentation and general sense of being overly full. An example from Dspace is the feature of linking ‘child pages’ to each main section. There are reasons why this would be helpful to follow a topic throughout the website but it adds on numerous links to the page that is unnecessary/confusing for the general user.

Jhove strikes a balance between the bustling atmosphere of Drupal and Dspace compared to the cleaner Omeka. It is clearly focused for IT staff. A concern is that some of the information is a little old. An example being that the news link has only two links, both from 2008. Has nothing happened in two years or is no one managing the website? Neither inspire a sense of confidence.

The OAI-PMH main site is deep and shows that the project has history and is a major international endeavor. It is a little impersonal and expects a certain amount of previous knowledge from its users on things like acronymns. My experience was positive with the install process but the website itself is a little intimidating.

One of the key features of all these systems is that they are open source based. To me this means the sense of community, communication and forum options, documentation and current news would be very important considerations in the choosing process. I like Omeka but the appeal of Dspace or Drupal is the large and active community of users that could provide support for free. It is a balancing act and I would give a lot of consideration to future support before making commitments.

Sunday, November 7, 2010

Unit 10

It is an interesting question to determine how successful a service provider of harvested metadata is as we are entirely dependent on their efforts for our results. Without the service providers, the information does not get found easily. Which makes it curious to me that the service providers I was able to examine were rather a mix of strange bedfellows.

Ex. 1. http://www.perseus.tufts.edu/hopper/search?redirect=true

The collections that provide the sources are disparate to say the least. The dominant collection is on the Art and Artifacts of Greek and RomanMaterials, while the second largest contributor is a collection of 19^th century American history, including the digital archive of the Richmond Times Dispatch newspaper. Subject, time period, place, etc. are not held in common and when searching the two faces of the collection are very apparent. What I am assuming is that it is a collection that the host, Tufts University, is finding this a convenience for its own reasons but it is not logical combination for the general user.

Ex. 2 http://re.cs.uct.ac.za/

This was a different style of searching than the norm. What the site does (from its frankly hideous looking interface) is allows you to set metadata parameters that you can then apply to the OAI compliant providers listed. There is not a way to search by key words and it requires knowledge of how OAI harvesting works to make sense. Again the collection of providers is from all sorts of institutions, around the world and with little obviously in common. It is kind of an inside out search tool. Another confusing point, the Open Archives list shows Virginia Tech as the host, but the site itself is from the University of Cape Town. Bit of a difference between the two!

Ex. 3 http://hispana.mcu.es/es/estaticos/contenido.cmd?pagina=estaticos/presentacion

By far the most successful of the examples was the Hispana site in terms of relevancy of search results. In large part this is due to the fact that it is a dual project with one side a directory of digital projects from Spain and the other a harvester for those same projects. This two in one approach meant that searches for common terms gave relevant results. The limitation is that it is all related to Spain. However, I would prefer to go to more than one service provider and get relevant results if they are both like the Hispana site rather than go to the Perseus site and have an unusable mix.

Tuesday, October 26, 2010

Week 9

Ah, the metadata dilemma. Too much is too expensive while too little makes the whole project pointless since no one will find it. For digital libraries in general I think this is an area that is both art and science. For my digital collection in particular, the potential audience and known creators are who I have in mind when I am experimenting with cataloging. What I mean by that is I envision my collection of webcomics as being of popular culture interest and not for a specialized field or academia. For general users, who are comfortable online, traditional subject terms are not adequate as they can be old fashioned or non-intuitive. Because of that I have been playing mainly with key words and tags. Neither are perfect. Key words have potential as being natural language based and a well known search method. I think it is the system most users will be comfortable with. I personally like tagging as method of description but without high user involvement and/or collection density it doesn’t necessarily work well. In both cases, consistency is dependent on me (the administrator) paying attention and keeping track of what I had chosen in previous cases. The only way around this problem I can think of is to include decision making for terminology in the planning stages. And then hope one is prescient enough to cast the net wide enough to give full coverage.

Tuesday, October 19, 2010

Week8

While I have become quite cavalier about setting up a new VM, actually installing Eprints was more anxious. Each time we install a new system, there are new areas of confusion or grey spots in my previous understanding. I am however, taking the lesson from IRLS 673 to heart. At the beginning of that class command line work seemed beyond my understanding. And while I don't claim to understand it completely now, I have improved by magnitudes. I am believing the same will happen with time spent on Drupal, Dspace and now Eprints. Comparing the installation experiences between the three is apples to oranges to kiwi. I currently favor Drupal for my semester project in part because I was most comfortable with that installation. Dspace was a horrible experience, though to be fair most problems were externally generated. Eprints has been in between. The main issue with the install has been that it has taken more time, even though there have not been major problems. The aptitude upgrade and then the Eprints apt method both took a considerable amount of time for no reason I could deduce. It makes me fearful over my laptop and if the hardware doesn't work then that is a very big problem for me. As far as the configuration/branding exercise, when compared to the customization processes for Drupal and Dspace, I again would place it somewhere between the two. Drupal continues to be on top and Dspace in third. In non technical terms I am not feeling Eprints yet and part of it is that I don't have an example library that uses Eprints that I have really liked. The listing of the roar site of repositories using Eprints was large and the examples I looked at were interesting but nothing that inspired me. I have not spent as much time with Eprints as I have with the other two so I will be curious if my opinion changes at the end of these two weeks.

Thursday, October 14, 2010

Week 7

I have had a week of discouragements in regards to DSpace installation as my late posts can attest. My beloved laptop has had sudden battery problems . The phone company has done a lot of repair work in my neighborhood after a big storm that has meant random internet blackout periods. And I am missing something about DSpace. Intellectually I understand the hierarchical nature of the DSpace setup but it is not a natural fit. When I try to apply those principles of organization to my digital collection it does not work well. I struggled to pick a collection at the beginning of the semester and still feel that it is not well formed. My conceptualization is not very firm as I don’t have any practical experience to draw off of. In previous posts I have been quite critical of dull or not very relevant collections being chosen for digital projects by institutions as being a cop-out. I still think that but I have new appreciation for the difficulties inherent in the process.

On more positive notes, the readings for Unit 7 were both very interesting. I appreciate the perspective the Stanford authors laid out as to the successes and failures of a major digital repository. It gave a new sense of the speed of change the digital reservation community is experiencing. The Johns reading about the context of repository software design gave insight into the root causes of differences between systems. The Greenstone open-source system is one I particularly find interesting because of its focus on multilingualism. The New Zealand Digital Library Project and the University of Waikato developed the project and a partnership with UNESCO has helped make it an international community. A phrase from the website has particular resonance regarding increasing the “awareness of the social implications of information technology”. This is brought home by the use of Greenstone for bilingual digital collections for minority languages at risk of extinction, as the New Zealand project has included Maori, there are others in Welsh, Kazakh, Hawaiian and more. It pleases me to see a digital library have two preservation roles to play. The Greenstone project also focuses on its interoperability with OAI-PMH and METS. It is also able to import and export collections with DSpace. How that works is something I will be interested in exploring further.

Tuesday, October 5, 2010

week 6

I had an unnervingly easy install of DSpace. I did not understand all the steps in the process but I did for a fair amount that means I would feel confident contributing to a discussion about DSpace but would not make decisions without a systems librarian. What is funny is that I had difficulties downloading the WRF for the tutorials that I am still trying to sort out. I was able to start the set up of the site as administrator and uploaded a collection piece but I will want to spend a lot more time experimenting before I make any final decisions about it being the best for my collection project.