Monday, November 15, 2010

Comments- Week 11

http://acovel.blogspot.com/2010/11/unit-11-reading-notes.html?showComment=1289876001155#c6378281774192061552


http://pittlis2600.blogspot.com/2010/11/week-eleven-reading-notes.html?showComment=1289932658597#c6253541805544147950

Reading Notes- Week 11 (reposted)

Reading Notes- Week 11- Im reposting the notes so my blog is somewhat in order.

1) David Hawking , Web Search Engines: Part 1 and Part 2 IEEE Computer, June 2006.

I found this article on search engines informative. I hadn’t really thought about the vast amount of space a search engine uses to be efficient. I also found it interesting that there were many different aspects of the search engine in order to make it work. For example, a politeness delay is used to prevent a crawling server from having too many requests at a time. When the article discussed duplicates, it mentioned that “sophisticated” methods were needed in some case. I wonder if these methods are not employed or just have yet to be available because of the many duplicates that can be found in a typical search. The second part of the article was more confusing to me. I didn’t completely understand how the search engine knows which documents to skip, and how it numbers different documents.

2) Shreeves, S. L., Habing, T. O., Hagedorn, K., & Young, J. A. (2005). Current developments and future trends for the OAI protocol for metadata harvesting. Library Trends, 53(4), 576-589.

This article discussed the Open Archives Initiative. This initiative works towards creating metadata standards to be used universally. The creators of the initiative had hoped that people would use the standards as well as implement others along with them. Open Language Archives Community is one such community that has extended the standards they use beyond OAI. The article discusses current developments, issues, and future developments for the OAI community. The issue of metadata formats made a lot of sense, as more formats means that there is no standard.

3) MICHAEL K. BERGMAN,  “The Deep Web: Surfacing Hidden Value”

The deep web consists of all the web pages that can not be accessed by “traditional” search engines. I was surprised by the statistic: “Eighty-five percent of Web users use search engines to find needed information.” What do the other 15 percent use to find information? I would have thought that everyone used search engines. I was also surprised that many deep websites are visited more often then some surface websites. I would think that the larger amount of traffic to the site would make it a surface site.

Muddiest Point- 11/15

I have no muddiest point for this week's lecture.