IST Home > IST Division > Data Services > Blog

Local Navigation:


Articles by Patrick Schmitz

Author Email: Contact Author

 

Delphi 1.2 at PAHMA is out

Friday, July 17th, 2009 by Patrick Schmitz

Here’s what Michael Black, research and IT director of the Hearst Museum, said about it:

Hi everyone,

This is just a quick announcement, as fuller information should be upcoming in a campus press release.

Delphi 1.2 (the updated version of the Museum’s collections exploration and discovery tool) is now live and online.

In addition to the features released in version 1.1 two weeks ago — the ability to share sets with other people (whether or not they’re Delphi users), greatly improved ontologies (’concept trees’) for automatic object classification, vastly enlarged object data (thanks to the efforts of dedicated volunteers doing data entry on more than 140,000 objects), and the ability to view scans of catalog cards for the objects you find — Delphi 1.2 presents a couple of new user-oriented features.

Delphi 1.2 now fully supports user tagging of objects, including being able to search on either your own tags or across all tags submitted by the entire user community.  Starting with this release, the blue “tongue” will change the content it displays according to the experience level of the user.  For new users, a basic introductory text is displayed, while for more experienced users (here defined as those who have at least played around with the sets and/or tagging features), the displayed text is more of a “what’s new in Delphi” news item.

I invite you to try it out, to share it with family, friends, colleagues, etc.

http://pahma.berkeley.edu/delphi
Michael

Big Data issue of Nature: uneven, but worth reading

Thursday, September 11th, 2008 by Patrick Schmitz

The topic of Big Data and the associated trends for research are part of our future here at DS. The recent issue of Nature looks at issues and trends around the topic, and while uneven, has some good material in it that folks should check out. Here’s my blow by blow on the sections:

The opening editorial calls for push to make annotating data be a major component of research and of grants. Sound familiar? Let’s hope funders listen.

The section on the next Google trots out a lot of familiar and frankly pretty dull options. Skip it.

Big data: Data wrangling poses important question about data collection. We might have the sense is that there is so much data, it is just a matter of managing it. However, David Goldston notes that there are also huge holes in the dataverse, and these are a result of political policy. Further, if a political entity controls the data, politics can (and will) shape and filter the data in fair-reaching ways.

Cory Doctorow’s Gee whiz piece is irritating (unless you’re into technoporn), and is easy to skip.

A piece on wikiomics is an excellent description of how community can make a difference, and the social dynamics of a collaboratory.

Cliff Lynch has a good piece on what data production projects must do to rationalize their data management, and what services must be provided by groups like IST/DS, to support these projects.

Frankel & Reid present an interesting discussion of mining and visualization, and include a compelling, cautionary note:

“The ingrained habits of highly trained scientists make them rarely as adventurous as these young minds. We think we are on the path to insight when shading reveals contours in 3D renderings, or when bursts of red appear on heat maps, for example. But the algorithms used to produce the graphics may create illusions or embed assumptions. The human visual system creates in the brain an apparent understanding of what a picture represents, not necessarily a picture of the underlying science. Unless we know all the steps from hypothesis to understanding — by conversing with theorists, experimentalists, instrument and software developers, visualization scientists, graphic artists and cognitive psychologists — we cannot be sure whether a display is accurate or misleading.”

The closing essay is human interest and could be skipped in the interest of time. However, it is short, and like the best human interest stories, is surprising and inspiring.

BECHAMEL project at NCSA combines preservation and semantic services

Friday, August 8th, 2008 by Patrick Schmitz

U. of Illinois is getting a chunk of NDIPP money to develop their BECHAMEL framework that identifies semantic vulnerabilities in metadata, as a means of supporting digital preservation services. What does this mean? Here’s a good quote:

“For example, the meta-data for a digital file—a photo or map or document—might include a field called “creator.” Putting a name like “John Smith” in this field might seem sufficient, but does that really identify the creator of the information? In 50 years will a future researcher be able to pinpoint which of the world’s many “John Smiths” created the information?

BECHAMEL flags risks like that one, or such as numerical values that aren’t accompanied by error ranges.”

There’s only a little more info in the article, but there are some papers on a research page at the uiuc site. David Dubin’s recent paper provides some better details. He describes their earlier BECHAMEL work as “a research environment for proposing and testing theories of the meaning of markup.” It is a Prolog app connected to an RDF store (Kowari, losing favor to Mulgara).

It sounds like some of what they’re doing is to recognize that lots of so-called structured markup (including, im my opinion, lots of RDF) is actually semantic-free and amounts to free text annotations with some weak hints (e.g., “dc:creator”). The question is whether the project will yield useful tools or more guidelines that are unrealistic in deployment. Their near term goal seems to be the conversion of entity references in free text (e.g., in  a dc:creator element) to RDF references to vocabularies. Is a reference to the concept of “San Francisco, CA” in a gazetteer more useful than the same free text? Probably. But will an RDF pointer to a FOAF description of “John Smith”be much more useful than the free text? I doubt it.  Nevertheless, a project worth watching.

Nina Simon on IMLS Meeting on Museums and Libraries in the 21st Century

Monday, July 21st, 2008 by Patrick Schmitz

Nina Simon (who writes the Museum 2.0 blog) recently wrote about her impressions of the IMLS Meeting on Museums and Libraries in the 21st Century that took place last week. The meeting was preliminary to a large report that NAS is commissioning on the subject.  It is an interesting survey of the state of attitudes in the industry, from the perspective of someone who wants to see things move forward.

She includes notes on the six topics that the workshop discussed:

  1. How do you plan for the future?
  2. What are the essential differences and similarities between libraries and museums?
  3. How do you measure and articulate the value of museums and libraries?
  4. How can our expertise and assets be applied towards new ends?
  5. Who owns the stuff? Who controls the experience?
  6. How do we reimagine physical space and assets?

Her general observations:

  1. Some leaders are more radical than I hoped, and these people have a hard time advocating for change when their accountability is to those who have not changed.
  2. Some leaders are more conservative than I feared, and these people are alternately smug and desperate about maintaining their power.
  3. Meetings about the future end up being about the present. We were much less creative and forward-thinking than we could have been. Dream big, share it in the comments, and help this be a more productive study.

Read the post - it is interesting, and a good introduction to that blog, if you do not know it already.

Scholarz.net - an intresting collection of tools for scholarship

Wednesday, July 9th, 2008 by Patrick Schmitz

I just came across the http://scholarz.net/ project that wants to be a destination for scholars doing research, collecting notes on projects, sharing information about sources, projects, etc. At this point, it seems to combine basic Wiki functionality (including tagging) with a basic social network app (a la Facebook). Some pieces are kind of nice, but it lacks the workflow integration of Zotero, of which I am very fond. And Zotero is soon to release a more community oriented version of their tools, which will make it much more powerful.  Still, Scholarz.net looks like a project to watch.


UC Berkeley UC Berkeley CIO Campuswide IT Service Providers
Site Map Contact Webmaster