Abstract

Scientific investigation produces significant amounts of rich multimedia digital data. Digital libraries provide the ability to collect resources that were never part of traditional library collections, such as materials microstructures. However, in order for these resources to be included in digital libraries end users, such as bench scientists and science students as resource producers, need to be able to record data, describe resources using appropriate scientific terminology, and tag them with metadata for better retrieval. New opportunities for collection development would arise by bringing national and international initiatives that bridge heterogeneous information streams through standards more directly to end users as creators of information. This poster presents preliminary findings on material science researchers, both experts and novices, as creators of information using Dublin Core (DC) metadata to provide access to information at an early stage of the scientific process. Because DC has been specifically designed to address the problems of resource discovery on the Internet (Hillman, 2001), it is expected to successfully provide material science researchers with a relatively quick and easy means of producing fundamental metadata elements. DC is a general approach that can be used with all formats across domains. It has been designed to be simple enough that someone who is not an expert in cataloguing can independently create and maintain metadata records. Long-term goals of this investigation are to study three fundamental questions: (1) Is it feasibly to use DC to describe laboratory data (i.e., does it adequately capture necessary information)? (2) How can the individual researcher best manage the information generated in the laboratory? and (3) How can the information best be disseminated across different disciplines and institutions beyond the laboratory? Answering the first question would necessitate examination of other metadata alternatives, if DC were proven incapable of capturing essential descriptive information or supporting important functions. While DC is intended to promote cross-domain discovery and accessibility of networked information, it is recognized that more precise descriptions will often be required within different information communities (Hillman, 2001). In order to allow for extensibility, a modular approach may be implemented, in which DC may be used in conjunction with other schemas. Materials Markup Language (MatML) is one currently available means of associating very detailed descriptive information with materials property data (Begley, 2003). Answering the second and third questions would encourage development of information management tools for the end user to support author-generated submissions in digital libraries. The sciences face a significant information management dilemma (Greenberg, Sutton, & Campbell, 2003) resulting from the quantity, complexity, and diversity of scientific information resources currently being generated. One component of this considerable problem is the initial encoding of preliminary laboratory data by bench scientists as well as science students, as information creators for use in information management and retrieval systems. Specific information must be described, organized, and disseminated quickly and easily in a systematic way so that researchers, educators, and students across the disciplines in the science community can have access to the information. Tagging individual units of laboratory data as they are generated, should ultimately allow individual researchers or members of interdisciplinary research teams to retrieve more meaningful information on different aspects of work in progress. Such enhanced capability should improve discovery, prevent duplication of effort, and increase productivity. Using a metadata standard such as DC is also expected to allow other interested parties, such as researchers, educators, and students to readily access an approved subset of the material. Furthermore immediate treatment of lab output is expected to improve the organization, and reusability of the data, and to facilitate the process of including items in digital libraries in the future. In this poster, we report qualitative results obtained during the beta testing of a metadata/markup generating/editing tool used to facilitate author-generated submissions to a materials microstructures repository. “The microstructure of a material is an image of the (usually) complex ensemble of polycrystalline grains, second phases, cracks, pores, and other features occurring on length scales large compared to atomic sizes” (Carter, Langer, & Fuller, 2002, ¶ 1). Authors contributing to the repository will include research scientists in the field (experts) at the National Institute of Standards and Technology as well as, undergraduate/graduate materials science students in the classroom (novices). Recommendations of the participants regarding the functioning of the tool and the usefulness of the resulting output will be used to refine plans to use the tool. The novices will be drawn from the University of Michigan's “Computational Nanoscience of Soft Matter” course. The goals of the course are to introduce students to new approaches to materials design and fabrication through cutting-edge, simulation-based research in nanoscience and nanotechnology. The course is taught in a lecture/lab format with the class assigned in teams of two and three workstations. The course introduces concepts of scientific computation, provides students with the background and skills to critically read and evaluate materials simulation literature, and gives students a fundamental understanding of simulation methods needed to use commercial materials simulation software packages intelligently and appropriately as well as develop their own codes. Each nanostructure is characterized by a unique set of information that must accompany the nanostructure for it to be useful to a student or materials scientist or engineer. To fully describe a particular simulation necessitates information on the specific simulation method and parameters, for example, force fields between nanoparticles, polymers, and solvent, approximations, geometric parameters of the nanoparticle and polymer, temperature and other thermodynamic variables, volume fraction of each species, initial condition, as well as cooling schedule and equilibration time. Historically, there has been no mechanism for conveniently and efficiently tagging nanostructures with this data, and storing them in a logical organizational scheme that would allow for intelligent sorting and access. Providing students and research scientists with a practical means of contributing to the repository is expected to remedy this problem, allowing them to make their work available in both research and education settings. Since materials scientists interact closely with the fields of biology, chemistry, mathematics, and physics, the participants will also be able to integrate the digital output of their work into other research. Novices and experts will store, describe, and tag the microstructures generated in their respective laboratories using the metadata/markup generating/editing tool. Digital output from computer simulations as they relate to the modeling of a range of materials will be included. Both groups will use research and commercial simulation software. The metadata will include processing and statistical information to characterize the microstructures as materials scientists would in order to increase discovery and reusability of the data by the scientific community. In conclusion, large amounts of data will be tagged using the metadata/markup generating/editing tool. Feedback from experts will enable refinements to the prototype tool. The project will compare author-generated metadata describing materials microstructures by experts and novices. The essential questions to be examined are: 1.) How effective is DC for general discovery and retrieval of laboratory data, and 2.) How well do authors with domain knowledge and no cataloguing training, provide initial description of materials property information. The material is based upon work sponsored by the National Science Foundation National Science Digital Library Program under Award Number DUE 0121545 and the National Institute of Standards and Technology 70NANB1H0005 We would like to thank Professor Sharon C. Glotzer, University of Michigan, for her invaluable guidance and contributions to the project.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call