News

Wisconsin Public Television’s Media Library Database

How we use PBCore

In our production facility we use the PBCore metadata standard in our media library database. The database includes completed programs, masters of story segments, and b-roll that we’ve decided to keep because it is historically relevant, generic and therefore reusable in other stories, and will support future revisits to a particular story.

The most detailed metadata is added to the assets at end of the workflow when the tape is given to the media library upon completion of the program. However, metadata is also added at other points in the lifecycle of the asset. For example, producers add important metadata (like title, creator, formatID, date created, aspect ratio, contributor, plus local fields) when they return with a tape from the field.

We’ve been experimenting with adding metadata to files that stay on the Avid Interplay in the Catalogs folder. This is b-roll that was previously acquired and includes footage of manufacturing, education, state legislature, etc.—material that will definitely be used again and will need quick accessibility. This metadata is valuable but not a complete PBCore record.

The software for our media library database is free and available at http://scout.wisc.edu/Projects/CWIS/. Setting up the database to include all the PBCore fields and local fields was not difficult. While PBCore now allows for multiple instantiations to be added to one record, this software does not allow for that. Nor does it allow for adding low resolution viewing copies of the asset. We are researching a new solution to CWIS so we can incorporate PBCore’s new attributes more fully.

Why we use PBCore
PBCore provides a thorough and detailed metadata standard in our online database (http://wptmedialibrary.wisc.edu) for cataloging assets, and improving findability. This database is meant to:

  1. Serve in-house needs by providing production staff access to content in our archive
  2. Provide a deeper level of intellectual control of our assets
  3. Reflect the totality of our archival holdings (eventually)

PBCore metadata from this database can be exported in PBCore XML and shared with other databases.

Contact Information

Ann Wilkens
Wisconsin Public Television
608-263-4516
wilkens@wpt.org

NDIIPP’s Model Digital Video Preservation Repository

The prototype Preservation Repository (PR) at New York University was built as a major component of the Library of Congress-funded project Preserving Digital Public Television.

A partnership between WNET-TV, WGBH-TV, the Public Broadcasting Service, and NYU, the Preserving Digital Public Television had several goals:

  • Build a model preservation repository for “born-digital” public television programs
  • Examine operating issues related to content selection, costs, and access
  • Promote system-wide support for digital preservation

The prototype Preservation Repository was developed at New York University by the Digital Library Technology Services team between 2006 and 2009. Focus was on submitting a selection of video files, ingesting them into the repository, and then retrieving them successfully.

The repository was built on DSpace and the operations were based on an ISO standard known as the “Reference Model for an Open Archival Information System (OAIS)” and commonly referred to as the “OAIS Reference Model”. The model is generic in terms of content – it describes the general data flow of files in and out of an OAIS and provides a vocabulary for discussing digital preservation repository concepts and a framework for structuring repository functions.

How we’re using PBCore

This framework is being adopted by many media archives for long-term storage. Using this model, files are packaged for input as Submission Information Packages (SIPs), then stored as Archival Information Packages (AIPs). Particular attention was given to determining the most appropriate metadata. PBCore was chosen as the primary descriptive and technical metadata schema for program files being packaged into an Archival Information Package. (Additional metadata was added to the AIP using other schema, including METS, MODS and PREMIS.)

A sample of over 80 hours of program files were submitted to the repository, including high-resolution (production quality) program masters from WNET and WGBH, and lower-resolution (broadcast quality) distribution versions of the same programs from PBS. These included samples from national productions Nova, Frontline, Religion & Ethics Newsweekly and other programs. Although not a large collection, this allowed the repository to organize and manage a wide range of program file encoding formats, with different wrappers and metadata.

Each AIP contained multiple files and were required to have:

  • At least one Master Program file
  • A PBCore file containing descriptive and technical metadata
  • A PREMIS file containing creating application and rendering environment information
  • Plus the files specified by the core AIP:
    • A METS file containing structural metadata
    • A MODS file containing descriptive metadata
    • And a METSRights file containing rights information

OAIS Repository Functions

SIP = Submission Information Package – configuration of files going into the repository
AIP = Archival Information Package – configuration of files stored in the repository
DIP = Distribution Information Package – configuration of files when they leave the
repository to be used.

Steps to adopting PBCore

All AIPs had to contain a PBCore document. However, because program information is not collected systematically or uniformly by public television entities, the PBCore data had to be collected from multiple sources on a program-by-program basis. Also, because it was not standardized, the quality of the incoming metadata was idiosyncratic and inconsistent. The PBCore records were drawn from:

  • From WNET — XML exports from the InMagic database and PBCore exports from ProTrack
  • WGBH — submitted XML exports from TEAMS database
  • PBS — submitted PODS data as PBCore

In order to generate the final PBCore files, the collected metadata had to be analyzed and mapped against the PBCore schema. Then software was developed to extract the metadata from the various files and insert it into a final PBCore document. Detailed descriptions of the OAIS framework and the selected metadata schema including XML, can be found at:

http://www.thirteen.org/ptvdigitalarchive/files/2010/03/PDPTV_ReposDesign_2010-03-19.pdf

The Mapping charts can be found in Appendices 6 – 8.

The difficulty of collecting, processing and manually entering data into PBCore demonstrated the high priority for setting uniform metadata and technical standards. Without them, automating the functions for extracting and managing the metadata and file integrity of large collections will simply not be feasible.

Why we’re using PBCore

At the same time, when PBCore metadata records are filled in, the repository proved that PBCore is a scalable and feasible operation for public broadcasters. Using PBCore this way, producers can easily share data with any third party capable of interpreting PBCore, e.g., repositories, other stations, or a wide range of other users.

Contact

Nan Rubin
Project Director – Preserving Digital Public Television
Community Media Services
4700 Broadway #2J
New York, NY 10040
212-569-3391
nanrubin@gmail.com

Illinois Public Media

How we use PBCore

In 2009 Illinois Public Media was one of 22 stations participating in the American Archive Pilot Project (AAPP). The AAPP provided funding to discover, digitize, and catalog public radio and TV content on the Civil Rights Movement produced from 1954 to 1975, and World War Two-related content produced in relation to the Ken Burns documentary The War. Illinois Public Media found in its archives some 270 hours of content in these two categories. After digitizing the analog audio and video materials, Project Director Jack Brighton built a cataloging tool based on the PBCore 1.2 schema using IPM’s website Content Management System, ExpressionEngine. EE is based on open source technologies (Linux, Apache, MySQL, PHP) and is extensible via third-party add-ons. Descriptive metadata was entered into the CMS by catalogers after viewing and listening to the content. Technical metadata was extracted automatically from the proxy digital media files, including file size, MIME type, bit rate, duration, and other elements contained in PBCore’s Instantiation-level records. The result was a complete catalog of all materials which was easily published as a public website, and as PBCore XML records.

Both the html and xml expressions of Illinois Public Media’s AAPP materials are available here.

Why we use PBCore

Brighton says purpose of creating the PBCore XML records was to allow automatic exchange of all metadata from Illinois Public Media’s CMS to the American Archive Pilot Project portal. To facilitate this exchange via http, he created a collection “wrapper element” around the several hundred PBCore records he created. The AAPP portal built by Oregon Public Broadcasting was able to simply ingest IPM’s PBCore collection. Metadata was entered once into the IPM system, and PBCore served as an exchange format with the AAPP system.

A blog posting with more technical details on this case study is available here.

How PBCore changed our workflow

At the time of the AAPP, a root-level PBCoreCollection element wasn’t part of the PBCore schema. With the release of PBCore 2.0, this method of wrapping many PBCoreDescriptionDocuments in one collection element is valid and officially supported. Illinois Public Media is now developing the next generation of its PBCore cataloging tool based on the PBCore 2.0 schema, and is making this PBCore tool the center of its workflow for media producers. As they add content to the IPM website, producers are in essence (and without knowing anything about PBCore) cataloging each media asset based on the PBCore standard.

Contact

Jack Brighton
Director of New Media & Innovation
Illinois Public Media
217-333-7300
jackb@illinois.edu

Broadway Video Digital Media

How we use PBCore

Broadway Video Digital Media (BVDM) used PBCore as part of their work for the Department of Transportation’s (DOT) Green Light for Midtown campaign. The primary goal of the project was to produce an edited video that could be used as a promotional tool for the campaign. The final production was approximately five minutes long, and can currently be viewed at the DOT’s website.

A secondary goal of the project was to provide searchable access to the DOT of the unedited footage. PBCore was used as the basis for their Library Access Platform’s data model.

Why we use PBCore

Because PBCore uses language and terminology appropriate to both the technical user and the non-broadcast users at the DOT, it fit their needs perfectly. The Library Access Platform allows users to search on PBCore-based terms and view proxies of the raw footage.

Contact

Dirk Van Dall
dvandall@broadwayvideo.com

WGBH Digital Asset Management

Why we use PBCore

The Media Library and Archives and Applied Technology departments helped inform the original metadata model that would become PBCore. For this reason, PBCore aligns well with our existing systems and we have worked to keep them PBCore compliant. Using PBCore allows us to interchange metadata easily with our own internal systems and with external initiatives such as the American Archive.

How we use PBCore

WGBH uses the Open Text Media Management (formally Artesia) Digital Asset Management platform. All of the fields within the current DAM metadata model are mapped to PBCore. WGBH productions are required to deliver a PBCore compliant filemaker database describing the assets they are depositing into the archive, whether physical or digital. In the case of digital assets, the Media Library and Archives, working with WGBH productions, is ingesting the files into the DAM and retrieving the metadata from the delivered filemaker databases.

As of December 2010, the WGBH DAM holds over 157,000 media files, each of which are described using PBCore compliant fields.

Contact

Karen Colbron
Digital Archives Manager
617-300-3924
karen_colbron@wgbh.org

WGBH Open Vault

Why we use PBCore

The WGBH Media Library and Archives needed a format for exchange that would be sufficient to describe our multimedia assets yet not overly complex.

How we use PBCore

WGBH Interactive developed Open Vault. Open Vault uses the Apache Solr search engine technology which indexes the PBCore fields and enables the faceted search and browse available on Open Vault.

While PBCore gives us a strong core to exchange data between our systems, we have added some private extensions to support our particular workflow needs.

Contact

Chris Beer
Web Developer
617-300-3769
chris_beer@wgbh.org

Secure Media Network: Dance Heritage Coalition & BAVC

How we use PBCore

The Secure Media Network is a collaborative project of the Dance Heritage Coalition and Bay Area Video Coalition, which starts with a union catalog of submitting dance archives’ records and goes from there integrating a digital repository (in testbed phase presently) and web interface for access.

The union catalog captures descriptive metadata from different submitting organizations (participating members of the Dance Heritage Coalition). It was determined that in order to gather and organize all this complex and disparately located information, we would utilize METS1 as an XML structure to encapsulate the collective metadata with PBCore 2 used for descriptive and technical metadata, PREMIS used for administrative and preservation metadata, and METSRights to document any known rights information.

Using the OAIS (Open Archival Information System) model, BAVC receives descriptive metadata and physical videotapes from submitting institutions. This metadata is then put through a mapping tool which results in a PBCore record that is then put up on the web site. So far the catalog contains 26,000 records from 5 member institutions.

Why we use PBCore

PBCore was chosen as a standard because it allows for several facets of a relatively complex cataloging process to become “in line” in a relatively smooth way. There is a need for our project, the Secure Media Network, to be able to map from a variety of standards, and PBCore allows that mapping to happen by providing a breadth of fields needed to do so.

We anticipate that the release of PBCore 2.0 will help us in further developing the repository, in particular supporting multi-part instantiations, addressing the need to show relationships between instantiations, and addressing the need to be able to “bundle” PBCore XML records in order to export and import large groups of records between systems (since many of our organizations are geographically separated).

Contact

Lauren Sorensen
Preservation Technician
Bay Area Video Coalition
415-558-2130
laurens@bavc.org

New York Public Radio Archives (WNYC)

How we use PBCore

The New York Public Radio Archives encompasses collections from WNYC, WQXR, and the broadcasts from NYPR’s performance venue, The Jerome L. Greene Performance Space. Our catalogue currently lists more than 44,000 assets with nearly 80,000 instantiations, covering the wide range of analog and digital audio formats, from lacquer discs to wav files. In addition, the catalog also includes HD and DV CAM digital video tapes of material streamed at both WNYC.org and WQXR.org. In the near future we expect to begin adding both program guides (dating back to the 1930s) and photographs (as early as 1924) to the PBCore compliant catalog.

Why we use PBCore

Our current SQL catalog database is an outgrowth of an older MS Access database. The MS Access database, although not fully PBCore-compliant, was already informed to some extent by PBCore. The use of PBCore allows us to use a cataloging standard built specifically for the public broadcasting community while being well-documented and used broadly enough to facilitate future exchanges and updates to our catalog, particularly within the public broadcasting community. As a web-based standard the NYPR archives staff expects that PBCore will also easily allow for the addition of more sophisticated features such as streaming, meta-tagging or “automated” cataloging. We look forward to exploring this further. We also believe that PBCore is a significant improvement over other more print oriented metadata schemas and will go a long way toward addressing the needs of growing multimedia organizations.

Contact

Andy Lanset
Director of Archives
New York Public Radio
646.829.4381
alanset@nypublicradio.org

WITNESS’s video catalog

How we use PBCore

WITNESS is a non-profit organization that collaborates with human rights groups around the world to co-produce and distribute videos that advance human rights causes. We maintain a Media Archive to collect, preserve and provide access to the audiovisual recordings made by WITNESS and our partners to support advocacy and the prosecution of justice, and for truth-telling and the historical record.

We use PBCore as a format to share information from our extensive video catalog, which is built with PBCore’s data structure in mind. Currently, we use PBCore to submit descriptive as well as some technical and structural metadata to our long-term preservation repository at the University of Texas Libraries. Along with a video master, a typical deposit includes a PBCore record, a MODS record and a MediaInfo report, packaged in BagIt format.

We are also planning to use PBCore to transact data between our video catalog and the new WITNESS website, built on Drupal. Data from the PBCore record will be imported to populate the Drupal database, and users will be able to search and view catalog records and video on our site. The structure and consistency of our data will allow us to implement user-friendly features, such as faceted browse, on the user interface.

Why we use PBCore

We use PBCore because its structure supports description relevant to our production-centered video archive, and because there is an active community of users that we can consult.

Contact

Yvonne Ng
Archivist
yvonne@witness.org