Slide deck from Henry Stewart DAM NY, May 8, 2015.
Theme: Cultural Heritage and DAMs
Presentation: Swimming through the Digital Flood
In the 21st Century, cultural heritage organizations’ reliance on digital content has grown exponentially. It can sometimes seem daunting: how on earth can we organize, catalog and provide access to everything that we create - digitally - day after day, on our limited budgets?
At the Minneapolis Institute of Arts, one of America’s largest fine art museums, we are hard at work building practical capacity for the full range of digital assets created by our staff. Recently, with the support of grant funding, we've developed new conceptual models and a new metadata specification that is driven by standards, which we are actively sharing with the field. Attendees at this session will hear first-hand accounts of how the cultural heritage sector is addressing media production workflows, metadata, cataloguing, DAM systems, multimedia challenges and one of our big “elephants in the room”: how are we to deal with the wave of born-digital contemporary art joining our collections?
While we’ve been making good progress toward effective solutions, the museum - along with the entire sector - continues to face many challenges - we’ll discuss them. If all cultural heritage organizations are becoming, in essence, media production houses, these challenges must be met if we are to sustain and enhance continued audience engagement across the field.
Moderator:
Douglas Hegley, Director of Media and Technology, Minneapolis Institute of Arts
Panelists:
Frances Lloyd-Baynes, Content Database Specialist, Minneapolis Institute of Arts
Joshua Lynn, Digital Media Specialist, Minneapolis Institute of Arts
4. Art Objects
(TMS)
Digital Assets
(DAMS)
Website
(CMS) Blogs
MobileWeb
Social
Media
Content
Created: Digital
Stored: Digital
Goal: Integrate/share all
content using open API
and/or web services
Goal: Access via web
services, integrate with
ECM content
ePubs
DH
5. Art Objects
(TMS)
Digital Assets
(DAMS)
Website
(CMS) Blogs
MobileWeb
Print
(floorplans,
brochures, etc.)
Pubs
(magazines,
catalogs)
Press
(PR, marketing,
news, etc.)
Business
Writing
(reports, plans,
grant apps, policies)
Social
Media
Content
Created: Digital
Stored: Analog
Created: Digital
Stored: Digital
Goal: Integrate/share all
content using open API
and/or web services
Goal: Store in ECM in
digital format, with
appropriate access
permissions in metadata
Goal: Access via web
services, integrate with
ECM content
ePubs
DH
6. Art Objects
(TMS)
Digital Assets
(DAMS)
Website
(CMS) Blogs
MobileWeb
Print
(floorplans,
brochures, etc.)
Pubs
(magazines,
catalogs)
Press
(PR, marketing,
news, etc.)
Business
Writing
(reports, plans,
grant apps, policies)
Archival
(library, paper files)
Social
Media
Content
Created: Digital
Stored: Analog
Created: Analog
Stored: Analog
Created: Digital
Stored: Digital
Goal: Integrate/share all
content using open API
and/or web services
Goal: Store in ECM in
digital format, with
appropriate access
permissions in metadata
Goal: Digitize as needed (no
mass-digitization project);
store digital versions in ECM
Goal: Access via web
services, integrate with
ECM content
ePubs
DH
11. Douglas Hegley
Director of Media
and Technology
DH
The Swimming Team
Frances Lloyd-Baynes
Content Database
Specialist
Joshua Lynn
Digital Media
Specialist
30. GOALS/GUIDING PRINCIPLES
• Maximize Resources: assets are valuable
• COPE: create once, publish everywhere (as needed)
“Get your content ready to go
anywhere because it’s going
to go everywhere.”
Brad Frost, 2011
DH
36. Our Starting Point
Collections Data
(TMS)
Digital Asset
Mgmt
(MediaBin)
Web CMS
(WordPress)
Misc Digital Assets
(Network drives)
Document Mgmt
(Network drives)
API
FLB
57. Open Source Software
• Free download
• No company or corporation
• Community-provided support
• Can be modified – access to code
• Susceptible to security issues
• Requires technical skills on staff
Commercial Software
• License fees
• Maintenance fees
• Company-provided support
• Often cannot be modified
• Less vulnerable (?)
• May require less internal tech skills
Open Source or Not?
“When the question of whether or not an
organization should use open-source or
commercial products arises, the discussion
frequently seems to focus on extremes …
Ultimately, the question is really one of
requirements. Which one will best meet the
needs of the organization?”
Michele Chubirka, Information Week, May 2014
(emphasis is ours)
JL
59. Where Are We?
Olympic swimmer Michael Phelps helped develop the swimsuit that used
materials tested at NASA Langley. Credit: Speedo
http://phys.org/news/2008-08-olympic-swimmers-shattering-nasa-tested.html#jCp
FLB
Surfers ride the waves! Be aware, eyes up, don’t fight it when you can instead enjoy it. Or, of course, you’ve got another option …
For the MIA, we are faced with ever-growing and ever-more-complex content creation systems and workflows. We seek to tie these disparate systems and processes together in order to maximize the value of our assets, including those born and stored digitally …
… and those that have been traditionally stored in analog format …
… and even those that are analog-analog. All assets are valuable.
But we’ve gotten ahead of ourselves a little bit.
Let’s back up a bit, and clarify: What is the MIA?
The Minneapolis Institute of Arts is a 100 year old museum, it attracts over 600,000 visitors per year to experience the art and the museum’s wide array of programs and activities drawn from it’s encyclopedic collection of nearly 90,000 objects, which span the entirety of human history and cultures.
Please allow us to introduce ourselves.
<individual introductions, rapid>
DH mention: we’re here to share a few things, but not to simply deliver a lecture of some sort. We encourage you to jump up and ask questions as you see fit – we’re happy to engage in a dialog. And don’t forget that the dialog continues immediately after our presentation, as we segue into a larger discussion about DAMs in the cultural heritage and non-profit sector.
I’m a firm believer in starting everything with “Why?”
At the MIA, our mission boils down to: we have stuff, we know about it, and we share both.
And what matters MOST is opening the doors and giving the public every opportunity to engage with us.
The museum’s current strategic plan puts Audience Engagement first and foremost – in the upper left hand corner. Guides every decision we make.
Everything else we do at the MIA is in the service of THIS goal: Happy Visitors! Engaged, excited, inspired, and “attached” to our organizations. We want them to think of us as their “third space” – not home, not work, but a familiar and comfortable place for community, connection, learning and FUN.
At this point in history, museums have “gone digital”. This includes not just content creation – nearly everything is born on a computer, - but also : devices in galleries (both BYOD & provided) ; pervasive onsite wifi; social media; websites (mobile-ready, responsive designs); interactive kiosks and digital screens; a variety of immersive experiences; location and context awareness (e.g. ibeacons) and ‘sensors’ of all kinds (to recognize faces, understand emotions/reactions, etc.). Whew!
Frances – that’s a lot, how on earth do we suppose we’ll manage all of these digital things?
Range of content being created / museums as virtual publishing houses
Management of the content we’re generating as ’digital publishing houses’ is daunting.
Important to protect our investments in tools and in people. Strategic decision-making is vital.
Josh – let’s take a look at this range of asset types in a little more detail.
Multiple audio formats, multiple video formats, multiple image formats, multiple document types. Compressed / delivery formats being stored, where we should have full res / uncompressed masters… And the duplication! And the derivatives!
Not only were we swimming in a flood of media assets, our metadata had become noisy, overgrown, dirty – just plain unruly over the years. It’s enough to drive you bonkers.
Oh no, it’s raining hard drives! The digital flood is everywhere! In other words, we have a plethora of unmanaged media assets living here, there and everywhere – from local drives to the cloud. Maybe even in this very PowerPoint!
I arrived at the MIA almost 4 years ago, and a grant opportunity was brought to my attention almost immediately. Given that we – collectively – sensed this impending tidal wave, we quickly hashed together an application for an Enterprise Content Management project.
We must express deep gratitude to the IMLS for recognizing the importance of the ECM effort and seeing fit to provide project funding.
Next up: we assembled our internal team, and created what we call our project team spiral. At the center is the audience, the Core Team gets the job done, the Steering Committee clears the way, and the Expanded Team and Stakeholders are pulled in as needed. We keep things “lean and agile”, and move fast.
We set our sights on solving a particular problem: We were (and still are) producing reams of digital content, but we had no consistent practice of storing and sharing those assets for wider use. This led to stunted workflows, person-to-person interruptions, staff using the “wrong” assets, and even to the need to completely re-create assets that had recently been produced. We could not afford to continue in this way.
We established a set of high-priority goals/guiding principles: First, content creation is valuable work, we must recognize that officially and treat it as such.
We strive to craft content that is ready for prime time, wherever that might be.
We needed to establish some kind of one stop shop, so that content “consumers” knew where to find assets – without making phone calls!
It had to be easy to access, and easy to use.
And staff who needed files had to be able to get them on their own.
Finally, we wanted to do our best to future-proof our efforts, and avoid locking content/assets into any proprietary formats or systems.
Frances, please tell us a bit about how we’ve organized our content types during this project.
Our Starting Point: We have
Both physical & digital content being generated and delivered
Goals to provide deep dives into wide range of content / types
Multiple channels of delivery: interactives, web, mobile; in gallery, off-site
AND WE HAVE
Disparate systems to hold and manage content: CIS, DAMs, Web CMS, our network drives
(only) Some connectivity via API (application program interface)
A Mix of material not easily accessible or re-purpose-able (hidden in network drives)
And Some content that’s not well documented (has little metadata; e.g. our website CMS content)
We knew we couldn’t access all the content we are creating and could not provide that deep dive we hoped for.
Grant proposal focused on
Implementing an Enterprise Content Management (ECM) system to store, organize, and make accessible the museum’s vast amount of intellectual property (significantly extending what we have now)
Creation of a unified interface to join up collletions-related content for discoverability & delivery
Using an Application Program Interface (API) approach extensively.
Our refined goals:
Not replace exiting Enterprise DAM
Acquire open source DAM to handle all homeless digital assets
This approach
Would allow us to demonstrate APIs with both an enterprise and an open source system.
Serve the grant goals “to address and improve the situation in a way our colleagues across the sector could benefit from.”
Open Source meant metadata creation possible and ability to see the code and build on it for our API;
Saved money on the product (open source = free up front) and put it into hiring a developer.
Also extended our Project Team: cross-functional team of stakeholders: Core Team, Extended Team, Steering Committee
Open Source DAMs:
There are literally hundreds of open source DAMs on the market; most not focused on a cultural heritage end user
We selected 20 open-source systems for a broad review
Selected ResourceSpace. Created by Montala for UK charity Oxfam.
Influencing factors on selection:
Our own capabilities: MIA has no .NET developers; has potential software conflicts with any Java-based applications (due to existing applications, such as timecard tool)
Issues with systems we reviewed:
Lack of English language versions/documentation;
Incomplete technology (NotreDAM);
Lack of ongoing development (Gallery);
Focus on single resource type (e.g. video production/Tactic) or solely production focused (all about workflow); asset sales often the main driver
Not a true DAM;
Lack of metadata customization or standards;
Inability to develop controlled vocabulary standards beyond keyword lists.
Why ResourceSpace?
Came with its own API to get us started;
clean user interface ‘out-of-the-box’;
clear management of assets, renditions and versions;
infinitely extensible (with system plug-ins) and
a strong body of users, though few from the cultural heritage sector. (in US, only Historic New England and the Walters Art Museum)
Josh – whenever we talk about any kind of assets or content types, it’s vital that we also invest in the metadata around each.
So, after changing direction a few times, we did a little soul searching. We had identified the challenges, and we had decided that we wanted a simplified metadata model, to cover a core set of file formats. And the data had to travel well...
The “kitchen sink” approach was completely out of the question. … But where to find such a mythical beast? We went straight to the Feds to help our detective work…
You know, Library of Congress. The Federal Agencies Digitization Guidelines Initiative. Turns out that embracing best practices from the Feds was not only polite (we took money from IMLS), but also wise. Because LOC and FADGI provide a simple and sturdy foundation for establishing your core “archival master” file formats and minimal metadata set.
What are these minimal fields and formats? Message in a bottle metaphor
What is it? Who made this, or where did we get it? What was it for? How may it be used? Is it ready for prime time?
And we needed our metadata model to work for not only for our technical staff and catalogers. We needed it to work for our content creators –
Content creators need metadata to flow from point of capture / file creation, through the entire production process, into the DAMS, and back out again into the production flow
And of course, no single metadata standard does it all. So we took what’s good from VARIOUS STANDARDS and built our own specification – MIA CORE
Simplified diagram. Not exhaustive.
Effectively, we went panning for metadata properties – metadata gold, if you will. Sifting though the flood of standards and specifications, sticking largely to the ISO standards along the way.
With MIA Core, we are capturing the basics including:
With xmpBasicJob, we’re capturing Project Names, Project References
With xmpRights, Creative Commons, and PLUS, Basic Rights Management, Including Embargo and Constraints Information
With VRA Core, we’re capturing tombstone information about works of art
With MIA Core fields, we’re capturing quality control tags, and a range of sync keys for our API. For example, ObjectID, or EventID for sync with TMS, our Collections Information System
MIA CORE is not perfect, it’s not complete, but it works, and it’s flexible. Which is good, because we can always ADD to – or REMOVE FROM – our specification as we learn more and needs evolve.
ITERATIVE, CONTINUOUS IMPROVEMENT = the way we work at the MIA
Now MIA Core is one piece of the data puzzle. And our goal was to keep this metadata – embedded metadata – as lightweight as possible. Again: minimal, catastrophic metadata on the asset level - only enough data on the asset to ensure its survival in the harsh world outside of the DAMS.
All embedded is problematic. All DB is problematic. We want data to flow in, out, and beyond. And we want to flow data between systems, aggregating with our search server (Elasticsearch)
Feeding the API what it wants…
In this way, each system can do what it does best, leaning on APIs to flow data and media between systems and, ultimately, out to channels
Frances, we’ve talked about wanting parsimony – a simple-as-possible approach. Do users have to sort through all of those systems?
Delivering a Unified Interface (sample screen alpha/preview version)
Federated search across all four collections-related systems
Using Elasticsearch - an “open source search and analytics engine” - to index ALL (every word of the) content = Used by Wikipedia and other big data providers.
Allows a dynamic content feed: real time updates;
related content is made visible in real time as well, as it is added/changed.
“With Elasticsearch, all data is immediately made available for search and analytics.”
https://www.elastic.co/products/elasticsearch/
Elastic Search is used with Kibana.
“You use Kibana to search, view, and interact with data stored in Elasticsearch indices.”
Josh – this open source thing we keep referencin, that was just because it’s all free, right?
Well, let’s be honest: there is no such thing as a free puppy. Oh, it might *start out* as free … but there is care and feeding and training and … you get the picture.
There are strong reasons for and even against open source or commercial software. Ultimately, Open Source is LESS like FREE BEER and MORE like free puppies. In addition, we are not advocating one way or the other: the choice is up to the organization, its needs and its resources. Open source worked for us.
Frances, how are we doing? Are we making headway, or are we being pulled out to sea by the riptide?
Where are we?
Using Agile methodology to finalize development of ResourceSpace and Unified Interface
Use Testing and Roll out in May 2015 with ResourceSpace training (less formal training for interface as it’s more intuitive)
Still need to
Extend API
Seed systems with content
Tweak systems based on user input as it is used
Public release of open source code: targeting late summer 2015
Evaluation of project – success of improving access to all assets (Summer/final report end Oct 2015)
I want to re-emphasize that we work in iterative cycles, aiming for continuous improvement. We aren’t claiming to have solved every puzzle.
But we do believe that we’ve taken a nice, big slice out of a very big pie.
… and if you haven’t visited Koronet up by Columbia University, then you’ve never really seen a big slice!
So far, we’ve tackled a high-priority need: wide access to production-ready assets needed by multiple staff to get things done fast (and right); a repository of ready-for-use files.
We’ve still got plenty of challenges on the horizon, here are just a few:
Long-term preservation & archival practices – how do we sustain our digital assets over long periods of time?
2. Workflow management tools – after all, process matters just as much as product in any organization that strives for efficiency.
3. Deployment + usage tracking + external access … We’d like to know what our assets are up to out there.
4. Born-digital works of art represent the 800 lb gorilla in the room – is it the responsibility of the DAMs and its manager(s)? Uh oh, the pressure just shot way up! Art acquisition is very, very expensive. Museums know how to tend to sculpture and paintings over time, but we are novices as really coping with conservation and preservation of born-digital works.
Luckily, we have top people working on it, as we speak. All of you!
Frances, what’s next?
What’s Next?
Linked Open data as a worthy goal – joining data on a global scale, not just making our own website available