The software upgrade to the Dash service is intended to enhance the basic Dash functionality of:
- Providing an easy to use interface for individual researchers to deposit their data into an underlying content repository
- Facilitating the search and browse capabilities of the platform to enable easier discovery of data deposited into an underlying content repository
A second goal is to enable the overlay of the Stash data deposit platform on to any standards-compliant repository that supports the SWORD protocol for deposit and the OAI-PMH protocol for metadata harvesting.
The new Stash platform is intended to enhance the existing Dash service by providing individual researchers the self-service ability to:
Preserve, manage and share datasets by uploading to appropriate & existing content repositories
- Select datasets for deposit by means of local file browse or drag-and-drop operations
- Discover, retrieve and re-use datasets by means of the faceted search / browse of an aggregated data corpus
- Prepare datasets for deposit by reviewing best practice guidance for the creation or acquisition of research data
- Follow incoming links (from paper references, catalogs, web searches, etc.) to dataset landing pages.
Required deliverables include adding or enhancing the platform's capability to:
- Characterize datasets for deposit in terms of standards-compliant metadata schemes designed to facilitate data citation and scientifically meaningful description
- Identify datasets with persistent identifiers for permanent citation and reference
Provide a simpler and more intuitive means of uploading datasets into a managed repository by a thorough redesign of the user interface / experience (UI/UX)
Authenticate with InCommon/Shibboleth institutional or OAuth social credentials
- Self-service deposit through drag-and-drop or file browse
- Associate datasets with "scientifically-meaningful" metadata
- Associate datasets with ORCIDs of contributors
- Assign DOIs to datasets
- Mediate repository submission via the SWORD protocol
- Provide for versioning of datasets
- Provide some means of embargoing data sets
- Allow metadata harvesting vis OAI-PMH
- Provide public faceted search and browse
- Facilitate the indexing of dataset metadata by abstracting and indexing (A&I) services
- Allow organizational / institutional branding and URLS in the user interface
- Make all major functions available through embeddable widgets
- Make all code available in GitHub under an open source license
- Pilot integration with external repositories beyond Merritt
- Encourage an engaged open source community for code contributions and further platform development
The Technical Work Plan for the Stash upgrade to the Dash Service provides a description of the overall approach to the development of an added-value layer that mediates submission and discovery of data to source repositories. The plan is to provide this functionality by means of loosely coupling public SWORD and OAI-PMH endpoints with four main components:
- Hosted website
- Submission subsystem
- Harvesting subsystem
- Discovery subsystem
More complete information about the various subsystems and layers can be found in the complete Technical Work Plan and visualized by the Stash Functional Architecture diagram below.
Stash Functional Architecture
The Dash platform was designed to provide a well-designed, user-friendly data curation platform that could be used by organizations to allow individual researchers to deposit their datasets in self-service mode by means of a simple, intuitive interface designed with those individual researchers in mind. The Dash platform was designed to be layered on top of existing community repositories, and so, allow researchers to document, preserve, and publicly share their data with minimal support needed from repository staff. The Dash platform was itself based on a working prototype, called DataShare, that provided a simple archiving platform, specifically for UCSF researchers primarily generating biomedical data. DataShare was collaboratively developed by the University of California Curation Center (UC3) at the California Digital Library, the University of California, San Francisco (UCSF) Library and Center for Knowledge Management, and the UCSF Clinical and Translational Science Institute (CTSI). See the original Dash Proposal.
The Sloan-funded Stash upgrade to the Dash platform is being developed by the ace Stash development team consisting of Stephen Abrams, Principal Investigator providing overall project and technical oversight as well as technical design; Marisa Strong, Technical Development Manager along with developers Scott Fisher, David Moles, and Bhavitavya Vedula; Dev Ops, Jim Vanderveen; UI / UX Designer John Kratz, and consultants to the Dash / Stash project, Perry Willet and John Chodacki.