Provides an API for storing, retrieving and deleting streams of bits in a transactionally safe fashion. The main class is BitstreamStorageManager.
An example use of the Bitstore API is shown below:
// Create or obtain a context object Context context; // Stream to store InputStream stream; try { // Store the stream int id = BitstreamStorageManager.store (context, stream); // Retrieve it InputStream retrieved = BitstreamStorageManager.retrieve(context, id); // Delete it BitstreamStorageManager.delete(context, id); // Complete the context object so changes are written } // Error with I/O operations catch (IOException ioe) { } // Database error catch (SQLException sqle) { }
The BitstreamStorageManager stores files in one or more asset store
directories. These can be configured in dspace.cfg
. For
example:
assetstore.dir = /dspace/assetstore
The above example specifies a single asset store.
assetstore.dir = /dspace/assetstore_0 assetstore.dir.1 = /mnt/other_filesystem/assetstore_1
The above example specifies two asset stores. assetstore.dir
specifies the asset store number 0 (zero); after that use
assetstore.dir.1
, assetstore.dir.2
and so on. The
particular asset store a bitstream is stored in is held in the database, so
don't move bitstreams between asset stores, and don't renumber them.
By default, newly created bitstreams are put in asset store 0 (i.e. the one specified by the
assetstore.dir
property.) To change this, for example when asset
store 0 is getting full, add a line to dspace.cfg
like:
assetstore.incoming = 1
Then restart DSpace (Tomcat). New bitstreams will be written to the asset
store specified by assetstore.dir.1
, which is
/mnt/other_filesystem/assetstore_1
in the above example.
You can move an asset store as a whole to a new location in the file system; stop DSpace
(Tomcat), move all of the contents to the new location, change the appropriate
line in dspace.cfg
, and restart DSpace (Tomcat).
We will be providing administration tools for more sophisticated management of these asset stores in the future.
When given a stream of bits to store, the BitstreamStorageManager generates a unique key for the stream. The key takes the form of a long sequence of digits, which is transformed into a file path. The BitstreamStorageManager stores the contents of the stream in this path, creating parent directories as necessary.
The bitstore is carefully engineered to prevent data loss, using transactional flags in the database. Before a bitstream is actually stored, a metadata entry with the unique bitstream id is committed to the database. If the storage operation fails or is aborted, the deleted flag remains. The bitstore API then ensures that the bitstream cannot be retrieved, and after an hour, the bitstream is eligible for cleanup. The bitstream is accessible only after all database operations have been successfully committed.
Similarly, bitstreams are deleted by simply setting the deleted flag. If an deletion operation is rolled back, the bitstream is still present in the asset store.
As noted above, sometimes files will be physically present in the
Asset Store even though they are marked deleted in the database.
You can use the command-line utility class
org.dspace.storage.bitstore.Cleanup
(which is invoked via
/dspace/bin/cleanup
)
to remove the bitstreams which are marked deleted from the Asset Store.
To prevent accidental deletion of bitstreams which are in the process
of being stored, cleanup only removes bitstreams which are more than
an hour old.