Provides an API for storing, retrieving and deleting streams of bits in a transactionally safe fashion. The main class is BitstreamStorageManager.

Using the Bitstore API

An example use of the Bitstore API is shown below:

    // Create or obtain a context object
    Context context;
    // Stream to store
    InputStream stream;
    
    try
    {
        // Store the stream
        int id = BitstreamStorageManager.store (context, stream);
        // Retrieve it
        InputStream retrieved = BitstreamStorageManager.retrieve(context, id);
        // Delete it
        BitstreamStorageManager.delete(context, id);

        // Complete the context object so changes are written
    }
    // Error with I/O operations
    catch (IOException ioe)
    {
       
    }
    // Database error
    catch (SQLException sqle)
    {
    }

Storage mechanism

The BitstreamStorageManager stores files in one or more asset store directories. These can be configured in dspace.cfg. For example:

assetstore.dir = /dspace/assetstore

The above example specifies a single asset store.

assetstore.dir = /dspace/assetstore_0
assetstore.dir.1 = /mnt/other_filesystem/assetstore_1

The above example specifies two asset stores. assetstore.dir specifies the asset store number 0 (zero); after that use assetstore.dir.1, assetstore.dir.2 and so on. The particular asset store a bitstream is stored in is held in the database, so don't move bitstreams between asset stores, and don't renumber them.

By default, newly created bitstreams are put in asset store 0 (i.e. the one specified by the assetstore.dir property.) To change this, for example when asset store 0 is getting full, add a line to dspace.cfg like:

assetstore.incoming = 1

Then restart DSpace (Tomcat). New bitstreams will be written to the asset store specified by assetstore.dir.1, which is /mnt/other_filesystem/assetstore_1 in the above example.

Moving an Asset Store

You can move an asset store as a whole to a new location in the file system; stop DSpace (Tomcat), move all of the contents to the new location, change the appropriate line in dspace.cfg, and restart DSpace (Tomcat).

We will be providing administration tools for more sophisticated management of these asset stores in the future.

When given a stream of bits to store, the BitstreamStorageManager generates a unique key for the stream. The key takes the form of a long sequence of digits, which is transformed into a file path. The BitstreamStorageManager stores the contents of the stream in this path, creating parent directories as necessary.

The Bitstore and Transactions

The bitstore is carefully engineered to prevent data loss, using transactional flags in the database. Before a bitstream is actually stored, a metadata entry with the unique bitstream id is committed to the database. If the storage operation fails or is aborted, the deleted flag remains. The bitstore API then ensures that the bitstream cannot be retrieved, and after an hour, the bitstream is eligible for cleanup. The bitstream is accessible only after all database operations have been successfully committed.

Similarly, bitstreams are deleted by simply setting the deleted flag. If an deletion operation is rolled back, the bitstream is still present in the asset store.

Cleaning up the Asset Store

As noted above, sometimes files will be physically present in the Asset Store even though they are marked deleted in the database. You can use the command-line utility class org.dspace.storage.bitstore.Cleanup (which is invoked via /dspace/bin/cleanup) to remove the bitstreams which are marked deleted from the Asset Store. To prevent accidental deletion of bitstreams which are in the process of being stored, cleanup only removes bitstreams which are more than an hour old.