inmem_txnexample_c.html [plain text]

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>In-Memory Transaction Example</title>
    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
    <meta name="generator" content="DocBook XSL Stylesheets V1.62.4" />
    <link rel="home" href="index.html" title="Getting Started with Berkeley DB Transaction Processing" />
    <link rel="up" href="wrapup.html" title="Chapter 6. Summary and Examples" />
    <link rel="previous" href="txnexample_c.html" title="Transaction Example" />
  </head>
  <body>
    <div class="navheader">
      <table width="100%" summary="Navigation header">
        <tr>
          <th colspan="3" align="center">In-Memory Transaction Example</th>
        </tr>
        <tr>
          <td width="20%" align="left"><a accesskey="p" href="txnexample_c.html">Prev</a> </td>
          <th width="60%" align="center">Chapter 6. Summary and Examples</th>
          <td width="20%" align="right"> </td>
        </tr>
      </table>
      <hr />
    </div>
    <div class="sect1" lang="en" xml:lang="en">
      <div class="titlepage">
        <div>
          <div>
            <h2 class="title" style="clear: both"><a id="inmem_txnexample_c"></a>In-Memory Transaction Example</h2>
          </div>
        </div>
        <div></div>
      </div>
      <p>
        DB is sometimes used for applications that simply need to cache
        data retrieved from some other location (such as a remote database
        server). DB is also often used in embedded systems.
    </p>
      <p>
        In both cases, applications may want to use transactions for
        atomicity, consistency, and isolation guarantees, but they may also want
        to forgo the durability guarantee entirely. In doing so, they can keep
        their DB environment and databases entirely in-memory so
        as to avoid the performance impact of unneeded disk I/O.
    </p>
      <p>
        To do this:
    </p>
      <div class="itemizedlist">
        <ul type="disc">
          <li>
            <p>
                Refrain from specifying a home directory when you open your
                environment. The exception to this is if you are using the
                <tt class="literal">DB_CONFIG</tt> configuration file — in
                that case you must identify the environment's home
                directory so that the configuration file can be found.
            </p>
          </li>
          <li>
            <p>
                Configure your environment to back your regions from
                system memory instead of the filesystem.
            </p>
          </li>
          <li>
            <p>
                Configure your logging subsystem such that log files are kept
                entirely in-memory.
            </p>
          </li>
          <li>
            <p>
                Increase the size of your in-memory log buffer so that it
                is large enough to hold the largest set of concurrent write operations.
            </p>
          </li>
          <li>
            <p>
                Increase the size of your in-memory cache so that it can
                hold your entire data set. You do not want your cache to
                page to disk.
            </p>
          </li>
          <li>
            <p>
                Do not specify a file name when you open your database(s).
            </p>
          </li>
        </ul>
      </div>
      <p>
        As an example, this section takes the transaction example provided
        in <a href="txnexample_c.html">Transaction Example</a>
        and it updates that example so that the environment, database, log
        files, and regions are all kept entirely in-memory. 
    </p>
      <p>
        For illustration purposes, we also modify this example so that 
        uncommitted reads are no longer used to enable the 
            
            <tt class="function">countRecords()</tt>
        function. Instead, we simply provide a transaction handle to
            
            <tt class="function">countRecords()</tt>
        so as to avoid the self-deadlock. Be aware that using a transaction handle here rather than
        uncommitted reads will work just as well as if we had continued to use uncommitted reads. However,
            the usage of the transaction handle here will 
            probably cause more deadlocks than using read-uncommitted does, because more locking is being performed in
            this case.
    </p>
      <p>
        To begin, we simplify the beginning of our example a bit. Because
        we no longer need an environment home directory, we can remove all
        the code that we used to determine path delimiters
        and include the <tt class="function">getopt</tt> function. We can also
        remove our <tt class="function">usage()</tt> function because we no
        longer require any command line arguments. 
    </p>
      <pre class="programlisting">// File TxnGuideInMemory.cpp

// We assume an ANSI-compatible compiler
#include &lt;db_cxx.h&gt;
#include &lt;pthread.h&gt;
#include &lt;iostream&gt;

// Run 5 writers threads at a time.
#define NUMWRITERS 5

// Printing of pthread_t is implementation-specific, so we
// create our own thread IDs for reporting purposes.
int global_thread_num;
pthread_mutex_t thread_num_lock;

// Forward declarations
int countRecords(Db *, DbTxn *);
int openDb(Db **, const char *, const char *, DbEnv *, u_int32_t);
int usage(void);
void *writerThread(void *);  </pre>
      <p>
    Next, in our <tt class="function">main()</tt>, we also eliminate some
    variables that this example no longer needs. In particular, we are able to remove
    the 
         
        <tt class="literal">dbHomeDir</tt> 
    and 
        
        <tt class="literal">fileName</tt>
    variables. We also remove all our <tt class="function">getopt</tt> code.
</p>
      <pre class="programlisting">int
main(void)
{
    // Initialize our handles
    Db *dbp = NULL;
    DbEnv *envp = NULL;

    pthread_t writerThreads[NUMWRITERS];
    int i;
    u_int32_t envFlags;

    // Application name
    const char *progName = "TxnGuideInMemory";  </pre>
      <p>
        Next we create our environment as always. However, we add
        <tt class="literal">DB_PRIVATE</tt> to our environment open flags. This
        flag causes our environment to back regions using our
        application's heap memory rather than by using the filesystem.
        This is the first important step to keeping our DB data
        entirely in-memory.
    </p>
      <p>
        We also remove the <tt class="literal">DB_RECOVER</tt> flag from the environment open flags. Because our databases,
        logs, and regions are maintained in-memory, there will never be anything to recover.
    </p>
      <p>
        Note that we show the additional code here in
        <b class="userinput"><tt>bold.</tt></b>
    </p>
      <pre class="programlisting">    // Env open flags
    envFlags =
      DB_CREATE     |  // Create the environment if it does not exist
      DB_INIT_LOCK  |  // Initialize the locking subsystem
      DB_INIT_LOG   |  // Initialize the logging subsystem
      DB_INIT_TXN   |  // Initialize the transactional subsystem. This
                       // also turns on logging.
      DB_INIT_MPOOL |  // Initialize the memory pool (in-memory cache)
      <b class="userinput"><tt>DB_PRIVATE    |  // Region files are not backed by the filesystem.
                       // Instead, they are backed by heap memory.</tt></b>
      DB_THREAD;       // Cause the environment to be free-threaded

    try {
        // Create the environment 
        envp = new DbEnv(0); </pre>
      <p>
        Now we configure our environment to keep the log files in memory,
        increase the log buffer size to 10 MB, and increase our in-memory
        cache to 10 MB. These values should be more than enough for our
        application's workload.
      </p>
      <pre class="programlisting">
        <b class="userinput">
          <tt>        // Specify in-memory logging
        envp-&gt;log_set_config(DB_LOG_IN_MEMORY, 1);

        // Specify the size of the in-memory log buffer.
        envp-&gt;set_lg_bsize(10 * 1024 * 1024);

        // Specify the size of the in-memory cache
        envp-&gt;set_cachesize(0, 10 * 1024 * 1024, 1); </tt>
        </b>
      </pre>
      <p>
    Next, we open the environment and setup our lock detection. This is
    identical to how the example previously worked, except that we do not
    provide a location for the environment's home directory.
 </p>
      <pre class="programlisting">        // Indicate that we want db to internally perform deadlock 
        // detection.  Also indicate that the transaction with 
        // the fewest number of write locks will receive the 
        // deadlock notification in the event of a deadlock.
        envp-&gt;set_lk_detect(DB_LOCK_MINWRITE);

        // Open the environment
        envp-&gt;open(<b class="userinput"><tt>NULL</tt></b>, envFlags, 0); </pre>
      <p>
        When we call 
             
            <span><tt class="function">openDb()</tt>,</span> 
        which is what we use
        to open our database, we no not provide a database filename for the
        third parameter. When the filename is <tt class="literal">NULL</tt>, the database is not
        backed by the filesystem.
    </p>
      <pre class="programlisting">        // If we had utility threads (for running checkpoints or 
        // deadlock detection, for example) we would spawn those
        // here. However, for a simple example such as this,
        // that is not required.

        // Open the database
        openDb(&amp;dbp, progName, <b class="userinput"><tt>NULL</tt></b>,
            envp, DB_DUPSORT);
        </pre>
      <p>
    After that, our <tt class="function">main()</tt> function is unchanged,
    except that when we 
        
        <span>check for exceptions on the database open,</span>
    we change the error message string so as to not reference the database filename.
  </p>
      <pre class="programlisting">        // Initialize a pthread mutex. Used to help provide thread ids.
        (void)pthread_mutex_init(&amp;thread_num_lock, NULL);

        // Start the writer threads.
        for (i = 0; i &lt; NUMWRITERS; i++)
            (void)pthread_create(
                &amp;writerThreads[i], NULL,
                writerThread,
                (void *)dbp);

        // Join the writers
        for (i = 0; i &lt; NUMWRITERS; i++)
            (void)pthread_join(writerThreads[i], NULL);

    } catch(DbException &amp;e) {
        <b class="userinput"><tt>std::cerr &lt;&lt; "Error opening database environment: "</tt></b>
                  &lt;&lt; std::endl;
        std::cerr &lt;&lt; e.what() &lt;&lt; std::endl;
        return (EXIT_FAILURE);
    }

    try {
        // Close our database handle if it was opened.
        if (dbp != NULL)
            dbp-&gt;close(0);

        // Close our environment if it was opened.
        if (envp != NULL)
            envp-&gt;close(0);
    } catch(DbException &amp;e) {
        std::cerr &lt;&lt; "Error closing database and environment."
                  &lt;&lt; std::endl;
        std::cerr &lt;&lt; e.what() &lt;&lt; std::endl;
        return (EXIT_FAILURE);
    }

    // Final status message and return.

    std::cout &lt;&lt; "I'm all done." &lt;&lt; std::endl;
    return (EXIT_SUCCESS);
} </pre>
      <p>
        That completes <tt class="function">main()</tt>. The bulk of our 
        <tt class="function">writerThread()</tt> function implementation is
        unchanged from the initial transaction example, except that we now pass
        <tt class="function">countRecords</tt> a transaction handle, rather than configuring our
        application to perform uncommitted reads. Both mechanisms work well-enough
            for preventing a self-deadlock. However, the individual count
            in this example will tend to be lower than the counts seen in
            the previous transaction example, because
            <tt class="function">countRecords()</tt> can no longer see records
            created but not yet committed by other threads.
    </p>
      <pre class="programlisting">// A function that performs a series of writes to a
// Berkeley DB database. The information written
// to the database is largely nonsensical, but the
// mechanism of transactional commit/abort and
// deadlock detection is illustrated here.
void *
writerThread(void *args)
{
    Db *dbp = (Db *)args;
    DbEnv *envp = dbp-&gt;get_env(dbp);

    int j, thread_num;
    int max_retries = 20;   // Max retry on a deadlock
    char *key_strings[] = {"key 1", "key 2", "key 3", "key 4",
                           "key 5", "key 6", "key 7", "key 8",
                           "key 9", "key 10"};

    // Get the thread number
    (void)pthread_mutex_lock(&amp;thread_num_lock);
    global_thread_num++;
    thread_num = global_thread_num;
    (void)pthread_mutex_unlock(&amp;thread_num_lock);

    // Initialize the random number generator 
    srand((u_int)pthread_self());

    // Perform 50 transactions
    for (int i=0; i&lt;50; i++) {
        DbTxn *txn;
        bool retry = true;
        int retry_count = 0;
        // while loop is used for deadlock retries
        while (retry) {
            // try block used for deadlock detection and
            // general db exception handling
            try {

                // Begin our transaction. We group multiple writes in
                // this thread under a single transaction so as to
                // (1) show that you can atomically perform multiple 
                // writes at a time, and (2) to increase the chances 
                // of a deadlock occurring so that we can observe our 
                // deadlock detection at work.

                // Normally we would want to avoid the potential for 
                // deadlocks, so for this workload the correct thing 
                // would be to perform our puts with auto commit. But 
                // that would excessively simplify our example, so we 
                // do the "wrong" thing here instead.
                txn = NULL;
                envp-&gt;txn_begin(NULL, &amp;txn, 0);
                // Perform the database write for this transaction.
                for (j = 0; j &lt; 10; j++) {
                    Dbt key, value;
                    key.set_data(key_strings[j]);
                    key.set_size((strlen(key_strings[j]) + 1) *
                        sizeof(char));

                    int payload = rand() + i;
                    value.set_data(&amp;payload);
                    value.set_size(sizeof(int));

                    // Perform the database put
                    dbp-&gt;put(txn, &amp;key, &amp;value, 0);
                }

                // countRecords runs a cursor over the entire database.
                // We do this to illustrate issues of deadlocking
                std::cout &lt;&lt; thread_num &lt;&lt;  " : Found "
                          &lt;&lt;  countRecords(dbp, <b class="userinput"><tt>txn</tt></b>)
                          &lt;&lt; " records in the database." &lt;&lt; std::endl;

                std::cout &lt;&lt; thread_num &lt;&lt;  " : committing txn : " &lt;&lt; i
                          &lt;&lt; std::endl;

                // commit
                try {
                    txn-&gt;commit(0);
                    retry = false;
                    txn = NULL;
                } catch (DbException &amp;e) {
                    std::cout &lt;&lt; "Error on txn commit: "
                              &lt;&lt; e.what() &lt;&lt; std::endl;
                }
            } catch (DbDeadlockException &amp;de) {
                // First thing we MUST do is abort the transaction.
                if (txn != NULL)
                    (void)txn-&gt;abort();

                // Now we decide if we want to retry the operation.
                // If we have retried less than max_retries,
                // increment the retry count and goto retry.
                if (retry_count &lt; max_retries) {
                    std::cout &lt;&lt; "############### Writer " &lt;&lt; thread_num
                              &lt;&lt; ": Got DB_LOCK_DEADLOCK.\n"
                              &lt;&lt; "Retrying write operation."
                              &lt;&lt; std::endl;
                    retry_count++;
                    retry = true;
                 } else {
                    // Otherwise, just give up.
                    std::cerr &lt;&lt; "Writer " &lt;&lt; thread_num
                              &lt;&lt; ": Got DeadLockException and out of "
                              &lt;&lt; "retries. Giving up." &lt;&lt; std::endl;
                    retry = false;
                 }
           } catch (DbException &amp;e) {
                std::cerr &lt;&lt; "db put failed" &lt;&lt; std::endl;
                std::cerr &lt;&lt; e.what() &lt;&lt; std::endl;
                if (txn != NULL)
                    txn-&gt;abort();
                retry = false;
           } catch (std::exception &amp;ee) {
            std::cerr &lt;&lt; "Unknown exception: " &lt;&lt; ee.what() &lt;&lt; std::endl;
            return (0);
          }
        }
    }
    return (0);
} </pre>
      <p>
    Next we update 
        
        <span><tt class="function">countRecords()</tt>.</span>
    The only difference
    here is that we no longer specify <tt class="literal">DB_READ_UNCOMMITTED</tt> when
    we open our cursor. Note that even this minor change is not required.
    If we do not configure our database to support uncommitted reads,
    <tt class="literal">DB_READ_UNCOMMITTED</tt> on the cursor open will be silently
    ignored. However, we remove the flag anyway from the cursor open so as to
    avoid confusion.
</p>
      <pre class="programlisting">int
countRecords(Db *dbp, DbTxn *txn)
{

    Dbc *cursorp = NULL;
    int count = 0;

    try {
        // Get the cursor
        dbp-&gt;cursor(txn, &amp;cursorp, <b class="userinput"><tt>0</tt></b>);

        Dbt key, value;
        while (cursorp-&gt;get(&amp;key, &amp;value, DB_NEXT) == 0) {
            count++;
        }
    } catch (DbDeadlockException &amp;de) {
        std::cerr &lt;&lt; "countRecords: got deadlock" &lt;&lt; std::endl;
        cursorp-&gt;close();
        throw de;
    } catch (DbException &amp;e) {
        std::cerr &lt;&lt; "countRecords error:" &lt;&lt; std::endl;
        std::cerr &lt;&lt; e.what() &lt;&lt; std::endl;
    }

    if (cursorp != NULL) {
        try {
            cursorp-&gt;close();
        } catch (DbException &amp;e) {
            std::cerr &lt;&lt; "countRecords: cursor close failed:" &lt;&lt; std::endl;
            std::cerr &lt;&lt; e.what() &lt;&lt; std::endl;
        }
    }

    return (count);
} </pre>
      <p>
        Finally, we update 
             
            <span><tt class="function">openDb()</tt>.</span> 
        This involves
        removing <tt class="literal">DB_READ_UNCOMMITTED</tt> from the
        open flags. 

        
    </p>
      <pre class="programlisting">// Open a Berkeley DB database
int
openDb(Db **dbpp, const char *progname, const char *fileName,
  DbEnv *envp, u_int32_t extraFlags)
{
    int ret;
    u_int32_t openFlags;

    try {
        Db *dbp = new Db(envp, 0);

        // Point to the new'd Db
        *dbpp = dbp;

        if (extraFlags != 0)
            ret = dbp-&gt;set_flags(extraFlags);

        // Now open the database
        <b class="userinput"><tt>openFlags = DB_CREATE   |      // Allow database creation
                    DB_THREAD        |        
                    DB_AUTO_COMMIT;    // Allow auto commit</tt></b>

        dbp-&gt;open(NULL,       // Txn pointer
                  fileName,   // File name
                  NULL,       // Logical db name
                  DB_BTREE,   // Database type (using btree)
                  openFlags,  // Open flags
                  0);         // File mode. Using defaults
    } catch (DbException &amp;e) {
        std::cerr &lt;&lt; progname &lt;&lt; ": openDb: db open failed:" &lt;&lt; std::endl;
        std::cerr &lt;&lt; e.what() &lt;&lt; std::endl;
        return (EXIT_FAILURE);
    }

    return (EXIT_SUCCESS);
} </pre>
      <p>
    This completes our in-memory transactional example. If you would like to
    experiment with this code, you can find the example in the following
    location in your DB distribution:
</p>
      <pre class="programlisting"><span class="emphasis"><em>DB_INSTALL</em></span>/examples_cxx/txn_guide</pre>
    </div>
    <div class="navfooter">
      <hr />
      <table width="100%" summary="Navigation footer">
        <tr>
          <td width="40%" align="left"><a accesskey="p" href="txnexample_c.html">Prev</a> </td>
          <td width="20%" align="center">
            <a accesskey="u" href="wrapup.html">Up</a>
          </td>
          <td width="40%" align="right"> </td>
        </tr>
        <tr>
          <td width="40%" align="left" valign="top">Transaction Example </td>
          <td width="20%" align="center">
            <a accesskey="h" href="index.html">Home</a>
          </td>
          <td width="40%" align="right" valign="top"> </td>
        </tr>
      </table>
    </div>
  </body>
</html>