List:Commits« Previous MessageNext Message »
From:knielsen Date:February 20 2007 9:34am
Subject:bk commit into 5.1 tree (knielsen:1.2436)
View as plain text  
Below is the list of changes that have just been committed into a local
5.1 repository of knielsen. When knielsen does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet@stripped, 2007-02-20 10:34:41+01:00, knielsen@ymer.(none) +15 -0
  WL#2223: NdbRecord.
  Implement distribution key optimization for index range scans.
  Allocate single scan row buffer, fix batch size calculation.
  Misc. minor bug fixes.

  storage/ndb/include/kernel/signaldata/KeyInfo.hpp@stripped, 2007-02-20 10:34:36+01:00, knielsen@ymer.(none) +1 -1
    Fix comment.

  storage/ndb/include/ndbapi/NdbDictionary.hpp@stripped, 2007-02-20 10:34:36+01:00, knielsen@ymer.(none) +5 -0
    createRecord method for index now also takes base table pointer.

  storage/ndb/include/ndbapi/NdbOperation.hpp@stripped, 2007-02-20 10:34:36+01:00, knielsen@ymer.(none) +0 -1
    Need to set table id and version differently depending on whether it is a
    normal PK/UK operation or a scan lock take-over operation.

  storage/ndb/include/ndbapi/NdbReceiver.hpp@stripped, 2007-02-20 10:34:36+01:00, knielsen@ymer.(none) +7 -3
    Allocate a single row buffer for table/index scan, and fix batch size
    calculations.

  storage/ndb/include/ndbapi/NdbScanOperation.hpp@stripped, 2007-02-20 10:34:36+01:00, knielsen@ymer.(none) +3 -0
    Allocate a single row buffer for table/index scan, and fix batch size
    calculations.

  storage/ndb/include/ndbapi/NdbTransaction.hpp@stripped, 2007-02-20 10:34:36+01:00, knielsen@ymer.(none) +8 -0
    Docs.

  storage/ndb/src/ndbapi/NdbDictionary.cpp@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +18 -0
    createRecord method for index now also takes base table pointer.

  storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +50 -4
    createRecord method for index now also takes base table pointer.
    Pre-compute some more table aggreated information, for use in setting
    distribution key during index range scans.

  storage/ndb/src/ndbapi/NdbDictionaryImpl.hpp@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +28 -2
    createRecord method for index now also takes base table pointer.
    Pre-compute some more table aggreated information, for use in setting
    distribution key during index range scans.

  storage/ndb/src/ndbapi/NdbOperationExec.cpp@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +10 -13
    Need to set table id and version differently depending on whether it is a
    normal PK/UK operation or a scan lock take-over operation.

  storage/ndb/src/ndbapi/NdbReceiver.cpp@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +32 -24
    Allocate a single row buffer for table/index scan, and fix batch size
    calculations.

  storage/ndb/src/ndbapi/NdbScanOperation.cpp@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +38 -15
    Allocate a single row buffer for table/index scan, and fix batch size
    calculations.

  storage/ndb/src/ndbapi/NdbTransaction.cpp@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +138 -10
    Add some error checks for user error in NdbRecord calls.
    Implement distribution key optimization for NdbRecord index range scans.

  storage/ndb/src/ndbapi/ndberror.c@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +3 -1
    Add error codes.

  storage/ndb/test/ndbapi/flexBench.cpp@stripped, 2007-02-20 10:34:37+01:00, knielsen@ymer.(none) +1 -1
    Allow testing with 128 attributes.

# This is a BitKeeper patch.  What follows are the unified diffs for the
# set of deltas contained in the patch.  The rest of the patch, the part
# that BitKeeper cares about, is below these diffs.
# User:	knielsen
# Host:	ymer.(none)
# Root:	/usr/local/mysql/mysql-5.1-wl2223

--- 1.7/storage/ndb/include/kernel/signaldata/KeyInfo.hpp	2007-02-20 10:34:49 +01:00
+++ 1.8/storage/ndb/include/kernel/signaldata/KeyInfo.hpp	2007-02-20 10:34:49 +01:00
@@ -62,7 +62,7 @@ private:
   SCAN_TABREQ and associated ATTRINFO stream (using
   NdbIndexScanOperation::end_of_bound(RANGE_NO) ). In this case, the first word
   of each range bound contains additional information: bits 16-31 holds the
-  length of this bound, in words of ATTRINFO data, and bits 4-11 holds a
+  length of this bound, in words of ATTRINFO data, and bits 4-15 holds a
   number RANGE_NO specified by the application that can be read back from the
   RANGE_NO pseudo-column.
 

--- 1.59/storage/ndb/include/ndbapi/NdbTransaction.hpp	2007-02-20 10:34:49 +01:00
+++ 1.60/storage/ndb/include/ndbapi/NdbTransaction.hpp	2007-02-20 10:34:49 +01:00
@@ -641,6 +641,14 @@ public:
     bound_index and just return the values for the next bound (for example
     if data is kept in a linked list).
 
+    Note that for multi-range, the IndexBound::low_key and IndexBound::high_key
+    pointers must be unique, ie. it is not permissible to re-use the same row
+    buffer for several different range bounds within a single scan. It is
+    however permissible to use the same row pointer as low_key and high_key (to
+    specify an equals bound), and it is also permissible to re-use the rows
+    after the scanIndex() method returns (ie. they need not remain valid until
+    ececute() time, like the NdbRecord pointers do).
+
     The callback can return 0 to denote success, and -1 to denote error (the
     latter causing the creation of the NdbIndexScanOperation to fail).
 

--- 1.91/storage/ndb/include/ndbapi/NdbDictionary.hpp	2007-02-20 10:34:49 +01:00
+++ 1.92/storage/ndb/include/ndbapi/NdbDictionary.hpp	2007-02-20 10:34:49 +01:00
@@ -1945,6 +1945,11 @@ public:
       Create an NdbRecord for use in index operations.
     */
     NdbRecord *createRecord(const Index *index,
+                            const Table *table,
+                            const RecordSpecification *recSpec,
+                            Uint32 length,
+                            Uint32 elemSize);
+    NdbRecord *createRecord(const Index *index,
                             const RecordSpecification *recSpec,
                             Uint32 length,
                             Uint32 elemSize);

--- 1.45/storage/ndb/include/ndbapi/NdbOperation.hpp	2007-02-20 10:34:49 +01:00
+++ 1.46/storage/ndb/include/ndbapi/NdbOperation.hpp	2007-02-20 10:34:49 +01:00
@@ -949,7 +949,6 @@ protected:
   Uint32 fillTcKeyReqHdr(TcKeyReq *tcKeyReq,
                          Uint32 connectPtr,
                          Uint64 transId,
-                         const NdbRecord *rec,
                          AbortOption ao);
   int    allocKeyInfo(Uint32 connectPtr, Uint64 transId,
                       Uint32 **dstPtr, Uint32 *remain);

--- 1.23/storage/ndb/include/ndbapi/NdbReceiver.hpp	2007-02-20 10:34:49 +01:00
+++ 1.24/storage/ndb/include/ndbapi/NdbReceiver.hpp	2007-02-20 10:34:49 +01:00
@@ -78,13 +78,17 @@ private:
   void getValues(const NdbRecord*, char*);
   void do_get_value(NdbReceiver*, Uint32 rows, Uint32 key_size, Uint32 range);
   void prepareSend();
-  void calculate_batch_size(Uint32, Uint32, Uint32&, Uint32&, Uint32&);
+  void calculate_batch_size(Uint32, Uint32, Uint32&, Uint32&, Uint32&,
+                            const NdbRecord *);
   /*
     Set up buffers for receiving TRANSID_AI and KEYINFO20 signals
     during a scan using NdbRecord.
   */
-  int do_setup_ndbrecord(const NdbRecord *ndb_record, Uint32 batch_size,
-                         Uint32 key_size, Uint32 read_range_no);
+  void do_setup_ndbrecord(const NdbRecord *ndb_record, Uint32 batch_size,
+                          Uint32 key_size, Uint32 read_range_no,
+                          Uint32 rowsize, char *buf);
+  Uint32 ndbrecord_rowsize(const NdbRecord *ndb_record, Uint32 key_size,
+                           Uint32 read_range_no);
 
   int execKEYINFO20(Uint32 info, const Uint32* ptr, Uint32 len);
   int execTRANSID_AI(const Uint32* ptr, Uint32 len); 

--- 1.47/storage/ndb/include/ndbapi/NdbScanOperation.hpp	2007-02-20 10:34:49 +01:00
+++ 1.48/storage/ndb/include/ndbapi/NdbScanOperation.hpp	2007-02-20 10:34:49 +01:00
@@ -401,6 +401,9 @@ protected:
   NdbRecAttr *m_curr_row;
   bool m_multi_range; // Mark if operation is part of multi-range scan
   bool m_executed; // Marker if operation should be released at close
+
+  /* Buffer for rows received during NdbRecord scans, or NULL. */
+  char *m_scan_buffer;
 };
 
 inline

--- 1.76/storage/ndb/src/ndbapi/NdbTransaction.cpp	2007-02-20 10:34:49 +01:00
+++ 1.77/storage/ndb/src/ndbapi/NdbTransaction.cpp	2007-02-20 10:34:49 +01:00
@@ -2094,10 +2094,19 @@ NdbTransaction::setupRecordOp(NdbOperati
     implementing WL#3707.
   */
   if (key_record->flags & NdbRecord::RecIsIndex)
+  {
     op= getNdbIndexOperation(key_record->table->m_index, key_record->table,
                              NULL, true);
+  }
   else
+  {
+    if (key_record->tableId != attribute_record->tableId)
+    {
+      setOperationErrorCodeAbort(4287);
+      return NULL;
+    }
     op= getNdbOperation(key_record->table, NULL, true);
+  }
   if(!op)
     return op;
 
@@ -2248,8 +2257,9 @@ NdbTransaction::scanTable(const NdbRecor
                           Uint32 batch)
 {
   /*
-    For some reason, normal scan operations are created as index scan
-    operations ... :-(
+    Normal scan operations are created as NdbIndexScanOperations.
+    The reason for this is that they can then share a pool of allocated
+    objects.
   */
   NdbIndexScanOperation *op_idx;
   NdbScanOperation *op;
@@ -2308,6 +2318,84 @@ NdbTransaction::scanTable(const NdbRecor
   return NULL;
 }
 
+/*
+  Compare two rows on some prefix of the index.
+  This is used to see if we can determine that all rows in an index range scan
+  will come from a single fragment (if the two rows bound a single distribution
+  key).
+ */
+static int
+compare_index_row_prefix(const NdbRecord *rec,
+                         const char *row1,
+                         const char *row2,
+                         Uint32 prefix_length)
+{
+  Uint32 i;
+
+  for (i= 0; i<prefix_length; i++)
+  {
+    const NdbRecord::Attr *col= &rec->columns[rec->key_indexes[i]];
+
+    bool is_null1= col->is_null(row1);
+    bool is_null2= col->is_null(row2);
+    if (is_null1)
+    {
+      if (!is_null2)
+        return -1;
+      /* Fall-through to compare next one. */
+    }
+    else
+    {
+      if (is_null2)
+        return 1;
+
+      Uint32 offset= col->offset;
+      Uint32 maxSize= col->maxSize;
+      const char *ptr1= row1 + offset;
+      const char *ptr2= row2 + offset;
+      void *info= col->charset_info;
+      int res=
+        (*col->compare_function)(info, ptr1, maxSize, ptr2, maxSize, true);
+      if (res)
+      {
+        assert(res != NdbSqlUtil::CmpUnknown);
+        return res;
+      }
+    }
+  }
+
+  return 0;
+}
+
+static void
+set_distribution_key_from_range(NdbIndexScanOperation *op,
+                                const NdbRecord *record,
+                                const char *row,
+                                Uint32 distkey_max)
+{
+  Uint64 tmp[1000];
+  char *dst= (char *)(&tmp[0]);
+  Uint32 i;
+  Uint32 len, padded_len, total_len;
+
+  total_len= 0;
+  for (i= 0; i<record->distkey_index_length; i++)
+  {
+    const NdbRecord::Attr *col= &record->columns[record->distkey_indexes[i]];
+    /* Distribution key cannot be NULL. */
+    col->get_var_length(row, len);
+    padded_len= (len+3)&~3;
+    total_len+= padded_len;
+    if (total_len > sizeof(tmp))
+      break;
+    memcpy(dst, row + col->offset, len);
+    if (padded_len != len)
+      bzero(dst + len, padded_len - len);
+    dst+= padded_len;
+  }
+  op->setPartitionHash(tmp, total_len>>2);
+}
+
 NdbIndexScanOperation *
 NdbTransaction::scanIndex(
             const NdbRecord *key_record,
@@ -2342,7 +2430,9 @@ NdbTransaction::scanIndex(
     setOperationErrorCodeAbort(4279);
     return NULL;
   }
-  if (key_record->tableId != result_record->tableId)
+  if (!(key_record->flags & NdbRecord::RecIsIndex) ||
+      !(result_record->flags & NdbRecord::RecIsIndex) ||
+      key_record->tableId != result_record->tableId)
   {
     setOperationErrorCodeAbort(4283);
     return NULL;
@@ -2415,7 +2505,9 @@ NdbTransaction::scanIndex(
   {
     NdbIndexScanOperation::IndexBound bound;
     int res;
-    Uint32 key_count;
+    Uint32 key_count, common_key_count;
+    Uint32 range_no;
+    Uint32 bound_head;
 
     res= get_key_bound_callback(callback_data, i, bound);
     if (res==-1)
@@ -2423,20 +2515,30 @@ NdbTransaction::scanIndex(
       setOperationErrorCodeAbort(4280);
       goto giveup_err;
     }
+    range_no= bound.range_no;
+    if (range_no >= (1 << 12))
+    {
+      setOperationErrorCodeAbort(4286);
+      goto giveup_err;
+    }
+
     if ( (scan_flags & NdbScanOperation::SF_ReadRangeNo) &&
          (scan_flags & NdbScanOperation::SF_OrderBy) )
     {
-      if (i>0 && previous_range_no >= bound.range_no)
+      if (i>0 && previous_range_no >= range_no)
       {
         setOperationErrorCodeAbort(4282);
         goto giveup_err;
       }
-      previous_range_no= bound.range_no;
+      previous_range_no= range_no;
     }
 
     key_count= bound.low_key_count;
+    common_key_count= key_count;
     if (key_count < bound.high_key_count)
       key_count= bound.high_key_count;
+    else
+      common_key_count= bound.high_key_count;
 
     if (key_count > key_record->key_index_length)
     {
@@ -2464,12 +2566,38 @@ NdbTransaction::scanIndex(
                                      NdbIndexScanOperation::BoundGT);
       }
     }
-    op->end_of_bound(bound.range_no);
+
+    /* Set the length of this bound. */
+    bound_head= *op->m_first_bound_word;
+    bound_head|=
+      (op->theTupKeyLen - op->m_this_bound_start) << 16 | (range_no << 4);
+    *op->m_first_bound_word= bound_head;
+    op->m_first_bound_word= op->theKEYINFOptr + op->theTotalNrOfKeyWordInSignal;
+    op->m_this_bound_start= op->theTupKeyLen;
+
     /*
-      ToDo: If both lower and upper, non-strict bounds, compare for
-      equality, to transform into a single 'equal' constraint, and possibly
-      set distributionKey.
+      Now check if the range bounds a single distribution key. If so, we need
+      scan only a single fragment.
+
+      ToDo: we do not attempt to identify the case where we have multiple
+      ranges, but they all bound the same single distribution key. It seems
+      not really worth the effort to optimise this case, better to fix the
+      multi-range protocol so that the distribution key could be specified
+      individually for each of the multiple ranges.
     */
+    if (num_key_bounds==1)
+    {
+      Uint32 distkey_min= key_record->m_min_distkey_prefix_length;
+      if (distkey_min > 0 &&
+          common_key_count >= distkey_min &&
+          bound.low_key &&
+          bound.high_key &&
+          0==compare_index_row_prefix(key_record,
+                                      bound.low_key,
+                                      bound.high_key,
+                                      distkey_min))
+        set_distribution_key_from_range(op, key_record, bound.low_key, distkey_min);
+    }
   }
 
   return op;

--- 1.71/storage/ndb/src/ndbapi/NdbDictionary.cpp	2007-02-20 10:34:49 +01:00
+++ 1.72/storage/ndb/src/ndbapi/NdbDictionary.cpp	2007-02-20 10:34:49 +01:00
@@ -1499,11 +1499,29 @@ NdbDictionary::Dictionary::createRecord(
 
 NdbRecord *
 NdbDictionary::Dictionary::createRecord(const Index *index,
+                                        const Table *table,
                                         const RecordSpecification *recSpec,
                                         Uint32 length,
                                         Uint32 elemSize)
 {
   return m_impl.createRecord(&NdbIndexImpl::getImpl(*index),
+                             &NdbTableImpl::getImpl(*table),
+                             recSpec,
+                             length,
+                             elemSize);
+}
+
+NdbRecord *
+NdbDictionary::Dictionary::createRecord(const Index *index,
+                                        const RecordSpecification *recSpec,
+                                        Uint32 length,
+                                        Uint32 elemSize)
+{
+  const NdbDictionary::Table *table= getTable(index->getTable());
+  if (!table)
+    return NULL;
+  return m_impl.createRecord(&NdbIndexImpl::getImpl(*index),
+                             &NdbTableImpl::getImpl(*table),
                              recSpec,
                              length,
                              elemSize);

--- 1.161/storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp	2007-02-20 10:34:49 +01:00
+++ 1.162/storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp	2007-02-20 10:34:49 +01:00
@@ -4530,10 +4530,11 @@ NdbRecord *
 NdbDictionaryImpl::createRecord(const NdbTableImpl *table,
                                 const NdbDictionary::RecordSpecification *recSpec,
                                 Uint32 length,
-                                Uint32 elemSize)
+                                Uint32 elemSize,
+                                const NdbTableImpl *base_table)
 {
   NdbRecord *rec= NULL;
-  Uint32 numKeys, tableNumKeys;
+  Uint32 numKeys, tableNumKeys, numIndexDistrKeys, min_distkey_prefix_length;
   Uint32 oldAttrId;
   bool isIndex;
   Uint32 i;
@@ -4567,15 +4568,23 @@ NdbDictionaryImpl::createRecord(const Nd
         tableNumKeys++;
     }
   }
+  Uint32 tableNumDistKeys;
+  if (base_table->m_noOfDistributionKeys != 0)
+    tableNumDistKeys= base_table->m_noOfDistributionKeys;
+  else
+    tableNumDistKeys= base_table->m_noOfKeys;
+
   /*
     We need to allocate space for
      1. The struct itself.
      2. The columns[] array at the end of struct (length #columns).
      3. An extra Uint32 array key_indexes (length #key columns).
+     4. An extra Uint32 array distkey_indexes (length #distribution keys).
   */
   rec= (NdbRecord *)calloc(1, sizeof(NdbRecord) +
                               (length-1)*sizeof(NdbRecord::Attr) +
-                              tableNumKeys*sizeof(Uint32));
+                              tableNumKeys*sizeof(Uint32) +
+                              tableNumDistKeys*sizeof(Uint32));
   if (!rec)
   {
     m_error.code= 4000;
@@ -4583,14 +4592,19 @@ NdbDictionaryImpl::createRecord(const Nd
   }
   Uint32 *key_indexes= (Uint32 *)((unsigned char *)rec + sizeof(NdbRecord) +
                                   (length-1)*sizeof(NdbRecord::Attr));
+  Uint32 *distkey_indexes= (Uint32 *)((unsigned char *)rec + sizeof(NdbRecord) +
+                                      (length-1)*sizeof(NdbRecord::Attr) +
+                                      tableNumKeys*sizeof(Uint32));
 
   rec->table= table;
   rec->tableId= table->m_id;
   rec->tableVersion= table->m_version;
   rec->flags= 0;
   rec->noOfColumns= length;
+  rec->m_no_of_distribution_keys= tableNumDistKeys;
 
   Uint32 max_offset= 0;
+  Uint32 max_transid_ai_bytes= 0;
   for (i= 0; i<length; i++)
   {
     const NdbDictionary::RecordSpecification *rs= &recSpec[i];
@@ -4617,6 +4631,8 @@ NdbDictionaryImpl::createRecord(const Nd
     recCol->maxSize= col->m_attrSize*col->m_arraySize;
     if (recCol->offset+recCol->maxSize > max_offset)
       max_offset= recCol->offset+recCol->maxSize;
+    /* Round data size to whole words + 4 bytes of AttributeHeader. */
+    max_transid_ai_bytes+= (recCol->maxSize+7) & ~3;
     recCol->charset_info= col->m_cs;
     recCol->compare_function= NdbSqlUtil::getType(col->m_type).m_cmp;
     recCol->flags= 0;
@@ -4646,8 +4662,11 @@ NdbDictionaryImpl::createRecord(const Nd
     {
       isVarCol= false;
     }
+    if (col->m_distributionKey)
+      recCol->flags|= NdbRecord::IsDistributionKey;
   }
   rec->m_row_size= max_offset;
+  rec->m_max_transid_ai_bytes= max_transid_ai_bytes;
 
   /* Now we sort the array in attrId order. */
   qsort(rec->columns,
@@ -4662,10 +4681,14 @@ NdbDictionaryImpl::createRecord(const Nd
 
     Also test for duplicate columns, easy now that they are sorted.
     Also set up key_indexes array.
+    Also compute if an index includes all of the distribution key.
+    Also set up distkey_indexes array.
   */
 
   oldAttrId= ~0;
   numKeys= 0;
+  min_distkey_prefix_length= 0;
+  numIndexDistrKeys= 0;
   for (i= 0; i<rec->noOfColumns; i++)
   {
     NdbRecord::Attr *recCol= &rec->columns[i];
@@ -4688,6 +4711,13 @@ NdbDictionaryImpl::createRecord(const Nd
         key_indexes[key_idx]= i;
         recCol->index_attrId= table->m_columns[key_idx]->m_attrId;
         numKeys++;
+
+        if (recCol->flags & NdbRecord::IsDistributionKey)
+        {
+          if (min_distkey_prefix_length <= key_idx)
+            min_distkey_prefix_length= key_idx+1;
+          distkey_indexes[numIndexDistrKeys++]= i;
+        }
       }
     }
     else
@@ -4701,6 +4731,13 @@ NdbDictionaryImpl::createRecord(const Nd
   }
   rec->key_indexes= key_indexes;
   rec->key_index_length= tableNumKeys;
+  if (numIndexDistrKeys==rec->m_no_of_distribution_keys)
+    rec->m_min_distkey_prefix_length= min_distkey_prefix_length;
+  else
+    rec->m_min_distkey_prefix_length= 0;
+  rec->distkey_indexes= distkey_indexes;
+  rec->distkey_index_length= numIndexDistrKeys;
+
   /*
     Since we checked for duplicates, we can check for primary key completeness
     simply by counting.
@@ -4712,7 +4749,14 @@ NdbDictionaryImpl::createRecord(const Nd
       rec->flags|= NdbRecord::RecIsKeyRecord;
   }
   if (isIndex)
+  {
     rec->flags|= NdbRecord::RecIsIndex;
+    rec->m_keyLenInWords= base_table->m_keyLenInWords;
+  }
+  else
+  {
+    rec->m_keyLenInWords= table->m_keyLenInWords;
+  }
 
   return rec;
 
@@ -4724,11 +4768,13 @@ NdbDictionaryImpl::createRecord(const Nd
 
 NdbRecord *
 NdbDictionaryImpl::createRecord(const NdbIndexImpl *index_impl,
+                                const NdbTableImpl *base_table_impl,
                                 const NdbDictionary::RecordSpecification *recSpec,
                                 Uint32 length,
                                 Uint32 elemSize)
 {
-  return createRecord(index_impl->getIndexTable(), recSpec, length, elemSize);
+  return createRecord(index_impl->getIndexTable(), recSpec, length, elemSize,
+                      base_table_impl);
 }
 
 void

--- 1.75/storage/ndb/src/ndbapi/NdbDictionaryImpl.hpp	2007-02-20 10:34:49 +01:00
+++ 1.76/storage/ndb/src/ndbapi/NdbDictionaryImpl.hpp	2007-02-20 10:34:49 +01:00
@@ -623,7 +623,9 @@ public:
       The flags are mutually exclusive.
     */
     IsVar1ByteLen= 0x08,
-    IsVar2ByteLen= 0x10
+    IsVar2ByteLen= 0x10,
+    /* Flag for column that is a part of the distribution key. */
+    IsDistributionKey= 0x20
   };
 
   struct Attr
@@ -686,6 +688,12 @@ public:
 
   Uint32 tableId;
   Uint32 tableVersion;
+  /* Copy of table->m_keyLenInWords. */
+  Uint32 m_keyLenInWords;
+  /* Total maximum size of TRANSID_AI data (for computing batch size). */
+  Uint32 m_max_transid_ai_bytes;
+  /* Number of distribution keys (usually == number of primary keys). */
+  Uint32 m_no_of_distribution_keys;
   /* Flags, or-ed from enum RecFlags. */
   Uint32 flags;
   /* Size of row (really end of right-most defined attribute in row). */
@@ -699,7 +707,23 @@ public:
   const Uint32 *key_indexes;
   /* Length of key_indexes array. */
   Uint32 key_index_length;
+  /*
+    Array of index (into columns[]) of distribution keys, in attrId order.
+    This is used to build the distribution key, which is the concatenation
+    of key values in attrId order.
+  */
+  const Uint32 *distkey_indexes;
+  /* Length of distkey_indexes array. */
+  Uint32 distkey_index_length;
 
+  /*
+    m_min_distkey_prefix_length is the minimum lenght of an index prefix
+    needed to include all distribution keys. In other words, it is one more
+    that the index of the last distribution key in the index order.
+    If the index does not include all distribution keys, it is set to 0.
+    This member is only valid for an index NdbRecord.
+  */
+  Uint32 m_min_distkey_prefix_length;
   /* The real size of the array at the end of this struct. */
   Uint32 noOfColumns;
   struct Attr columns[1];
@@ -814,8 +838,10 @@ public:
   NdbRecord *createRecord(const NdbTableImpl *table,
                           const NdbDictionary::RecordSpecification *recSpec,
                           Uint32 length,
-                          Uint32 elemSize);
+                          Uint32 elemSize,
+                          const NdbTableImpl *base_table= 0);
   NdbRecord *createRecord(const NdbIndexImpl *index,
+                          const NdbTableImpl *base_table,
                           const NdbDictionary::RecordSpecification *recSpec,
                           Uint32 length,
                           Uint32 elemSize);

--- 1.30/storage/ndb/src/ndbapi/NdbOperationExec.cpp	2007-02-20 10:34:49 +01:00
+++ 1.31/storage/ndb/src/ndbapi/NdbOperationExec.cpp	2007-02-20 10:34:49 +01:00
@@ -541,12 +541,11 @@ NdbOperation::prepareSendNdbRecord(Uint3
 
   const NdbRecord *key_rec= m_key_record;
   const char *key_row= m_key_row;
-  const NdbRecord *result_rec, *upd_rec;
+  const NdbRecord *attr_rec= m_attribute_record;
   const char *updRow;
 
   TcKeyReq *tcKeyReq= CAST_PTR(TcKeyReq, theTCREQ->getDataPtrSend());
-  Uint32 hdrSize= fillTcKeyReqHdr(tcKeyReq, aTC_ConnectPtr, aTransId,
-                                  m_key_record, ao);
+  Uint32 hdrSize= fillTcKeyReqHdr(tcKeyReq, aTC_ConnectPtr, aTransId, ao);
   keyInfoPtr= theTCREQ->getDataPtrSend() + hdrSize;
   remain= TcKeyReq::MaxKeyInfo;
 
@@ -554,6 +553,8 @@ NdbOperation::prepareSendNdbRecord(Uint3
   if (!key_rec)
   {
     /* This means that key_row contains the KEYINFO20 data. */
+    tcKeyReq->tableId= attr_rec->tableId;
+    tcKeyReq->tableSchemaVersion= attr_rec->tableVersion;
     res= insertKEYINFO_NdbRecord(aTC_ConnectPtr, aTransId, key_row,
                                  m_keyinfo_length*4, &keyInfoPtr, &remain);
     if (res)
@@ -561,6 +562,8 @@ NdbOperation::prepareSendNdbRecord(Uint3
   }
   else
   {
+    tcKeyReq->tableId= key_rec->tableId;
+    tcKeyReq->tableSchemaVersion= key_rec->tableVersion;
     theTotalNrOfKeyWordInSignal= 0;
     for (Uint32 i= 0; i<key_rec->key_index_length; i++)
     {
@@ -601,13 +604,12 @@ NdbOperation::prepareSendNdbRecord(Uint3
   if ((tOpType == InsertRequest) || (tOpType == WriteRequest) ||
       (tOpType == UpdateRequest))
   {
-    upd_rec= m_attribute_record;
     updRow= m_attribute_row;
-    for (Uint32 i= 0; i<upd_rec->noOfColumns; i++)
+    for (Uint32 i= 0; i<attr_rec->noOfColumns; i++)
     {
       const NdbRecord::Attr *col;
 
-      col= &upd_rec->columns[i];
+      col= &attr_rec->columns[i];
       Uint32 attrId= col->attrId;
 
       if (!(attrId & AttributeHeader::PSEUDO) &&
@@ -644,12 +646,11 @@ NdbOperation::prepareSendNdbRecord(Uint3
   }
   else if (tOpType == ReadRequest)
   {
-    result_rec= theReceiver.m_record.m_ndb_record;
-    for (Uint32 i= 0; i<result_rec->noOfColumns; i++)
+    for (Uint32 i= 0; i<attr_rec->noOfColumns; i++)
     {
       const NdbRecord::Attr *col;
 
-      col= &result_rec->columns[i];
+      col= &attr_rec->columns[i];
       Uint32 attrId= col->attrId;
 
       if (!(attrId & AttributeHeader::PSEUDO) &&
@@ -702,7 +703,6 @@ Uint32
 NdbOperation::fillTcKeyReqHdr(TcKeyReq *tcKeyReq,
                               Uint32 connectPtr,
                               Uint64 transId,
-                              const NdbRecord *rec,
                               AbortOption ao)
 {
   Uint32 hdrLen;
@@ -716,8 +716,6 @@ NdbOperation::fillTcKeyReqHdr(TcKeyReq *
   /* We will setAttrinfoLen() later when AttrInfo has been written. */
   tcKeyReq->attrLen= attrLen;
 
-  tcKeyReq->tableId= rec->tableId;
-
   UintR reqInfo= 0;
   TcKeyReq::setSimpleFlag(reqInfo, theSimpleIndicator);
   TcKeyReq::setCommitFlag(reqInfo, theCommitIndicator);
@@ -735,7 +733,6 @@ NdbOperation::fillTcKeyReqHdr(TcKeyReq *
   /* We will setAIInTcKeyReq() and setKeyLength() later. */
   tcKeyReq->requestInfo= reqInfo;
 
-  tcKeyReq->tableSchemaVersion= rec->tableVersion;
   tcKeyReq->transId1= (Uint32)transId;
   tcKeyReq->transId2= (Uint32)(transId>>32);
 

--- 1.24/storage/ndb/src/ndbapi/NdbReceiver.cpp	2007-02-20 10:34:49 +01:00
+++ 1.25/storage/ndb/src/ndbapi/NdbReceiver.cpp	2007-02-20 10:34:49 +01:00
@@ -89,7 +89,6 @@ NdbReceiver::release(){
   }
   else
   {
-    delete[] m_record.m_row_buffer;
     m_record.m_row_buffer= NULL;
   }      
 }
@@ -137,20 +136,28 @@ NdbReceiver::calculate_batch_size(Uint32
                                   Uint32 parallelism,
                                   Uint32& batch_size,
                                   Uint32& batch_byte_size,
-                                  Uint32& first_batch_size)
+                                  Uint32& first_batch_size,
+                                  const NdbRecord *record)
 {
   TransporterFacade *tp= m_ndb->theImpl->m_transporter_facade;
   Uint32 max_scan_batch_size= tp->get_scan_batch_size();
   Uint32 max_batch_byte_size= tp->get_batch_byte_size();
   Uint32 max_batch_size= tp->get_batch_size();
   Uint32 tot_size= (key_size ? (key_size + 32) : 0); //key + signal overhead
-  /* ToDo: Use the NdbRecord to calculate size here instead for NdbRecord operation. */
-  NdbRecAttr *rec_attr= m_recattr.theFirstRecAttr;
-  while (rec_attr != NULL) {
-    Uint32 attr_size= rec_attr->getColumn()->getSizeInBytes();
-    attr_size= ((attr_size + 7) >> 2) << 2; //Even to word + overhead
-    tot_size+= attr_size;
-    rec_attr= rec_attr->next();
+
+  if (record)
+  {
+    tot_size+= record->m_max_transid_ai_bytes;
+  }
+  else
+  {
+    NdbRecAttr *rec_attr= m_recattr.theFirstRecAttr;
+    while (rec_attr != NULL) {
+      Uint32 attr_size= rec_attr->getColumn()->getSizeInBytes();
+      attr_size= ((attr_size + 7) >> 2) << 2; //Even to word + overhead
+      tot_size+= attr_size;
+      rec_attr= rec_attr->next();
+    }
   }
   tot_size+= 32; //include signal overhead
 
@@ -256,9 +263,22 @@ NdbReceiver::do_get_value(NdbReceiver * 
   return;
 }
 
-int
+void
 NdbReceiver::do_setup_ndbrecord(const NdbRecord *ndb_record, Uint32 batch_size,
-                                Uint32 key_size, Uint32 read_range_no)
+                                Uint32 key_size, Uint32 read_range_no,
+                                Uint32 rowsize, char *row_buffer)
+{
+  m_using_ndb_record= true;
+  m_record.m_ndb_record= ndb_record;
+  m_record.m_row= row_buffer;
+  m_record.m_row_buffer= row_buffer;
+  m_record.m_row_offset= rowsize;
+  m_record.m_read_range_no= read_range_no;
+}
+
+Uint32
+NdbReceiver::ndbrecord_rowsize(const NdbRecord *ndb_record, Uint32 key_size,
+                               Uint32 read_range_no)
 {
   Uint32 rowsize= ndb_record->m_row_size;
   /* Room for range_no. */
@@ -272,19 +292,7 @@ NdbReceiver::do_setup_ndbrecord(const Nd
     rowsize+= 8 + key_size*4;
   /* Ensure 4-byte alignment. */
   rowsize= (rowsize+3) & 0xfffffffc;
-
-  char *row_buffer= new char[batch_size*rowsize];
-  if (!row_buffer)
-    return -1;
-
-  m_using_ndb_record= true;
-  m_record.m_ndb_record= ndb_record;
-  m_record.m_row= row_buffer;
-  m_record.m_row_buffer= row_buffer;
-  m_record.m_row_offset= rowsize;
-  m_record.m_read_range_no= read_range_no;
-
-  return 0;
+  return rowsize;
 }
 
 NdbRecAttr*

--- 1.109/storage/ndb/src/ndbapi/NdbScanOperation.cpp	2007-02-20 10:34:49 +01:00
+++ 1.110/storage/ndb/src/ndbapi/NdbScanOperation.cpp	2007-02-20 10:34:49 +01:00
@@ -50,6 +50,7 @@ NdbScanOperation::NdbScanOperation(Ndb* 
   m_array = new Uint32[1]; // skip if on delete in fix_receivers
   theSCAN_TABREQ = 0;
   m_executed = false;
+  m_scan_buffer= NULL;
 }
 
 NdbScanOperation::~NdbScanOperation()
@@ -257,7 +258,13 @@ NdbScanOperation::fix_receivers(Uint32 p
   if(parallel > m_allocated_receivers){
     const Uint32 sz = parallel * (4*sizeof(char*)+sizeof(Uint32));
 
+    /* Allocate as Uint64 to ensure proper alignment for pointers. */
     Uint64 * tmp = new Uint64[(sz+7)/8];
+    if (tmp == NULL)
+    {
+      setErrorCodeAbort(4000);
+      return -1;
+    }
     // Save old receivers
     memcpy(tmp, m_receivers, m_allocated_receivers*sizeof(char*));
     delete[] m_array;
@@ -889,6 +896,11 @@ void NdbScanOperation::release()
   for(Uint32 i = 0; i<m_allocated_receivers; i++){
     m_receivers[i]->release();
   }
+  if (m_scan_buffer)
+  {
+    delete[] m_scan_buffer;
+    m_scan_buffer= NULL;
+  }
 
   NdbOperation::release();
   
@@ -947,7 +959,10 @@ int NdbScanOperation::prepareSendScan(Ui
    */
   theReceiver.prepareSend();
   bool keyInfo = m_keyInfo;
-  Uint32 key_size = keyInfo ? m_currentTable->m_keyLenInWords : 0;
+  Uint32 key_size= keyInfo ?
+    (m_attribute_record ? m_attribute_record->m_keyLenInWords :
+                          m_currentTable->m_keyLenInWords) :
+    0;
   /**
    * The number of records sent by each LQH is calculated and the kernel
    * is informed of this number by updating the SCAN_TABREQ signal
@@ -959,7 +974,8 @@ int NdbScanOperation::prepareSendScan(Ui
                                    theParallelism,
                                    batch_size,
                                    batch_byte_size,
-                                   first_batch_size);
+                                   first_batch_size,
+                                   m_attribute_record);
   ScanTabReq::setScanBatch(req->requestInfo, batch_size);
   req->batch_byte_size= batch_byte_size;
   req->first_batch_size= first_batch_size;
@@ -975,20 +991,27 @@ int NdbScanOperation::prepareSendScan(Ui
   
   if (theStatus == UseNdbRecord)
   {
-    for (Uint32 i = 0; i<theParallelism; i++)
+    if (theParallelism > 0)
     {
-      /*
-        ToDo: Allocate receive buffers here in one big chunk, and hand it over
-        to receivers in pieces.
-        Needs some delicate handling of the way the memory is freed to avoid
-        dangling pointers and leaks in error case.
-      */
-      res= m_receivers[i]->do_setup_ndbrecord(m_attribute_record, batch_size,
-                                              key_size, m_read_range_no);
-      if (res==-1)
+      Uint32 rowsize= m_receivers[0]->ndbrecord_rowsize(m_attribute_record,
+                                                        key_size,
+                                                        m_read_range_no);
+      Uint32 bufsize= batch_size*rowsize;
+      char *buf= new char[bufsize*theParallelism];
+      if (!buf)
       {
         setErrorCodeAbort(4000); // "Memory allocation error"
-        return res;
+        return -1;
+      }
+      assert(!m_scan_buffer);
+      m_scan_buffer= buf;
+
+      for (Uint32 i = 0; i<theParallelism; i++)
+      {
+        m_receivers[i]->do_setup_ndbrecord(m_attribute_record, batch_size,
+                                           key_size, m_read_range_no,
+                                           rowsize, buf);
+        buf+= bufsize;
       }
     }
   }
@@ -2400,13 +2423,13 @@ NdbIndexScanOperation::end_of_bound(Uint
      than one range is specified */
   if (no > 0 && !m_multi_range)
     DBUG_RETURN(-1);
-  if(no < (1 << 13)) // Only 12-bits no of ranges
+  if(no < (1 << 12)) // Only 12-bits no of ranges
   {
     Uint32 bound_head = * m_first_bound_word;
     bound_head |= (theTupKeyLen - m_this_bound_start) << 16 | (no << 4);
     * m_first_bound_word = bound_head;
     
-    m_first_bound_word = theKEYINFOptr + theTotalNrOfKeyWordInSignal;;
+    m_first_bound_word = theKEYINFOptr + theTotalNrOfKeyWordInSignal;
     m_this_bound_start = theTupKeyLen;
     DBUG_RETURN(0);
   }

--- 1.18/storage/ndb/test/ndbapi/flexBench.cpp	2007-02-20 10:34:49 +01:00
+++ 1.19/storage/ndb/test/ndbapi/flexBench.cpp	2007-02-20 10:34:49 +01:00
@@ -60,7 +60,7 @@ Arguments:
 #include <NdbTest.hpp>
 
 #define MAXSTRLEN 16 
-#define MAXATTR 64
+#define MAXATTR 128
 #define MAXTABLES 128
 #define MAXATTRSIZE 1000
 #define MAXNOLONGKEY 16 // Max number of long keys.

--- 1.84/storage/ndb/src/ndbapi/ndberror.c	2007-02-20 10:34:49 +01:00
+++ 1.85/storage/ndb/src/ndbapi/ndberror.c	2007-02-20 10:34:49 +01:00
@@ -623,7 +623,9 @@ ErrorBundle ErrorCodes[] = {
   { 4282, DMEC, AE, "range_no not strictly increasing in ordered multi-range index scan" },
   { 4283, DMEC, AE, "key_record and result_record in index scan are not for the same index" },
   { 4284, DMEC, AE, "Cannot mix NdbRecAttr and NdbRecord operations for scan take-over" },
-  { 4285, DMEC, AE, "NULL NdbRecord pointer" }
+  { 4285, DMEC, AE, "NULL NdbRecord pointer" },
+  { 4286, DMEC, AE, "Invalid range_no (must be < 4096)" },
+  { 4287, DMEC, AE, "The key_record and attribute_record in primary key operation do not belong to the same table" }
 };
 
 static
Thread
bk commit into 5.1 tree (knielsen:1.2436)knielsen20 Feb