List:Commits« Previous MessageNext Message »
From:Ole John Aske Date:November 9 2011 8:55am
Subject:bzr push into mysql-5.1-telco-7.0 branch (ole.john.aske:4640 to 4641)
View as plain text  
 4641 Ole John Aske	2011-11-09
      Fix for better load balancing SPJ operations, and reduce exec. latency :
      
      Add a deterministic skew in the fragmentlist such that it is rotated 
      one place for each SPJ request. This has the effect that a non parallel
      scan is not started on the same subset of fragments by all
      SPJ executors. Which will result in a better loadbalancing and
      reduced latency.

    modified:
      storage/ndb/src/kernel/blocks/dbspj/Dbspj.hpp
      storage/ndb/src/kernel/blocks/dbspj/DbspjMain.cpp
 4640 Ole John Aske	2011-11-09
      SPJ Fix of buffered rows possible refering garbage after reorg'ed memory pages:
      
      When the SPJ block buffer rows, it also convert the RowPtr from a RT_SECTION
      to RT_LINEAR type as part of ::storeRow(). After ::storeRow has allocated
      memory for the row itself, it is put into a 'map' by add_to_map().
      
      However, as add:to_map() also does its own memory alloc, that may
      trigger a reorg of the memory page in order to reclaim free'ed memory blocks.
      
      Such a reorg will may cause the previously allocated row to be moved, and
      the previously memory rowptr to become invalid - thus referring garbage.
      
      Solution is to refetch the ptr after add_to_map() and fill in the
      returned RowPtr& as the final action.
      
      .... No MTR test as I am only able to reproduce this with my rewritten
      calculate_batch_size() + some additional hack for testing that. These hacks
      effectively force a really tiny 'BatchByteSize' to be calculated. That in
      turn cause lots of repeated fetching (and buffering) from bushy scan queries.

    modified:
      storage/ndb/src/kernel/blocks/dbspj/DbspjMain.cpp
=== modified file 'storage/ndb/src/kernel/blocks/dbspj/Dbspj.hpp'
--- a/storage/ndb/src/kernel/blocks/dbspj/Dbspj.hpp	2011-09-29 11:43:27 +0000
+++ b/storage/ndb/src/kernel/blocks/dbspj/Dbspj.hpp	2011-11-09 08:54:55 +0000
@@ -871,6 +871,7 @@ public:
     Uint32 m_senderRef;
     Uint32 m_senderData;
     Uint32 m_rootResultData;
+    Uint32 m_rootFragId;
     Uint32 m_transId[2];
     TreeNode_list::Head m_nodes;
     TreeNodeCursor_list::Head m_cursor_nodes;

=== modified file 'storage/ndb/src/kernel/blocks/dbspj/DbspjMain.cpp'
--- a/storage/ndb/src/kernel/blocks/dbspj/DbspjMain.cpp	2011-11-09 08:39:50 +0000
+++ b/storage/ndb/src/kernel/blocks/dbspj/DbspjMain.cpp	2011-11-09 08:54:55 +0000
@@ -482,6 +482,7 @@ Dbspj::do_init(Request* requestP, const 
   requestP->m_outstanding = 0;
   requestP->m_transId[0] = req->transId1;
   requestP->m_transId[1] = req->transId2;
+  requestP->m_rootFragId = LqhKeyReq::getFragmentId(req->fragmentData);
   bzero(requestP->m_lookup_node_data, sizeof(requestP->m_lookup_node_data));
 #ifdef SPJ_TRACE_TIME
   requestP->m_cnt_batches = 0;
@@ -777,6 +778,7 @@ Dbspj::do_init(Request* requestP, const 
   requestP->m_transId[0] = req->transId1;
   requestP->m_transId[1] = req->transId2;
   requestP->m_rootResultData = req->resultData;
+  requestP->m_rootFragId = req->fragmentNoKeyLen;
   bzero(requestP->m_lookup_node_data, sizeof(requestP->m_lookup_node_data));
 #ifdef SPJ_TRACE_TIME
   requestP->m_cnt_batches = 0;
@@ -4630,12 +4632,17 @@ Dbspj::execDIH_SCAN_TAB_CONF(Signal* sig
   Ptr<Request> requestPtr;
   m_request_pool.getPtr(requestPtr, treeNodePtr.p->m_requestPtrI);
 
+  // Add a skew in the fragment lists such that we don't scan 
+  // the same subset of frags fram all SPJ requests in case of
+  // the scan not being ' T_SCAN_PARALLEL'
+  Uint16 fragNoOffs = requestPtr.p->m_rootFragId % fragCount;
+
   Ptr<ScanFragHandle> fragPtr;
   Local_ScanFragHandle_list list(m_scanfraghandle_pool, data.m_fragments);
   if (likely(m_scanfraghandle_pool.seize(requestPtr.p->m_arena, fragPtr)))
   {
     jam();
-    fragPtr.p->init(0);
+    fragPtr.p->init(fragNoOffs);
     fragPtr.p->m_treeNodePtrI = treeNodePtr.i;
     list.addLast(fragPtr);
   }
@@ -4701,10 +4708,11 @@ Dbspj::execDIH_SCAN_TAB_CONF(Signal* sig
     {
       jam();
       Ptr<ScanFragHandle> fragPtr;
+      Uint16 fragNo = (fragNoOffs+i) % fragCount;
       if (likely(m_scanfraghandle_pool.seize(requestPtr.p->m_arena, fragPtr)))
       {
         jam();
-        fragPtr.p->init(i);
+        fragPtr.p->init(fragNo);
         fragPtr.p->m_treeNodePtrI = treeNodePtr.i;
         list.addLast(fragPtr);
       }

No bundle (reason: useless for push emails).
Thread
bzr push into mysql-5.1-telco-7.0 branch (ole.john.aske:4640 to 4641) Ole John Aske11 Nov