March 17th, Jiandan


Implementation notes:

getNext() needs to grab UpdateLog lock to synchronized the
access of per-writer-log and to access per-writer-log item.
Its methods are highly bound to per-writer-log.

So we implement the InMemLogIterator as an inner private class
that implements the general Iterator interface.

and all of the functions of InMemLog are implemented in
the current UpdateLog.java

tbd: clean up UpdateLog.
     too complicated to make any changes.
     mix the basic function with garbage collection is really
     a bad idea.



Main changes:
  add SingleWriterLogPointers.java
  add InMemLogIterator.java
  add registeredIters to SingleWriterLogUncommitted.java
     - add function registerAll(InMemLogIterator[])
     - add function register(InMemLogIterator)
     - add function remove(InMemLogIterator)
 
  add inner class InMemLogInternalIterator to UpdateLog.java
      - add member activeIters
      - add function makeInMemLogIterator()
      - add function removeInMemLogIterator()
      - initiate activeIters
        register activeIters to any new perWriterLog added in perWriterLog
        add/remove iter to/from activeIters
        register iter to perwriterlog[wid] whenever the writer has no new item 
        of wid to return 

----------------------------------------------------------------------------

Option 1 does not work. Because the nextCausal will keep change.

Consider the following scenario:
-  A: 1, 2, 3
-  B: 1, 4

If A doesn't talked to B yet, then the causal linked list is A 1, A 2, A 3
If II.getNext(A 2), will return A3.

If A talked to B, the causal linked list will be A1 B1 A2 A3 B4.

If an InvalIterator returns A3 before A talked to B, then after A talked to B, 
getNext(A 3) will return B 4 and omitting B 1.



Two options of how to maintain the InMemLogIterator
(1) inside InMemLog

    Whenever the SingleWriterLog applying new per-writer-log item,
    it checks each SingleWriterLogIterator and sees if it needs to put new item
    into the SingleWriterLogIterator.nextPointers so that the nextPointers
    reflect the new per-writer-log immediately.
    
    Cons: write/applyInval performance involves O(#iterator) updates
    pros: do it once, the iterator.getNext() only needs O(#iterator) search

    
    
(2) outside of InMemLog
    
    Whenever iterator.getNext() is called, it needs to check if there's
    anything new happened. O(#Iterator) search to identify new updates, 
    then another O(#Iterator) to find  the min.
    
    Cons: more overhead for each getNext(); even if no more new updates 
          happen.

    Pros: SingleWriterLog is simpler.
          no affects to the local write/read performance    
    
Pick (1), because it does not impose unnecessary calculation.
     only when it really has new items will it check.

Refer to NewInvalIteratorDesign_March10.txt for implementing option (2)

Note:
  Potential scalability issues:
     When all outgoing streams are current, i.e. reach the end of the 
     per-writer-log, then every single writer log will have #active iterators
     registered. Whenever a new update arrives, it requires O(#active iterators)
     work to release those iterators from the single writer log,
     
     
     When the number of active iterators grow, the local write/applylInval
     will be linear to the number of outgoing connections.


Summary of Changes need:
=======================

(1) create InMemLogIterator class:
    An iterator that returns all invalidations in the per-writer-log
    in causal order by getNext();
    return null at the end;

    Difference between InvalIterator and InMemLogIterator:
      - performance: Besides cvv, it keeps the references of the
          next item to return for each writer so that the getNext will 
          be start from the last pointer instead of from the very begining.
	  
	  What's more, we precalculate the nextItem for each writer.
	  whenever iter.getNext() is called, it directly returns the min
	  of all current nextItem, and prefetch the next item to replace
	  the one to return.

      - semantics:
         . InMemIterator is block free.
	   It won't block. If no more item to return, it simply returns null.

         . InMemIterator won't accumulate any SingleWriterInv into
	   a general imprecise invalidation.
	   It only faithfully reports all the SingleWriterInv in the
	   current per-writer-log. Each SingleWriterInv is ordered by
	   it's start acceptStamp.


(2) create SingleWriterLog class:
    move all the SingleWriterLog code in UpdateLog to SingleWriterLog

    --> clean up UpdateLog

(3) OutgoingConnectionEventMailbox:
    Coordinate the OutgoingConnectionWorker 
    between UpdateLog and OutgoingConnection. 

    Why?

    Issues with current design:
    
     OutgoingConnectionWorker is a single-thread for three tasks:
     1. deal with addSubscription/removeSubscription requests
     2. send catchup streams
     3. sending next new invalidation by calling UpdateLog.getNext()

    Currently, the possible blocking it has is:
     block at UpdateLog.getNext() due to no new updates which have a timeout
    
    During the blocking, the new coming addSub/removeSub requests can't be
    applied until it returns due to timeout if no new updates ever issued.


    Another issue with the worker is about Accumulating imprecise invalidation:
    currently, the accumulating is done in InvalIterator.
    Then it is possible that while InvalIterator is accumulating impreciseInv
    for a certain accumulatingTimeout period, there're new addSub/removeSub
    coming. Then what we will see is that the processing of addSub/removeSub 
    has a huge delay.


    Solution:
   
    add OutgoingConnectionEventMailbox to coordinate the worker and updateLog
    and outgoingConnection.

    OutgoingConnectionEventMailbox:
 
    1. block: 
       the worker only wait on OutgoingConnectionEventMailbox.getNextEvent() if 
       there's no pending requests or new updates.
 
    2. notify:
       OutgoingConnection will notify mailbox when there's new request
       UpdateLog will notify mailbox when there's new updates.

    move accumulating to Worker.

(4) IncomingConnection keeps a SingleWiterLogPointer of the last item just applied
    to reduce the overhead of applying one item. right now. it's O(N).



InvalIterator's actual functions are put inside SingleWriterLog because
we have to grab a SingleWriterLog to access any item in per-writer-log.

So the outside InvalIterator needs to get a token at the very begining,
and forward any function call to SingleWriterLog because all the states
are maintained inside SingleWriterLog.




assumption: InvalListItem item's invaliadtion member can change, but
the invariant is that item.start will always be the same.

How to enforce this invariant?

InMemLogIterator:
=============
  
  InMemLog log;


  //must create from a InMemLog
  public 
  InMemLogIterator(InMemLog log){
    this.log = log;
    
  }


  public AcceptVV
  getCVV()
  throws InvalidIteratorException{
    try{
	AcceptVV ret = this.log.getIteratorCVV(this);
    }catch{InvalidTokenException ite){
        throws new InvalidIteratorException(ite.toString());
    }
  }

  public SingleWriterInv
  getNext()
  throws InvalidIteratorException{
    try{
      SingleWriterInv ret = this.log.getNext(this);
    }catch(InvalidTokenException ite){
      throws new InvalidIteratorException(ite.toString());
    }
  }


/*
 * 
 */

InMemLog:
============
HashMap<NodeId, SingleWriterLogUncommitted> perWriterLogs;
HashMap<InMemLogIterator, IteratorState> activeIters;

//
// called by outgoingConnection
// this is the only way for outside world to create an invalIterator
//
public InMemLogIterator
makeInMemLogIterator(AcceptVV excludedStartVV){
  //populate the next items for each writer
  nextPointers = this.getNextPointers(excludedStartVV);
  ret = new InMemLogIterator(this);
  IteratorState iter = new IteratorState(excludedStartVV,
                                  nextPointers);
  activeIters.put(ret, iter);
  return ret;
}




public AcceptVV
getIteratorCVV(InMemLogIterator iter)
throws InvalidIteratorToken{
  if(!activeIters.containsKey(token){
    throw new InvalidToken(token.toString());
  }
  return iter.getCVV();
}


//
// assumption
// all nextItems are already stored in the corresponding
// InMemLogIterator
//
// just get the item with the minimum start
// advance cvv
// prefetch the next item on the same writer to replace
// the one about to be sent
//
public SingleWriterInv
getNext(InMemLogIterator iter)
throws InvalidIteratorException{
  if(!activeIters.containsKey(iter){
    throw new InvalidIteratorException(iter.toString());
  }
  IteratorState iterState = activeIters.get(iter);
  nextItem = iterState.getMin();
  if(nextItem == null){//no more new invalidates to return
    return null;
  }

  ret = nextItem.getInv();

  //update the pointers
  iterState.advanceCVV(ret.getEndVV());
  newNextItem = nextItem.getNewer(); 
  if(newNextItem != null){
    iterState.advancePointer(nextItem.getNodeId(), newNextItem);
  }else{//no new inval --> reach the end of the writer's updates

    //register in per-writer-log so that it will get a new one
    //whenever the writer gets a new update 
    log = perWriterLogs.get(nextItem.getNodeId());
    log.register(iterState);
    iter.remove(nextItem.getNodeId(), nextItem);
  }

  return ret;
}





public void
removeIter(token){
  activeIters.remove(token);
}

private SingleWriterLogPointers 
getNextPointers(AcceptVV startVV){
  ret = new SingleWriterLogPointers();
  for each pwlog in perWriterLogs
    if pwLog.nodeId exists in startVV
      start = startVV[pwlog.nodeId]
    else
      start = -1;

    item = perWriterLog.getNextItemByStart(start);
    ret.add(item);
  return ret;    
}


//
// Following functions are needed for UpdateLog
// to replace perWriterLog in UpdateLog by
// InMemLog
//

//
// replace perWriterLogs.put(nodeId, l)
// in UpdateLog.java
// 
public void 
addSingleWriterLog(NodeId nid, 
                   SingleWriterLogUncommited log){
  log.registerAll(activeIters.values());
  perWriterLogs.put(nid, log);
}


//
// replace perWriterLogs.get(nodeId) 
// by perWriterLogs.getSingleWriterLog(nodeId)
// in UpdateLog.java
//
public SingleWriterLogUncommitted 
getSingleWriterLog(NodeId id){
  return perWriterLogs.get(id);
}

//
// replace perWriterLogs.keySet()
//
public NodeId[]
getNodeIds(){
  return perWriterLogs.keySet();
}

SingleWriterLogUncommitted:
===========================
ArrayList registeredIters;

public long
merge(GeneralInv inv){
  ...
  current.setOlder(newest);
  for each iter in registeredIters
      //sanity check
      assert each iter.getPointer(myNodeId)==null;
              && iter.getEndVV()[myNodeId] == maxCounter
      iter.addPointer(myNodeId, current);
  registeredIters.removeAll();
}

public void
registerAll(InMemLogIterator[] iters){
  registeredIters.addAll(iters);
} 

public void
register(iter){
  
}


public InvalListItem
getNextItemByStart(long start){
  //similar to getNextByStart
  //except that here we return the entire item
  // not just inv
}




/*
 * inner class of InMemLog
 * 
 * stores internal state for external active InMemLogIterator
 * with the same token
 *
 * It's unsynchronized.
 */

IteratorState:
===================


SingleWriterLogPointers nextPointers;

CounterVV cvv; //summarize all sent invals

private
IteratorState(AcceptVV excludedStartVV, 
	      SingleWriterLogPointers pointers){
  cvv = new CounterVV(excludedStartVV);
  this.token = token; 
  this.nextPointers = pointers;
}



public AcceptVV
getCVV(){
  return cvv.cloneAcceptVV();
}


  private void
  advanceCVV(AcceptVV vv){
    cvv.advance(vv);
  }

  public void
  advancePointer(nodeId, InvalListIterm item){
    nextPointers.advance(nodeId, item);   
  }  

  public void
  removePointer(nodeId, item){
    nextPointers.remove(nodeId, item);
  }

  public void
  addPointer(nodeId, item){
    nextPointers.add(nodeId, Item);
  }


//if empty return null
private InvListItem getMin(){
        
}


public void
sanityCheck(InMemLog log){
  SingleWriterLogPointers expectedPointers = log.getNextPointers(cvv.cloneAcceptVV());
  assert expectedPointers == this.nextPointers;
}



/*
 * util class
 * keep per-writer InvalListItem references for a number of nodes 
 */
public SingleWriterLogPointers:
====================
Hashmap<nodeId, InvListItem> pointers;

public SingleWriterLogPointers(){
  pointers = new Hashmap();
}

public NodeId[]
getAllNodeIds(){

}

public InvalListItem
getInvalListItem(NodeId nid)
throws NullPointerException{
  return pointers.get(nid);
  
}

public boolean
contains(nodeId){
  return pointers.containsKey(nodeId);
}

public void
advance(NodeId newNID, InvalListItem newItem){
  assert newNID match newItem
  assert pointers.containsKey(nodeId);
  assert curretItem < newItem;
  pointers.put(..);
}

public void
add(NodeId newNID, InvalListItem newItem){
  assert ! containsKey(nodeId);
  pointers.put(nodeId, newItem);
}

public void
remove(id, item){
  assert contains and match;
  remove;
}


//-----------------------------------------------------------------
//-----------------------------------------------------------------
//-----------------------------------------------------------------
//-----------------------------------------------------------------
// new locking algorithms
// for the OutgoingConnectionWorker
//-----------------------------------------------------------------
//-----------------------------------------------------------------
//-----------------------------------------------------------------
//-----------------------------------------------------------------

//changes to OutgoingConnection
// 
// previous protocol
// initialize a stream by assuming ss = empty, startvv=sender.cvv
//
// new protocol
// only initialize a stream by the first call of addSubscriptionSet(startVV, ss); 
// it does not make sense to process a removeSubscriptionSet without a stream
//
// it does not make sense to have an empty stream initially
// rather let the policy layer to decide if it wants an empty stream or not
// by first calling a request: addSubscriptionSet(startVV, ss);
//

OutgoingConnection::
==================

SubscriptionSet ss;
OutgoingConnectionNextEventMailBox mailBox;

OutgoingConnection(SubscriptionSet ss, AcceptVV startVV,
		   Core, controller, receiverId, receiverDNS, invalport){
    ...
    mailBox = core.makeOutgoingConnectionMailBox(startVV);
}

synchronized private void
startNewWorker(){
  worker = new OutgoingConnectionWorker(core, controller,
                                        receiverId, receiverDNS, 
					portInval, 
					mailbox, this);
					

}


public synchronized void
addSubscriptionSet(SubscriptionSet newSubset, 
                                              AcceptVV newStartVV,
                                              boolean includeBodiesIfCPSent,
					      boolean catchupWithCP){

//see current implementation
//replace pendingReq.add
//by mailBox.addNewRequest(newRequest);
}

protected synchronized void addSubscriptionRequest(SubscriptionRequest sr){
//see current implementation
//replace pendingReq.add
//by mailBox.addNewRequest(newRequest);

public synchronized void removeSubscriptionSet(SubscriptionSet newSubset){
//see current implementation
//replace pendingReq.add
//by mailBox.addNewRequest(newRequest);


//rethink the "synchronized" of all methods



OutgoingConnectionWorker::
========================

public
OutgoingConnectionWorker(core, controller,
                                        receiverId, receiverDNS, 
					portInval, 
					mailbox, this){


}


void run(){
  //initialize connection
  while(!workQueue.getTimeToDie()){
    if(accumulated != null){    
      set timout for getNext 
      //need timeout so that the sender
      //can push all the previous accumulated
      //invalidations
      nextEvent mailBox.getNext(timeout);
    }else{
      nextEvent mailBox.getNext(timeout);
    }
    
   
    

    switch(nextEvent){//process nextEvent
    //  (1) add sub
    //      copy current code
    //  (2) remove sub
    //      copy current code
    
    // send new inv -- accumulate invalidation
    // copy current InMemLogIterator's accumulation code
    // here

  }
}


OutgoingConnectionNextEventMailBox
==================================
//for now unbounded

queue, ii;

public
OutgoingConnectionNextEventMailBox(InMemLogIterator ii){
  queue = new queue();
  this.ii = ii;
}

//called by OutgoingConnectionWorker
//NextEvent  add/remove requests or nextInvalidate to send
synchronized NextEvent getNext()
throws InvalidIteratorException{
  SingleWriterInv gi = null;
  if(queue.isEmpty()){
    gi = ii.getNext();
  }

  while (queue.isEmpty() && (gi == null)){
     wait();
     if(queue.isEmpty()){
      gi =ii.getNext();//might block on UpdateLog
     }
  }

  if(!queue.isEmpty()){
    return queue.getNext();
  }else{
    assert gi != null;
    return gi;
  }

}


//called by OutgoingConnectionPool
synchronized void addNewRequest(newRequest){
   queue.add(newRequest);
   notifyAll();
}

//called by UpdateLog
//Note: don't put it under UpdateLog lock.--> deadlock
synchronized void newUpdate(){
   notifyAll();
}



UpdateLog
==============================
mailboxList; //synchronized

   add(mailbox){
      mailbox.add(mailbox);
   }
   private getNewUpdates(){//for all mailbox() call newUpdate();
     for each mailbox in mailboxList
        mailbox.newUpdate();
   }
   remove(mailbox){
     mailbox.remove(mailbox);
   }

   private void
   applyInvalInternal(inv){
    lock.lock()
      ....
    lock.unlock();
    getNewUpdates();
   }



Plans:

  (0) proof read pseudo code

      incrementally plug-in the new code with the existing code.
  
  (1) implement InMemLogIterator

  (2) replace log.getNext() in the original InMemLogIterator.getNext()
      with new InMemLogIterator.getNext()

  (3) Test the performance for the subscription

  (4) change OutgoingConnection to use the new MailBox
      and use the new InMemLogIterator

  (5) run all related unit tests

  (6) clean up old code completely including the UnbindMsg and DebargoMsg
      old comments etc. see codeReviewWithPrinceSangmin.txt

  (7) rewrite the UpdateLog garbage collection code to not go beyond
      any active iterator's cvv. so that the iterator doesn't throw 
      VVGap during the streams. make life easier.

