<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Statistically incorrect</title>
	<atom:link href="http://anomalizer.net/statistically-incorrect/feed/" rel="self" type="application/rss+xml" />
	<link>http://anomalizer.net/statistically-incorrect</link>
	<description>Statistically incorrect</description>
	<lastBuildDate>Sun, 18 Dec 2011 17:26:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Object composition implementation styles</title>
		<link>http://anomalizer.net/statistically-incorrect/2011/12/object-composition-2/</link>
		<comments>http://anomalizer.net/statistically-incorrect/2011/12/object-composition-2/#comments</comments>
		<pubDate>Sun, 18 Dec 2011 17:26:55 +0000</pubDate>
		<dc:creator>Arvind Jayaprakash</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[OOP]]></category>

		<guid isPermaLink="false">http://anomalizer.net/statistically-incorrect/?p=102</guid>
		<description><![CDATA[&#160; In the first part, we looked at conceptual implications of the various styles of object implications. Now we shall look at a few common implementations of object composition in conjunction with the concepts presented in the earlier post. Association v/s composition First off, we shall start with an example in C to understand the [...]]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>In the <a title="Implications of object composition styles" href="http://anomalizer.net/statistically-incorrect/2011/10/object-composition/">first part</a>, we looked at conceptual implications of the various styles of object implications. Now we shall look at a few common implementations of object composition in conjunction with the concepts presented in the earlier post.</p>
<h3>Association v/s composition</h3>
<p>First off, we shall start with an example in C to understand the difference</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">struct</span> node <span style="color: #009900;">&#123;</span>
  <span style="color: #993333;">int</span> data<span style="color: #339933;">;</span>
  node <span style="color: #339933;">*</span>next<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span></pre></div></div>

<p>This is a fairly common definition of a node in a singly linked list of integers. Each node contains two elements: an integer and a pointer (logical reference) to the next node. The object layout looks as follows:</p>
<p><a href="http://anomalizer.net/statistically-incorrect/2011/12/object-composition-2/c-ll-node/" rel="attachment wp-att-120"><img class="alignnone size-full wp-image-120" title="c-ll-node" src="http://anomalizer.net/statistically-incorrect/wp-content/uploads/c-ll-node.png" alt="" width="78" height="61" /></a></p>
<p>It is interesting to note that this logical layout would still be correct in most programming language implementations that have managed memory. Things get interesting when we go back to the definition of a rectangle as seen in the <a title="Implications of object composition styles" href="http://anomalizer.net/statistically-incorrect/2011/10/object-composition/">first post</a>.</p>
<p>&nbsp;</p>
<p><a href="http://anomalizer.net/statistically-incorrect/2011/12/object-composition-2/c-rect-node/" rel="attachment wp-att-123"><img class="alignnone size-full wp-image-123" title="c-rect-node" src="http://anomalizer.net/statistically-incorrect/wp-content/uploads/c-rect-node.png" alt="" width="211" height="141" /></a></p>
<p>&nbsp;</p>
<p><a href="http://anomalizer.net/statistically-incorrect/2011/12/object-composition-2/java-rect-node/" rel="attachment wp-att-133"><img class="alignnone size-full wp-image-133" title="java-rect-node" src="http://anomalizer.net/statistically-incorrect/wp-content/uploads/java-rect-node.png" alt="" width="431" height="181" /></a></p>
<p>The difference should be fairly obvious: the rectangle in C contains two points whereas the rectangle in java contains references to two free-standing point objects. Again, note that the layout that is shown for <em><strong>Java</strong></em> actually holds true for any system where memory management is taken care of by some runtime. A non-exhaustive list includes, perl, php, python, shared_ptr in boost, etc. etc.</p>
<h3>Implications of indirection</h3>
<h4>Additional usage of memory</h4>
<p>The most obvious impact is that the rectangle class as implemented in memory managed style now uses two additional pointers to store references. This is the user visible overhead. There could also be a user invisible overhead in managing additional objects on the heap. To get an idea of what such overheads could be, one can look the implementation of heap allocation as described in &#8220;The c programming language&#8221; by K&amp;R. Note that the overheads can be amortized by some very clever implementations or subsidized elsewhere but it does exist in some form or the other. The deeper the composition (i.e. object/struct tree is), the more of overheads we pay (both visible &amp; invisible)</p>
<h4>Slower access of fields</h4>
<p>This is a lot more worrisome aspect than a larger footprint. Loading up of the rectangle object in memory does not mean that the referenced objects tl &amp; br are also loaded into the same part of the memory. Here, the term generic memory is being used as opposed to RAM to signify any class of memory. For eg: it is possible that the rectangle object is sitting in the L1-data cache of the CPU whereas tl is actually sitting on disk since the page containing that object has been swapped out of RAM! In addition, an access to the logical top right corner goes as follows in case of indirection:</p>
<ul>
<li>Load contents of base address of rectangle + offset to tl into say &#8220;r&#8221;</li>
<li>Load contents of base address of point (r) +  offset to x</li>
</ul>
<p>In case of inline composition, it simply would be be &#8220;Load contents of base address of rectangle + (offset to tl +  offset to x) &#8221;</p>
<p>Again, this overhead of jumping across various memory addresses due to a lack of locality of reference is proportional to the complexity of the object composition</p>
<h4>Object copying/serialization</h4>
<p>Another common problem with the indirection scheme is that it introduces the notion of shallow-copy v/s deep-copy. Most common implementations of object copying tend to do shallow copies. Explicit effort is usually needed to provide deep copy semantics. If in case, you have been reading this article as a C v/s Java thing, now is a good time to wake up. C++ programmers for example have always had to deal with &#8220;how deeply should we copy&#8221; problem whenever they chose to use an indirection scheme. In fact, the same is true even of C programmers except that they never had the option to overload the assignment operator and copy construction and hence, it always had to be controlled using documentation + special functions.</p>
<p>A variant of object copying is serialization. Serializing an object with indirections is usually a recursive descent problem given the non-continuous memory layout. Fully inlined objects can on the other hand can be serialized in one logical instruction that transfers a block of memory from one location to another destination.</p>
<h4> No memory sharing</h4>
<p>Interprocess shared memory is a powerful concept in latency sensitive applications. Multiple processes can get a consistent view (including immediate write visibility) without the overheads of IPC. The presence of pointers/references however makes it near impossible to share such objects across processes without implementing a userspace level virtual address manager. This is so because, it is usually infeasible to map a given object to the same address space across multiple processes. Failure to do so means the references/pointers are no longer valid.</p>
<h3>So why go through all of this?</h3>
<p>The most common reason these days for choosing indirection over inlining is that most programming languages no longer offer the choice and the reason is automatic memory management becomes easier once we assume each sub-object has a life of its own. Should you happen to have a choice, then the question is if the sub-object is really a part of the main object or just something that the main object collaborates with. In case of a collaboration, again we have to resort to indirection. That being said, there clearly are situations where it does help to do have inlined objects.</p>
]]></content:encoded>
			<wfw:commentRss>http://anomalizer.net/statistically-incorrect/2011/12/object-composition-2/feed/</wfw:commentRss>
		<slash:comments>138</slash:comments>
		</item>
		<item>
		<title>Implications of object composition styles</title>
		<link>http://anomalizer.net/statistically-incorrect/2011/10/object-composition/</link>
		<comments>http://anomalizer.net/statistically-incorrect/2011/10/object-composition/#comments</comments>
		<pubDate>Sun, 02 Oct 2011 07:03:05 +0000</pubDate>
		<dc:creator>Arvind Jayaprakash</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[OOP]]></category>

		<guid isPermaLink="false">http://anomalizer.net/statistically-incorrect/?p=84</guid>
		<description><![CDATA[Object composition is a concept older than object oriented programming itself. Let us start off with a well understood example of how to represent a rectangle on a Cartesian plane: struct point &#123; int x; int y; &#125; ; struct rectangle &#123; point tl; /* top left corner */ point br; /* bottom right corner [...]]]></description>
			<content:encoded><![CDATA[<p>Object composition is a concept older than object oriented programming itself. Let us start off with a well understood example of how to represent a rectangle on a Cartesian plane:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">struct</span> point <span style="color: #009900;">&#123;</span>
   <span style="color: #993333;">int</span> x<span style="color: #339933;">;</span>
   <span style="color: #993333;">int</span> y<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span> <span style="color: #339933;">;</span>
<span style="color: #993333;">struct</span> rectangle <span style="color: #009900;">&#123;</span>
   point tl<span style="color: #339933;">;</span> <span style="color: #808080; font-style: italic;">/* top left corner */</span>
   point br<span style="color: #339933;">;</span> <span style="color: #808080; font-style: italic;">/* bottom right corner */</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Here the <em>rectangle</em> object is a <strong>composition</strong> of two <em>point</em> objects and tl &amp; br are <strong>sub-objects</strong> (not to be confused with sub-classing) of <em>rectangle</em>.</p>
<h3>Revealing the sub-structure</h3>
<p>Most objects end up revealing the state captured by their sub-objects in some shape or form. Writing a &#8220;<em>java bean</em>&#8221; that marks the sub-objects as private only to reveal them via setters &amp; getters is effectively the same as having no visibility controls in place. The interesting question to ask is if the composing object allows programmers to interact with a snapshot of the sub-object or the actual sub-object itself. Let us once again go back to some code written in two different languages and see if you can spot the difference.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">struct</span> point <span style="color: #008000;">&#123;</span> <span style="color: #0000ff;">int</span> x<span style="color: #008080;">;</span> <span style="color: #0000ff;">int</span> y<span style="color: #008080;">;</span> <span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span>
<span style="color: #0000ff;">class</span> rectangle <span style="color: #008000;">&#123;</span>
   point tl<span style="color: #008080;">;</span> point br<span style="color: #008080;">;</span>
<span style="color: #0000ff;">public</span><span style="color: #008080;">:</span>
   <span style="color: #0000ff;">void</span> set_top_left<span style="color: #008000;">&#40;</span>point a<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span>tl <span style="color: #000080;">=</span> a<span style="color: #008080;">;</span> <span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span>
   point get_top_left<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><span style="color: #0000ff;">return</span> a<span style="color: #008080;">;</span><span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">class</span> point <span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">int</span> x<span style="color: #339933;">;</span> <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">int</span> y<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">class</span> rectangle <span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">private</span> point tl<span style="color: #339933;">;</span> <span style="color: #000000; font-weight: bold;">private</span> point br<span style="color: #339933;">;</span>
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> set_top_left<span style="color: #009900;">&#40;</span>point a<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>tl <span style="color: #339933;">=</span> a<span style="color: #339933;">;</span><span style="color: #009900;">&#125;</span>
  <span style="color: #000000; font-weight: bold;">public</span> point get_top_left<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span><span style="color: #000000; font-weight: bold;">return</span> tl<span style="color: #339933;">;</span><span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>The first snippet is written in C++ and the second one is in Java.  While the code looks identical, the behaviour of the two programs is completely different.</p>
<p>In the C++ version, the <em>setter</em> accepts a point and copies over its state onto the sub-object that represents the top-left point. Any manipulation of the argument by the caller after <em>set_top_left</em> has been invoked does not affect the state of the rectangle. Likewise, the getter returns a copy of state of the top-left point. Any changes done to the top-left point in the rectangle after the <em>getter</em> has returned is not visible to the caller. The reverse insulation also exists in both cases i.e. any changes done by the rectangle after the setter has been invoked is not reflected in the variable that was passed to the setter and any changes done to the variable holding the return value of the getter does not affect the state of the rectangle.</p>
<p>In the Java version, no snapshotting occurs. For example: if the caller chooses to update the abscissa of value that it passed to the setter, then the rectangle would also see the update. The assignment is not a transfer of state. Instead, it is merely transfer of responsibility to some other object.</p>
<p>It is perfectly possible to implement either kinds of behaviour in both languages, it just so happens that the most simplistic (and natural) style of coding produces different results. The exact means to mimic the <em>&#8220;other behaviour&#8221;</em> is left as an exercise to the reader.</p>
<h3>Are sub-objects the same as free standing objects?</h3>
<p>Often times, programmers tend to forget the difference between a class and an object and also implementation details from core concepts. Try and answer the following question to see if you understand the difference: <em>&#8220;Does the top-left corner of a rectangle have an identity and existence outside of the rectangle?&#8221;</em></p>
<p>For starters, the notion of a top-left corner itself is very questionable. It comes into play only if we are to assume that the edges of rectangle are parallel either the x or y axis. In effect, any point that needs to be treated as a top-left corner of the rectangle is only valid as far as a specific implementation of the rectangle is concerned. Unfortunately, this is not the main topic of the post. We shall instead spend more time on the existence part of things. <em><strong>Point</strong></em>, as a concept, can clearly exist without rectangles as a concept. A specific instance of point, say (3,5 ) can also exist without a rectangle. The same co-ordinates (3, 5) can also be a vertex of a rectangle. However, it does not mean that a free standing point (3,5) has the same utility as a vertex (3,5) that is the top-left corner of a rectangle has.</p>
<p>A piece of code that is logically expecting a top-left corner of a rectangle can potentially fail if it is given an arbitrary point. Protection against such failures is usually managed by carefully structuring code paths. One must realize that the semantic type of these entities are different enough though the user-defined data type happens to be the same. This causes a lot more problems than one would anticipate. The most perplexing question is can a vertex of a rectangle outlive the rectangle itself?</p>
<p>The answer is not unless it is semantically casted to something else. As mentioned earlier, this is usually not possible to express this, at least conveniently. This causes weird situations if the designer of the rectangle class has chosen either to use an otherwise general purpose point to represent its vertex or has directly exposed the underlying vertex to the outside world. The former case is not so troubling as the latter one and here is why. Try answering the question <em>&#8220;What happens to the leaked/revealed vertex after the rectangle dies?&#8221;</em>. If we are still able to meaningfully use that object, then it is because we have done a semantic cast that is enabling us to use it in a different context. If not, subsequent operations that we would perform is gibberish. Having a memory managed runtime usually makes it very hard to appreciate this aspect since the object as understood by the language continues to remain valid and hence one finds it that much more harder to spot the issue.</p>
<h3>What does it really mean?</h3>
<p>In short, if you choose to directly reveal a live sub-object as opposed to a snapshot, care must be taken to ensure that code that is operating on the sub-object understands the lifecycle of the containing/composing object. Just because your runtime guarantees that there will be no memory corruption/leaks occur does not mean that you can be sloppy. If you are still a non-believer see the code below and spot the trouble for yourself.</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">class</span> TopMostPointSeeker <span style="color: #009900;">&#123;</span>
   <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #003399;">List</span> pts <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">ArrayList</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #000000; font-weight: bold;">public</span> TopMostPointSeeker<span style="color: #009900;">&#40;</span><span style="color: #003399;">List</span> rs<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span>r <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span> <span style="color: #339933;">||</span> r.<span style="color: #006633;">size</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
         <span style="color: #000000; font-weight: bold;">throw</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">RuntimeException</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Need non-empty collection&quot;</span><span style="color: #009900;">&#41;</span>
      <span style="color: #009900;">&#125;</span>
      <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">&#40;</span><span style="color: #003399;">Rectangle</span> r <span style="color: #339933;">:</span>rs<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
         pts.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span>r.<span style="color: #006633;">get_top_left</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
   <span style="color: #009900;">&#125;</span>
&nbsp;
   <span style="color: #000000; font-weight: bold;">public</span> getTopMost<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #666666; font-style: italic;">/* Implementation delinked from list of active rectangles
       * May not return the topmost point at a given instance.
       *
       * Nothing in the externally visible interface reveals this feature/bug.
       */</span>
      <span style="color: #003399;">Point</span> retval <span style="color: #339933;">=</span> pts<span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">&#40;</span><span style="color: #003399;">Point</span> p<span style="color: #339933;">:</span> pts<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
         <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span>p.<span style="color: #006633;">y</span> <span style="color: #339933;">&amp;</span>gt<span style="color: #339933;">;</span> retval.<span style="color: #006633;">y</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            retval <span style="color: #339933;">=</span> p<span style="color: #339933;">;</span>
         <span style="color: #009900;">&#125;</span>
      <span style="color: #009900;">&#125;</span>
      <span style="color: #000000; font-weight: bold;">return</span> retval<span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Wait for part 2 of this story where we look at the most common implementations and its related implications.</p>
]]></content:encoded>
			<wfw:commentRss>http://anomalizer.net/statistically-incorrect/2011/10/object-composition/feed/</wfw:commentRss>
		<slash:comments>142</slash:comments>
		</item>
		<item>
		<title>In search of a job queue</title>
		<link>http://anomalizer.net/statistically-incorrect/2010/12/in-search-of-a-job-queue/</link>
		<comments>http://anomalizer.net/statistically-incorrect/2010/12/in-search-of-a-job-queue/#comments</comments>
		<pubDate>Mon, 20 Dec 2010 16:40:13 +0000</pubDate>
		<dc:creator>Arvind Jayaprakash</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[job queue]]></category>

		<guid isPermaLink="false">http://anomalizer.net/statistically-incorrect/?p=59</guid>
		<description><![CDATA[Job queues are an essential element in internet application design to execute long running tasks. This stems from the fact that web servers and consequently web applications are best suited for interactive applications. If a particular operation that needs to be carried out fits the description below, then it makes a good case for the [...]]]></description>
			<content:encoded><![CDATA[<p>Job queues are an essential element in internet application design to execute long running tasks. This stems from the fact that web servers and consequently web applications are best suited for interactive applications. If a particular operation that needs to be carried out fits the description below, then it makes a good case for the usage of job queues</p>
<ul>
<li>For the most part, these tasks need to be carried out near instantaneous fashion. Specifically, the commencement of execution of the task should happen at the earliest possible time. The expectation of completion however depends on the nature of task.</li>
<li>If the task cannot be executed immediately for some reason, it should be queued up and processed later.</li>
<li>Executing the task is resource intensive in some form or the other.</li>
<li>Tasks once submitted must not be dropped to the extent possible.</li>
</ul>
<p>The desirable features of a job queue are as follows:</p>
<ol>
<li>The job queue should have durability.</li>
<li>The queue should support multiple producers &amp; consumers. These queue operations would be performed across hosts.</li>
<li>The draining of the queue should happen as quickly as possible. If a new task gets added to the system and there are idle consumers, the consumption should commence at the earliest.</li>
</ol>
<p>The first two points are well addressed by an RDBMS solution. However, it struggles to achieve the third point since there is no inherent notification mechanism and aggressive polling is the closest solution but it does not scale very well. A message queue is good at supporting the last two points but trying to maintain a credible state is exceptionally hard. Interestingly, the popular job queue solutions out there choose to use either an RDBMS (example: <a href="http://gearman.org/">gearman</a>) or an MQ (example: <a href="http://celeryproject.org/">celery</a>). However, a mix of both seems to be the right answer. I shall briefly describe what looks like.</p>
<h3>Adding a new task</h3>
<ul>
<li>Add a new element to your data store. This element should represent every aspect of the task such as the task type, the task details and also task management data such as execution status. A unique id must also be generated by the producer before adding the task to the store. Failure to make this entry is considered as failure to accept the job.</li>
<li>A notification event is sent out a message queue. The notification contains the task type and task id.</li>
</ul>
<h3>Processing a task, the normal case</h3>
<ul>
<li>A pool of consumers is actively waiting for notification of a new task and starts working the moment it gets a notification. The delivery mechanism of the notification can be configured to either exactly one or at least one consumer based on what looks like the right trade-off.</li>
<li>The consumer checks with the data store and manipulates it accordingly to indicate that it has voulenteered to perform the task.</li>
<li>When it is done processing the task (either successfully or unsuccessfully), it updates the store with the outcome.</li>
</ul>
<h3>Processing a task, the abnormal cases</h3>
<ul>
<li> The notification message could have gotten lost and not reached any consumer for a variety of reasons. It is necessary to <em>sweep</em> the job queue periodically for any unprocessed tasks and trigger its execution. The latencies associated with this is comparable to a pure RDBMS based queue. Specifically, the need to scan by the value of a field (task status) ni addition to the normal access pattern based on id is what makes RDBMS a convenient choice.</li>
<li>Semi-completed and also failed tasks may have to be retried depending upon the semantics of the task at hand. This might require a back-off mechanism which will effectively need a scheduler. In such situations, the scheduler needs to be held outside of the job queue to achieve clear separation of responsibilities.</li>
</ul>
<p>So far, I have not been able to find any open source solution that seems to follow the above approach. If you know of any, do let me know. Else I get down to implementing one.</p>
]]></content:encoded>
			<wfw:commentRss>http://anomalizer.net/statistically-incorrect/2010/12/in-search-of-a-job-queue/feed/</wfw:commentRss>
		<slash:comments>137</slash:comments>
		</item>
		<item>
		<title>An introduction to the elements of thrift serialization</title>
		<link>http://anomalizer.net/statistically-incorrect/2010/11/introduction-thrift-serialization/</link>
		<comments>http://anomalizer.net/statistically-incorrect/2010/11/introduction-thrift-serialization/#comments</comments>
		<pubDate>Mon, 01 Nov 2010 16:55:51 +0000</pubDate>
		<dc:creator>Arvind Jayaprakash</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[thrift]]></category>

		<guid isPermaLink="false">http://anomalizer.net/statistically-incorrect/?p=45</guid>
		<description><![CDATA[Apache thrift is a surprisingly popular data serialization and RPC library. I say surprising because at the time of this writing, there is hardly any decent documentation out there that explains the elements of thrift serialization. It is however easy to find tutorials that help you hit the ground running very fast. This article assumes [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://wiki.apache.org/thrift/">Apache thrift</a> is a surprisingly popular data serialization and RPC library. I say surprising because at the time of this writing, there is hardly any decent documentation out there that explains the elements of thrift serialization. It is however easy to find tutorials that help you hit the ground running very fast. This article assumes you are conversant with simple examples.</p>
<h3>Tbase objects</h3>
<p>Data structures are defined using the thrift IDL. Code is then generated using this IDL in some <a href="http://wiki.apache.org/thrift/ThriftUsage">programming language</a>. The generated class usually inherits from a type known as TBase defined in the thrift library. Objects of type TBase are the ones that can be serialized or deserialized.</p>
<h3>Protocols (Serializers)</h3>
<p>Strictly speaking, modern day thrift (0.5 at the time of this writing) is a pluggable serialization library rather than being a specific way of serialization. The original format is simply called the  <strong>binary protocol</strong> (class name TBinaryProtocol).  Other available protocols are the compact protocol and JSON protocol. The job of a protocol is to convert a TBase object to and from a byte stream.</p>
<h3>Transports</h3>
<p>In thrift speak, a transport is a place where the serialized representation of an object is written to (or read from in the case of deserialization). Since the motivation for serialization tends to be either persistence or a network transfer, it is not surprising to find a transport for a generic stream (I/O stream transport) and more specialized versions of it for various kinds of network endpoints. A <em>memory transport</em> is also available for applications that wish to work off a memory representation of serialization. While most transports will pass on the output for a serializer as is, a transport may choose to alter the byte stream as it pleases. The most common usage is byte addition for message framing. Serializers in general, do not produce message boundary markers. If multiple objects need to be used in conjuction with streams, it becomes convenient to have message boundary markers. The framed transport is a good example of a transport that does exactly this kind of byte addition.</p>
<h3>Piecing it all together</h3>
<p>An object of a transport is created and it is associated with a protocol object.  This protocol object can now be used in conjunction with Tbase objects. TBase objects have  read() &amp; write() functions that take a protocol object as an argument. Thrift also comes with seemingly tempting utility classes called <em>TDeserializer</em> &amp; <em>TSerializer</em>.  They are however not the best choice since they tend to be restrictive in terms of the choices of transports and protocols. Here is a pseudocode sample to illustrate the usage pattern:</p>
<pre>class Point inherits TBase
{
  int x
  int y
}

Point obj

FileStream ifs = open("path to file", "r")
TBinaryProtocol tpl(TIOStream(ifs))
obj.read(tp1)

obj.y = 100

FileStream ofs = open("path to file", "w")
TBinaryProtocol tp2(TIOStream(ofs))
obj.write(tp2)</pre>
]]></content:encoded>
			<wfw:commentRss>http://anomalizer.net/statistically-incorrect/2010/11/introduction-thrift-serialization/feed/</wfw:commentRss>
		<slash:comments>328</slash:comments>
		</item>
		<item>
		<title>The fair coin paradox</title>
		<link>http://anomalizer.net/statistically-incorrect/2010/10/fair-coin-paradox/</link>
		<comments>http://anomalizer.net/statistically-incorrect/2010/10/fair-coin-paradox/#comments</comments>
		<pubDate>Sun, 03 Oct 2010 05:12:49 +0000</pubDate>
		<dc:creator>Arvind Jayaprakash</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://anomalizer.net/statistically-incorrect/?p=43</guid>
		<description><![CDATA[One of the most commonly accepted yet fundamentally flawed ideas in probability theory is that of a fair coin. A fair coin is defined as a coin when when tossed sufficient number of times will return roughly the same amount of heads and tails. The trick here is that both the number of trials required [...]]]></description>
			<content:encoded><![CDATA[<p>One of the most commonly accepted yet fundamentally flawed ideas in probability theory is that of a <strong>fair coin</strong>. A fair coin is defined as a coin when when tossed <em>sufficient number of times</em> will return <em>roughly</em> the same amount of heads and tails. The trick here is that both the number of trials required to establish the fairness and the accuracy is left open ended.</p>
<h3>The evil coin</h3>
<p>Let us say that we define some number &#8220;<em>n</em>&#8221; as the number of trials for which the deviation between heads and tails is &#8220;<em>ε</em>&#8220;. The first problem is that past performance is not indicative of future performance. Let us that the actual number of trials needed to establish the fairness of a coin in &#8220;<em>m</em>&#8221; where <em>m &gt;&gt; n</em>.  In such a case, the permitted deviation becomes <em>ε</em><em>/n*m</em>. It is perfectly possible that in all the remaining <em>m-n</em> trials, we exactly get heads or tails and we just happened to look at a small enough subsequence of the trials when we made our initial conclusion. In effect, an <em><strong>evil coin</strong></em> that wishes to deceive us can always do so. So we can never really comment on the fairness of coin without making the assumption that what we have seen thus far is not an aberration of the true nature of the coin.</p>
<h3>Independent trials</h3>
<p>Another interesting problem is one of independent trials.  If what is deemed as a fair coin has been returning a large number of heads than tails in recent trial, then probability of the next trial returning a tail has to go up every time it returns a head from this point onwards until the number of heads and tails roughly match. Either that, or we are dealing with an unfair coin. I first <a href="http://logicalreligion.blogspot.com/2008/11/random-ness-predicting-future.html">encountered</a> this problem about a couple of years ago. As one of the comments point out, we are dealing with an independent and identically distributed trial problem. In the short run, getting a heads in a certain trial using a fair coin does not imply that it has increased the odds of getting tails in the next trial but as mentioned earlier, a sufficiently large trial sequences that indicates a bias towards one direction does have some implications on future trials</p>
<h3>Subset v/s subsequence</h3>
<p>While there seems to be some expectation over fairness being exhibited over a subsequence of trials, the question to ask is if it can be expected over an arbitrary subset of trials from a universal set where we can observe that the coin is indeed fair. For simplicity let us restrict ourselves to both universal sets and subsets whose cardinality is even. It becomes fairly obvious that an arbitrary subset can result in an arbitrary conclusion. What can however be said is that for every subset that is biased a certain way, there is another subset that is biased exactly the other way. In effect, they cancel each other out.</p>
<p>A subsequence happens to be just one arbitrary subset. If so, then why do we expect to exhibit fairness? The answer lies in the subset that we did not choose. Given that we have fixed the size of a universal set <em>and</em> deemed it to exhibit a 50:50 distribution, we can pick any subset of choice but by doing so, we have defined the outcome remaining trials to the extent of what their distribution is going to be.</p>
<h3>Conclusion</h3>
<p>All troubles seem to originate from the assumption that the coin can demonstrate fairness over the next &#8220;n&#8221; trials. That it will actually do so is not guaranteed and more importantly, if it doesn&#8217;t do so it means nothing i.e. statements like &#8220;it has a 99% chance of doing so&#8221; means nothing since you tend to reinvent the problem by now making the unit of your trial as the original &#8220;n&#8221; trails and start all over again.</p>
]]></content:encoded>
			<wfw:commentRss>http://anomalizer.net/statistically-incorrect/2010/10/fair-coin-paradox/feed/</wfw:commentRss>
		<slash:comments>327</slash:comments>
		</item>
		<item>
		<title>Strings, lists &amp; gatherers</title>
		<link>http://anomalizer.net/statistically-incorrect/2010/04/strings-syscalls-lists/</link>
		<comments>http://anomalizer.net/statistically-incorrect/2010/04/strings-syscalls-lists/#comments</comments>
		<pubDate>Wed, 21 Apr 2010 18:29:01 +0000</pubDate>
		<dc:creator>Arvind Jayaprakash</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[scatter gather]]></category>
		<category><![CDATA[string operation]]></category>

		<guid isPermaLink="false">http://anomalizer.net/statistically-incorrect/?p=36</guid>
		<description><![CDATA[Problem statement: You need to send a dynamically constructed stream of bytes over some I/O channel. An illustrative example As always, I shall take an example in the internet space and more specifically in HTML. Let us say that you have a classical two dimensional array whose dimensions are not known at compile time. The [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Problem statement:</strong> You need to send a dynamically constructed stream of bytes over some I/O channel.</p>
<h3>An illustrative example</h3>
<p>As always, I shall take an example in the internet space and more specifically in HTML. Let us say that you have a classical two dimensional array whose dimensions are not known at compile time. The contents of this array happens to be strings. This 2d array needs to be printed in the form of a valid HTML table. To make things more interesting, each cell in the table may be given a CSS class based on the content of the cell. The output could look as follows:<br />
<code> &lt;table&gt;<br />
  &lt;tr&gt;<br />
   &lt;td class="ve"&gt;one&lt;/td&gt;<br />
   &lt;td class="ve"&gt;two&lt;/td&gt;<br />
   &lt;td class="ve"&gt;three&lt;/td&gt;<br />
  &lt;/tr&gt;<br />
  &lt;tr&gt;<br />
   &lt;td&gt;four&lt;/td&gt;<br />
   &lt;td class="ve"&gt;five&lt;/td&gt;<br />
   &lt;td&gt;six&lt;/td&gt;<br />
  &lt;/tr&gt;<br />
  &lt;tr&gt;<br />
   &lt;td&gt;seven&lt;/td&gt;<br />
   &lt;td&gt;eight&lt;/td&gt;<br />
   &lt;td class="ve"&gt;nine&lt;/td&gt;<br />
  &lt;/tr&gt;<br />
&lt;/table&gt;</code><br />
The rule for having the optional CSS class is left as a puzzle for the bored reader.</p>
<h3>A naïve approach</h3>
<p>The most straightforward  technique is to keep writing one chunk at a time, peeking into the the array as necessary. The reality is that very few people do it since everyone is taught that <em>unbuffered</em> I/O is expensive. The origin of this problem comes the fact that in most mainstream operating systems, only the kernel is allowed to do actual I/O and hence a system call needs to be made. The cost of switching from user space to kernel space and back is considered to be high.</p>
<h3>The &#8220;fix&#8221;</h3>
<p>Hence, the strategy is to accumulate a fair amount bytes that needs to be transmitted and then do fewer system calls. I/O APIs in programming languages usually provide transparent APIs where this accumulation happens. It usually comes with a burden of expecting the programmer to indicate the end of the stream so that any left over bytes can be <em>flushed</em> by making one last syscall.</p>
<h3>Memory copy</h3>
<p>One of sources of overhead (not the largest) of making syscall arises from having to copy data from user space to kernel space. This data copy operation happens ever so often in user space when string are concatenated. If either the user program directly concatenates or the buffering implementation does so in the end, part of the syscall overhead is incurred. The plausible reason is that the most common form of output APIs that programmers are exposed to takes a single stream/string.</p>
<h3>Fewer copies</h3>
<p>A far less known technique known as scatter/gather I/O exists that sort of addresses this problem. The <em>gather</em> operation is used for writing output in a single shot from multiple input byte buffers. This API (in both POSIX and Windows) accepts an array of buffers over which is sequentially iterates and writes the output. The problem is now reduced to having an array with pointers/references to all the buffers. If you have come down to having to operate at this level, chances are, you might not want to deal with magically expanding arrays. Your option at that point would be to use a linked list to accumulate all the references to the buffers and then turn it into an array just before performing the write operation.</p>
<h3>But why all this you ask</h3>
<p>&#8230; because unless you are coding at what is considered fairly low levels these days (such a POSIX &amp; C), you do not realize how many times you are ripping up and recreating little byte streams in multiple layers of code using both direct and indirect constructs. This problem becomes very evident when trying to generate  outputs in formats such as XML or JSON where there is a mix of a lot of what I call gluing bytes sprinkled liberally between the actual payload. Given an arbitrarily nested variable loosely typed languages like say the ones available in perl/php/python/javascript, I am wondering what is the most elegant way to arrive a representation like JSON and perform an output operation without mindless string concatenation. Thoughts are welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://anomalizer.net/statistically-incorrect/2010/04/strings-syscalls-lists/feed/</wfw:commentRss>
		<slash:comments>235</slash:comments>
		</item>
		<item>
		<title>Why you can&#8217;t always just throw more hardware at it</title>
		<link>http://anomalizer.net/statistically-incorrect/2009/02/throwing-in-more-hardware-is-not-panacea/</link>
		<comments>http://anomalizer.net/statistically-incorrect/2009/02/throwing-in-more-hardware-is-not-panacea/#comments</comments>
		<pubDate>Fri, 06 Feb 2009 14:30:20 +0000</pubDate>
		<dc:creator>Arvind Jayaprakash</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cs fundamentals]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://anomalizer.net/statistically-incorrect/?p=4</guid>
		<description><![CDATA[A long time ago, people used to worry about the efficiencies of software they used to write. Then came a time when processors just kept getting faster every month the pace wouldn&#8217;t slow down even after crossing the 500MHz mark. Somewhere around this time, people started writing exceptionally bloated software and the bloat started to [...]]]></description>
			<content:encoded><![CDATA[<p>A long time ago, people used to worry about the efficiencies of software they used to write. Then came a time when processors just kept getting faster every month the pace wouldn&#8217;t slow down even after crossing the 500MHz mark. Somewhere around this time, people started writing exceptionally bloated software and the bloat started to grow at a phenomenal pace. Then came the new catch phase <q>hardware is cheap, we can throw more hardware at it</q>. And in one magic swoop, all bloatware became perfectly acceptable since the bloat now seemed to be affordable.  And this was precisely the point wherein most people forgot their CS fundamentals. If you have done a course on CPU scheduling, you would know these metrics:</p>
<ol>
<li>CPU utilisation</li>
<li>Throughput</li>
<li>Turnaround time</li>
<li>Waiting time</li>
<li>Response time</li>
</ol>
<p>I will take up web application space as an example in the remainder of the discussions since it has a fairly large development community and also because it is littered with bloatware + hardware is cheap mentality.  In web applications, the consumer is usually worried about response times and turnaround times. Let us say there is solution <em>A</em> wherein it takes a full second for the server to process a single web request and solution <em>B</em> that takes 50 milliseconds to process a single web request. A very misplaced number that people chase is <em>requests/second</em> and this is solved using the now infamous <strong>throw more hardware</strong> approach. Focus on throughput works in businesses when your consumers have nowhere else to go and your notion of increasing business is by increasing volumes. You don&#8217;t hear people switching banks because of how fast (or slow) their websites load and the reason is that main product offering is banking service and not a website i.e. you would worry more about interest rates rather than website response times. Businesses whose primary offering is the website itself cannot take such liberties.</p>
<h3>Turnaround time</h3>
<p>Turnaround time is the total time taken to service a request. So, if you have a slow running web page, you can keep adding more hardware to take on more volume (assuming the solution can be scaled out infinitely) but the experience of each individual user is not going to improve. Also, real world experience suggests that left to itself, things start to slow down as you scale out. A knee jerk fix is to do things in parallel and use <em>threads</em>. That also usually doesn&#8217;t get you too far thanks to what a certain <a href="http://en.wikipedia.org/wiki/Amdahl's_law">Amdahl had to say</a>. This is where all those classes on algorithms, architecture and the abstinence from bloatwares begin to make some difference.</p>
<h3>Response time</h3>
<p>Response time is what is usually called as <a href="http://blog.browsermob.com/2009/04/understanding-time-to-first-byte/">time to first byte</a> in the internet world. In trying to solve the turnaround time problem, one of the speedup areas that people work on is minimizing the context switches from user space to kernel space. <a href="http://www.ibm.com/developerworks/library/j-zerocopy/index.html">Zero copy</a> is an example of one such problem. The most common example however happens to be buffered files (or streams if you are from the Java world). Some people (and their software creations) take this to the extreme and try and send out the entire HTTP response in one shot hoping to minimize the number of system calls needed to get the job done. It turns out that this makes for a worse user experience. Put it another way, it is better off to start sending something to the user after 200 milliseconds (ms) and finish it in the next 4 seconds rather than start sending something 2 seconds after the request was issued and get done in the next 500 ms.  In fact this is a harder problem to solve for two reasons:</p>
<ul>
<li> Left to itself, most web servers aren&#8217;t eager to push back smaller chunks of data (easier problem to solve)</li>
<li>Dynamic pages, especially the ones generated MVC frameworks do not make the response available to the web server until they have fully constructed the response body. Some of these solutions offer no straight forward way to push out data in parts while others have explicit mechanisms of achieving this effect.</li>
</ul>
<p>For those of you who are still wondering why something that puts on extra load on the server <strong>and</strong> takes longer to finish is considered better by the user, there are two reasons:</p>
<ul>
<li> <strong>Psychological</strong>: Giving the user an early indication of some progress creatives some incentive for the user to wait rather than sending no information. Even getting the status bar to say <q><em>recieving from &#8230;</em></q> as opposed to <q><em>sending request to &#8230;</em></q> makes a difference.</li>
<li><strong>Pipeline effect</strong>: An average web page has references to various resources (images, external css files, etc. etc.) that are needed to completely render a page. It turns out that most browsers can initiate the retrieval of those resources before the page loads up completely. Pushing out a partial response early on gives the browsers a chance to get started with other things early on. So while the additional flushes done on the server side might have slowed down the turnaround time for basic page transmission, the overall turnaround time as seen by the user can still drop with this technique.</li>
</ul>
<h3>Throughput</h3>
<p>Since throughput signifies the total amount of work that gets done in a unit of time, it turns out that throwing more hardware can sort of solve this problem. As I had mentioned earlier, if you solution scales infinitely, then the hardware addition technique works. The reason why things do not scale infinitely are:</p>
<ul>
<li>There ends up being some components that are hard to scale infinitely such as the top level load balancer and the pipes that it is connected to</li>
<li>Amdahl&#8217;s law</li>
</ul>
<p>One of the most common fixes that is a borderline superstition is to run more threads. In a CPU bound world, having any more threads of execution than the number of compute unit slows things down. In a <a href="NUMA">NUMA</a> based world, certain workloads can be detrimental even when the number of threads matches the number of compute unit available. However, for workloads that are I/O bound, threads do help as long as the different threads are not contending for the same underlying I/O resource. The one exception is <a href="http://www.cs.jhu.edu/~yairamir/cs418/os8/sld022.htm">rotating storage media</a> where the amortized performance increases as concurrent requests increase but only up to a certain point.</p>
<p>In effect, the reasons for throughput not increasing just by increasing either the concurrency levels of task execution or by throwing in more hardware beyond a certain point is very real.</p>
<h3>Closing remarks</h3>
<p>We are now in an age where people not only believe that hardware is cheap but also in cloud computing that promises provisioning of infinite hardware (i.e. more than you can afford). The thing to remember is that you might extend the life of a given solution for quite sometime by throwing in hardware (at diminishing rate of returns) but if you are chasing response times, you will have to constantly improvise on your design as opposed relying on hardware.</p>
]]></content:encoded>
			<wfw:commentRss>http://anomalizer.net/statistically-incorrect/2009/02/throwing-in-more-hardware-is-not-panacea/feed/</wfw:commentRss>
		<slash:comments>179</slash:comments>
		</item>
	</channel>
</rss>

