<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>pwnetics</title>
	<atom:link href="http://pwnetics.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://pwnetics.wordpress.com</link>
	<description>post-gradschool blog offerings to the index</description>
	<lastBuildDate>Sat, 21 Jan 2012 14:16:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='pwnetics.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>pwnetics</title>
		<link>http://pwnetics.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://pwnetics.wordpress.com/osd.xml" title="pwnetics" />
	<atom:link rel='hub' href='http://pwnetics.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Creating a Basic Webpage with html5-boilerplate</title>
		<link>http://pwnetics.wordpress.com/2011/12/30/creating-a-basic-webpage-with-html5-boilerplate/</link>
		<comments>http://pwnetics.wordpress.com/2011/12/30/creating-a-basic-webpage-with-html5-boilerplate/#comments</comments>
		<pubDate>Fri, 30 Dec 2011 18:48:40 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=458</guid>
		<description><![CDATA[Here&#8217;s how I put together a basic personal homepage that is accessible, reads well on mobile devices, and follows current HTML5 and CSS3 best practices. Framework The first step was to survey the current crop of HTML5/CSS/Mobile frameworks. Leading the pack is html5-boilerplate. This is a full-site solution, including things like a .htaccess with smart [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=458&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s how I put together a basic <a href="http://pwnetic.com">personal homepage</a> that is accessible, reads well on mobile devices, and follows current HTML5 and CSS3 best practices.</p>
<p><span id="more-458"></span></p>
<h2>Framework</h2>
<p>The first step was to survey the current crop of HTML5/CSS/Mobile frameworks. Leading the pack is <a href="http://html5boilerplate.com/">html5-boilerplate</a>. This is a full-site solution, including things like a <tt>.htaccess</tt> with smart defaults and a build script that automatically optimizes html/css/js/images for the web. It has decent documentation, including several videos which I couldn&#8217;t bring myself to sit through, so I just dove in and started working.</p>
<p>Obtain <a href="http://github.com/h5bp/html5-boilerplate">html5-boilerplate from GitHub</a> and create a branch for your site. The GitHub version is up-to-date and branching makes it easier to merge future updates. My branch is called &#8220;homepage&#8221;.</p>
<pre>git clone https://github.com/h5bp/html5-boilerplate.git
git checkout -b homepage</pre>
<h2>Content and Styling</h2>
<p>The main files you&#8217;ll modify for a simple static site are <tt>index.html</tt>, <tt>css/style.css</tt>, all the favicon-y images, and <tt>humans.txt</tt>. For <tt>index.html</tt> be sure to add your information to the title and description tags and modify the Google Analytics id section.</p>
<p>The CSS is based on progressive enhancement, where the default CSS is minimal and targeted at a low-resolution and bandwidth mobile device. This is the first section of CSS in <tt>style.css</tt>, and then for larger-screen devices (presumably with more bandwidth and processing power) you later introduce extra styling in the <tt>@media</tt> sections.</p>
<p>My site has a top-to-bottom linear layout of all items by default. As the browser window width increases, an image is floated alongside the text and we accept the extra overhead of loading a background image tile. I chose the background from the site <a href="http://subtlepatterns.com">Subtle Patterns</a> and the color pallet with the help of <a href="http://colorschemedesigner.com">Color Scheme Designer</a>.</p>
<h2>Publish</h2>
<p>When the site is ready to go, use the ant build script to generate the web-optimized site. This will create a <tt>publish/</tt> directory that is ready to be placed on the server. On Ubuntu, the build required the Oracle JDK, rather than the OpenJDK that comes installed by default, to avoid seeing an error message.</p>
<p>I wanted to manage deployment via git, so that I would have an fast way to rollback changes and that didn&#8217;t require re-building the site. I created a git repository in the <tt>publish</tt> directory (it is <tt>.gitignore</tt>&#8216;d by the parent project) and pushed it to my remote server. On the server, I cloned the repository directly into my web directory. The html5-boilerplate <tt>.htaccess</tt> prevents access to dotfiles, so the <tt>.git</tt> information is safe. I use the hash from the development repository branch for the commit message in the publish repository.</p>
<p>Unfortunately, by default, the html5-boilerplate build script deletes the <tt>publish</tt> directory, wiping out the git publishing repository. To fix this, I replaced the line in <tt>build.xml</tt>:</p>
<pre>&lt;delete dir="./${dir.publish}/"/&gt;</pre>
<p>with:</p>
<pre>&lt;delete dir="./${dir.publish}/" includeemptydirs="true" defaultexcludes="false"&gt;
&lt;exclude name=".git/**" /&gt;
&lt;/delete&gt;</pre>
<h2>Workflow</h2>
<p>The process for updating my webpage is:</p>
<ol>
<li>Make changes to development branch</li>
<li><tt>ant minify</tt></li>
<li>local: <tt>git add/commit/push</tt></li>
<li>remote: <tt>git pull</tt></li>
</ol>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/458/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/458/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/458/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/458/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/458/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/458/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/458/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/458/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/458/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/458/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/458/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/458/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/458/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/458/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=458&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/12/30/creating-a-basic-webpage-with-html5-boilerplate/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
		<item>
		<title>Naive Sort in Prolog</title>
		<link>http://pwnetics.wordpress.com/2011/10/15/naive-sort-in-prolog/</link>
		<comments>http://pwnetics.wordpress.com/2011/10/15/naive-sort-in-prolog/#comments</comments>
		<pubDate>Sat, 15 Oct 2011 05:26:02 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[technical]]></category>
		<category><![CDATA[algorithm]]></category>
		<category><![CDATA[prolog]]></category>
		<category><![CDATA[sorting]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=372</guid>
		<description><![CDATA[The simplest way to code a sort in Prolog is to describe what a sorted list looks like.  A sorted list is a permutation of some given list such that elements that appear earlier in the list are smaller than elements that appear later in the list.  In SWI Prolog: srt(X,Y) :- permutation(X,Y), isSrt(Y). isSrt([]). [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=372&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The simplest way to code a sort in Prolog is to describe what a sorted list looks like.  A sorted list is a permutation of some given list such that elements that appear earlier in the list are smaller than elements that appear later in the list.  In SWI Prolog:</p>
<pre style="border:solid 1px;font-size:140%;padding:5px;"><span style="color:#660000;">srt</span><strong>(</strong>X,Y<strong>)</strong> :- <span style="color:#006600;">permutation</span><strong>(</strong>X,Y<strong>)</strong>, <span style="color:#660000;">isSrt</span><strong>(</strong>Y<strong>)</strong>.

<span style="color:#660000;">isSrt</span><strong>(</strong><strong>[</strong><strong>]</strong><strong>)</strong>.
<span style="color:#660000;">isSrt</span><strong>(</strong><strong>[</strong>_<strong>]</strong><strong>)</strong>.
<span style="color:#660000;">isSrt</span><strong>(</strong><strong>[</strong>H,I|J<strong>]</strong><strong>)</strong> :- H=&lt;I, <span style="color:#660000;">isSrt</span><strong>(</strong><strong>[</strong>I|J<strong>]</strong><strong>)</strong>.</pre>
<p>The beauty of this solution is that Prolog does all the dirty work of finding a sorted <tt>Y</tt> for our given <tt>X</tt>. The darker side is that Prolog&#8217;s satisfaction mechanism carries out a brute force search through all possible permutations of the given list. This is dangerous; compare the runtimes of our naive sort and the Prolog built-in sort (which is decidedly less naive):</p>
<pre style="border:solid 1px;font-size:120%;padding:5px;">% <strong>5 seconds vs. something &lt; 0.01 seconds</strong>
<span style="color:#006600;">reverse</span><strong>(</strong><strong>[</strong>1,2,3,4,5,6,7,8,9,10<strong>]</strong>,X<strong>)</strong>, <span style="color:#006600;">profile</span><strong>(</strong><span style="color:#660000;">srt</span><strong>(</strong>X,Y<strong>)</strong><strong>)</strong>.
<span style="color:#006600;">reverse</span><strong>(</strong><strong>[</strong>1,2,3,4,5,6,7,8,9,10<strong>]</strong>,X<strong>)</strong>, <span style="color:#006600;">profile</span><strong>(</strong><span style="color:#006600;">msort</span><strong>(</strong>X,Y<strong>)</strong><strong>)</strong>.</pre>
<p>Not great, and notice how fast the worst case time of O(n!) grows:</p>
<pre style="border:solid 1px;font-size:120%;padding:5px;">% <strong>55 seconds vs. something &lt; 0.01 seconds</strong>
<span style="color:#006600;">reverse</span><strong>(</strong><strong>[</strong>1,2,3,4,5,6,7,8,9,10,11<strong>]</strong>,X<strong>)</strong>, <span style="color:#006600;">profile</span><strong>(</strong><span style="color:#660000;">srt</span><strong>(</strong>X,Y<strong>)</strong><strong>)</strong>.
<span style="color:#006600;">reverse</span><strong>(</strong><strong>[</strong>1,2,3,4,5,6,7,8,9,10,11<strong>]</strong>,X<strong>)</strong>, <span style="color:#006600;">profile</span><strong>(</strong><span style="color:#006600;">msort</span><strong>(</strong>X,Y<strong>)</strong><strong>)</strong>.</pre>
<p>This is to be expected; if we want to be picky about how the job gets done, we can <a href="http://kti.mff.cuni.cz/~bartak/prolog/sorting.html">be more explicit in our description</a>. So let&#8217;s ignore performance and and see how to write a declarative &#8220;permutation&#8221; predicate from scratch.</p>
<p><span id="more-372"></span></p>
<h2>Ugly Permutations</h2>
<p><span style="color:#888888;">Skip this section unless you&#8217;re interested in how a simple idea is translated into a complex and unattractive Prolog solution.</span></p>
<p>One descriptive definition of a permutation of a list: each item in the given list is in the permutation and each item in the permutation is in the list. </p>
<p>I started by trying to define a <tt>membr/2</tt> predicate such that it would succeed when all items in the first list are in the second list. Attempts at extending this into something that would produce permutations caused infinite loops and stack overflows until I settled on this definition:</p>
<pre style="border:solid 1px;font-size:140%;padding:5px;"><span style="color:#660000;">membr</span><strong>(</strong><strong>[</strong>H|T<strong>]</strong>,Y<strong>)</strong> :-
    <span style="color:#006600;">length</span><strong>(</strong>Y,L<strong>)</strong>, <span style="color:#006600;">length</span><strong>(</strong>T,M<strong>)</strong>, M&lt;L,
    <span style="color:#006600;">member</span><strong>(</strong>H,Y<strong>)</strong>, <span style="color:#660000;">membr</span><strong>(</strong>T,Y<strong>)</strong>.
<span style="color:#660000;">membr</span><strong>(</strong><strong>[</strong><strong>]</strong>,Y<strong>)</strong> :- <span style="color:#660000;">allNonVar</span><strong>(</strong>Y<strong>)</strong>. 

<span style="color:#660000;">allNonVar</span><strong>(</strong><strong>[</strong><strong>]</strong><strong>)</strong>.
<span style="color:#660000;">allNonVar</span><strong>(</strong><strong>[</strong>H|T<strong>]</strong><strong>)</strong> :- <span style="color:#006600;">nonvar</span><strong>(</strong>H<strong>)</strong>, <span style="color:#660000;">allNonVar</span><strong>(</strong>T<strong>)</strong>.

<span style="color:#660000;">srt</span><strong>(</strong>X,Y<strong>)</strong> :- <span style="color:#660000;">membr</span><strong>(</strong>X,Y<strong>)</strong>, <span style="color:#660000;">isSrt</span><strong>(</strong>Y<strong>)</strong>, !.</pre>
<p>Where <tt>member/2</tt> is a built-in predicate with a straightforward Prolog implementation.</p>
<p>The clauses dealing with length prevent <tt>len(Y) &lt; len(X)</tt> if <tt>X</tt> contains duplicates. The predicate <tt>allNonVar/1</tt> prevents <tt>Y</tt> from having extra, un-unified anonymous variables. These can be left unbound when <tt>Y</tt> is forced to be a certain length, but duplicates in <tt>X</tt> always match to the first occurrence of the duplicate value in <tt>Y</tt>. These can also be left unbound when the <tt>member/2</tt> goal posits extra anonymous variables; <tt>membr(X,[Y|_junk1,_junk2])</tt> would otherwise be true.</p>
<p>Finally, the cut at the end of <tt>srt/2</tt> prevents the consideration of equivalent permutations of the input list when it contains duplicates.</p>
<h2>Pretty Permutations</h2>
<p>The implementation of a simple definition of permutation ended up looking rather ugly because it had to fight the Prolog search mechanism.</p>
<p>A prettier implementation of a permutation predicate is given in Clocksin and Mellish, <em>Programming in Prolog</em>, Springer, 2003:</p>
<pre style="border:solid 1px;font-size:140%;padding:5px;"><span style="color:#660000;">perm</span><strong>(</strong><strong>[</strong><strong>]</strong>,<strong>[</strong><strong>]</strong><strong>)</strong>.
<span style="color:#660000;">perm</span><strong>(</strong>L,<strong>[</strong>H|T<strong>]</strong><strong>)</strong> :-
    <span style="color:#006600;">append</span><strong>(</strong>V,<strong>[</strong>H|U<strong>]</strong>,L<strong>)</strong>,
    <span style="color:#006600;">append</span><strong>(</strong>V,U,W<strong>)</strong>,
    <span style="color:#660000;">perm</span><strong>(</strong>W,T<strong>)</strong>.</pre>
<p>This states that a permutation of L is something that has as its head an element H drawn from L, and which is followed by something that is a permutation of the remaining elements of L.  Most of the permutation definitions on the web are built like this one, but I like its use of <tt>append/3</tt>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/372/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=372&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/10/15/naive-sort-in-prolog/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
		<item>
		<title>An Empirical Analysis of QuickSelect and QuickMedian</title>
		<link>http://pwnetics.wordpress.com/2011/09/28/an-empirical-analysis-of-quickselect-and-quickmedian/</link>
		<comments>http://pwnetics.wordpress.com/2011/09/28/an-empirical-analysis-of-quickselect-and-quickmedian/#comments</comments>
		<pubDate>Wed, 28 Sep 2011 05:32:54 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[technical]]></category>
		<category><![CDATA[algorithm]]></category>
		<category><![CDATA[java]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=351</guid>
		<description><![CDATA[The QuickSelect algorithm is an efficient way to find the N-th largest element in a list. The QuickMedian algorithm uses QuickSelect to efficiently find the middle element(s) of the array. This blog post sketches how QuickSelect and QuickMedian work in general and in my Java implementation. The bulk of the post discusses the plot of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=351&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Selection_algorithm#Partition-based_general_selection_algorithm">QuickSelect</a> algorithm is an efficient way to find the N-th largest element in a list. The QuickMedian algorithm uses QuickSelect to efficiently find the middle element(s) of the array.</p>
<p>This blog post sketches how QuickSelect and QuickMedian work in general and in <a href="https://github.com/romanows/QuickSelect">my Java implementation</a>. The bulk of the post discusses the plot of QuickMedian vs. SortMedian&#8217;s runtime shown below (click the image for more details).</p>
<p><a href="http://github.com/romanows/QuickSelect/raw/master/doc/QuickSelectEval_large1.png"><img class="aligncenter" title="Runtime of QuickMedian and SortMedian Trials" src="https://github.com/romanows/QuickSelect/raw/master/doc/QuickSelectEval_preview1.png" alt="Runtime of QuickMedian and SortMedian Trials, sorted by trial conditions to show regularities" width="577" height="341" /></a></p>
<h2><span id="more-351"></span>QuickSelect Efficiency</h2>
<p>QuickSelect uses the partitioning step of <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Quicksort">QuickSort</a> to divide an array into two sections around a pivot.  All elements smaller than the pivot move before the pivot and all elements larger than the pivot move after the pivot.  A new pivot is chosen <em>on the side of the partition that will contain the N-th element</em>, and the partitioning step is run again.  It turns out that the number of elements visited on successive iterations of partitioning decreases so quickly in the average case that the overall runtime is expected <img src='http://s0.wp.com/latex.php?latex=O%28n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O(n)' title='O(n)' class='latex' />, although worst case is <img src='http://s0.wp.com/latex.php?latex=O%28n%5E2%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O(n^2)' title='O(n^2)' class='latex' />.</p>
<h2>QuickMedian Implementation</h2>
<p>QuickMedian was originally intended to be a demonstration of the QuickSelect algorithm.  It turns out to be surprisingly tricky to write a robust median function!  Much of the trouble comes from handling even-length arrays in the case of large or infinite values.  When an array is of even length and the middle two values have large magnitudes, we need an averaging formula that avoids overflow.  When the middle two elements are infinity, in some cases we must return <code>NaN</code>.</p>
<p>Presenting arrays containing <code>NaN</code> to QuickSelect produces undefined behavior, which seems like the correct behavior.  What seems like incorrect behavior is present in <a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.median.html">Numpy</a>, where <code>NaN</code> sorts before other values.  Here, the median value of a list can be decreased by inserting things which are not even numbers! Incorrect behavior may also result from a median calculation that uses Python&#8217;s native list <code>sort()</code> method.  Here, <code>NaN</code> is not ordered with respect to other numbers, so the median of an array can change from call to call!</p>
<p>The final problem faced by QuickMedian was one of inefficiency. Even-length arrays require two calls to QuickSelect to find the middle two elements. This is inefficient if implemented naively, because the two calls operate independently over the full array.  It should not be the case that runtimes vary greatly between arrays of length 1,000,000 and 1,000,001. To improve performance, QuickSelect was refactored to return the array indexes defining the smallest partition found that contains the N-th element. This partition is usually very small, and so the second QuickSelect call over this sub-array has negligible cost.</p>
<h2>QuickMedian versus SortMedian</h2>
<p>I wanted to examine the performance of QuickMedian, prove that it worked better than the sorting median method, and analyze the effects of different pivot picking methods.  The wall clock time was accumulated over the course of several randomized trials, divided and sorted carefully, then plotted.</p>
<h3>Method</h3>
<p>The evaluation consisted of 100 repetitions of trials in which several conditions were exhaustively varied.  In each trial, the median methods processed a total of 1e6 elements, which were divided up into arrays of constant length.  This number of elements was used because processing just one array of length 1000 was too quick to measure using the wall clock system time on my machine.  The conditions which were varied across trials were:</p>
<ul>
<li>Length of arrays into which the 1e6 elements were divided</li>
<li>Even or odd-length arrays</li>
<li>Sorting of the arrays: random, sorted, reverse-sorted, mostly-sorted, mostly-reverse-sorted</li>
<li>QuickSelect pivot-picking method: middle, random, median-of-three, median-of-randomized-three</li>
<li>Whether the arrays contained duplicates</li>
</ul>
<div>Array element values were consecutive integers; each integer was repeated under the &#8220;duplicate&#8221; condition.  The mostly-sorted arrays were sorted but for the very middle 10%.  Trials were completely randomized in a paranoid effort to eliminate JVM runtime optimizations.</div>
<h3>Analysis</h3>
<p>The runtimes were divided into series and sorted in such a way as to show interesting systemic effects.</p>
<p>Sorting by array length addressed the main question of computational complexity and runtime speed.  The QuickMedian runtimes remain roughly constant, while the SortMedian runtimes increase.  Remember that the <img src='http://s0.wp.com/latex.php?latex=N%3D1e6&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N=1e6' title='N=1e6' class='latex' /> elements in each trial are divided into <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> arrays each of length <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=n+%3D+N%2Fk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n = N/k' title='n = N/k' class='latex' />.  QuickMedian has an expected runtime of <img src='http://s0.wp.com/latex.php?latex=k+%2A+O%28n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k * O(n)' title='k * O(n)' class='latex' />, which is <img src='http://s0.wp.com/latex.php?latex=O%28N%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O(N)' title='O(N)' class='latex' /> and thus expected to be constant across trials.  SortMedian has an expected runtime of <img src='http://s0.wp.com/latex.php?latex=k+%2A+O%28n+%5Clog+n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k * O(n &#92;log n)' title='k * O(n &#92;log n)' class='latex' /> which is expected <img src='http://s0.wp.com/latex.php?latex=O%28N+%5Clog+n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O(N &#92;log n)' title='O(N &#92;log n)' class='latex' />, which is expected to increase as the array length increases!  The array lengths grow exponentially (something like 1e2, 1e3, 1e4, and 1e5), so the runtimes increase linearly.  This, and the fact that QuickMedian&#8217;s runtimes lie well-below most of the SortMedian runtimes)  is perhaps the central finding of the evaluation.</p>
<p>Then, splitting by the most interesting intra-algorithm condition showed further systemic effects.  The runtimes from the QuickMedian algorithm were divided into series based upon the pivot picking method used, while the SortMedian runtimes were divided by how the array was initially sorted.</p>
<p>The SortMedian results show that my platform&#8217;s Java sorting algorithm works fastest on reverse-sorted arrays.  The ripples in these series are interestingly due to the presence or absence of duplicates.</p>
<p>The QuickMedian results confirm that the fastest runtime is given by the deterministic pivot picking methods on sorted arrays, where the middle element is almost immediately found.  I was surprised to see that the randomized median-of-three pivot picking method never outperforms the deterministic median-of-three pivot picking method.  However, it makes sense because median-of-three killer sequences are probably unlikely.  I haven&#8217;t looked into this behavior yet, but I am assuming that the randomized median-of-three pivot picking method is more resistant to naturally-occurring killer sequences, so it will remain the default pivot picking method in QuickMedian.</p>
<h2>Conclusion</h2>
<div id="LC54">QuickMedian is clearly the fastest way to find the median, and should be used particularly when the length of input arrays is at least 100 or greater.  If throughput performance is important, you are operating on safe and mostly-sorted data, and the possibility of a worst-case O(n^2) runtime won&#8217;t dissuade you: consider switching to the non-randomized median-of-three pivot picking method when constructing the QuickMedian object.</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/351/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=351&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/09/28/an-empirical-analysis-of-quickselect-and-quickmedian/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>

		<media:content url="https://github.com/romanows/QuickSelect/raw/master/doc/QuickSelectEval_preview1.png" medium="image">
			<media:title type="html">Runtime of QuickMedian and SortMedian Trials</media:title>
		</media:content>
	</item>
		<item>
		<title>An Epoch Poem</title>
		<link>http://pwnetics.wordpress.com/2011/09/21/an-epoch-poem/</link>
		<comments>http://pwnetics.wordpress.com/2011/09/21/an-epoch-poem/#comments</comments>
		<pubDate>Wed, 21 Sep 2011 21:49:45 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[fun]]></category>
		<category><![CDATA[genetic algorithm]]></category>
		<category><![CDATA[poetry]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=336</guid>
		<description><![CDATA[A folk explanation of genetic algorithms for my evolutionary computing class, to be recited with creative pronunciation: A farmer was planting to reap, Wished to profit the max on the cheap. Trade-off soil, spacing, and blight, Planting time, market value, and light, But: too-complex his goals were to meet. &#8220;Perhaps I can ask my dear [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=336&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A folk explanation of genetic algorithms for my evolutionary computing class, to be recited with creative pronunciation:</p>
<hr />
<p>
A farmer was planting to reap,<br />
Wished to profit the max on the cheap.<br />
Trade-off soil, spacing, and blight,<br />
Planting time, market value, and light,<br />
But: too-complex his goals were to meet.
</p>
<p><span id="more-336"></span></p>
<p>
&#8220;Perhaps I can ask my dear friends,<br />
To plant blindly what I recommends.<br />
Select crops and place them by chance,<br />
Grow, harvest, and then sell those plants.<br />
The free market enforces the trends!
</p>
<p>
Some will make it and some will not,<br />
Dupe success in the fallow lot,<br />
But to all I’ll suggest<br />
To mingle, and for zest,<br />
To randomly swap partial plots.
</p>
<p>
Farmhands are a hard working breed.<br />
They plant and they tend and they weed.<br />
While it may be rarer,<br />
A job done in error,<br />
Begats a season novelty’d.
</p>
<p>
Perhaps not me nor my son,<br />
Nor his daughter nor boss ADM,<br />
But eventually one day,<br />
The min work for max pay<br />
Will pervade, then we’re finally done!&#8221;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/336/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=336&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/09/21/an-epoch-poem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
		<item>
		<title>Bash one-liner using sox to batch convert the sampling frequency of audio files</title>
		<link>http://pwnetics.wordpress.com/2011/07/12/bash-one-liner-using-sox-to-batch-convert-the-sampling-frequency-of-audio-files/</link>
		<comments>http://pwnetics.wordpress.com/2011/07/12/bash-one-liner-using-sox-to-batch-convert-the-sampling-frequency-of-audio-files/#comments</comments>
		<pubDate>Tue, 12 Jul 2011 19:41:17 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[speech]]></category>
		<category><![CDATA[technical]]></category>
		<category><![CDATA[audio]]></category>
		<category><![CDATA[linux]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=314</guid>
		<description><![CDATA[A bash one-liner to batch convert the sampling rate of WAV files using the SoX tool.  The example will resample *.wav files in the current directory to 8000Hz and place the output in an existing subdirectory called 8000Hz. The one-liner below is overkill for this task, but the extra arguments provide a starting point for [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=314&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A bash one-liner to batch convert the sampling rate of WAV files using the <a href="http://sox.sourceforge.net/">SoX</a> tool.  The example will resample <tt>*.wav</tt> files in the current directory to 8000Hz and place the output in an existing subdirectory called <tt>8000Hz</tt>.</p>
<p>The one-liner below is overkill for this task, but the extra arguments provide a starting point for modification for related tasks.  The use of <code><a href="http://en.wikipedia.org/wiki/Find">find</a></code>/<code><a href="http://en.wikipedia.org/wiki/Xargs">xargs</a></code> should help the one-liner deal with very large numbers of audio files and filenames that contain whitespace.</p>
<h3>Code</h3>
<hr />
<pre><strong>find . -maxdepth 1 -name '*.wav' -type f -print0 | xargs -0 -t -r -I {} sox {} -r 8000 8000Hz/{}</strong></pre>
<hr />
<span id="more-314"></span></p>
<h3>Explanation</h3>
<p>An explanation of commands and flags from left-to-right:</p>
<dl>
<dt><span style="color:darkgreen;"><code>find .</code></span></dt>
<dd>searches the current directory subtree</dd>
<dt><span style="color:darkgreen;"><code>-maxdepth 1</code></span></dt>
<dd>excludes subdirectories</dd>
<dt><span style="color:darkgreen;"><code>-name '*.wav' -type f</code></span></dt>
<dd>outputs files that match the glob <code>*.wav</code></dd>
<dt><span style="color:darkgreen;"><code>-print0</code></span></dt>
<dd>first step in handling filenames with spaces<code></code></dd>
<dt><span style="color:darkgreen;"><code>xargs</code></span></dt>
<dd>executes a command given the output of <code>find</code></dd>
<dt><span style="color:darkgreen;"><code>-0</code></span></dt>
<dd>second step in handling filenames with spaces</dd>
<dt><span style="color:darkgreen;"><code>-t</code></span></dt>
<dd>prints on stdout the command that is to be executed</dd>
<dt><span style="color:darkgreen;"><code>-r</code></span></dt>
<dd>runs SoX command iff <code>find</code> output is generated</dd>
<dt><span style="color:darkgreen;"><code>-I {}</code></span></dt>
<dd>specifies the pattern &#8220;{}&#8221; is replaced by the output of <code>find<br />
</code></dd>
<dt><span style="color:darkgreen;"><code>sox {} -r 8000 8000Hz/{}</code></span></dt>
<dd>is the resampling command</dd>
</dl>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/314/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/314/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/314/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/314/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/314/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/314/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/314/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/314/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/314/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/314/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/314/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/314/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/314/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/314/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=314&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/07/12/bash-one-liner-using-sox-to-batch-convert-the-sampling-frequency-of-audio-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
		<item>
		<title>Sphinx 4 Acoustic Model Adaptation</title>
		<link>http://pwnetics.wordpress.com/2011/07/01/sphinx-4-language-model-adaptation/</link>
		<comments>http://pwnetics.wordpress.com/2011/07/01/sphinx-4-language-model-adaptation/#comments</comments>
		<pubDate>Fri, 01 Jul 2011 22:43:59 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[speech]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[speech recognition]]></category>
		<category><![CDATA[sphinx]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=283</guid>
		<description><![CDATA[This is a writeup of the steps I took to perform acoustic model adaptation for an acoustic model to be used in Sphinx 4.  I followed the well-written CMU howto.  I performed all steps on a mostly-new Ubuntu 11.04 install and adapted the Communicator acoustic model for use in Sphinx 4.  Keep an eye out [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=283&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This is a writeup of the steps I took to perform acoustic model adaptation for an acoustic model to be used in Sphinx 4.  I followed the well-written <a href="http://cmusphinx.sourceforge.net/wiki/tutorialadapt">CMU howto</a>.  I performed all steps on a mostly-new Ubuntu 11.04 install and adapted the <a href="http://www.speech.cs.cmu.edu/sphinx/models/">Communicator acoustic model</a> for use in Sphinx 4.  Keep an eye out for paths that may be different on your system and any error messages that pop up when running these commands.</p>
<p>I also generated a <a href="https://github.com/romanows/Sphinx-4-Acoustic-Model-Adaptation-Data">new, full set of adaptation prompt data</a> from the CMU ARCTIC prompts.</p>
<h2><span id="more-283"></span>Build SphinxBase</h2>
<p>First download and build SphinxBase.  Since I had a relatively new Ubuntu install, I had to <code>sudo apt-get install autoconf libtool bison</code>.</p>
<pre>svn co https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/sphinxbase
cd sphinxbase
./autogen.sh
./configure
make
make check
sudo make install
sudo ldconfig -v</pre>
<p>Thank you to <a href="//mnemonicplace.blogspot.com/2010/06/cmu-sphinx-error-wave2feat-error-while.html">Mnemonic Place</a> for figuring out the last step (with <code>ldconfig)</code>.  Without this, you&#8217;ll see errors when trying to run the <code>sphinx_fe</code> command, later.</p>
<h2>Build SphinxTrain</h2>
<pre>svn co https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/SphinxTrain
cd SphinxTrain
./configure
make</pre>
<h2>Record Adaptation Training Data</h2>
<p>Record/obtain the 20 Arctic WAV files that are used in the CMU howto. Because the Communicator acoustic model is 8kHz, I recorded the WAV&#8217;s as 8kHz, 16-bit in Audacity and used the &#8220;multiple export&#8221; function to batch save. I <code>rename</code>&#8216;d them to match the format that is compatible with their howto, e.g., &#8220;arctic_0001.wav&#8221;.  (Not quite as much fun as the Dave Barry passage that you get to read during Dragon&#8217;s adaptation process :] )</p>
<h2>Miscellaneous File Downloading/Arranging</h2>
<p>Download the four howto training files from the <a href="http://cmusphinx.sourceforge.net/wiki/tutorialadapt">CMU wiki page</a>.  One of the files is listed as &#8220;arctic20.fileids&#8221; but saves as &#8220;arctic20.listoffiles&#8221;; I just renamed it to &#8220;arctic20.fileids&#8221; so as to not cause confusion with later scripts and instructions.  If you need more than 20 prompts for adaptation, you may like to draw them from the <a href="https://github.com/romanows/Sphinx-4-Acoustic-Model-Adaptation-Data">full set of CMU ARCTIC prompts I generated</a>.</p>
<p>Now move the four training files, your 20 arctic WAV files, the acoustic model you&#8217;re going to use to one directory.  My acoustic model is in a subdirectory named <code>Communicator_40.cd_cont_4000/</code>.  In my case, that one directory is a sibling of the sphinxbase and SphinxTrain directories.</p>
<h2>Generate Features</h2>
<pre>sphinx_fe -argfile Communicator_40.cd_cont_4000/feat.params \
   -samprate 8000 -c arctic20.fileids -di . -do . \
   -ei wav -eo mfc -mswav yes</pre>
<h2>Gather Statistics</h2>
<p>First copy over binaries from SphinxTrain:</p>
<pre>cp ../SphinxTrain/bin.i686-pc-linux-gnu/bw .
cp ../SphinxTrain/bin.i686-pc-linux-gnu/map_adapt .
cp ../SphinxTrain/bin.i686-pc-linux-gnu/mk_s2sendump .</pre>
<p>Find a copy of the &#8220;fillerdict&#8221; you use and copy it to the acoustic model directory, renaming it to the filename &#8220;noisedict&#8221;.</p>
<p>Run the Baum-Welch program:</p>
<pre>./bw \
   -hmmdir Communicator_40.cd_cont_4000 \
   -moddeffn Communicator_40.cd_cont_4000/mdef \
   -ts2cbfn .cont. \
   -feat 1s_c_d_dd \
   -cmn current \
   -agc none \
   -dictfn arctic20.dic \
   -ctlfn arctic20.fileids \
   -lsnfn arctic20.transcription \
   -accumdir .</pre>
<h2>Perform MLLR training</h2>
<p>Note that MLLR isn&#8217;t supported in Sphinx 4, so skip this step.  &#8220;The Heiroglyph&#8221; document makes a point that you can iteratively build the mllr_matrix by running <code>bw</code>, then <code>mllr_solve</code>, then <code>bw</code>, then <code>mllr_solve</code>, etc.  If you&#8217;re using a system that can take advantage of the mllr_matrix, you can run <code>bw</code> after you have your final mllr_matrix to generate MLLR-adapted statistics for the MAP adaptation described later.</p>
<pre>cp ../SphinxTrain/bin.i686-pc-linux-gnu/mllr_solve .
./mllr_solve \
   -meanfn Communicator_40.cd_cont_4000/means \
   -varfn Communicator_40.cd_cont_4000/variances \
   -outmllrfn mllr_matrix -accumdir .</pre>
<p>When the mllr_matrix exists, you can re-calculate statistics as mentioned above with the command:</p>
<pre>./bw \
   -hmmdir Communicator_40.cd_cont_4000 \
   -moddeffn Communicator_40.cd_cont_4000/mdef \
   -ts2cbfn .cont. \
   -feat 1s_c_d_dd \
   -cmn current \
   -agc none \
   -mllrmat mllr_matrix \
   -dictfn arctic20.dic \
   -ctlfn arctic20.fileids \
   -lsnfn arctic20.transcription \
   -accumdir .</pre>
<h2>Perform MAP training</h2>
<p>Sphinx4 can benefit from MAP training, but it is labor-intensive at the moment.  MAP training requires accurate transcripts of the adaptation prompts, and may degrade performance if the wrong dictionary pronunciation is used for an audio recording.  For example, there are two pronunciations for &#8220;a&#8221; in the cmudict: &#8220;uh&#8221; and &#8220;ay&#8221;.  Unless the transcription is annotated with the correct varient (&#8220;A&#8221; or &#8220;A(2)&#8221;), the MAP training can degrade the acoustic models involving that phoneme.  </p>
<p>The suggested solution to this problem is to either perform forced alignment or hand-transcribe the data.  I haven&#8217;t tried methods for force-alignment, yet.  Human annotators could use a file included in my version of the ARCTIC prompts: it lists all alternative pronunciations inline in the transcription file.  This is slow-going and error-prone, but it may be useful if you&#8217;re trying to adapt an acoustic model for personal use?</p>
<p>In any case, the commands to perform the MAP adaptation are below.</p>
<pre>cp ../SphinxTrain/bin.i686-pc-linux-gnu/map_adapt .
cp -r Communicator_40.cd_cont_4000/ Communicator_40.cd_cont_4000.adapted
./map_adapt \
    -meanfn Communicator_40.cd_cont_4000/means \
    -varfn Communicator_40.cd_cont_4000/variances \
    -mixwfn Communicator_40.cd_cont_4000/mixture_weights \
    -tmatfn Communicator_40.cd_cont_4000/transition_matrices \
    -accumdir . \
    -mapmeanfn Communicator_40.cd_cont_4000.adapted/means \
    -mapvarfn Communicator_40.cd_cont_4000.adapted/variances \
    -mapmixwfn Communicator_40.cd_cont_4000.adapted/mixture_weights \
    -maptmatfn Communicator_40.cd_cont_4000.adapted/transition_matrices</pre>
<h2>Done!</h2>
<p>And that&#8217;s it! All of these commands should terminate within about one second.</p>
<h2>Testing</h2>
<p>To test whether or not these commands actually did anything, I generated some new/old recognition results.  Test data were the WAV files used for adaptation and I used the 5K NVP 3-gram ARPA language model avaiable from <a href="http://www.keithv.com/software/giga/">Keith Vertanen&#8217;s site</a>.  The acoustic models were the original Communicator model and the MAP adapted model (but I didn&#8217;t use the MLLR transform).</p>
<p>This doesn&#8217;t tell us anything definite about the performance of the acoustic model in our application, but it shows that the adaptation did do something. Below are the first few recognition results: the first line is using the  original acoustic model (OLD), the second line is using the MAP-adapted acoustic model (NEW), and the third line is the gold-standard transcription (GLD).</p>
<p><code style="font-size:smaller;"><br />
OLD: all circuit egypt around philip still successor<br />
NEW: author of the danger trail philip feels it set or a<br />
GLD: author of the danger trail, philip steels et cetera<br />
</code></p>
<p><code style="font-size:smaller;"><br />
OLD: not efficiency killer case tom unfair work<br />
NEW: not at this particular case thomas politics with more<br />
GLD: not at this particular case tom apologized whittemore<br />
</code></p>
<p><code style="font-size:smaller;"><br />
OLD: further twentieth time at evening into mexican<br />
NEW: for the twentieth time that evening the two men sugar<br />
GLD: for the twentieth time that evening the two men shook hands<br />
</code></p>
<h2>Other Resources</h2>
<p>A document called &#8220;<a href="http://www.cs.cmu.edu/~archan/sphinxDoc.html">The Hieroglyph</a>&#8221; talks a bit more about adaptation and makes a few good points about the number of utterances and the care with which they are transcribed.  Some suggestions from that document were incorporated into this post.</p>
<p>Good luck!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/283/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/283/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/283/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/283/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/283/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/283/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/283/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/283/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/283/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/283/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/283/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/283/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/283/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/283/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=283&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/07/01/sphinx-4-language-model-adaptation/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
		<item>
		<title>Gnome &#8220;Show Desktop&#8221; applet that only minimizes all windows</title>
		<link>http://pwnetics.wordpress.com/2011/06/28/gnome-show-desktop-applet-that-only-ever-minimizes-all-windows/</link>
		<comments>http://pwnetics.wordpress.com/2011/06/28/gnome-show-desktop-applet-that-only-ever-minimizes-all-windows/#comments</comments>
		<pubDate>Tue, 28 Jun 2011 19:33:53 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[technical]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[linux]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=275</guid>
		<description><![CDATA[The &#8220;show desktop&#8221; applet in Gnome is more like a toggle button that switches between minimizing all windows and restoring all windows. I&#8217;ve never found the restore all windows behavior useful, it requires an extra mouse click, and I&#8217;m always jolted by the temporary flash of restoring windows. For those few souls who share this [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=275&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The &#8220;show desktop&#8221; applet in Gnome is more like a toggle button that switches between minimizing all windows and restoring all windows. I&#8217;ve never found the restore all windows behavior useful, it requires an extra mouse click, and I&#8217;m always jolted by the temporary flash of restoring windows.</p>
<p>For those few souls who share this peeve there is <a href="http://www.reddit.com/r/linux/comments/dj7w8/hey_gnome_userswant_a_show_desktop_button_that/">a simple way to replace the applet</a> with a version that always and only shows the desktop.  Reproducing ruizscar&#8217;s instructions here:</p>
<ol>
<li>Install <code>"wmctrl"</code> (i.e., <code>sudo apt-get install wmctrl</code>)</li>
<li>Add a &#8220;Custom App Launcher&#8221; to the desktop panel bar with the command <code>"wmctrl -k on"</code></li>
</ol>
<p>If you want to completely reproduce the &#8220;show desktop&#8221; applet, you&#8217;ll need to also set the icon.  For the default Ubuntu/Gnome theme (as of 11.04), the icon is located at: <code>"/usr/share/icons/Humanity/places/24/gnome-ccdesktop.svg"</code>.  Hopefully that&#8217;ll get you in the right neighborhood depending on the theme/panel size you run.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/275/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=275&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/06/28/gnome-show-desktop-applet-that-only-ever-minimizes-all-windows/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
		<item>
		<title>PowerPoint Projector Attack?</title>
		<link>http://pwnetics.wordpress.com/2011/06/23/powerpoint-projector-attack/</link>
		<comments>http://pwnetics.wordpress.com/2011/06/23/powerpoint-projector-attack/#comments</comments>
		<pubDate>Thu, 23 Jun 2011 19:50:53 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[fun]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=264</guid>
		<description><![CDATA[Our projector broke during a presentation today, apparently a problem with the bulb. I started to wonder, &#8220;why today, why this presentation?&#8221; H0: Random hardware failure. H1: It so happens that the presentation used all black slides with white text. Perhaps the LCD blocking the projector light got too hot, which got the bulb too [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=264&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Our projector broke during a presentation today, apparently a problem with the bulb.  I started to wonder, &#8220;why today, why this presentation?&#8221; </p>
<p>H0: Random hardware failure.</p>
<p>H1: It so happens that the presentation used all black slides with white text. Perhaps the LCD blocking the projector light got too hot, which got the bulb too hot, which destroyed our super-cheap projector bulb?</p>
<p>That&#8217;s my clever hypothesis; something to test if you have to give a presentation that you haven&#8217;t had time to finish :)</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/264/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/264/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/264/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/264/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/264/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/264/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/264/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/264/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/264/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/264/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/264/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/264/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/264/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/264/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=264&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/06/23/powerpoint-projector-attack/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
		<item>
		<title>Negation in Prolog</title>
		<link>http://pwnetics.wordpress.com/2011/04/10/negation-in-prolog/</link>
		<comments>http://pwnetics.wordpress.com/2011/04/10/negation-in-prolog/#comments</comments>
		<pubDate>Sun, 10 Apr 2011 03:02:17 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[technical]]></category>
		<category><![CDATA[prolog]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=207</guid>
		<description><![CDATA[The first Prolog example in Bruce Tate&#8217;s Seven Languages in Seven Weeks has a bug.  Tate defines the predicate \+ to be &#8220;logical negation&#8221;, but this is incorrect.  It&#8217;s like logical negation, but differs in ways that cause straightforward queries to fail. Problem with the First Example The first example involves facts about Wallace and Grommit.  [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=207&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The first Prolog example in Bruce Tate&#8217;s <a href="http://pragprog.com/titles/btlang/seven-languages-in-seven-weeks">Seven Languages in Seven Weeks</a> has a bug.  Tate defines the predicate <code>\+</code> to be &#8220;logical negation&#8221;, but this is incorrect.  It&#8217;s <em>like</em> logical negation, but differs in ways that cause straightforward queries to fail.</p>
<h2><span id="more-207"></span> Problem with the First Example</h2>
<p>The first example involves facts about <a href="http://www.youtube.com/watch?v=mk6zbY8i4_8">Wallace and Grommit</a>.  Here is the program that I load into <a href="http://www.gprolog.org/">GNU Prolog 1.3.0</a>.</p>
<h4>Listing 1</h4>
<pre style="border:solid 1px;font-size:120%;padding:5px;">likes(wallace, cheese).
likes(grommit, cheese).
likes(wendolene, sheep).
friend(<strong>X</strong>, <strong>Y</strong>) :- \+(<strong>X</strong> = <strong>Y</strong>), likes(<strong>X</strong>, <strong>Z</strong>), likes(<strong>Y</strong>, <strong>Z</strong>).</pre>
<p>This is a straightforward attempt at encoding the knowledge we&#8217;d express in English as: &#8220;Wallace and Grommit like cheese; Wendolene likes sheep&#8221; and &#8220;Two different things are friends if they both like at least one of the same thing&#8221;.  Evaluation 1 shows the results of querying this knowledge base.</p>
<h4>Evaluation 1</h4>
<pre style="border:solid 1px;font-size:120%;padding:5px;">| ?- likes(wallace, cheese).
 yes

| ?- friend(wallace, grommit).
 yes

| ?- friend(wallace, wallace).
 no

| ?- likes(wallace, <strong>Q</strong>).
Q = cheese
 yes

| ?- likes(<strong>Q</strong>, cheese).
Q = wallace ? a
Q = grommit
 no

| ?- friend(wallace, <strong>Q</strong>).
<span style="background-color:#ffcccc;"> no</span></pre>
<p>The first three queries look reasonable.  The fourth and fifth queries make Prolog search and report correct assignments of atoms to the variable <strong>Q</strong>. The sixth query attempts this same type of query; however, it fails to find an assignment. This is unexpected, as we know from our second query that <code>Q=grommit</code> is a consistent assignment!</p>
<h2>Improved First Example</h2>
<p>Listing 2 gives a version of the first example that behaves as expected.  The fix is to move the constraint <code>\+(<strong>X</strong> = <strong>Y</strong>)</code> to the end of the rule.</p>
<h4>Listing 2</h4>
<pre style="border:solid 1px;font-size:120%;padding:5px;">likes(wallace, cheese).
likes(grommit, cheese).
likes(wendolene, sheep).
friend(<strong>X</strong>, <strong>Y</strong>) :- likes(<strong>X</strong>, <strong>Z</strong>), likes(<strong>Y</strong>, <strong>Z</strong>), \+(<strong>X</strong> = <strong>Y</strong>).</pre>
<h4>Evaluation 2</h4>
<pre style="border:solid 1px;font-size:120%;padding:5px;">| ?- friend(wallace,<strong> Q</strong>).
<span style="background-color:#ccffcc;"><strong>Q</strong> = grommit</span> ? ;
 no</pre>
<h2><code>\+</code> is Not Logical Negation</h2>
<p>The fact that Listing 2 produces the correct output by reordering the predicates might be unsettling if one&#8217;s mental model of a Prolog program is that of a set of logical statements comprehended by a black box theorem prover.  However, <code>\+</code> is not logical negation and, more broadly, it is dangerous to ignore the details of how Prolog goes about searching for solutions.</p>
<p>The predicate <code>\+(M)</code> is implemented as <a href="http://cs.union.edu/~striegnk/learn-prolog-now/html/node90.html">&#8220;negation as failure&#8221; (NAF)</a>.  The first thing this predicate does is attempt to prove the goal <code>M</code>.  If this can be satisfied, then the next thing the predicate does is to make a <a href="http://en.wikibooks.org/wiki/Prolog/Cuts_and_Negation">cut</a>.  Cuts freeze all variable assignments made in a rule, from the beginning up to the point at which the cut appears.  Finally, the predicate fails explicitly.  All of this machinery effectively causes <code>\+(M)</code> to be false if <code>M</code> is true.</p>
<p>The incorrect behavior in the first program is due to the cut.  The <code>M</code> goal in <code>\+(M)</code> contains free variables, and it is easy to find an assignment for those variables such that <code>M</code> is true.  The cut prevents any other assignments of these variables to atoms, and therefore <code>\+(M)</code> is always false and the rule can never succeed.</p>
<p>For listing 1, Prolog does something like the following:</p>
<ol>
<li>Goal 1: <code>friend(wallace, <strong>Y</strong>)</code></li>
<li>Goal 2: <code>\+(wallace = <strong>Y</strong>)</code></li>
<li>Goal 3: <code>wallace = <strong>Y</strong></code></li>
<li>Succeed 3: <code><strong>Y</strong> &lt;-- wallace</code></li>
<li>Cut and prevent reassignment of <code><strong>Y</strong></code></li>
<li>Fail 2 with no possibility of success because <code>\+(wallace = (<strong>Y</strong> = wallace))</code> is always false</li>
<li>Fail 1</li>
</ol>
<p>By moving the negation predicate to the end of the rule, Prolog will search for variable assignments for the variables in the goal M <strong>before</strong> the negation predicate appears.  Prolog does something like the following for listing 2:</p>
<ol>
<li>Goal 1: <code>friend(wallace, <strong>Y</strong>)</code></li>
<li>Goal 2: <code>likes(wallace<strong></strong>, <strong>Z</strong>)</code></li>
<li>Succeed 2: <code><strong>Z</strong> &lt;-- cheese</code></li>
<li>Goal 3: <code>likes(<strong>Y</strong>, cheese)</code></li>
<li>Succeed 3: <code><strong>Y</strong> &lt;-- wallace</code></li>
<li>Goal 4: <code>\+(wallace = wallace)</code></li>
<li>Fail 4, with a similar reasoning as above</li>
<li>Backtrack to 3 (cuts don&#8217;t freeze variables set in other goals)</li>
<li>Goal 3: <code>likes(<strong>Y</strong>, <strong>cheese</strong>)</code></li>
<li>Succeed 3: <code><strong>Y</strong> &lt;-- grommit</code></li>
<li>Goal 4: <code>\+(wallace = grommit)</code></li>
<li>Succeed 4, because </code>wallace = grommit</code> is false</li>
<li>Succeed 1, since we've succeeded in all sub-goals</li>
</ol>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/207/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/207/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/207/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=207&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/04/10/negation-in-prolog/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
		<item>
		<title>CRAPL &#8212; Community Research and Academic Programming License</title>
		<link>http://pwnetics.wordpress.com/2011/04/07/crapl-community-research-and-academic-programming-license/</link>
		<comments>http://pwnetics.wordpress.com/2011/04/07/crapl-community-research-and-academic-programming-license/#comments</comments>
		<pubDate>Thu, 07 Apr 2011 15:30:45 +0000</pubDate>
		<dc:creator>romanows</dc:creator>
				<category><![CDATA[technical]]></category>
		<category><![CDATA[copyright]]></category>
		<category><![CDATA[license]]></category>

		<guid isPermaLink="false">http://pwnetics.wordpress.com/?p=193</guid>
		<description><![CDATA[The Community Research and Academic Programming License (CRAPL) is an open source license drafted by Matt Might and is directed at researchers who produce code that is a byproduct of research.  This license is tuned to the current academic environment and its emphasis on publishing moreso than licenses like GPL, LGPL, and BSD. Might&#8217;s writeup [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=193&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://matt.might.net/articles/crapl/">Community Research and Academic Programming License (CRAPL)</a> is an open source license drafted by <a href="http://matt.might.net/">Matt Might</a> and is directed at researchers who produce code that is a byproduct of research.  This license is tuned to the current academic environment and its emphasis on publishing moreso than licenses like GPL, LGPL, and BSD.</p>
<p><span id="more-193"></span>Might&#8217;s writeup of the license is great and the license itself is very readable (and funny!).  But in three bullet points: what does the CRAPL (version 0, beta 0) do for you?</p>
<ul>
<li>Preemptive social protection: the author can signal that the published code is crap and that <em>he or she knows this so calm down already<br />
</em></li>
<li>Attempt to balance GPL-like sharing of code <em>and data</em> with idiosyncrasies of the publication process</li>
<li>Does not itself allow commercial nor community use</li>
</ul>
<p>What might the CRAPL lifecycle look like?</p>
<p>Stage 1:</p>
<ol>
<li>Researcher tackles a problem, sketches a beautiful system architecture diagram on the whiteboard</li>
<li>Weeks pass, the stack of coffee filters shrinks, and another xkcd comic appears on the wall</li>
<li>&#8220;Oh smeg,&#8221; says the researcher, &#8220;five days until the conference deadline.&#8221;</li>
<li>Code is hacked, commented in and out, copy-pasted, and&#8230; this part never gets old&#8230; made to work and produce results</li>
<li>Paper <strong>and CRAPL&#8217;d code</strong> is submitted for peer review</li>
<li>Peer-reviewers read the paper and have their grad students examine the source code</li>
<li>Paper is published.  Huzzah!</li>
</ol>
<p>Stage N+1:</p>
<ol>
<li>On the other side of the globe, a lone grad student decides to extend the result</li>
<li>Student finds the CRAPL&#8217;d code&#8211; lucky for them, because you can&#8217;t fit everything into an 8 page paper</li>
<li>Student fixes some bugs and reports them to the original researcher (as-per at least the spirit of CRAPL)</li>
<li>Student extends the code, writes a paper, and submits the paper and CRAPL&#8217;d code</li>
<li>Paper doesn&#8217;t make the cut, but the new code remains confidential so the student can try again at his leisure (as-per CRAPL)</li>
<li>One month later, the revised paper is accepted, and the student is required to publish the new code, also CRAPL&#8217;d (as-per CRAPL)</li>
</ol>
<p>Might clearly describes the social benefits of the CRAPL.  I&#8217;ll pick on some of the issues with the use-control aspects of the license.</p>
<p>The second permission #2 (heh, I will report this bug) is vague.  It requires someone who extends CRAPL&#8217;d code to make a &#8220;good-faith effort&#8221; to contact the original author.  As long as the license is tailored to our current academic model, why not define a minimum &#8220;good faith effort&#8221;?  I propose that sending an email to the authors&#8217; email addresses (specified in the CRAPL license file) with attached code modifications OR bugfixes relevant to the original code?  (<a href="http://fluca1978.blogspot.com/2010/08/crapl-or-crap-of-university.html">Luca Ferrari pointed out</a> this issue)</p>
<p>The last permission, concerning the release of any input data used to generate published results, is contentious.  Acceptable or not, there are a lot of research results generated from data that cannot be released to the public.  Company-confidential data may be used in a publication from Google or Microsoft.  Data from experimental subjects is often protected by confidentiality requirements, as in some medical and psychological studies.  Data may only be available for a hefty fee from a third-party or may require several hard drives and hours of a researcher&#8217;s time.  I would modify the license to allow the original author to decide whether input data must be released and how, with a default of no restrictions.  Perhaps a later version of the CRAPL should fight the open data battle.</p>
<p>Where does the CRAPL fit into my licensing philosophy?  As a first approximation, I feel that the GPL is appropriate for end-user software, the LGPL or BSD/Apache is appropriate for libraries or &#8220;good&#8221; code, and the CRAPL is appropriate for means-to-an-end research code.  Apart from social signaling, releasing under the CRAPL allows you to release code for scientific reasons while still holding the door open for later commercialization or alternative licensing.  For the benefit of science, it encourages peer-review of code, communication of bugs and issues, and a kind of parallel continuity between code and published papers as scientific ideas wend their way through the literature.  While it may need a lot of work to be a legally bulletproof license, it is probably worthwhile to use as-is because it (1) doesn&#8217;t seem to give up the author&#8217;s copyright and (2) the scientific community has a decent sense of right and wrong concerning the correct use of other scientists&#8217; materials.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/pwnetics.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/pwnetics.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/pwnetics.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/pwnetics.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/pwnetics.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/pwnetics.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/pwnetics.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/pwnetics.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/pwnetics.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/pwnetics.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/pwnetics.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/pwnetics.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/pwnetics.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/pwnetics.wordpress.com/193/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pwnetics.wordpress.com&amp;blog=9228334&amp;post=193&amp;subd=pwnetics&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pwnetics.wordpress.com/2011/04/07/crapl-community-research-and-academic-programming-license/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d415e223f7bb2407cd53114ef2d0aff5?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">romanows</media:title>
		</media:content>
	</item>
	</channel>
</rss>
