<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>idefex.net &#187; Project</title>
	<atom:link href="http://idefex.net/category/project/feed/" rel="self" type="application/rss+xml" />
	<link>http://idefex.net</link>
	<description>Almost an obsession</description>
	<lastBuildDate>Thu, 24 Dec 2009 16:39:45 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Co(mic)incidence</title>
		<link>http://idefex.net/2009/08/comicincidence/</link>
		<comments>http://idefex.net/2009/08/comicincidence/#comments</comments>
		<pubDate>Tue, 18 Aug 2009 07:27:57 +0000</pubDate>
		<dc:creator>area</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Project]]></category>

		<guid isPermaLink="false">http://www.idefex.net/?p=189</guid>
		<description><![CDATA[So, bizarrely, in two days I&#8217;ve been mentioned on the front page of two webcomics. The first is Dinosaur Comics which I wrote this mashup for, which Ryan North graciously linked. It seems to have been one of the more popular things I&#8217;ve done; I&#8217;ve enjoyed reading people&#8217;s thoughts on it over the course of [...]]]></description>
			<content:encoded><![CDATA[<p>So, bizarrely, in two days I&#8217;ve been mentioned on the front page of two webcomics. The first is <a href="http://www.qwantz.com/index.php?comic=1532">Dinosaur Comics</a> which I wrote this <a href="http://www.idefex.net/projects/qwantztwitter">mashup</a> for, which Ryan North graciously linked. It seems to have been one of the more popular things I&#8217;ve done; I&#8217;ve enjoyed reading people&#8217;s thoughts on it over the course of the last few days &#8211; I think my favourite was someone who announced that it was now going to be their Twitter client of choice.  Someone also saw sufficient worth in it to submit it to <a href="http://www.reddit.com/r/funny/comments/9bgqv/dinosaur_comics_twitter_sweet/">Reddit</a>, where it did by far the best of anything I&#8217;ve made. It didn&#8217;t make the front page, but 26 points is a personal record. Unfortunately, it wasn&#8217;t my idea, so I can&#8217;t really take credit for <a href="http://cakebomb.co.uk/bing/?p=400">Chris Bingham&#8217;s masterstroke</a>. I was just glad I was able to give the idea the realisation it deserved. I should also say gracious thanks to my <a href="http://www.puffinhost.com">hosters</a>, who don&#8217;t seem to have blinked after the image generating script got inlined on the front page of a site that gets 70k+ page views a day. Given that they host me for free, that&#8217;s pretty impressive stuff.</p>
<p>The second webcomic is <a href="http://www.reallifecomics.com/archive/090818.html">Real Life Comics</a>. The author, Greg Dean, is having people submit scripts each day this week for how to continue the week-long (and non-canon) adventure. Today, he deemed my script the best. I&#8217;ve been reading his comic for over six years now (I think), and to see my name at the bottom of the comic is more than a little bizarre. I&#8217;m half tempted to buy a print. Of course, I now have even more of an interest in how this storyline turns out!</p>
]]></content:encoded>
			<wfw:commentRss>http://idefex.net/2009/08/comicincidence/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>In the summertime when the weather is high&#8230;</title>
		<link>http://idefex.net/2009/07/in-the-summertime-when-the-weather-is-high/</link>
		<comments>http://idefex.net/2009/07/in-the-summertime-when-the-weather-is-high/#comments</comments>
		<pubDate>Tue, 28 Jul 2009 09:02:40 +0000</pubDate>
		<dc:creator>area</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Project]]></category>

		<guid isPermaLink="false">http://www.idefex.net/?p=170</guid>
		<description><![CDATA[Well into the summer now (half-way, if we count from the end of exams) and having a real whale of a time. It&#8217;s my last &#8216;long&#8217; holiday for a long, long time, so coming in I consciously decided that I wanted to try and make the most of it. The month that I&#8217;ve been at [...]]]></description>
			<content:encoded><![CDATA[<p>Well into the summer now (half-way, if we count from the end of exams) and having a real whale of a time. It&#8217;s my last &#8216;long&#8217; holiday for a long, <em>long</em> time, so coming in I consciously decided that I wanted to try and make the most of it. The month that I&#8217;ve been at home so far compares surprisingly well to the month leading up to graduation.</p>
<p>A long-term goal of mine had been to build an arcade stick for a games console. It took around ten days of near-solid work, but it&#8217;s finally done and is a joy to use. <img class="alignleft" src="http://farm3.static.flickr.com/2596/3728695045_b73e420382.jpg?v=1247834175" border="1px" alt="Arcade Stick" width="250px" />It&#8217;s already getting a lot of use in Street Fighter IV (and I&#8217;ve only received a single piece of abuse on Live so far, which is less than I was expecting), and come Wednesday it&#8217;ll be getting a lot of love through the medium of Marvel vs. Capcom 2. The electronics was, ironically, the easy bit, despite the fact I was soldering to an existing PCB and setting up two 25 pin harnesses. It was all done in under three days and worked correctly first time. Cutting six pieces of wood out of a floorboard, drilling a few holes, and screwing them together? Took the remainder of the time. I hate working with wood. I&#8217;m very happy with how it turned out, though &#8211; certainly at the lowest point I wouldn&#8217;t have expected it to turn out this good. It ended up costing around the same as buying a Madcatz Street Fighter IV SE stick, but is at least future-proof &#8211; I can build other breakaway project boxes that will let it work on other consoles, if I am struck by the urge. </p>
<p>Much to my delight, there are enough people around without jobs just yet (mostly people who took a gap year and so have only just graduated along with me), so there&#8217;s more than enough support for fun times. Notable mentions so far are a black-tie dinner party, sailing on Ben&#8217;s boat (where we saw seals), the beach, and going out having dressed up in owl material (oooooooo indeed).</p>
<p>I also trekked into London with some excellent co-conspirators who were game enough to see Derren Brown&#8217;s Enigma show. We&#8217;re asked to &#8216;keep the mysteries mysterious, and the surprises surprising&#8217;, so in that spirit I won&#8217;t delve into too much detail. However, speaking loosely, I think I can safely say that we were all extremely impressed. We managed to deduce how some of the effects were done to our satisfaction, though many still eluded us. I believe I know loosely how the final effect was done (at least, I have a method that I would use if demanded to reproduce it at gunpoint), and I had a particular song stuck in my head for a couple of days after the show. What the most shocking thing about the show was just how impressive Derren putting people into a dissociative state is; there was one girl he kept putting under and pulling back, and she just went completely limp instantly, each time.</p>
<p>I&#8217;m conscious of the fact that compared to a lot of people I know, I&#8217;m not doing anything too extravagant with my prolonged time off. Phil&#8217;s off trekking South America, and I know a few other people who are travelling to similar extents. While doing something along those lines would be fun, I&#8217;m not pining for it to any great extent. I&#8217;m away for a week towards the end of next month, but it&#8217;s primarily Ph.D. related rather than outright fun &#8211; though I&#8217;m anticipating some of that as well; after all, all work and no play&#8230;</p>
<p>For a competition I wrote a program to &#8217;solve&#8217; any Countdown numbers game (assuming a solution exists). The source is <a href="http://idefex.net/countdown/countdown.pl">here</a>, though it is very slow (a couple of minutes to exhaustively search a tree). Poking around the internet after I wrote it shows that there are <a href="http://www.cs.nott.ac.uk/~gmh/countdown.pdf">much faster</a> implementations of what is essentially the same algorithm, just with more shortcuts taken and a stricter adherence to the rules (it transpires that intermediate fractions are not allowed in Countdown, so by checking a % b before dividing you can save yourself a lot of time). It took me most of the day, but only because I don&#8217;t use Perl on a regular basis; I was initially unfamiliar with how it displays numbers and truncates their representations when asked to display them if there are a sufficient number of 0s after the decimal point. Frustrating, but good to know.</p>
]]></content:encoded>
			<wfw:commentRss>http://idefex.net/2009/07/in-the-summertime-when-the-weather-is-high/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Twitter Compression</title>
		<link>http://idefex.net/2009/05/twitter-compression/</link>
		<comments>http://idefex.net/2009/05/twitter-compression/#comments</comments>
		<pubDate>Sun, 10 May 2009 10:15:20 +0000</pubDate>
		<dc:creator>area</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Project]]></category>

		<guid isPermaLink="false">http://www.idefex.net/?p=111</guid>
		<description><![CDATA[Take one arbitrary limitation (Twitter only lets you send 140 characters). Note that Twitter allows you to use UTF-8 characters. Add a course in information theory. Roast under the heat of procrastination for several hours until juicy.  
In english: I wondered just how much information you could send in a single tweet, and decided [...]]]></description>
			<content:encoded><![CDATA[<p>Take one arbitrary limitation (Twitter only lets you send 140 characters). Note that Twitter allows you to use UTF-8 characters. Add a course in information theory. Roast under the heat of procrastination for several hours until juicy.  </p>
<p>In english: I wondered just how much information you could send in a single tweet, and decided to find out.</p>
<p>Armed with the knowledge from an Information Theory course given by the singular <a href="http://www.inference.phy.cam.ac.uk/mackay/">David MacKay</a>, I decided to see.  So, it&#8217;s common knowledge that Arithmetic Coding is the best way to compress information, getting you to within a couple of bits of the Shannon Limit so long as you have an accurate modelling system. The &#8216;best&#8217; bit of the compression comes from having suitably subtle and ingenious models of the text that you&#8217;re encoding; without such a model, you&#8217;re probably not going to see the benefits of Arithmetic Coding. However, I&#8217;m always up for a challenge &#8211; especially one set by someone with an Erdos Number of two. </p>
<p>Admittedly, I&#8217;d attempted to make an arithmetic encoder for a homework, which I almost achieved. However, I cheated by using the arbitrary precision maths module in PHP which would break after a suitably long string. As I presented it to him, I felt a pang of shame even before I explained&#8230; so reinvigorated with the fire of project procrastination, I thought I&#8217;d try and get one working properly.  Admittedly, my first instinct was to just find one written in PHP, to get onto the meat of this project. But there didn&#8217;t seem to be one on the internet &#8211; until now! And it really does work properly, much to my amazement. Technically, it&#8217;s a range encoder, but the two are mathematically identical. As an added bonus, a range encoder isn&#8217;t protected by IBM patents, whereas an arithmetic coder is.  </p>
<p>So, how do we transmit our bitstream over <a href="http://twitter.com">Twitter</a>? Happily, Twitter supports UTF-8 characters, which can occupy up to four bytes each. For a four byte character, 21 are under control of the user. If we were to use just uppercase characters and space, for English this corresponds to about four bits per character we wish to send &#8211; so we could send up to 700 characters in such a case! We like punctuation though, so this means we can send fewer total characters, but it&#8217;s probably worth it.  And so began my trial with Unicode.</p>
<ol>
<li>First attempt: split the binary string into 21 bit blocks, pad the last one and send the corresponding UTF-8 characters. Unfortunately, not all 21 bit long binary strings map to valid characters &#8211; those above 0&#215;10ffff are invalid. So drop down to 20 bits per UTF-8 character, and have the first bit as 0 &#8211; we&#8217;re now definitely going to map onto a valid unicode character, right?</li>
<li>Turns out, no. The points 0xD800 to 0xDFFF are <a href="http://en.wikipedia.org/wiki/UTF-16">UTF-16</a> surrogate pairs. So look for these, and if we would be trying to send one of these, send a three byte character instead (which will only be 16 bytes, but still a bargain).</li>
<li>Unicode has canonical decompositions &#8211; so the character á is canonically identical to the characters representing a´ in sequence, for example. These are defined to be identical, and no application should function differently when presented by one sequence or the other. So now check to see that a character we are trying to send is fully decomposed &#8211; if not, drop to a three byte character if we were four, and a two byte character if we were three. I&#8217;m not convinced that there&#8217;s a PHP library that does a complete job of this anywhere, so this is currently only a &#8216;probably&#8217; step of the process &#8211; if a message fails to send, it&#8217;s probably failed here.</li>
<li>Line Feed / Carriage Return. Different OSes introduce the other &#8211; or not &#8211; after seeing one. So if the character we&#8217;re trying to send is actually in the bottom 128 of characters, only take seven bits. Set the 32s bit to 1, so we&#8217;re not using the awkward part of the ASCII table.</li>
</ol>
<p>I then set about trying to send some information over twitter. <a href="http://twitter.com/area/status/1728233100">The first test</a> worked well. I decided to push it to the limit, however, and encoded nearly 600 characters as a string of 138 UTF-8 characters. Dropped my clipboard into Tweetie, and was informed that I was over the limit. Perplexed, I did some poking around and found that, in fact, <a href="http://groups.google.com/group/twitter-development-talk/browse_thread/thread/9d9d16d55e2e1e67/4fb990b472f3e812?lnk=gst&amp;q=bytes#4fb990b472f3e812">Twitter isn&#8217;t sure</a> what they mean when they say 140 characters. They think they probably mean bytes, not characters, but there is <a href="http://twitter.com/atebits/status/1286199010">evidence to the contrary.</a> No further updates as yet that I can see, but problems seem to be in at least partly due to how Ruby counts characters. Until this is fixed, however, the amount of information able to be sent using a single tweet is going to be limited, somewhat. They&#8217;re looking into it.</p>
<p>The upshot is that there is now a proof of concept of the <a href="http://idefex.net/compressor/twitter/">compressor</a> up along with <a href="http://idefex.net/compressor/twitter/rangecoder.zip">the source</a>. I&#8217;ve included the UTF-8 libraries with it which are from a couple of sources, but mostly MediaWiki. I based the range encoder on the pseudocode over at <a href="http://en.wikipedia.org/wiki/Range_encoding">the wiki article</a>, changing it to use binary rather than base 10. It&#8217;s entirely for use at your own risk.</p>
<p>Lastly, it would seem that I am not the <a href="http://anirudhsanjeev.org/how-to-send-420-characters-per-twitter-message/">first</a> to have an idea along these lines, and some have even gone much, much <a href="http://www.flickr.com/photos/quasimondo/3518306770/in/photostream/">further</a>. Arguably <a href="http://lukehatcher.com/2009/05/storing-binary-data-in-twitter/">too far</a>. Truly, the internet is a wonderful enabler for people who want to do things simply to see if they can be done. I find it amusing how all of these seem to have sprung up in the last month!</p>
]]></content:encoded>
			<wfw:commentRss>http://idefex.net/2009/05/twitter-compression/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
