a blog by victor powell (vicapow)

Simpsons Paradox Visualized in D3

July 24, 2013

In probability and statistics, Simpson's paradox, or the Yule-Simpson effect, is a paradox in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data. This result is often encountered in social-science and medical-science statistics, and is particularly confounding when frequency data are unduly given causal interpretations. -- Wikipedia

Feel free to play with the code and plug in your own data with requirebin or github

Note: all examples taken from the Wikipedia page on Simpsons Paradox

What is this?

Context is one of the most complicated things for people just starting out learning Javascript to wrap their brains around. Even if you've been programming in JS for a while, you may often run into bugs related to context. In this post, I'm going to do my best to explain context with as many examples and as few words as possible. Here goes!

First off, what is it? Well, before we talk about context, you need to make sure you understand how scope works.

var a = 10;
var foo = function(a){
  console.log(a);
};
foo(20); // prints `20`
console.log(a); // prints `10`

what if I want to print the outer a in foo? well, we can either rename a outside of foo or rename a inside of foo.

var a = 10;
var foo = function(b){
  console.log(a);
  console.log(b);
};
foo(20); // prints `10` then `20`
console.log(a); // prints `10`

Now for context. You can think of the context of a function as an additional, hidden argument called this which is by default, the window object in the browser.

console.log(this === window); // prints `true`
var foo = function(){
  console.log(this);
};
foo(); // prints the `window` object to the console

However, we can change this this hidden argument, (aka, the context), if we call a function with a variable to the left of the dot.

var a = {};
a.foo = function(){
  console.log(this);
};

a.foo(); // prints the `a` object to the console

notice how in the following example we're not calling f with an object to the left of the dot?

var f = a.foo;
console.log(f === a.foo); // prints `true`
f(); // prints the `window` object to the console

Another way in which a function can be called with a different context is by using call or apply With these functions we an explicitly set the context to whatever we want.

var f = function(){
  console.log(this);
}
var a = { foo : 'bar' }
f(); // prints `window` to the console
f.call(a); // prints `{ foo : 'bar'}` to the console
f.apply(a); // prints `{ foo : 'bar'}` to the console

The only difference between call and apply is that call can take an unlimited number of additional arguments that will get passed to the function to be called. Similarly, apply optional takes an array of arguments as a second argument.

f.call(theContext, arg1, arg2, arg3, ...);
f.apply(theContext, [arg1, arg2, arg3, ...]);

The most common issue people run into with context is when passing a function as a call to some other function that doesn't take an additional context argument. setTimeout is a great example.

var a = {
  foo : function(){
    console.log(this);
  }
};
setTimeout(a.foo, 1000); // will print `window` to the console after 1 second. WTF?!?

the expression a.foo returns a function, it doesn't call a function in a particular context. When setTimeout actually calls the function it gets, it doesn't call it with any context so it defaults to the window object. Well crap. How can we make sure our function gets called in the right context? Magic. More specifically, the magic of closures.

var a = {
  foo : function(){
    console.log(this);
  }
};
setTimeout(function(){
  a.foo();
}, 1000); // will print the object `a` to the console

Why You Should Learn Git Even If You Don't Program (Yet)

Jun 20, 2013

Git is an amazing tool for collaborating. Because it started as the version control mechanism for managing the Linux kernel, it's most often associated with large programming projects, and few people outside (and probably inside) the development community realize its applicability in other areas.

But that's changing. For example, a group of about twenty mathematics professors recently put together a textbook on 'Homotopy Type Theory' using git in combination with an online tool called Github. Passing around word documents, images, and equations might have worked for a team of two, but with more team members, the communication channels became too complicated for email. Git was the glue that made the collaboration possible. You can watch a video of the team's collaboration in this video.

Back in April of 2012, Twitter also put their employee patent agreement, the Innovator's Patent Agreement, on Github with...

"the hope that you will take a look, share your feedback and discuss with your companies. And, of course, you can #jointheflock and have the IPA apply to you."

Some people have event started using git with Github to version control their cooking recipes.

The really amazing thing about git projects is that anyone can easily contribute to these works by what's called forking followed by a pull request. Forking allows you to have your very own version of these documents that you can edit and change anyway you like. A pull request allows you to recommend your changes back to the original authors, essentially saying "hey! how about you accept this change I made back into your version of the project?"

This blog is even controlled with git. Fork it and fix my typos!

To learn more about how to use git from the command line, you can checkout Github's excellent tutorial at try.github.io or in most cases, you can get by just using Github's web interface.

Happy gitting!

The Mean Visualized

June 13th, 2013

This is a simple visualization of the Arithmetic mean (aka, the average.) Click and drag the balls around and watch the mean (the larger yellow ball) update.

The mean is often used to describe the central tendency of a set of data values. It's one possible answer to the question "what is a typical value for the data set?"

It's important to remember that the mean is not a robust statistic meaning that outliers will have a large effect on the mean. You can see this in the visualization by adding several balls to one side in a concentrated area and then adding a single ball to the opposite side.

Note that the size of the balls are meaningless in the visualization. They're only large enough to be clicked on. The mean ball has twice the radius of the others only so you can still see it when the other balls are in front of it.

You can find a copy of the code to this visualization here: http://bl.ocks.org/vicapow/5778069

Central Limit Theorem Visualized in D3

May 29, 2013

In probability theory, the central limit theorem (CLT) states that, given certain conditions, the mean of a sufficiently large number of independent random variables, each with a well-defined mean and well-defined variance, will be approximately normally distributed.

-- Central Limit Theorem - Wikipedia

which is what we see here. at every triangle, the ball has a 50/50 shot of going to the left or to the right. you can also think of it like coin flips, where the number of coin flips is (bins -1)

if we assign heads to 0, and tails, 1 (or 0 for left, 1 for right as in the case of the visualization above)

for 1 coin flip, the possible sums of coin flips are:

0 -> 0
1 -> 1

so for 1 coin flip, there are 2 different possible outcomes, each equally likely. the expected percentage of the possible outcomes is then:

0: 50%
1: 50%

for 2 coin flips, all possible outcomes are:

0 + 0 -> 0
0 + 1 -> 1
1 + 0 -> 1
1 + 1 -> 2

so for 2 coin flips, there's only 3 different possible expected coin sums. But unlike the other outcomes, the outcome where the coins total 1, is possible in two coin combinations so the probability is double in this case. Our expected percentages would then look like:

0: 25%
1: 50%
2: 25%

give it a try!

note: this visualization was inspired by: http://vis.supstat.com/2013/04/bean-machine/

Test Your Javascript Skillz

May 25, 2013

Hoisting

will it throw an error?

foo()
function foo(){
  console.log('foo')
}

answer

no


bar()
var bar = function(){
  console.log('bar')
}

answer

yes TypeError: undefined is not a function


if(false){
  var konnichiwa = 'bonjour'
}
console.log(konnichiwa)

answer

no


if(false){
  var hallo = 'privet'
}
console.log(nihao)

answer

yes ReferenceError: nihao is not defined


The Event Loop, Asynchronicity, And Closure Scope

what is the entire output to the console after 100ms for the following code?

var array = []
for(var i = 0; i < 5; i++){
  setTimeout(function(){
    array.push(i)
  }, 100)
}

console.log('array: ', array)
setTimeout(function(){
  console.log('array: ', array)
},  100)

answer

array: []

array: [5,5,5,5,5]


what if we changed the 100 in the second setTimeout to 99 ?

answer

array: []

array: []


how could we write the code above to produce the following output:

array: []

array: [0,1,2,3,4]

answer

var array = []
for(var i = 0; i < 5; i++){
  (function(i){ // closure
    setTimeout(function(){
      array.push(i)
    }, 100)
  })(i)
}

console.log('array: ', array)
setTimeout(function(){
  console.log('array: ', array)
},  100)

what would be the output if we changed i to i * 2 in the argument to the closure?

var array = []
for(var i = 0; i < 5; i++){
  (function(i){
    setTimeout(function(){
      array.push(i)
    }, 100)
  })(i * 2) // used to be just `i`
}

console.log('array: ', array)
setTimeout(function(){
  console.log('array: ', array)
},  100)

answer

array: []

array: [0,2,4,6,8]


what is the output?

function foo(baz){
  var hola = 'moshi moshi!'
  baz()
}

function bar(){
  console.log(hola)
}

foo(bar)

answer

ReferenceError: hola is not defined


Automatic Semicolon Insertion

what is printed to the console?

function foo(){
  return 
  {
    foo : "bar"
  }
}
console.log( typeof foo() === 'undefined')
answer

true


var foo = function(){
  console.log('foo')
}
(function(){
  console.log('bar')
})()
answer

TypeError: undefined is not a function


How would you fix the code above?

answer
var foo = function(){
  console.log('foo')
}
;(function(){ // comma
  console.log('bar')
})()

Responsive D3

May 15, 2013

So what techniques make this possible?

1. don't use svg!

I know this seems to goes against idiomatic d3 but remember that d3 is

...not a monolithic framework that seeks to provide every conceivable feature. Instead, D3 solves the crux of the problem: efficient manipulation of documents based on data.

And that's exactly what we're doing here except with non-svg DOM elements because of the lack of support for percentage based positioning of svg elements. (if you've found a better technique, please let me know!)

update: After writing this blog post, I found an example that does achieve a similar effect with svg elements using the the preserveAspectRatio="none" <svg> property in combination with vector-effect: non-scaling-stroke on <path> elements

2. use container divs

A common pattern that I often see is trying to center align relative to a percentage distance from a parent element. for these cases, you can use an extra wrapper or container div

.label-container{
  width: 1px;
  height: 1px;
  position: absolute;
  left: 50%;
}
.label-container .label{
  width: 100px;
  position: absolute;
  left: -50px; /* half of the width */
  text-align: center;
}

3. Use 'em' not 'px' !

This will allow text elements to resize automatically when the font-size of their parent element changes. Then, when the window resizes, you can update the font size of the parent element like so:

  window.onresize = function(){
    var graph = document.getElementsByClassName('graph')[0]
    graph.style.fontSize = (graph.offsetWidth / 75) + 'px'
  }

When Semicolons Are Not Optional

May 15, 2013

Use semicolons. But if you don't use semicolons, know when they're not optional.

// here we want to create a function and call it immediately
(function(msg){
  console.log(msg)
})('wow')
// but what if we have two of these in a row?
(function(msg){
  console.log(msg)
})('wow')
(function(msg){
  console.log(msg)
})('wow')
// produces a TypeError: undefined is not a function

wha!?

;(function(msg){
  console.log(msg)
})('wow')
;(function(msg){
  console.log(msg)
})('wow')

ahh.. much better! so basically, if you're not going to do anything with a result wrapped in parentheses, put a semicolon in front of it. that's it!

Technical Discrepancies In The Movie Oblivion (Spoiler Alert)

April 27, 2013

  1. Why were the clones soulless when they first invaded the planet but not while working as drone repair people?

  2. Why didn't the robot grab more human samples when it first invaded? Why just stick to two?

  3. Why would a massive computer be so stupid as to contain its entire self within a floating cube the size of a house? Wouldn't it be safer to distribute itself among several components in as wide an array as possible?

  4. What the fuck does a robot want with water? Seems like a sun would be a much more convent source of energy. Or wouldn't a robot that's smart enough to clone people also be smart and powerful enough to create a fusion reactor?

  5. How did it get its power from the water back to the god/mother robot?

  6. if the base station for the repair crews had to be out of communication range from the mother robot during long periods, wouldn't that mean the drones were also able to operate autonomously? So than why did the drones just drop immediately when the mother robot died?

  7. Why would the mother robot live inside of the cloning center of the large base station?

  8. How did the first repair ship stop running after it entered the radiation zone but the second ship entered the previous zone without any problems?

  9. What happens when the other 50-100 robot repair men find out Julia is alive and living with clone 52? Is she going to have that many husbands? They all loved her equally as much as the original.

  10. Why didn't clone 52 try to help Julia after he untied himself?

  11. How did clone 52 make it out of the dessert and why didn't he try to take Julia?

  12. Why didn't the repair ship have a black box recording all of Jack's activity?

Fibonacci In Rust

April 14, 2013

// (the slow, recursive way)

fn fib(x :int) -> int {
  match x {
    0 => 0
    , 1 => 1
    , _ => fib(x-1) + fib(x-2)
  }
}

fn main() {
  let mut n = 0;
  while n < 40 {
    io::println(fmt!("fib %?: %?", n, fib(n)));
    n += 1;
  }
}

What Do I Want?

January 2, 2013

A while ago I was asked the question "what do you want?" it took me back. It wasn't something I really thought of much. So I took the time to write a few things down and a few days ago I came across that text file. here it is bellow:

glad to see my opinions haven't changed :)

Thoughts On The Movie '12 Monkeys'

December 12, 2012

I know I'm a little late to the party but I just watched the movie 12 Monkeys and was so blown away, I was inspired enough to write down some of my thoughts. There was one scene in particular I couldn't wrap my head around; the final scene, when the woman scientist (astrophysicist) is seen on the plane next to the bio terrorist, she mentions she's "in insurance."

My first thought was that it was simply a coincidence. That she unknowingly sat next to the bio terrorist in the past. But she wasn't as young as she should have been as suggested in the age difference between young and old Cole. This small scene changes completely my original interpretation of the film because up until that point, it seemed Cole was very much unable to effect his past and that he was always destined to witness his own death.

After some Googling, it turns out I wasn't the only one perplexed by this scene. One theory was that the past in the movie was in fact, changeable and the scientist on the plane was there to "ensure" the bio terrorist went through with his plain in light of Coles actions. I remained in disbelieve of this opinion until the commenter posted a link to the original script which describes Jose's reaction to hearing the details of the bio terrorist and what plane he's flying out of. The script reads specifically, "JOSE, having heard this, steps back into the crowd as RAILLY grabs COLE and pulls him toward the Security Check Points." This means Jose was after that information from for the scientists so they could make sure he made it onto the plane! The scientists had no intent on changing the past and that the past, was actually changeable.

Another scene in the movie that supports this theory is that in an early flashback to Cole's memory of the incident at the airport, Cole sees Jeffrey as the man with the ponytail and suitcase running for the gate, implying that Jeffrey may have been the original terrorist until the past was altered.

The Appreciability Of Code As Art

December 02, 2012

Code is art. If you google "what is art?" you get the following response

"The expression or application of human creative skill and imagination"

By this definition, code should rightfully be considered art but few self described artists give it this rightful designation. I thing this has to do with what I call codes lake of "appreciability." Take music as an example. It's fairly easy to argue that any one of Beethoven's works can be considered art. This is because it's easy for almost everyone (at some level) to appreciate music. You need no formal education to enjoy it. (Although, I'm sure having a deeper understanding of music helps you to better appreciate the level of detail and mastery expressed in his work.) So with music, everyones born with the ability to appreciate it, only a few go on to master it. Programming is different, in that to appreciate it, you have to have mastered it (or at least be proficient at it.)

I'm Dyslexic And I Program

November 18, 2012

I'm dyslexic. And no, that doesn't mean I read backwards. It also doesn't mean I'm stupid. From wikipedia:

"Dyslexia is a brain-based type of learning disability that specifically impairs a person's ability to read. These individuals typically read at levels significantly lower than expected despite having normal intelligence." - National Institute of Neurological Disorders and Stroke

But in middle school, my teachers all just about gave up on me ever being able to read, write or spell at a proficient level and so did I. They gave me a laptop that came with software that would translate text in speech and the audio versions of my reading assignments and sent me on my way. And so, for the longest time, I believed I would never need to read or write. What was the point? Technology could already do this task for me. So I almost never read or wrote anything. Nor did I want to. It was hard and frustrating and everyone could do it better then me.

With the computer I was given, I learned to type. And better and more quickly then any of my peers. Typing, it turns out, was a great way to learn how to spell. I could memorize the motion of my fingers instead of trying to remember the order of letters.

Then, another interesting thing began to happen. I learned to program. In art class, we used Adobe Flash to create animations. I found myself wanting to create interactions and more complicated motions, so I learned ActionScript. As long as I could find tutorials with several examples, and relatively little text between them, I could get my animations to do what I wanted. And so by accident, I found an entire would inside of ActionScript for doing all kinds of neat things. I could even make video games!

Because of all the time I was spending trying to follow tutorials, my ability to read slowly improved. After I got to college and decided to major in Computer Engineer, a really great thing happened. Someone suggested a book for me to read. The book was called Outliers. It was the fastest I've ever read a book and probably the first book I ever read that I truly enjoyed reading. So I read Blink and The Tipping Point just as quickly after that. Somehow reading wasn't a struggle while reading those books. It was like riding a current down stream, instead of feeling like a fight upstream. I reached some sort of critical mass, where the enjoyment factor outweighed the choir.

There's a lot of research in the area in neuroplasticity that suggests increasing remedial reading can offset the effects of dyslexia. Let me say that another way, reading more makes you less of a dyslexic. The prevailing wisdom on dyslexia up until recently has basically been if you got it your stuck with it. Which makes sense, since the one thing dyslexics don't want to do its read more so they never improve.

It seems more and more that reading and writing is like the internet itself and being better at both is analoges to improving your download and upload bandwidth. I feel very strongly that reading and writing on the web isn't just a fad. It's going to be the primary communication protocol for us humans on the internet for a very long time to come. It should be our priority, as a society, to teach the fundamentals of communication above all else in school. As long as kids are able to communicate, they'll gravitate toward their interests and fields of choice on their own teaching themselves along the way.

Zookeeper with node.JS on OS X - Part 2: Setting up the Node.JS Client

July 16, 2012

At first I thought node-zookeeper was the best module out there for working with zk and node but the API is gross. I was glad to find out I wasn't alone. Mark Cavage from Joyent was kind enough to write a wrapper around it that makes it feel more like the native node file system api. You can find it here: https://github.com/mcavage/node-zkplus or just install it via npm.

  npm install zkplus

then just write your client code. Here's the usage example taken from the github project:

var assert = require('assert')
  , zkplus = require('zkplus')

var client = zkplus.createClient({
  servers: [{
    host: 'localhost'
    , port: 2181
  }]
});

client.on('connect', function () {
  client.mkdirp('/foo/bar', function (err) {
    assert.ifError(err);
    client.rmr('/foo', function (err) {
      assert.ifError(err);
      client.close();
    });
  });
});

Zookeeper with node.JS on OS X - Part 1: Installing Zookeeper

July 16, 2012

To get started, you'll need to install zookeeper. I'll assume you're using OS X. In which case, you can install zookeeper via home brew.

brew install zookeeper

if that breaks because of a permissions issue mentioning this directory:

/usr/local/var/run/zookeeper

just go ahead and create that folder using sudo:

sudo mkdir /usr/local/var/run/zookeeper

after that, make sure to change the owner to the current user

sudo chown victor:victor /usr/local/var/run/zookeeper

Now just try running brew install zookeeper again. zookeeper should now be installed in:

/usr/local/Cellar/zookeeper/

You'll also want to setup the zookeeper configuration file in

/usr/local/etc/zookeeper/zoo.cfg

I was able to just copy the example cfg to zoo.cfg from within that directory. tl:dr, your cfg file should look like this: https://gist.github.com/3126340

To clean up all this mess, I also added a small shell script that I put in /usr/local/bin that looks like this https://gist.github.com/3126356. make sure to also set the file mode to executable using chmod +x zookeeper so that zookeeper will find our cfg file and use the proper directory to store its data. this will also let us do:

zookeeper start

to start the server or

zookeeper stop

to stop it. yay!!