the programmer language,

txus /t∫us/
general enthusiast

  1. Programmer based in Berlin.
  2. Obsessed with programming language design, compilers, virtual machines. Some things I like working on:
    • Lambra: Distributed programming language inspired by Erlang and Clojure.
    • TerrorVM: Language-agnostic VM for dynamic programming languages.
  3. Co-founder of Codegram and conference organizer (Barcelona Ruby Conference, Barcelona Future JS).
  4. Musician (MRKL).
  5. Writer (Breves sinsentidos, prose in Spanish)
  6. Successful astronaut.

I'm @txustice on Twitter and txus on GitHub.


Jun 28, 2014
A thought experiment: teaching programming

This post was going to be a couple of tweets, but I guess it got out of hand. Let's make a little thought experiment and forget what we know about teaching programming.

The thing about teaching is that it is usually incremental over generations of teachers-learners: we teach very much in the way we were taught, with some improvements though, maybe some new metaphors, adding some new insights. But we stick to what we know worked when we were being taught.

An example: I know nothing about parenting myself, but if I were to raise future children, I would definitely start from how my parents raised me. A pervasive reading culture, music and books being omnipresent, musical instruments, cats, dogs and other pets.

None of these elements necessarily contribute to good parenting, but that's my reference point, all I know.

Back to teaching programming. While teaching at a number of Rails Girls events I gained a couple of insights about how absolute beginners learn programming. Their typical beginners workshop consists of following a tutorial where the learner builds a small web application with Rails. The steps are pretty detailed and carefully designed to provide wow moments periodically. Reinforcing feedback is seemingly invaluable to beginners.

Some abstract concepts, however, are more difficult to teach. As the beginners are taught about how to poke the system to accomplish what they want, understanding underlying notions such as Model View Controller or the request lifecycle are necessary to build an accurate mental model of what they are building. And so I learned a lot just by trying different ways of explaining such concepts, seeing what kind of metaphors made them click better.

As I've stated before, the way I teach programming is not very different from the way I learned it. The only thing I do differently is trying to come up with better metaphors thanks to my wider perspective. I never thought whether that is actually the best way to learn about programming. I just know it was "good enough" for me, and "good enough" almost invariably means being stuck in a local maxima.

Let's now look at some common assumptions when teaching beginners to program.

Common assumptions when teaching programming

When we teach beginners to program, programming is usually defined somewhat in the lines of describing precise instructions for the computer to execute in order. That means we start with the notion of algorithm.

A common metaphor, for example, is that of a kitchen recipe. From there it's easy to teach the most primitive way of code reuse, namely the goto and if statements. Those two concepts are naturally shown in the form of a for loop (for n potatoes, peel a potato). This is more generally defined as control flow.

Then we may introduce the concept of variables, baskets or drawers where we can store values and retrieve them. We are teaching about memory, more specifically mutable memory.

And so the very first thing we are telling a beginner when learning to program is that programming is about algorithms, control flow and mutable memory. But is programming really about that?

Programming is better than that

Seasoned professional programmers will usually answer no. They'll come up with a much higher level perspective, throwing in words such as design, architecture, interacting components, modularity. None of this is even mentioned to the beginner.

Chris Granger recently picked an interesting definition of programming: data transformation.

What if we taught beginners that programming is about transforming data into other data? Or if we told them, more specifically: programming is about representing a real-world problem with data in the computer, then describing transformations on this data until its shape solved the problem.

With this definition, beginners approach programming not as an imperative kitchen recipe, but as a higher-level, declarative description of a problem in terms of data and its transformations.

But as good as this sounds, the current tools are not good enough to accomplish that. The cognitive load associated with using a command-line, a text editor, a compiler and a REPL for the first time is too much, even for teaching for loops. We need better tools for teaching this kind of high-level programming.

Chris Granger's Aurora seems not only like a very interesting and novel approach to professional programming, but also a very good fit for teaching it. Go watch the video if you haven't yet. I'm really looking forward to its public release and the ripple of new ideas that it will bring.

Conclusion

I would really like to experiment with such an approach to teaching programming as data transformation, without a single mention to variable assignment or for loops. I wonder what kind of effect would it have on the following generations of programmers.

And if you are currently teaching somebody to program, why not giving it a shot yourself?


Jun 22, 2014
On dogma

Two weeks ago I published some thoughts about Apple's latest programming language explaining why it let me down through a seemingly contrived analogy. In that post I derived some arguments from a dogmatic principle I hold, namely:

All corporations, big or small, have a moral imperative to advance the state of the art of their industry. The bigger the corporation, the more absolute resources it is expected to dedicate to that end. This makes leading corporations in a given industry very likely to direct the most significant paradigm shifts in it.

A few people asked me on Twitter why I hold that principle. I replied and told them that it is a dogma, so there is no why. A dogma is the belief in an axiom, a self-evident proposition. It cannot be derived from other logical propositions. Other propositions may stem from it though.

One of them promptly dismissed the claim as untruthy, to be ignored, as soon as he read the word dogma. Let me explain why I think dogma is not a bad thing by definition, and why it should not be a reason to sistematically dismiss an argument.

Dogma and belief systems

Back in college I used to have very interesting discussions with my friend Cuauhtémoc. We were both majoring in Political Science at that time. Even though we held quite different cosmovisions, we were able to inspect the differences in our respective political points of view very calmly, using what we knew about philosophy and history of political thought.

By deconstructing our arguments to understand why we had a specific opinion on a subject, we almost invariably reached the roots of our respective belief systems -- their axioms. In our case, the conflicting axiom from which all the completely different world views stemmed was almost always the following: I believed that the individual should prevail over the group, and he believed exactly the opposite.

And we knew that we had reached our dogmatic axioms because we could no longer answer the question: why?

In conclusion, any belief system needs a set of axioms, or dogmatic propositions. Those are the very root of it. And they can't be judged truthy or falsy, they can just be believed or not.

Dogma is not a matter of truth -- it is a matter of faith. In my opinion, that doesn't make it dismissable as myth, but rather useful to understand the logical roots of our reasoning.


Jun 10, 2014
How Apple failed miserably to get us closer to the Moon: What's wrong with Swift?

A few days ago Apple unveiled a new programming language of their own: Swift. As I see it, it is a great step up for developers of the Apple platforms (iOS and OSX). Compared to Objective-C, it seems like a fairly modern language with a nice compromise between safety and expressiveness. Apple seemed to give special attention to its tooling and user experience (its so-called Interactive Playgrounds). All developers for the Apple platforms should be seriously excited.

However, the announcement of Swift made me upset for other reasons. This post is an attempt to structure my thoughts and explain what made me upset and why.

First of all, my logic stems from a dogmatic principle I hold. That principle is:

All corporations, big or small, have a moral imperative to advance the state of the art of their industry. The bigger the corporation, the more absolute resources it is expected to dedicate to that end. This makes leading corporations in a given industry very likely to direct the most significant paradigm shifts in it.

Humans on the moon

I took this analogy from an article about the development of AI technology, and liberally adapted it. Unfortunately I don't remember what the original article was anymore. Let me know if you do.

Let's imagine that putting a human on the moon is a big deal for human advancement (and let's ignore that it already happened). Let's imagine as well that there's a big corporation who has the resources and the talent to figure out how to do it. We will call this fictional corporation Happle.

Taking the dogmatic principle I stated earlier as truth, we can say that Happle has a moral imperative to try to put a human on the moon. There are at least two ways Happle can fail to fulfill that moral imperative. It can either choose to ignore it, or (even worse) deceptively pursue a shorter-term goal instead, marketing it as if it was an attempt to put a human in the moon.

Doing the former is common among some corporations whose goals dismiss long-term thinking in favor of shorter-term ROI.

Choosing the latter, however (masking a short-term goal as a long-term, game-changing endeavor) is pure evil, and is motivated only by self-aggrandizing brand marketing. Let's imagine a scenario where our fictional Happle chose to pursue this path.

Wooden crates

Imagine that Happle presents its latest product: a beautifully designed wooden crate. Dozens of possibilites come to mind quickly with this new product: people can pile a few of those crates, stand on them and reach the higher walls of their 2-floor houses to paint them. It's great! The wood Happle chose for their product is the best of breed, and their varnish is a patented formula that even the most woodworkers can only dream of.

It's all good until you see the advertisement on Youtube, featuring a middle-aged man piling a bunch of crates up to a few meters high, then standing on top of it. Then the product motto appears:

A little bit closer to the moon.

Back to Apple

Returning from the analogy, Swift is the beautiful wooden crate and its broad spectrum of usefulness. Even though it is a good tool, much better than many present mainstream programming languages, it is not "a product of the latest research in programming languages". It doesn't bring us any closer to the moon. You just can't keep piling wooden crates on top of each other until you reach the moon -- you need to dedicate millions of research in physics and engineering.

Then again, a good tool is a good tool, and I'd be glad if any developer like me or my friends had developed and marketed it as what it is. However it didn't take me even 5 minutes reading the Swift guide until I found an example of a for loop mutating a counter, which made it look like any other Algol-derived programming language from the 80s or the 90s.

Swift is just a better hammer. It doesn't represent a new class of tools. It doesn't reshape the way we think about programming as a whole or even an area of programming, as we should expect from a new tool developed by such a powerful industry leader.

Sadly, Apple isn't the only would-be world-changer to leave us cold. Facebook and Google's Hack and Go are also just better hammers. And people love having better hammers for their nails. But as long as the expectations of our industry remain this low, a paradigm shift is not likely to occur.

If Apple, Google and Facebook spend their resources and talent as industry leaders in making better hammers and more beautiful wooden crates, how are we ever going to reach the moon?

(Big thanks to Chad Fowler for proofreading and feedback on this article!)


May 03, 2014
Why I am excited about Clojure

I've been meaning to write about Clojure for some time now. Unfortunately, as it often happens, I felt the urge to rewrite my entire blog in Clojure first, and that delayed me a bit. So let's get to it!

My first real language was Ruby. I still do Ruby every day when doing client work, mostly Rails apps. I still use Ruby to prototype lots of things -- it's a pretty nice language, although it requires a great deal of discipline on the programmer side to avoid common pitfalls.

I had been looking at Clojure for a while now, read a couple of books, watched a couple of videos, but mostly just played around with it. In the past few weeks I got the opportunity to write a small, simple service in Clojure for my current client, and that made all the difference. From there on, prototyping and writing other things with Clojure felt so much more natural. Here are some thoughts and opinions that I got from that experience and why, right now, Clojure is my favorite practical programming language. Bear in mind that I'm neither a Clojure expert nor I'm claiming any of this as a fact -- it's just my perceptions from having used it for a relatively short time in a rather narrow set of problems.

It feels designed

Coming from Ruby, that's one of Clojure's most shocking traits. Clojure took 2 solid years to be designed (before being even built). And one can tell from the first impression already.

When I start using a language, there are usually some situations where I can't understand why my code isn't doing what I expect it to do. When that happens to me in a language like JavaScript, for example, finding out what the problem was is generally a very frustrating experience -- mainly because when I finally ask a more experienced JS developer they tell me:

"Hahaha of course! You see, in JS there's this quirk. You have to work around it by doing so and so."

And then I lose it. What I "learn" when that happens is a workaround. It doesn't feel like learning.

In Clojure on the other hand, I end up finding out I was doing it wrong (for example, dealing with lazy sequences as if they were normal sequences), and that the language provides a way to do it right which is also cleaner. It feels as if I had asked Rich Hickey (the creator of Clojure) and he had told me:

"Hahaha of course! Think again -- everything works like it's supposed to. You're using the wrong function or the wrong data structure."

That truly does feel like learning, and it makes me happy.

It is extremely concise and elegant

Clojure is not a pure functional programming language, which makes it easier to learn for everybody in general. But it is still very functional, and that makes for very elegant, concise and powerful programs.

I find myself building programs from very small, reusable functions, nicely composed together. Its super simple module system makes it really easy to reason about both my program modules and their dependencies.

Plus, whenever some interface I've built feels awkward or there's some duplication, often times I find that, while thinking and trying to refactor it, the language tries to drive me to the cleanest solution. It feels completely the opposite of fighting a language. It is there to help you reach the cleanest, most elegant solution. For me, that is unprecedented, having used Ruby, C, and JavaScript quite a lot.

The workflow is awesome

In my opinion, one of the worst problems in the craft of programming is how we waste our brain power with terribly slow feedback loops. The common workflow in Clojure aims to fix this. Note for the reader: this will not surprise you if you're used to Lisps.

With Clojure your editor (be it Vim, Emacs, Light Table...) is permanently connected to a live REPL. You continually develop, test and modify functions with subsecond feedback. Continuously. All the cores in your brain are lit, as you have literally no time to think about anything else. That's not only deeply satisfying, but also leads you to certain thought paths that slow feedback and its inevitable lower focus would have simply blocked. That's one of my favorite parts of Clojure.

So, those are my current thoughts and feelings about Clojure, summarized. If you haven't tried it, I highly recommend you to do it: check out their website to learn how to get started.


Nov 06, 2012
Traitor - an implementation of traits for Ruby 2.0

Refinements are the most buzzed new feature in Ruby 2.0. Admittedly, they're probably a bad idea. But honestly I couldn't resist trying them to implement traits!

What are traits?

Traits are like Ruby modules in the sense that they can be used to define composable units of behavior, but they are not included hierarchically. They are truly composable, meaning that are pieces that must either fit perfectly or the host object must provide a way for them to do it, normally resolving conflicts by explicitly redefining the conflicting methods.

Since I first read about traits, I found them better than Ruby mixins, that's why I implemented them natively in Noscript, my programming language running on the Rubinius VM. But having traits in our beloved Ruby turned out to be less trivial than expected.

A while ago I tried to implement traits with pure Ruby and gave up. The problem basically was the way in which a Ruby module is included in a class or extended in an object. One of the power features of traits is the explicit conflict resolution between conflicting implementations of the same method, and that turned out to be a pain in the ass with modules, so I gave up for a while.

Introducing Traitor

So when I heard that MRI 2.0 had a release candidate with refinements, I thought: well let's give it a try. FUN!!!

And so I did! Traitor is the result. Let's see how it works:

Let's say we want to have Rectangle objects that have color and shape. Those two behaviors will be composed as traits, let's see Colorable:

Colorable = Trait.new do
  attr_accessor :color

  def ==(other)
    other.color == color
  end
end

Easy. For now, Colorable only knows how to compare itself to other Colorable objects. Let's try and use it from Rectangle:

class Rectangle
  uses Colorable
end

blue, red  = Rectangle.new, Rectangle.new
blue.color = :blue
red.color  = :red

blue == red
# => false

Now let's implement the Shapeable trait:

Shapeable = Trait.new do
  attr_accessor :sides

  def ==(other)
    other.sides == sides
  end
end

Shapeable knows how to compare itself to other Shapeable objects, through the number of sides that it has.

Our Rectangle needs to be both, the problem is that if we use both traits, since they have no hierarchy, a rectangle won't know how to respond to #==. What implementation should it use, the Colorable or the Shapeable? No way of knowing. When in doubt, Rectangle will always raise a trait conflict error:

class Rectangle
  uses Colorable
  uses Shapeable

  # A Rectangle has 4 sides, thank God.
  def sides
    4
  end
end

Rectangle.new == Rectangle.new
# TraitConflict: Conflicting methods: #==

Resolving conflicts explicitly

We must provide a mechanism to resolve the conflict in Rectangle, our host class. Fortunately, it is as easy as defining our own version of #==:

class Rectangle
  uses Shapeable
  uses Colorable

  # A Rectangle has 4 sides, thank God.
  def sides
    4
  end

  def ==(other)
    colorable_equal = trait_send(Colorable, :==, other)
    shapeable_equal = trait_send(Shapeable, :==, other)
    colorable_equal && shapeable_equal
  end
end

Now a Rectangle knows how to compare itself to other rectangles, via both its shape and color.

The cool thing is that we have granular access to any implementation of our traits via trait_send. That allows us to compose all implementations, ignore some, or do whatever we want with them.


Oct 19, 2012
Version your Ruby objects with Aversion

During the past few months I've been often daydreaming about functional programming, persistent data structures, and so on. It's something that probably came from learning a bit of Clojure and getting familiar with traditional concepts of the functional paradigm.

One cool thing that I took from that is the concept of immutability. In programs, mutable state is a rich source of all kinds of problems. For one, your ability to reason about your program becomes impaired -- you cannot trust values anymore. Variables are containers of ever-changing chaos, and especially bad programmers seem to be always in to find new ways of enhancing the insanity of any program through nonsensical mutation.

Now I've developed a bit of this sixth sense, or aversion towards mutation. When I see mutation in code, my danger sense goes nuts. I might just accept it, but I recognize it and question it.

What does all of this have to do with versioning objects? Well not much, apart from the fact that versioning objects is a cool thing you can do when you're objects are immutable. Of course after all these random thoughts I needed to code something up to see it in action, and there you go!!

Versioning with Aversion

When you include Aversion in your Ruby objects, every state mutation is explicit and, instead of actually mutate the object, it returns a new instance with the transformation, keeping a history of all the states it went through.

Let's see how it works. Say we have a Person class:

class Person
  include Aversion
  attr_reader :hunger

  def initialize(hunger)
    @hunger = hunger
  end

  def eat
    transform do
      @hunger -= 5
    end
  end
end

See the transform part? Here's an explicit change of state. Instead of subtracting 5 from our current hunger, what it will do is return a new version of the object where this transformation happened. The cool thing is, you can go back too!

So, our Person instances will be immutable. Every mutation must be explicitly wrapped within a transform block, and will return the new instance:

john       = Person.new(100)

new_john   = john.eat
new_john.hunger # => 95

newer_john = new_john.eat
newer_john.hunger # => 90

Of course, you can roll back to a previous state:

new_john_again = newer_john.rollback
new_john_again.hunger # => 95

And finally one of the nicest things is that you can also compute the difference between two versions, expressed as an array of transformations, and apply it onto an arbitrary object:

difference = newer_john - john
newer_john_again = john.replay(difference)
newer_john_again.hunger # => 90

So, if you're curious, just grab the github repo and play with it! You surely can find interesting use cases of immutability and versioning in your own programs.

And also, if you're really curious about persistent data structures but you don't want to learn Clojure just yet, try out Hamster, a Ruby library that implements a ton of persistent data structures.


Oct 13, 2012
Expressing Ruby code in natural language

In my morning shower I'm normally still half asleep so my process of thought is still pretty bizarre and dreamy. This explains why is then when I usually come up with the weirdest ideas.

The other day I thought about the layer of translation that we apply when we read code. Even with a language with a nice syntax like Ruby, we still translate when reading.

Think of the typical situation: you're stuck with a bug. You've already spent 30 minutes staring at the code, uncapable of detecting what's wrong. In your frustration, you ask a colleague to come and figure this out together. The first thing you do is explain the code to your partner out loud: and then you immediately realize where the problem is. This is called rubber duck debugging, because you could have solved the problem by explaining it to a rubber duck on your desk.

Knowing this (because it happened to me seven thousand million times), and thinking simultaneously about the Isla programming language (an educational programming language for young children), I thought about the cognitive difficulty of learning to program. Harder and more complex syntax equals more cognitive load, and that slows down people when learning to program. That's why I think it's by far easier to start learning to program with a LISP, or even with Ruby, rather than with Erlang.

So I coded this up:

Explain, a Ruby source-to-natural-language compiler

Explain is a special kind of a source-to-source compiler: it translates Ruby code to English. This might be used by beginners to gain more insight into what a given piece of code is doing. Let's see an example. Given this Ruby code:

class Person
  def walk(distance)
    @distance += distance
    @hunger += 2
  end

  def eat(food)
    @hunger -= food.nutritional_value
  end
end

When we run explain on it we get this:

$ explain person.rb

Let's describe the general attributes and behavior of any Person.

A Person can **walk**, given a specific distance. This is described as
follows: its distance will be its distance plus what we previously defined as
`distance`. Finally we return its hunger will be its hunger plus the number
2..

A Person can **eat**, given a specific food. This is described as follows:
Finally we return its hunger will be its hunger minus the result of calling
**nutritional_value** on what we previously defined as `food`..
And with this we're done describing a Person.

The quality of the translation is not very good yet, but it's a start.

In the future, Explain will also distinguish builtin Ruby methods (such as map, each, puts) and explain them, so the description of the program will be much more high level. Also, it will be able to output different formats, and it might be a good idea to build a web service using it (so beginners can access it even more easily).

If you're curious about the implementation, it uses the Rubinius builtin parser (Melbourne), which means that it runs only on Rubinius. You can check the code at the github repo and contribute with issues, ideas or whatever! :)

For now it is pretty basic, but I think it's a good idea to build upon, and might help people who are new to programming and to Ruby.


Apr 15, 2012
The three cancers in the Ruby community

UPDATE for Russian readers: Kyrylo Silin has translated this article to Russian and put up in his Github repo, check it out here!

It's time for a non-technical post. Today I'm going to talk about the part of the developer community I'm most familiar with: the Ruby community. And its three cancers.

First of all, feel free to despise my thoughts wielding the fact that I'm not a long-time member of the developer community. I've been programming for the past three years (four if CSS counts as programming, which I hope does not). That implies that I'm missing a whole lot of internal insight that you, my dear reader, surely have -- but in my opinion it also gives me the benefit of an external perspective.

In my humble opinion, there are three cancers of the Ruby community that we should get rid of: faction mentality, scarcity of critical thinking, and the damn Pareto rule. (They may apply to other communities as well, but I'm just talking about the one I feel part of.)

Faction mentality

As a non-graduate in Political Science, I've spent a lot of time watching people debating. They would debate about fascism, communism, conservatism, liberalism and whatnot. Except that I just lied: they weren't debating, they were confronting each other with labels. They all had one thing in common: they felt strongly identified with a label and despised the ideas of the rest, even when those ideas were the same as theirs.

That was fun, actually. Whenever I chose to enter those debates, I would confuse everyone by analyzing their ideas separately from their labels. For example, I would talk with radically atheist, left-wing Spanish students about how their ideas (and deeply engraved feelings!) connected perfectly with Catholic moral. Or how the Spanish social security system, which they defended, was founded by a pseudo-fascist dictator named Franco. Most of them just ended up being pissed of at me: they thought I was trying to convince them about something, but I wasn't, because I haven't even expressed my opinion about those things, just showed them that they weren't thinking by themselves, but just defending a label someone had set up for them.

The thing that would make them really incomfortable was actually that I had no label to defend. Are you left-wing or right-wing? --they'd ask. None. But that can't be possible! The flavor of strawberry, is it blue or green? Doesn't apply. How do you describe a circle in a one-dimension reality? You just can't. Because the categories used don't fit the reality being analyzed.

I've seen plenty of this in the Ruby community as well. At first I thought: well, this community is pretty different from regular university students --they're utterly smart people. They don't fall for labels. WRONG!

Authority leaders and social debt

In the Ruby community there are notorious authority leaders. Whenever DHH says "A is X", the community splits in two: YESH and BULLSHIT. And it's not just DHH, but many other community leaders. People feel compelled to take part in one of the two factions, even when the debate itself might be irrelevant. I call this social debt.

Think of it as technical debt, but socially: you borrow the opinion of a faction or its counter-faction, and that makes you feel secure, but you might end up paying for your mistake later on (your mistake being not thinking about this yourself in the first place).

Scarcity of critical thinking

And that leads me to the second cancer: scarcity of critical thinking. It's true that we, as young members of the community, are condemned to reinvent the wheel over and over, to some extent. That's not exactly good, so to avoid this we may take one of these two paths: either learning from older developers' experience, or following trends blindly. Guess which way is easier, and sadly more common.

We seriously need to read a fucking ton more and take advantage of the expertise of older developers. We just can't live anymore with the old threads suck, reactor pattern will save the world, or processes suck, threads will save the world, or multithreading is hard, processes will save the world. Any of these might be partially true, but religiously sticking to one of those is just stupid. Try out things yourself, talk to other people, consider different use cases for every one of them: HELL, it's just such a simple pattern to compare technologies, it applies to almost everything! It just takes a little effort. And there's no shame in telling someone you don't know about a particular topic or you're not sure about what fits best a given problem. Stop sticking to trends blindly, seriously.

The damn Pareto rule

The third and last cancer I've been noticing in the Ruby community, and for the most part in Open Source, is the Pareto rule. 80% of the work is done by 20% of the people, while the other 80% of the people are either passively consuming technology or even worse: blaming their authors and maintainers, criticising their work or in some extreme cases laughing at it. If you feel you're part of the 20% of the people who do the work, put an end to this bullshit. Don't let people step over your work. And most importantly, if you feel you're part of the other 80%, you need to stop whining and start reading, learning, coding and doing something meaningful with your life.

Conclusion

I believe that these three cancers, if they keep growing, won't kill the Ruby community, but they surely will make it a dark, sad place to be in. We need to stop these, start respecting other people's work, learning from it, yet thinking for ourselves. We're supposed to build the future. We should know better.

We're motherfucking programmers, for fuck's sake.


Apr 06, 2012
Learning with Terror... VM

Hello! I promised to write about what I've been learning these past few months, so here I am. Today I'm going to write about TerrorVM. Dragons ahead! Not really. If you don't know the first thing about how Virtual Machines work, then you're at the same point I was a few months ago, so continue reading.

How did I get interested in this?

I started playing with language design last year, mostly thanks to Rubinius being such an easy platform to target. I implemented a Brainfuck compiler and eventually I started building the Noscript programming language, an object-oriented, class-less programming language running on the Rubinius VM.

For the record, building a programming language is certainly one of the most rewarding experiences I've had in the programming world. It's challenging, creative, and extremely fun.

Eventually I wondered: targeting a Virtual Machine is fun, but how do Virtual Machines work? What's all the low-level, dirty work they do underneath all those colourful balloons of joy? So I decided to try and build something simple, just to learn the basics. As a little detour, whenever I want to learn something new in programming I follow these steps:

My way of learning new things

  1. Read about the problem, but just superficially. Imagine how it could work internally, even if you have no idea.
  2. Try and build an extremely simple version of that yourself, intuitively.
  3. Whenever you encounter a design or conception problem, go read some more or ask someone who knows their shit.
  4. Shape your prototype accordingly and try to expand it with new features.
  5. Go to step 3.

It works, trust me.

TerrorVM

The first problem I encountered: I didn't know C. So I went and read Zed Shaw's Learn C the Hard Way book. I learned the basics and moved on!

So I opened a text editor and started coding a simple C program with a while loop and a switch statement. No memory management, no real objects (just integers), no local variables, no literal pool, just basic arithmetic operations on a simple stack data structure.

At first it was stack-based, but then I took the Lua Virtual Machine (a register-based virtual machine) as a reference and start reading about it. I read The Implementation of Lua 5, and from there I just read its source code directly to understand what was going on. The fun thing is: I understood nothing at first, but progressively, and by repeatedly bothering Jeremy Tregunna (thanks Jeremy!!) and reading about the things he'd tell me, I was able to understand a bit more every time I read it.

TerrorVM had been born. Technically it is a register-based, (naively) garbage-collected, stackless Virtual Machine aimed at running dynamic languages compiled down to its own (relatively compact) bytecode format.

A proof-of-concept Ruby-to-Terror compiler

Once I had basic functionality working, I wondered: but I want to see a real language running on my VM! So I wrote a simple Terror compiler: a program written Rubinius that compiles a subset of Ruby to my bytecode format. Its design is heavily inspired by the Rubinius compiler itself. It's easy to use!

It basically takes this:

a = 123
b = false
if b
  print 'Goodbye world!'
else
  print "Hello world!"
end

And outputs this:

_main
:10:2:4:17
123
"print
"Goodbye world!
"Hello world!
0x1000000
0x51000000
0x9010000
0x51010100
0x50020100
0x21060200
0x30030000
0x2040100
0x2050200
0x80030405
0x20050000
0x30060000
0x2070100
0x2080300
0x80060708
0x8090000
0x90090000

Then you can put this into a .tvm file and run it with TerrorVM! (Spoiler alert: it will print Hello world! to the standard output). Read the Readme in the Github repo to learn more about how to try it yourself.

Future plans

My idea is that TerrorVM should be able to run on both desktop computers and Android devices, since it would be really cool to have dynamic languages on Android.

I'd also like to have a decent, generational Garbage Collector, but blah blah now I'm just being boring. Whatever comes to my mind as I learn further!

Anyway, take a look at the Github repo if you like!

Conclusion

As you can see, nearly every topic you'd like to learn about is approachable if you take it easy, aren't afraid of it, and just do it. TerrorVM might certainly not be the Virtual Machine of tomorrow, but what I'm learning while building it -- that I'll keep forever.


Feb 27, 2012
Mutation testing with Mutant

Hi everyone! I know it has been a while since my last post (more than half a year). I've spent this time learning a ton of interesting stuff, and done most of my writing on the Codegram blog. I'm preparing a series of blogposts about what I've learned these past few months, so expect a bunch of updates to my RSS feed :) Anyway, today I'm going to talk about Mutant.

What is Mutant?

As stated in the project description:

Mutant is a mutation tester. It modifies your code and runs your tests to make sure they fail. The idea is that if code can be changed and your tests don't notice, either that code isn't being covered or it doesn't do anything.

The first time I read about mutation testing, it was in the RSpec book. It talked about Heckle, a mutation testing library that runs on Ruby 1.8.x.

Last year Justin Ko (from the RSpec core team) started a rewrite running on Rubinius: Mutant was born!

A few weeks ago Justin offered me to partner up for the project, and since he's got a lot of work to do on the upcoming RSpec 3, I eventually took over Mutant to bring it to 1.0!

In fact today I released 0.1.1, which is pretty much usable, and I encourage you to try it in your projects! It works in rubinius-head, either 1.8 or 1.9 mode, and for now it supports only the RSpec testing framework (although support for MiniTest and other frameworks will be very soon in Mutant).

Anyway, let's see a practical example about how to use it.

Mutation testing

Mutation testing consists in programmatically modifying the code of a method and asserting that the tests consequently fail.

First of all, let's install Rubinius head and the Mutant gem. To do it with RVM, just type on the terminal:

$ rvm install rbx-head
$ rvm use rbx-head
$ gem install mutant

Now imagine you had this Worker class in a file and another file with its spec:

# worker.rb
class Worker
  attr_reader :job

  def get_job
    @job = :some_job
  end
end

# worker_spec.rb
$: << '.' # Hack for 1.9 mode
require 'rspec/autorun'
require 'worker'

describe Worker do
  before do
    @worker = Worker.new
  end

  it 'is free by default' do
    @worker.job.should be_nil
  end

  context 'when it has a job' do
    it 'should have :some_job' do
      @worker.get_job
      @worker.job.should_not be_nil
    end
  end
end

Let's run Mutant now:

$ mutate "Worker#get_job" worker_spec.rb

It will first run the tests, making sure they pass, then mutate the Worker's #get_job method, and pass the tests again. In this case, the method is simple, so it performs a single mutation, as we can see in the output:

Mutating line 6
  @job = :some_job >>> @job = :BcNKAyqRSfzMPGeiLmGekgAIxYua

And the test still passes after that, because it isn't checking for @job being specifically :some_job, but just any truthy value. This means Mutant will fail, because your test isn't asserting what it should.

If we changed the test to check that @worker.job.should eq(:some_job), then Mutant wouldn't complain and life would be good :)

How can you help out?

Now that Mutant 0.1.1 is out, the best way to help out is to try and run it on your projects. For now those should be projects using RSpec, but soon I'll implement support for other frameworks (or maybe you will!). So you know what to do:

$ rvm install rbx-head
$ rvm use rbx-head
$ gem install mutant

And MUTATE ALL OF THE METHODS!

If you want to contribute with code or even just suggestions, I suggest you to check out the development roadmap. It is open to comments and votes, so you can see what's going on and participate.

I hope you guys enjoyed reading this post. See you on Github!


Jun 18, 2011
Rexpl - an interactive bytecode console for Rubinius

A bit of a background

Since the beginning of the year I've been gradually becoming more and more interested in language design. Last December I implemented Brainfuck, the most trivial Turing-complete programming language, in pure Ruby.

In January I discovered parslet, a really smart piece of software written by Kaspar Schiess). Having tried a different approach at generating PEG parsers before, I found parslet quite smarter and friendlier to use, so I decided that the best way of giving it a try would be rewriting my Brainfuck implementation with it.

Since I liked it very much, in February I started an attempt to implement the Scheme programming language in Ruby, using what I already knew about parslet and trying to figure out the rest (not without the help of this awesome book and a refreshing trip to Paris). Although it's far from finished, it turned out to be an enlightening experience and it just brought my interest on language design further.

Rubinius

A few months earlier I had become aware of Rubinius, which up until that point was nothing more than a strange word that came up whenever I typed rvm list known in the terminal. It turned out to be an amazing project to implement Ruby... in Ruby itself! Taking advantage of modern research about language design, they are working hard to make it both faster and more user-friendly than the most widely used implementation, MRI.

That was last October, when I attended to Ruby And Rails Conf 2010, a really awesome Ruby conference held in Amsterdam, and there were not only one, but two talks about Rubinius. The first one was given by Dirkjan Bussink about the Rubinius VM itself. The other was given by Christopher Bertels, and he talked about Fancy, his own programming language, which was bootstrapped on the Rubinius VM.

I found those two talks really inspiring, and in May I rewrote (again) my Brainfuck implementation targeting the Rubinius VM. It was a lot of fun! I also freaked out a bit when I realized that every programming language, well, everything in computing, could be represented by a stack and a memory heap. Man! My educational background is in Political Science, where everything is sooo high level :) So that was kind of shocking to me.

Pen and paper

While tinkering with the Rubinius VM, I found it useful to have pen and paper around whenever I was coding. Given a set of VM instructions, I'd go instruction after instruction, drawing what the stack should look like at each step. I'd freak out at net stack underflow errors, try to follow along the instructions that were being executed, keeping track of the stack size and spotting that extra (or missing) pop. As a last resort, I'd also go crying in the #rubinius IRC channel on Freenode :)

Once, in that very channel, I asked if there was a way to print the stack after a particular instruction. Since there wasn't, I started reading the Rubinius source code and came up with a short script that seemed to do exactly that — except that it didn't work at all, so I abandoned the idea, drown in sadness.

Then Jeremy Tregunna, the guy behind Metis (an implementation of IO targeting the Rubinius VM), tweeted that he tried to do the same, and that he wanted to bring it further, like an IRB for bytecode. I thought this was even better than what I had in mind, so I started hacking on it and came up with rexpl, a REPL for Rubinius bytecode.

Installing and playing with rexpl

Rexpl is built to be a fun tool to use when learning how to use Rubinius bytecode instructions, for example when bootstraping a new language targeting the Rubinius VM for the first time.

Its main feature is stack introspection, which means you can inspect what the stack looks like after each step of your instruction set.

After you have installed Rubinius (you can find how to do so in their website), open a terminal and type:

$ gem install rexpl
$ rexpl

Now you should see a welcome banner and an IRB-like prompt, and you're good to go! Just start typing some VM instructions and see what happens.

There are three extra commands to take advantage of the stack introspection:

  • list lists the instruction set of the current program.
  • reset empties the instruction set and starts a new program.
  • draw prints a visual representation of the stack after each instruction of your program.

Here's a screenshot of what it looks like:

rexpl in action

Contribute!

The first version of Rexpl was just released yesterday, so it should contain a number of unforeseen use cases and unexpected behaviors.

If you have never played with the Rubinius VM, now it's time to start! You can tinker around, make yourself comfortble and who knows, maybe you're the author of the next mainstream programming language!

And if you already are a language master and dream in bytecode, you can surely find a lot of edge cases where rexpl would need more work — which I will be surely glad to do.

Make sure to check out the Github repo and create a bunch of issues! I love them :)


Jan 06, 2011
Micetrap - Catch evil hackers on the fly!

What if you could set traps for hackers and script kiddies trying to scan ports on your computer?

As any valuable hacker knows, better information leads to better attacks. Therefore, the first thing any potential attacker will do is collect information about her victim: your machine. This can be performed thanks to some really impressive port scanning tools, probably being Nmap the most popular among them.

Ok, but what is micetrap?

Micetrap opens a server on either a given or random port, emulating fake vulnerable services. Port scanners such as Nmap, when fingerprinting ports to discover service names and versions, will get apparently legitimate responses from common services such as FTP, HTTP or MySQL servers, therefore misleading potential attackers with false information.

Depending on the operating system you are using, micetrap will try its best to look feasible by choosing the appropriate fake services and versions to emulate. Whenever possible, micetrap will provide a bit outdated versions which are more likely to be vulnerable, and thus making the attacker focus on those ports. While the attacker tries to exploit these ports, she is essentially sending certain packets -- which get properly captured and logged by micetrap. This information might be useful to discover what kind of attacks are being tried against your machine, therefore giving you time and the opportunity to defend appropriately.

Running micetrap with sudo will allow it to use default, unsuspicious ports, which may give you advantage at tricking a smart attacker.

An example

First we need to install micetrap as a gem:

$ gem install micetrap

...or, if you want to be able to use it with sudo:

$ sudo gem install micetrap

Micetrap currently runs on Ruby versions 1.8.7 and 1.9.2.

Then we fire up the server with some fake service, such as an ftp server:

$ micetrap ftp --port 8765

If everything is ok, you will see something like this:

(some timestamp) ::: Ftp trap listening on ::ffff:0.0.0.0:8765

TL;DR: Most port scanners such as nmap have some kind of fingerprinting capabilities. This means that, in order to discover which services and versions run behind a specific port, they send special packets or probes which make different services and versions react differently. By capturing the response and matching it against a database, most of the time they can reliably determine what service and version is running behind that port.

Port scanners usually start by sending a blank probe, since many servers respond with a welcome banner telling interesting stuff about them. Micetrap only responds to those early blank probes. Let's try to port-scan this fake ftp service with nmap fingerprinting:

$ nmap 127.0.0.1 -p 8765 -A

We are scanning localhost, port 8765, and -A means service version detection and OS guessing. After a while, in our micetrap server terminal we see:

(timestamp) Recorded a probe coming from ::ffff:127.0.0.1:51082
containing the following: (empty line)

(timestamp) ::: Responded misleadingly: let's drive those
hackers nuts! :::

These lines get logged inside a .log file within the current directory. And in the nmap terminal:

Starting Nmap 5.35DC1 ( http://nmap.org ) at (timestamp)
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00017s latency).
PORT     STATE SERVICE VERSION
8765/tcp open  ftp     Mac OS X Server ftpd

The faked service/version is random (you can start an ftp server which looks like lukemftpd, Mac OS X server ftpd or PureFTPd for example), but it is consistent within the same server, so that every scan reports the same service and version.

U mad? Evil hackers

Probably :)

Available services

For now there are a bunch of ftp, http, torrent, mysql and samba services, mostly Mac-ish. But you can always...

Contribute!

If you want to contribute with more services and versions to empower micetrap and be a hacking superhero, you shall follow these steps:

  • Fork the project in the Github repo.
  • Install nmap and look for a file called nmap-service-probes in your system. This file contains regexes used to match responses from scanned services.
  • You only have to devise a string which fits in one of this regexes and then add it in the corresponding service file (in lib/micetrap/services/ftp.rb for example if it's an ftp server).
  • Commit, do not mess with rakefile, version, or history. If you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull.
  • Send me a pull request. Bonus points for topic branches.
  • Profit!

Dec 05, 2010
Hijacker 0.2

After some busy weeks I finally found some time to do a major rewrite of Hijacker, fixing some nasty bugs with 1.8.7, cleaning and refactoring the code and making it more robust in a number of ways.

The main enhancement is that now you can use your own handlers from your project and home directories! It's easy peasy. You just write a ruby file defining your handler, put in one of these paths:

./.hijacker/path/to/my/file.rb
~/.hijacker/path/to/my/file.rb

...and it will be automatically loaded by hijacker, allowing you to start the server with your handler like this:

$ hijacker my_handler --my-option --bla-bla

If you think your handler might be useful to others, please send me a pull request in the Github repo!

Other enhancements

There are some minor enhancements as well. For example, now blocks passed to hijacked objects are reported as arguments. So:

Hijacker.spy(my_object)

my_object.my_method(arg1, arg2) do
  # something
end

Will be reported as calling :my_method with arg1, arg2 and a Proc.

If the method raised something, it will be reported as well.

Contribute!

Now being the code cleaner and more organized, feel free to contribute if you find something missing (or if you know really well Rubinius and can easily figure out the metaprogramming issues preventing hijacker from working with it!)

Or maybe just file an issue :)


Nov 21, 2010
Introducing Hijacker - spy on your ruby objects!

Ok, so yesterday I woke up with two things in mind: doing something with DRb, and doing something with metaprogramming (just for the lulz). So I first implemented a little spy pattern to spy on objects, classes and their instances, mixed it with DRb, and came up with the hijacker idea.

You can check out hijacker's source code in the Github repo.

So, let's get started. Hijacker is two things indeed: a server and a spy.

Hijacker: the spy

On the client side (your ruby code), hijacker is a utility that lets you spy on any object or class. (Spying on a class also implies spying on its instances, unless you override this behavior.) The first step is telling hijacker where to send its spy reports. This can be done at three levels: either globally...

Hijacker.configure do

  # Sets the DRb uri to which the reports are sent.
  # (Ideally, where your hijacker server listens)
  uri 'druby://localhost:8787'
end

...or specifying a different uri for a particular spied object...

Hijacker.spy(my_object,
             :uri => 'druby://localhost:9999')

...or specifying an uri for a particular spied object WITHIN a block:

Hijacker.spying(my_object,
                :uri => 'druby://localhost:8787') do
  # do something with my_object
end

What I mean for spying is registering every method call on the spied object, and reporting the method name, the arguments and the return value.

Every time any activity on a spied object is registered, it is immediately sent to the specified URI (may it be the global one, or specified by any other means). There, on that URI, a patient hijacker server is waiting...

Hijacker: the server

The hijacker server is fired up serving a particular handler, which will receive those spy reports and handle them somehow. For now the only handler implemented is Logger, so we start hijacker like this:

$ hijacker logger

And the server is started with the Logger handler! It will tell you the DRb uri you have to connect to from your ruby code.

The Logger handler receives the report and shows it in a colourful way, just like a nice logger is expected to.

Creative uses for the Logger handler

Ok, Logger is a very simple handler, but combined with the power of DRb and Hijacker, it can be quite useful! Imagine you have two sensitive classes in some ruby code, let's call them PaymentGateway and PlanUpgrader. To avoid clutter in our logging server output we could use hijacker like this:

# We are only spying on instances of PaymentGateway and
# sending their activity to druby://localhost:2222
Hijacker.spy(PaymentGateway,
             :only => :instance_methods,
             :uri => 'druby://localhost:2222')

# Let's spy on PlanUpgrader's class methods, which upgrade
# users' paid plans, and send the activity to a different
# hijacker logging server in druby://localhost:3333
Hijacker.spy(PlanUpgrader,
             :only => :singleton_methods,
             :uri => 'druby://localhost:3333')

# If we had a veeery sensitive part of code where the user
# plan is sent some methods, we could spy *only* that part
# like this: (in this case the server uri is the global one)
Hijacker.spying(user.plan) do
  user.plan.change_with(untrusted_args)
  # do something else
end

Extending Hijacker with your own handlers!

What if you wanted a handler that calculates the average number of arguments sent to methods of a class? Or maybe keeping track of the mostly used object types as arguments, to see if you could replace them with more lightweight objects?

It is really easy to write your own handlers. Inside hijacker code, handlers live here:

lib/hijacker/handlers/your_handler.rb

They are autoloaded and automatically registered, so all you have to do is write them like this:

module Hijacker
  # You only have to subclass Hijacker::Handler...
  class MyHandler < Handler

    # You must implement a class method named cli_options
    # which must return a Trollop-friendly Proc, for
    # command-line options parsing.
    #
    # These options can be accessed from within the
    # #handlemethod by calling the `opts` method.
    #
    def self.cli_options
      Proc.new {
        opt :without_foo,
            "Don't use foo to handle the method name"
        opt :using_bar,
            "Use bar as much as you can"
      }
    end

    # This is the most important method. This is what
    # is called every time a method call is performed
    # on a hijacked object. The received params look
    # like this:
    #
    #   method    :foo
    #
    #   args      [{:inspect => '3',
    #               :class => 'Fixnum'},
    #              {:inspect => '"string"',
    #               :class => 'String'}]
    #
    #   retval    [{:inspect => ':bar',
    #               :class => 'Symbol'}]
    #
    #   object    [{:inspect => '#<MyClass:0x003457>',
    #               :class => 'MyClass'}]
    #
    def handle(method, args, retval, object)
      # Do what you want with these!
    end

  end
end

Try to think of creative uses of hijacker, write your own handlers and send them to me ZOMG I CAN HAZ MOAR HENDLARZ :3

Feel free to send me feedback on what do you think!


Oct 31, 2010
Stendhal 0.1.2 released

A brief post about stendhal latest updates! My little code kata is growing slowly :)

Admittedly, the way in which I develop this gem is by forcing myself not to write anything I don't need right now. Last version (0.1.0) introduced nested example groups, which forced me to rewrite the way examples were ran. I believe this made me rethink some stuff and come up with better code.

By 0.1.0, the only way to "assert" something inside an example was to primitively do it like this:

describe "something" do
  it "does something" do
    assert(3 + 4)
  end
end

This little method was implemented in an Assertions module which I no longer use: this new release introduces RSpec-like expectations and matchers. This means you can do this kind of things now:

describe "something" do
  it "does something" do
    (3+4).must eq(7)
  end
  it "does something else" do
    "string".must_not be_frozen
  end
  it "does something else" do
    "string".must be_a(String)
  end
end

For now there are a limited number of matchers, of course.

Where is this going then? What will 1.0 have?

My idea is to bring dogfooding to stendhal by 1.0: all stendhal code should be tested with stendhal itself, rather than with RSpec.

This means that hopefully these features have to available by then:

  • Test doubles
  • Mocks
  • Some more matchers
  • A decent reporter with multiple formatters

And it would be nice to have these too:

  • Support for 3rd party mocking libraries (mocha, rr, flexmock, rspec-mocks)
  • Support for using stendhal with rspec-expectations

Wouldn't that be great? Check the Github source and report issues if you find them! This will speed up the development process a lot :)


Oct 27, 2010
Stay fit with code katas

Hello! This is my first post. I'm writing this because I just started a personal kata project to improve my ruby: it's called stendhal.

It's a small test framework which tries to mimic the RSpec DSL, implemented from scratch with my narrow understanding of metaprogrammizing. w00t!

Why does the world need another test framework?

It doesn't. But since I recently got very interested in contributing to RSpec, I thought it would be enlightening to try and build something from scratch, so that I can get familiar with problems and patterns I would not use otherwise simply because I don't know them.

You can check the source at github.com/txus/stendhal and criticize as you wish. In the end, it's all about learning!