Anarchic programming in Artificial Intelligence

The problem

The center cannot hold – William B. Yeats, The Second Coming

What is good software engineering without a product?

As any programmer knows, code can quickly get messy. Fortunately, there is a fair amount of literature out there to help us keep it in good shape (more than 4000 search hits for “software engineering” books on Amazon at the time of writing). However those methods usually make one hidden assumption: that you have a fairly good idea of where you want to go. This is usually the case: the product is your compass. But what if there is no product?

In Artificial Intelligence programming, are mathematics really the compass of coding?

Research is about exploring uncharted territory. Research in AI programming, like deepmind, as well. As such, computer science is really about trying out various bits of code and measuring what works. In this context, talking about a product, or more generally a compass, is tricky. Some might argue that mathematics is the true language of computer science and that it provides the ultimate guideline for writing code. This argument carries some weight when writing code to simulate reality (solid-state physics, fluid dynamics, quantum mechanics…). But what about simulating intelligence?

universe is the biggest computer
Universe is the biggest computer

While a handful of equations is seemingly enough to fundamentally describe the whole of reality, intelligence does not seem to adhere to a clean model. Sure, it is based on the same fundamental equations, but so far only one computer is powerful enough to deduce intelligence from “basic principles”: the universe itself. As mere humans we are left with an infinitesimal fraction of the computational power of nature and we need to find empirical shortcuts toward intelligence. In this context, mathematics is arguably little more than a neat and useful way to write pseudocode.

The brain both as target and starting point of artificial intelligence

That’s not to say that we are totally blind when programming AI. For a start, we have an example: the human brain. Its study has already proven to be of great help in our quest for artificial intelligence. Other crucial ingredients are hardware and infrastructure, compilers and software tool chain, optimization techniques, rich world simulations, ever growing datasets… And given all that, perhaps most importantly, we have lots of ideas. I think this is really what it means to be an artificial intelligence researcher: trying out as many ideas as possible.

From ideas to Python code in A.I….

More precisely, AI Deep Learning research is about translating ideas into code. This is a messy business: ideas are ill-defined. They depend on your mood, the movie you watched last night and quite possibly literally the wind. And you might have plenty of them, the game is extremely permissive and possibly the only rule is to temporarily avoid trying out an idea which has already been tried extensively already.

… and towards a creative mess

So AI Deeplearning research code is messy, and in the legitimate absence of a good compass, it should be. Still, it suffers from exactly the same curse as traditional software engineering: code reusability. If as a good AI programmer you keep your code messy, you will be forever doomed to try out simple ideas, just beyond the boundary of what hasn’t been tried yet. You might occasionally reach out and try something more ambitious. But it will quickly become unmanageable: if it doesn’t work, is it because the idea is a bad one or a consequence of one of the many bugs hidden in this spaghetti code of yours (yes, they are there, lurking at you in the dark)?

When do you consolidate code as you do not know your way?

Unfortunately, applying standard out-of-the-book software engineering good practices is not the remedy either for deep learning algorithm. Sure you can make your spaghetti code look nicer, with nice classes, lots of unit tests and fancy documentation. But remember: you don’t know where you’re going. All this is very likely to be ditched in a couple of months time. So whenever you solidify your code without a good reason, on average, you’re just wasting your time. This is worth repeating: whenever you solidify your code without a good reason, on average, you’re just wasting your time.

Sketch of a solution: anarchic programming

Anarchy means “without leaders”, not “without order” – Alan Moore, V for Vendetta

V for Vendetta in anarchy coding
Anarchy coding

As often, the answer lies somewhere in the middle. In this case the two extremes are over-engineering and under-engineering. Let’s try to narrow this down a bit. I am now going to argue that researchers in computer science should adopt an anarchic style of programming. But before going any further, let’s refresh our memory on various different styles of government:

  • Chaos: absence of order. Complete unchecked freedom, inefficient since freedom is a subjective concept inevitably conflicting between individuals (metaphor for under-engineering).
  • Authoritarian government: centralized order. Rigid and centralized rules, inefficient since reducing the scope of what’s possible and harming creativity (metaphor for over-engineering).
  • Anarchy: emergent order in the absence of authority. See wikipedia if your are curious.

We’ve established that when it comes down to AI research, a healthy dose of both chaos and order is required. Anarchy provides a viable compromise: individuals are free to do what they want until they collectively identify an issue and implement a solution to fix it.

What does it mean for your coding at first?

Concretely, it means that the forefront of research code should be as simple and minimalist as it gets. Do not try to predict the future and solve problems other than the one you are actively working on, as doing so will result in over-complicated code and useless puzzling indirections. Early optimization is harmful, don’t do it.

How to iterate A.I. coding?

In the near future, however, you’ll find that the code you’ve written is not suitable anymore. This is a good sign: it shows that your original code was indeed minimalist. But now you need to refactor it. Anarchic programming comes with lots of refactoring. Embrace this, along with its controversial corollary: don’t unit-test your code in too much detail, it will slow down refactoring. Choosing the right testing granularity is essential to keep the creative juice flowing. Too fine-grained, unit-testing can be like an authoritarian government: it restricts your freedom. For AI research, it is better to break things than to be paralyzed by the past. Prefer coarse, high-level behavioural tests or simply integration tests, regularly testing the whole system end-to-end.

Anarchic programming: the real fun begins

You’ve come a long way and the real fun begins. You are now several programmers hammering the same codebase and some interesting results are coming out of this. This is where anarchy truly starts: read for the signs. Talk to each other, feel and share the pain. Incrementally come up with a simple and minimalist solution and implement it. You will break the code of some users. Again, this is a good sign, your code was never meant to be future proof or solidified. It’s meant to be in constant flux, for the sake of creativity. Help them fix their code. Explain to them how things work now. Some users might decide to fork (i.e. copy) your code. It’s all fine as long as they are aware of the cost: you only support a single and latest version of the code, for speed’s sake.

Migrating useful A.I. Libraries down the stack

As more users join in and the codebase grows in size, new opportunities to factorize code will appear. This is where order comes from. Seize any genuine opportunity to reduce code duplication and create new untested, undocumented libraries. Over time, the important libraries will emerge and should be gradually migrated down the stack. There they should enjoy a fair share of unit-testing and documentation.

It is very important to understand that chaos happens at the edge of research. And order further down the stack, when things have proven useful and stable to some degree. Enforcing order at the forefront will reduce creativity and enforcing chaos further down the stack will lead to general instability (the latter is well documented). Migrating unproven libraries too fast down the stack will result in a bloated and barely used set of libraries (in other words, a maintenance burden).

About Author