Last Thursday our very promising Space Wars Coding Battle was taking place at BeCentral.

The location, the participants, the game, all the ingredients were there to have another great time together.

In practice, things happened differently.

For the first time in 15 editions, we never had to stop the game prematurely due to technical issues. That was a first, and hopefully the last.

Developers come to the Coding Battles to share and have fun with like-minded people, improve their skills and discover how companies use technologies in their own projects.

Testimonial Vaclav Meetup

This time we didn’t deliver as promised and we’re deeply sorry for that.

At Hack League, community is at the core of what we do. We aim to provide thrilling and insightful content, bring passionate people together and encourage them to learn and share from each others.

These are the key principles that helped us grow this amazing community from 0 to 1500+ passionate developers in the last couple of months.

Community gathering

That’s why we feel we owe all the participants and our community a clear and transparent explanation.

The part where it went wrong

It’s not one but two things that went wrong last Thursday.

The first problem: unexpected wifi firewall

To run our Coding Battles, we’ve built an online platform with a continuous system that enables you to see your results straight in the browser.

To do that, we’re using Git , Gitlab CI and AWS. All the content and starting points are stored on our own servers and are accessible via a specific URL.

The great part in organizing events at different locations is that you let your community discover some great working environments, from coworking spaces to companies.

Take Eat Easy's workplace

The less convenient part is that you’re never sure about the wifi’s power, setting of the place or any other weird element that could come up during the event (noise, warmth, electricity..).

Over the Coding Battles, we experienced several surprises: the classical one being wifi hiccups due to the heavy bandwidth requirements.

This time, it was different.

When we arrived at the location all worked fine.

After introducing the game and giving the starting point’s URL to participants, the wifi’s firewall freaked out. Getting way too many request to connect to an unknown URL led the firewall’s wifi to blacklist the URL, preventing participants to fork and push their code.

All we got was a simple and unsweet ‘forbidden’ message when trying to access the page.

But we’re techies, right? We’re the ones building the future. It’s not a wifi problem that would stop us.

MacGyver

So like MacGyvers, participants all took our smartphone to bypass the firewall problem and get access to the starting points. It was slow and files were heavy to download but nothing impossible to overcome.

That’s when problem #2 come up, the big deal.

The second problem: server overloading

With the continuous integration system, participants get to push their code and get the feedback directly on the challenge’s page.

In other words, our own git servers are running participants’ code and checking it in the background against the Coding Battle’s tests.

Like always, we had run charge tests before the Coding Battle to make sure all runs smoothly.

All set.. Test running… All passed… Ok, seems perfect!

That was a bit too hasty.

The tests proved rapidly that we didn’t go deep enough into mirroring the charge that would happen during the event.

The architecture of the game was putting too much workload on the servers, which made the game almost unplayable.

That’s when we decided to stop the Coding Battle.

Game over
Here is the full technical explanation behind it:

As mentioned, the platform is essentially composed of four services :

  • the website (hlweb)
  • the git server (gitlab)
  • the executor service (gitlab CI)
  • the persistence service (hlapi)
technical architecture Hack League

Each time participants push code to their remote repository, gitlab triggers an execution in gitlab CI. The code is run and tested there to generate results, and those results are then sent to the hlapi service that store them in a database and recomputes the scores (more on that later).

In parallel, the website makes requests to hlapi to get the current status of the battle. That allows the platform to give each participant automatic feedback on their code but also the progress of the other players.

During previous Coding Battles, it was working smoothly. But this time, we were playing a real-time game.

Each game would last for 1000 ticks at most, and each tick could potentially generate more than 1kB of data (position of the spacecraft, speeds, console output…).
We were running several games for each submission before sending the results to hlapi.

In the end, some results weighted 23MB in the database. You can see the 20 largest result sizes below.

id LENGTH (results)
1117 23829587
1120 23600711
1113 21694887
1109 12059755
1107 9788772
1085 3743104
1084 3739179
1095 3698676
1102 3698567
1099 3694600
1116 2364465
1118 2355260
1123 2293010
1126 2292234
1121 2289201
1125 2190604
1115 2178775
1112 1483726
1134 534675
1133 532679
As mentioned above, the hlapi service has to recompute the scores at each submission. To do that, it takes all the submissions and sorts each submission to identify the player that finished first.

And that is where the system broke. The resulting load was heavy, but nothing close to what our previous tests showed.

We had done some previous testing which proved the load to be heavy but manageable.
It led us to discuss possible ways to lighten the burden.

However, based on the tests’ results, we decided to allocate our time to improving the game rather than going deeper into performance testing. We were wrong on this one…

As you can see from the table, at the start of the Coding Battle, the 20 lines reached 122MB. That amount of data had to be fetched from the database, processed, sorted then sent to the database again, after each submission.

Even if it passed in the beginning, each new submission would increase the size of the data even further. A server overload was inevitable.

We overlooked this fact when we designed this coding battle. New types of battles imply new kinds of problems. Clearly, we should have been paying a lot more attention to fact.

Be transparent (especially) when things go bad, people will thank you for it

Even though we had to stop the Coding Battle, we felt we couldn’t let participants just go like that. We had to give them further information and give them the chance to talk about it around a beer (we’re in Belgium… right?).

So that’s exactly what we did.

After explaining the problems and promising for a follow up article with deeper insights on the what went wrong, we invited all the participants to join us for beers.

And… What a surprise when we saw participants’ reactions.

Most of them were not angry. On the contrary, the large majority was actually very supportive.

Most of them shared their feedback and ideas to improve the platform. We had great “after Coding Battle” discussions with community members to define ways to make it easier for them to contribute.

Two of the participants even said they really liked the concept and would love to see it running at their company’s monthly meetup.

We’re so fortunate to have such an amazing community. Having the chance to work with passionate people like that around you is just so motivating. It gives you all the courage you need to really push your limits further to bring them the most of it.

Amazing community

The obvious lessons from this event:

  • check up for any potential problems that could arise
  • make sure you have backup plans set up in advance (e.g. bring your own routers^^)
  • improve the platform’s robustness
  • fix the score computation so it will not need to fetch all the submissions anymore
  • test, test and retest and take the time to do it in depth, especially if the game is of a new kind

The less obvious lesson: Failure s*cks. But don’t let it ruin your efforts. Try to turn it into an occasion to:

  • bond even more with your community
  • elaborate new ways to do things
  • build a better and more robust products

And most importantly, always remember to be transparent and true about the reasons that led you to that situation. Only greater trust and engagement will come out of it.

Closing words

We’re so fortunate to have such an amazing community. Having the chance to work with passionate people like that around you is just so motivating. It gives you all the courage you need to really push your limits further to bring them the most of it.

To all our community members, thank you for your support and your invaluable feedback. We’ll continue to work hard to provide you with great, insightful and thrilling experiences.

In the coming months, we will be working on alternative ways to process a submission, new games/topics and open source parts of our platform.
If you have ideas or suggestions on that, we’d be more than happy to hear them. So don’t hesitate to leave a comment, reach us by mail, facebook or twitter.

For the next Coding Battle, we’ll be re-running the Space Wars Coding battle. Don’t worry, this time the platform will be have been tested and retested extensively. Just fun and thrills on the menu. It will take place on the 25th of July at MIC Brussels. You can already grab your spot here.