Blog

David Wybourn

Data Engineering

For the last few months we've been working on a very DevOps focused project. As such we've used AWS, infrastructure as code, Docker and microservices. The different microservices were initially running all on one box, each with a different port. This solution wasn't scalable or very practical. We couldn't have all our services on one machine and it was getting tiresome and error prone having to remember/lookup which port each service was on. We needed our services to run on separate machines, and we needed a way to communicate with them without having to hard-code IP addresses or port numbers. What we needed was service discovery. As we had already been using Docker for each service, Docker Swarm was a natural candidate.

David Wybourn · 17^th Jun 2016 · 5 min read

Data Engineering

Introduction to Hadoop and MapReduce

Big data is one of those buzz phrases that gets thrown round a lot, companies love saying they work with ‘Big’ data, but what is ‘Big’ data?

David Wybourn · 13^th Jan 2016 · 9 min read