Apache Spark is the major talking point in Big Data pipelines, boasting performance 10-100x faster than comparable tools. But how achievable are these speeds and what can you do to avoid memory errors? In this blog I will use a real example to introduce two mechanisms of data movement within Spark and demonstrate how they form the cornerstone of performance.
Spark is well known in Big Data for its incredible performance and expressive API. However, it just takes one small misstep to transform a massively parallel powerhouse into a pathetically poor performer. This post presents an example and the steps that can be taken to indentify the problem.
Following from my recent article on Machine Learning with Scikit Learn, I decided to experiment with the library that is most loved by developers today: TensorFlow. Similar to Scikit Learn, this post walks through a simple example of TensorFlow to categorise handwritten digits.
DevOps culture is a critical part of successfully delivering an enterprise scale project. Getting that culture into a company requires a mindset change. In this article, I explore the journey to the DevOps Culture, DevOps Mindset and provide some practical advice based on my experiences.
Can the perfect team evolve beyond the need for sprint retrospectives?
This blog is based on my experience as a tester and how I got into testing. Furthermore, the blog explains my time at Scott Logic so far.
Testing can become a bottleneck within an agile delivery pipeline, resulting in delays and poorer quality software being released. This guide provides simple but effective ideas and techniques to successfully embed testing into the agile culture, eliminating those bottlenecks and increasing the confidence in your software quality.