Last year I wrote about Python for the Web - a boilerplate project to kickoff my web development. It's been a year since I wrote that post, since then I have started a multitude of projects. Along the way I have had the need need to update and upgrade the project …Continue reading
Having worked on so many Python Web projects recently, it became obvious I needed some kind of boilerplate to get new projects started, hence flask-boilerplate.
A lof of this is based on the excellent cookiecutter-flask - I've learned a lot from it, but made enough modifications and additions to call this …Continue reading
PostgreSQL version 10 brings a much anticipated feature: (Native) Table Partitioning.
(Emphasis on the NATIVE, PostgreSQL supported partitioning on previous versions by other means.)
There are a few gotchas you have to keep in mind when using this new feature: No PK allowed; No ON CONFLICT clauses; etc... In my …Continue reading
I've witnessed some of the online debate when it comes to Go and OOP. Some people say the language is object-oriented, others don't share that view. I think it's safe to say that Go isn't a traditional object-oriented language.
Lately, I've been trying to learn more about Go from an …Continue reading
A couple of weeks ago I wrote a post on Data Pipelines with Go, Kafka and Cassandra. Towards the end of the post I presented a very rudimentary (and wrong) approach to rate-limiting (in the specific case of Cassandra write timeouts).
I wanted to emphasize how performant the solution is …Continue reading
Last year I started working on a 'Big Data' exercise. It's an ongoing project that mixes large amounts of web traffic, data ingestion and analytics. It's also really fun. We get to play with an array of new technologies - sometimes on a bet, granted - but most of the time it …Continue reading
Valgrind is a code profiling tool for Linux, it is wildly known for it's memory leak detection and debugging capabilities.
At the time of writing, Valgrind is still not compatible with macOS Sierra (10.12), you can read more about it here.
Docker to the rescue
To get around this …Continue reading
Disclaimer: After reading up the limited amount of information available about Apache Spark, I've drawn my own conclusions on what the project purposes to fix. If any of the information below is incorrect, shoot me an email or post a comment so that I can rectify any mistakes!
On my previous post I went over Running Apache Spark on a cluster.
Spark can read and write from many data sources, including Apache Cassandra.
Cassandra is a distributed database management system. It is a considered a NoSQL database (the usage of such term is questionable, albeit outside of the …Continue reading
Apache Spark is a general-purpose data processing and analysis engine.
On the surface, it helps developers working with large data sets by providing easy to use libraries and modules. Spark integrates with various data sources (CSV, HDFS, remote databases, etc...), actions can then be performed against the data.
In the …Continue reading