Do not apply data science methods without understanding them

I heard a joke from a friend, here is an adaptation: Three software engineers and three data scientist meet to take a train to go to a conference. Software engineers buy three tickets, data scientists only buy one for all three. “Don’t worry,” they explain to the engineers, “We have a method.” On the train the three data scientists pack themselves into the bathroom. When … Continue reading Do not apply data science methods without understanding them

Trying to Replace Cassandra with DynamoDB ? Not so fast

In November last year I pointed out how tempted I was to replace Cassandra with DynamoDB. Since then I have done some research and things are not as straightforward as they may seem at first. I’d like to revisit my post and clarify a few things. On elasticity of Cassandra I said the following: Scaling a Cassandra cluster involves adding new nodes. Each additional node … Continue reading Trying to Replace Cassandra with DynamoDB ? Not so fast

Cassandra: Lessons Learned

After using Cassandra for 3 years since version 0.8.5, I thought I’d put together a blurb on lessons learned. Here it goes! Use Cases What works Anything that involves high speed collection of data for analysis in the background or via batch. For example: Logging and data collection Web servers Mobile devices Internet of things Sensors Finance Market data logging Transaction logging Trading activity Record … Continue reading Cassandra: Lessons Learned