Two developers choose to take a class

Here is a wednesday funny for you:

Two developers decide to take a class to improve their skills. One takes a day-long class on “Building scalable web applications with MongoDB.” The other one takes a class on “Basket Weaving.”

Next day they compare their notes. One of them is an expert basket weaver, the other one still can’t build a scalable web application with MongoDB.

Food for thought: PostgreSQL outperforms MongoDB.

Microsoft and Apple Have Everything to Lose if Chromebooks Succeed

Desperate times call for desperate measures at Microsoft. Frobes reports:

Despite a lacklustre start, Chromebooks are becoming relatively popular in the super-budget end of the portable market. This has worried Microsoft for some time. After all, with a Google-centric experience, not to mention an operating system in the form of Chrome OS, there’s little if anything to be gained here by Microsoft and everything to lose. That’s why it’s targeting the Chromebook specifically, with a most likely Windows 10-based $149 laptop.

In the 1990s MS-DOS and MS Windows 3.x and Windows 95 computers were expensive, unreliable, and most important cost prohibitive to build applications for if you were a start up. Students, curious about technology and wanting to learn and contribute could not afford the tools (Visual Studio) and certainly lacked access to underlying source code to learn more about operating systems.

In 1997 my friends and I founded Linux Users Group at Clarkson University and we raised awareness of the open-source software technologies. We had the backing of many professors, and many students. We even had an opportunity to influence IBM’s business direction with Linux and open-source software. After my friends and I graduated, students and faculty who remained formed a group called “Clarkson Open Source Institute” (COSI) which now has a bright and nicely equipped lab in the science center.

COSI is now as much of a fixture of the Clarkson campus as, say, the cafeteria is. How many computer science graduates were influenced by COSI ? How many of them went on to the industry and carried their ideas with them ?

Groups like this not only influenced their peers but also an entire industry. An entire generation of computer science graduates went on to Amazon, Google, Facebook, RedHat, and succeeded at getting the biggest companies out there like IBM and Microsoft to do what they previously laughed at. That generation of developers went on to enterprises and enterprise vendors. They ordered Linux servers, deployed open-source software, and contributed to open-source projects.

What does it have to do with Microsoft, Google, Chromebooks and Windows ? Well, everything. Microsoft, once again, is late to the game. Before you accuse me of being a Microsof-basher, let’s get one thing out of the way — Apple has had their head stuck in the sand for the last few years as well.

Last week I bought a used Samsung Chromebook for $120 on eBay. I wanted to play with it and see what the deal is all about. It turns out that I was able to get myself set up in literally 30 seconds simply by entering my Google credentials. Once in, I got a familiar user interface in the form of Chrome, and all of my Chrome bookmarks, apps and extensions neatly in place.

Moreover, my 8 year old daughter was able to use her school credentials to get onto the Chromebook without any fuss. All of her apps became immediately available, including MIT Scratch she likes to experiment with. Without much difficulty she put together a lab report which she then published on a website. To compose that report she took pictures with her iPad, and the pictures magically made it into her Google Drive and onto her Chromebook. I’ve never seen iCloud work so smoothly.

Since she is so knowledgeable with her Chromebook, she set up a “supervised” account for her younger brother, including all the bookmarks he typically uses.

I have never seen a device that was so easy to begin using by anyone in the family, with any skill level. In my entire life not a single computer, PDA or tablet was as easy to setup as a Chromebook — not Apple, not Microsoft, not Palm, with the exception of maybe Sinclair ZX Spectrum I owned in the 1980s.

Just like my generation of technologists influenced direction of the entire industry, my daughter’s generation will influence technology when they come of age. They will expect ubiquitous broadband access, just like electricity or water. They will expect computing power to be ubiquitous, cheap and plentiful. They will expect their work, apps, and projects to be available anywhere, anytime, on any device they own. These devices could be Apple, Microsoft, or Google — they will have to all talk to each other.

This is why companies like Microsoft and Apple should be paying such close attention to all this. They know what’s coming and they have no control over it. And if they don’t — they won’t survive.

Week of 3/23/2015 in review

Most Demanded: Replacing Cassandra With DynamoDB

The top most searched for topic on this blog remains Replacing Cassandra with DynamoDB or something else for that matter.

My prediction still stands: any development team considering Cassandra in AWS must also evaluate DynamoDB. Devops costs of Cassandra clusters are astronomical, and Datastax is not doing themselves any favors by not offering a managed alternative to DynamoDB. Cassandra as a managed service in the cloud can compete against DynamoDB.

Cry for Humanity in the Age of Big Data

Last weekend, I was at at a local “Barnes and Noble” store where I saw vinyl records (yes, vinyl) sold alongside B&N Nooks. All of these records were made in the last 1-2 years and they are far from vintage. In the age of Big Data, cloud, etc., this seems like a cry for humanity. These days a lot of the media we purchase or produce is digital. Our pictures are digital. Our music and books are in the cloud. These records are tangible. We can touch them. We can hold them. We can collect them. We can lend them to a friend.

Microsoft’s Uphill Battle

I attended Docker’s 2nd Birthday Party in NYC hosted by Microsoft, of all companies. Somehow Microsoft decided to embrace opensource, only 20 years late to the game. Ironically, very few attendies were using Microsoft Windows or hardware. Most people ran either Ubuntu or Mac OS X. A few of us joked about how we made careers out of avoiding Microsoft products, and don’t see it changing. Microsoft still has an uphill battle to fight.

There is Not Enough Investment in Security in the Mobile App Space

In an effort to deliver apps on Internet time, companies neglect to invest in security:

“Building security into mobile apps is not top of mind for companies, giving hackers the opportunity to easily reverse engineer apps, jailbreak mobile devices and tap into confidential data,” Caleb Barlow, vice president of mobile management and security at IBM, said in a statement. “Industries need to think about security at the same level on which highly efficient, collaborative cyber criminals are planning attacks.”

Laptops Don’t Need Touch

Anyone who has tried to use an iPad with a keyboard knows: Laptops don’t need touch. The very effort of lifting your arm to touch the screen goes against every ergonomics theory out there. Microsoft acknowledges that with Windows 10.

I bought a Chromebook

Speaking of laptops, I bought a used Chromebook on eBay for $120. As a cloud computing advocate I am intrigued by the idea of an affordable, safe, reliable computing device that anybody can own. As a Linux zealot I am excited to see a consumer-grade desktop Linux that actually works for people.

I happen to think that the Web-based method for delivery of apps to devices is going to have a longer lasting impact on software engineering and computing industry in general than self-contained apps built for propietary platforms like iOS or Android.

With Chrome OS, Chrome Apps and extensions Google is conducting an experiment to prove just that point. It is orders of magnitude easier to build an app for Chrome OS than it is for iOS or for Android. Just like 20 years ago an entire generation of computer literate students came out of schools knowing how to use PCs and Macs, 10 years from now we will have an entire generation of young people expecting 100% online connectivity, and ability to access their stuff from anywhere, and any device. An entire generation of software developers will be building simple, self-contained apps, fast.

Data Virtualization

From a recruiter email I learned a new term – Data Virtualization. I hope it is a typo, but it probably is not, and that is sad. What does it mean ? Is it something that we previously called “” ?

A Critical Approach to Design Thinking

Vish Canaran, CEO of Liquid Analytics writes:

If we rush, we will fail. We will end up touching the code multiple times to every developer’s frustration. So how do we meet the 18-month timeline and still follow a process that will lead to a successful product? We need to change how we approach the design issue, and how we solve the problem. We need to ensure that everyone on the team has a grasp of the company design process, and how we use it to solve the problem of designing great products that people will use.

This is an excellent post and a must read for anyone interested in improving their software development practices.

Cloud Apps are a Challenge to IT Departments

Forbes reports:

Gone are the days when you had full control over the infrastructure. Back then, business users outside of IT had to use whatever you made available. They lacked the knowledge and resources to acquire tools themselves.

But today, “shadow IT” services are making their way into the workplace. They help employees work faster, better, and avoid struggling through the red-tape of traditional IT processes.

That really is the crux of the issue. Cloud apps are helping business users work faster, better and avoid the red tape of traditional IT processes. I have written on this topic before. Cloud helps not just user productivity, but also developer productivity. IT departments should stay ahead of the game and leverage cloud services, rather than impeding progress.

Net Neutrality Hit With Lawsuits

Instead of innovating the telecoms are suing. I am told that in Kiev, Ukraine, 1 gigabit residential broadband costs under $10/month. Meanwhile I pay Comcast $60 for 30 megabit down, 10 megabit up. American telecoms need to innovate and reduce costs and improve services rather than sue to maintain the status quo. We need to break up regional monopolies.

Data Science and Politics

As the American 2016 election season gains steam various prognosticators are beginning to make themselves known.

Cool New Stuff from AWS

Have a Good Weekend All!

This was a great and productive week. Enjoy your weekend, and use your time away from work wisely.

Do not apply data science methods without understanding them

I heard a joke from a friend, here is an adaptation:

Three software engineers and three data scientist meet to take a train to go to a conference. Software engineers buy three tickets, data scientists only buy one for all three. “Don’t worry,” they explain to the engineers, “We have a method.”

On the train the three data scientists pack themselves into the bathroom. When conductor knocks on the door they stick the ticket out. Conductor takes the ticket and moves on. All six friends reach their destination without any problems.

On the way back, the three engineers buy one ticket while the data scientists buy none. The engineers are baffled. “We have a method,” explain the data scientists.

On the train, the engineers pack themselves into one bathroom stall, and the scientists into another. Right before departure one of the scientists comes out and knocks on the door where the engineers are hiding. An engineer sticks their ticket out, the scientist grabs it and goes back to his colleagues.

The moral of this story is: Don’t apply data science methods without understanding them

Finding Unused Elastic Load Balancers

AWS imposes limits on the number of Elastic LoadBalancers. Before asking for a limit increase, it is worthwhile to check if your load balancers are actually used and have healthy instances.

Using excellent boto framework for Python, I built a simple script to find all ELBs where there is an instance in OutOfService state or where there are no instances at all:

from optparse import OptionParser
from time import sleep
import boto.ec2.elb

itembuffer = []

parser = OptionParser()

parser.add_option("-i", "--awsKeyId", 
    dest="awsKeyId", type="string", action="store", 
    help="AWS Access Key Id")
parser.add_option("-k", "--awsSecretKey", 
    dest="awsSecretKey", type="string", action="store", 
    help="AWS Secret Access Key")

(options, args) = parser.parse_args()

elb = boto.ec2.elb.connect_to_region('us-east-1' ,

allElbs = elb.get_all_load_balancers()
for lb in allElbs:
    instances = lb.get_instance_health()
    if len(instances)==0:
        print lb
    for instanceState in instances:
        if  instanceState.state == 'OutOfService':
            print lb

Where AWS Elastic BeanStalk Could be Better

Amazon describes their AWS Elastic BeanStalk service as follows:

AWS Elastic Beanstalk is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS.

You can simply upload your code and Elastic Beanstalk automatically handles the deployment, from capacity provisioning, load balancing, auto-scaling to application health monitoring. At the same time, you retain full control over the AWS resources powering your application and can access the underlying resources at any time.

Over the past year it mostly met our expectations: it automatically creates and maintains all pieces necessary to run a web app; it simplifies deployments and monitoring of apps; and abstracts some of the more mundane aspects of EC2. However, there are a few areas where the service leaves much to be desired. I’ll just straight to it.

With many developers on the team, each responsible for their own app, and with multiple environments under the same account (dev, qa and prod) there is no way to configure IAM properly to restrict developer access to resources only related to the application he is responsible for.

My attempt to configure a correct IAM policy to restrict a developer to only one AWS Elastic Bean Stalk application resulted in nothing but hours of frustration. Amazon offers a bit of documentation:

The following policy is an example. It gives a broad set of permissions to the AWS products that AWS Elastic Beanstalk uses to manage applications and environments. For example, ec2:* allows an IAM user to perform any action on any Amazon EC2 resource in the AWS account. These permissions are not limited to the resources that you use with AWS Elastic Beanstalk. As a best practice, you should grant individuals only the permissions they need to perform their duties.

There is a reason why their example does not show correct policies for other AWS products. As it turns out Amazon made it nearly impossible to follow the best practice they recommend. The issue is that simply giving permissions to EB resources is not enough.

Each operation in EB ends up performing tasks on the underlying EC2, auto scaling, S3, RDS, and pretty much every other AWS service. If I could just compose an ARN for those resources that says “any resource that may be generated by the EB infrastructure that is related to this app” it would have been easy. However, AWS EB creates obscure IDs for EB environments that are literally impossible to determine from looking at EB dashboard or running some command line tool.

What I would like to see from AWS that would make EB that much more useful to us is ability to hierarchically control an IAM policy for a developer simply by specifying which operations they can perform. AWS can then cascade that policy down to EC2, S3, etc. In the meantime, a solid piece of documentation on determining the resources on my own would go a long way towards saving me time.

Amazon says in their EB documentation: you retain full control over the AWS resources powering your application and can access the underlying resources at any time. Well, it works lovely if you have only one or two application environments. But as I said above, EB ends up spawning other AWS resources with obscure names that are impossible to identify! So how am I supposed to retain full control over underlying AWS resources if I cannot find them ?

This problem is exacerbated when there is an issue with one of the resources EB spins up. For example, yesterday I experienced a problem where EB could not deploy a new version to an environment because it thought there was something wrong with an instance. The error message simply stated something like this after 15 minutes: There was an error deploying to this environment because an instance timed out. See troubleshooting documentation Seriously ? What am I supposed to do with that ?

If I am to micromanage every aspect of Elastic BeanStalk environments and track the resources that it uses then I have no use for it. I am better off using EC2 instances directly and coming up with CloudFormation templates for my applications. If AWS is going to market EB as a valuable tool then they also need to fix the following:

  • Abstract and hierarchically control IAM policies, such that a single policy controlling access to an EB application environment also controls access to underlying resources that EB may spin up on the behalf of the application.
  • Abstract full control over the AWS resources spun up by EB so I don’t need to look for them – or make them easier to identify.
  • Abstract error conditions that happen in the underlying AWS resources. If during a deployment an instance doesn’t respond – just terminate and recreate it.

I hope someone from AWS sees this post and acts on it, because the above issues make EB less useful to me by the day.


On apprenticeship

When I was a freshman at Clarkson in 1996 there was a work-study program they called Student-Directed Computing Services. It was an effort to recruit students and get their help in wiring the campus for high speed Internet. It was thanks to that program that by the end of that year I had a real world paid experience in UNIX administration and networking.

In the summer after freshman year I took another paid opportunity at Clarkson building educational software for Windows in C++. I’ve continued working there throughout my sophomore year. By the time I got an offer to take semester off and join IBM as a co-op intern, I’ve already accumulated a year of paid experience. After my junior year I took another internship at IBM Research.

Overall, by the time I graduated and got my first job out of college I had four years of practical, paid experience in addition to my studies.

I see college graduates these days who come out of four year degree programs not understanding basic practical concepts. I contrast that with my experience and I realize just how lucky I was that opportunities to get paid work experience in my field were presented to me early on.

Software engineering is an engineering discipline that could benefit from a concept of apprenticeship. Computer science students in four year degree programs should be asked as part of their graduation requirements to seek out apprenticeship opportunities. Colleges need to actively offer these opportunities – after all IT on a university campus is no different from enterprise IT in the private sector.