It’s easy to manage Amazon solutions which don’t require any special operations skill. Set load balancer, speed up content delivery with Cloudfront, store enormous amounts of data in S3 in 2 clicks. But what if we wish to get into the bottom about idea how do they all work? Do they work right in our case?
The bigger your system becomes, the harder to maintain all included services. Some parts falling down after unsuccessful operation, some can get stuck after server reboot. Something can just exit the process because of panic.
4 words for every Linux system administrator: config, console, backup and log. Pay close attention on logs part. With properly configured log management you can closely listen your infrastructure. Under listening I mean determining some strange processes and statistics. More important would be service application logging. Also that’s great to redirect alerts to many destination points such e-mail or chat.