Infrastructure from scratch. Part 4: log management

2nevype

4 words for every Linux system administrator: config, console, backup and log. Pay close attention on logs part. With properly configured log management you can closely listen your infrastructure. Under listening I mean determining some strange processes and statistics. More important would be service application logging. Also that’s great to redirect alerts to many destination points such e-mail or chat.

Ok, let’s skip the boring importance notes, go to deal with it!

Tools choice: one clear option.


The easiest way to deal with logs – set up the full system with dashboard. I could try loggly or graylog which are free. But I stopped on the schema filebeat-logstash-elasticsearch-kibana.

It doesn’t need any excess integration, it works from the box.

Everything at this chain is a elastic product, so it should be linked great. Also with suggested manual that’s clear to setup and easy to automate. For example, my solution for automation was ansible playbook with 4 roles included.

Why logstash?


It’s not only easy to deploy. Also logstash based on grok pattern engine. That’s definitely popular and declarative way to split your events. And configuration is such flexible, powerful because it based on programming.

Why Elasticsearch & Kibana?


Because it’s on the same chain with logstash! I’m partly kidding. Really, why do you care about document-oriented database when logstash simply works with Elasticsearch? Probably, I never faced with high-loud logging. So, in some cases Elasticsearch might be a bad solution. But, hope, logstash has some plugins with different NoSQL databases integration.

Kibana is a definitely beautiful dashboard and stays declarative. Its graphics, pies, maps and dashboards are really enough to roll all your events in the finger. I dealt with setup few moths ago, so, I have 4.x version, while 5.x version is ready and used in many places. One more task – upgrade Kibana version!

Writing grok patterns for logstash.


With this cheatsheet you need no more than half-hour to penetrate into grok basics. Also I made a post about this process. The typical workflow:

  1. write some draughts
  2. test them online
  3. put into special file (like /etc/logstash/patterns/mypattern) and name them
  4. make a reference in config file (just put pattern name in match)
  5. test the config. 
    sudo service logstash configtest
  6. restart logstash.
    sudo service logstash restart

    Might be painful, because there is no way to reload configuration but restart.

  7. make a type on clients filebeat configuration and restart this.
  8. go to kibana and check that records are coming splitted by tags.

Skip invalid records


Here is a workaround to drop all broken records before it comes into Elasticsearch. Just add the special tag (say, ‘valid’) if grok didn’t failed parsing a record.

if [type] == "postgresql" {
    grok {
    add_tag => [ "valid" ]
    patterns_dir => "/etc/logstash/patterns"
    match => { "message" => "%{POSTGRESQL}" }
    named_captures_only => true
}

date {
    match => [ "postgresql_timestamp", "yyyy-MM-dd HH:mm:ss" ]
}

if "valid" not in [tags] {            
     drop { }
   }

    mutate {
      remove_tag => [ "valid" ]
    }

}

Deploying GeoIP feature


Yes, GeoIP! With ELK stack you can feel yourself as a hacker from action movies! Such handy thing when you’re trying to analyze service audience location. Here is my simple map based on Nginx proxy.

Won’t describe installation process. There are one guy from DigitalOcean who’s awesome dealt with it.

Here are my application!


There was just a training before developers brushed app logs. That’s the most important – track all steps service’s doing during its job.

At first I wrote some patterns for each kind. We have Node.js app, so there should be HTTP requests and SQL statements from Sequelize. I splitted log records parsing on 3 fields: HTTP records, Sequelize records and the rest events. With Kibana filters it’s just reading perfectly. And, yes, of course, I pinned GeoIP feature to HTTP events.

After this I switch to the client. I worked a little bit with logrotate putting a config to rotate and drop obsolete logs:

/var/log/myapp/myapp.log {
    dateext
    daily
    rotate 7
    compress
    delaycompress
    missingok
}

Logrotate should pick this up after cron job start but it would be better to test this.

logrotate /etc/logrotate.d/myapp

At last I got back and configured notification feature. It goes to my email and slack project channel. Simple condition to say logstash to make output operation.

if [log_level] == "error" or [log_level] == "warn" {
mutate {
     add_tag => "notification"
  }
}

It’s not easy to attach logstash directly to the Google SMTP server. Even if credentials will be right, there must be some errors with TLS connection failure. It would be better to install SMTP relay and show the logstash local 25 port. In logstash configuration it’s done by default.

For slack you have to setup any output plugin. One of the logstash features is a built-in integration. It could be attached to many kinds of software like mail, monitoring, project management. But for slack they don’t have native output plugin. This post was made in August, so new logstash versions may include this integration.

Ok, I picked this plugin and there was nothing easier than install and configure it with Slack webhook. Finally we’ve got this kind of messages.

Total output config snippet for e-mail and slack.

    if "notification" in [tags]  {
        email  {
            from => "logstash"
            subject => "logstash alert"
            to => "app.notice@gmail.com"
            body => "%{message}"
        }

        slack {
            url => "https://hooks.slack.com/services/Tt/B1t6/MVpGZhLwUUyY"
            channel => "service-alerts"
            username => "logstash"
            format => "%{beat.hostname}: %{message}"
            icon_emoji => ":shit:"
        }
    }

Advices


1 Pick the open source tool. It matters every operations filed, but I can’t skip this thought. Open source solutions are not only free: they’re such customizable software. You can do with this whatever you want: search into application code, write plugins etc. Even you’re not so competent to do this (like me) – anybody done this job and shared his product on Github. And you will be free to grab and use this. Here is my solution with slack plugin for logstash.

2 Talk to developers. Remember, you do it not only for yourself, but for your team in particular. Here is the issue how are your developers comfortable to read log information. Understand what kind of research do they want, what fields are more important and so on. When no one have ideas about how your log management works – result will be no more than bullshit.

3 Check logstash logs after restart. Even your configtest was succeeded, there also might be a logical error in your configs. Even logstash.stdout shows up ‘sending logs’ – logstash.log could print the error. As a result, service will be stopped.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s