Logstash GROK pattern for nginx log from sysylog

December 14, 2016 3.8k views
Nginx Logging Elasticsearch CentOS

Hi!

I have ELK setup. Customer sending nginx syslog. Some part of log info can be droped. Need help with grok for logstash.

Customers log line :

2016-12-14T09:07:25.633Z 83.145.1.94 <13>Dec 14 09:07:25 128215238 442052907    406581698       -       ftp     14/Dec/2016:09:07:25 +0000      128215238       87.96.217.166   GET     live.tve.teracom.se     /live/ramdisk/TV_SDjk/LIVE_TVE_MSS/QualityLevels(4000000)/Fragments(video=25576240718010)  200     1001146 HIT     0.004   0.004   34242   442052907       376743631       live.tve.teracom.se     83.145.1.94     Fluendo SmoothStreaming demuxer/0.10.31  -       -       1000514 V7

Part that can be droped

2016-12-14T09:07:25.633Z 83.145.1.94 <13>Dec 14 09:07:25 128215238 442052907    406581698       -       ftp

Part tha can be saved.

14/Dec/2016:09:07:25 +0000      128215238       87.96.217.166   GET     live.tve.teracom.se     /live/ramdisk/TV_SDjk/LIVE_TVE_MSS/QualityLevels(4000000)/Fragments(video=25576240718010)  200     1001146 HIT     0.004   0.004   34242   442052907       376743631       live.tve.teracom.se     83.145.1.94     Fluendo SmoothStreaming demuxer/0.10.31  -       -       1000514 V7

Log format

$time_local<TAB>$edge_server_id<TAB>$remote_addr<TAB>$request_method<TAB>$http_host<TAB>$request_uri<TAB>$status<TAB>$bytes_sent<TAB>$upstream_http_x_cache<TAB>$upstream_response_time<TAB>$request_time<TAB>$tcpinfo_rtt<TAB>$for_operator<TAB>$resource_id<TAB>$server_name<TAB>$server_addr<TAB>$http_user_agent<TAB>$http_referer<TAB>$http_range<TAB>$body_bytes_sent<TAB>V7
2 Answers

This tutorial is a great starting place for understanding how to work with GROK patterns to filter logs coming into Logstash. It includes this sample GROK pattern for Nginx's default access log:

NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer}) %{QS:agent}

I always find Grok Constructor extremely helpful when needing to build new patterns or debugging existing one. It allows you to interactively test patterns and figure out exactly what is being matched and what is being dropped.

by Mitchell Anicas
One way to increase the effectiveness of your Logstash setup is to collect important application logs and structure the log data by employing filters. This guide is a sequel to the [How To Use Logstash and Kibana To Centralize Logs On Ubuntu 14.04](https://www.digitalocean.com/community/tutorials/how-to-use-logstash-and-kibana-to-centralize-and-visualize-logs-on-ubuntu-14-04) tutorial, and focuses primarily on adding filters for various common application log.

i think it's easier with mutate to delete all (field's) that you don't want before it will be send to elasticsearch. Configure it in logstash config-file.

 mutate {
  remove_field => [ "field1", "field2" ]
}
Have another answer? Share your knowledge.