thiru
By:
thiru

how to reload the skipped old files in ELK

December 17, 2014 4.9k views

how to reload the skipped old files in ELK and need to know where the file information
will be saved in Elasticsearch

2 comments
  • Hi! Could you be a little more specific? Any additional information that you could provide would help us figure out your problem. How are you inputting logs into logstash? Are you using logstash-forwarder or the file input? This blog post show how you can load old logs using logstash-forwarder.

  • hi ,

    thanks for the reply @asb ,

    logstash - forwarder is using as input and i have deleted the all old data file in
    elastic search inside /var/lib/elasticsearch/nodes/0/indices,while trying to reload few file using logstash - forwarder, not able load its was skipping as old file .
    the configuration : Ubuntu 14.04
    elasticsearch=1.1.1
    kibana-3.0.1
    logstash=1.4.2
    logstash-forwarder_0.3.1

    Logstash Forwarder configuration file

    {
      "network": {
        "servers": [ "localhost:5000" ],
        "timeout": 15,
        "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt"
      },
      "files": [
        {
          "paths": [
           "/home/lab/logs/mp.log","/home/lab/data/samplexml.log"      
           ],
          "fields": { "type": "logfile" }
        }
       ]
    }
    

    below lines in the syslog

    Dec 16 07:18:10  :10-05:00 ogstash-forwarder[15910]: 2014/12/15 07:18:10.889356 Loading registrar data
    Dec 16 07:18:10  :10-05:00 ogstash-forwarder[15910]: 2014/12/15 07:18:10.889520 Skipping old file: /home/lab/logs/mp.log
    Dec 16 07:18:10  :10-05:00 ogstash-forwarder[15910]: 2014/12/15 07:18:10.889626 Loading registrar data
    Dec 16 07:18:10  :10-05:00 ogstash-forwarder[15910]: 2014/12/15 07:18:10.889723 Skipping old file: /home/lab/data/samplexml.log
    Dec 16 07:18:10  :10-05:00 ogstash-forwarder[15910]: 2014/12/15 07:18:10.889802 Setting trusted CA from file: /etc/pki/tls/certs/logstash-forwarder.crt
    

    input & filter configuration

    input {
      lumberjack {
        port => 5000
        type =>"logs"
        ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
        ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
      }
    }
    
    filter {
      if [type] == "logfile" {
        grok {
          match => { "message" => "%{POSINT:ident} \[%{TIMESTAMP_ISO8601:time}]\ %{GREEDYDATA:message}" }
          add_field => [ "received_at", "%{time}" ]
          add_field => [ "received_from", "%{host}" ]
        }
    
      }
    
    }
    
2 Answers

Checkout the blog post that I linked. By default, logstash-forwarder doesn't ingest new logs, but we can backfill the old logs.

cat /home/lab/data/samplexml.log | /opt/logstash-forwarder/bin/logstash-forwarder -config temp.conf -spool-size 100 -log-to-syslog

where temp.conf is:

{
  "network": {
    "servers": [ "localhost:5000" ],
    "timeout": 15,
    "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt"
  },
  "files": [
    {
      "paths": [ "-" ],
      "fields": { "type": "logfile" }
    }
   ]
}

This is configuration allows logstash-forwarder to read logs piped in through stdin.

  • great job,thanks for the solution its working fine

    file name displaying as '-' we can't keep the same file name(loading file name) ?

Great man, this is working flawlessly!
I'm loading a whole year from apache logs like this:

cd /path/to/stored/logs/
for log in $(ls -1);do 
      zcat $log | /opt/logstash-forwarder/bin/logstash-forwarder -config backfill_apache.conf -spool-size 100;
done

I am using zcat because they are gziped.

My conf:
backfill_apache.conf

{
  "network": {
    "servers": [ "logstash:5000" ],
    "timeout": 15,
    "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt"
  },
  "files": [
    {
      "paths": [ "-" ],
      "fields": { "type": "apache" }
    }
   ]
}
Have another answer? Share your knowledge.