jedberg/quickparse
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
This program is used by reddit to more easily parse
haproxy log file lines. It was written because using
awk was starting to get tiring. :)
Simple usage is something like this:
tail -F logfile | quickparse.py status uri
That will get you something like this:
200 /user/jedberg/about.json
200 /user/reddit
200 /static/reddit.js?v=7dd397929b2f6c735455095e17a9eaa5
200 /ads/r/Enhancement/
200 /comments/fjgit/.json?sort=top&limit=200
304 /static/button/button2.html?width=51
304 /static/button/button2.html?width=51
404 /user/lskdjf
503 /user/foobar
Also in this repository is a program called log_gen.pl,
which will generate random bogus log lines for your testing
needs.
For this program to work, your haproxy needs to be
configured correctly. Here are the relevant logging bits
for your haproxy configuration file:
In the frontend section:
option httplog
log /dev/log local4
capture request header User-Agent len 150
capture request header Host len 50
capture request header Referer len 300
capture request header X-Forwarded-For len 15
capture request header If-Modified-Since len 50
capture cookie session= len 25
In each backend section:
capture cookie session= len 25
(this may now be unnecessary -- it was there because of
a bug in haproxy)
The local4 in the log option sends all the logs to
the local4 service, which helps in keeping the logs
seperate.
The quickparse program has a few support modules that are
reddit specific. I left these in as examples.
Enjoy!