$1
is the first column. awk separates columns by spaces by default. Delimiters can be specified with -F
. In the examples below '[: ]'
means that columns are delimited by spaces or colons.
Snippets for access logs
99.56.8.181 10.0.1.239 - - [16/Nov/2018:20:45:59 +0000] "GET /app/themes/example/dist/img/marketing-hero-cover.jpg HTTP/1.0" 200 38808 "https://www.example.com/sw.js" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0"
## Print IP and user agent for requests between the hours of 18 and 19
cat /var/log/httpd/access_log | awk -F'[: ]' '$6 >= 18 && $6 <= 19 { print }' | awk -F\" '{print $1,$6}' | more
## Print entire log entry for requests between the hours of 18 and 19 and status code is 301
cat /var/log/httpd/access_log | awk -F'[: ]' '$6 >= 18 && $6 <= 19 && $13 == 301 { print }' | more
## Find referrers matching a pattern
cat /var/log/httpd/access_log | awk -F'"' '$4~/(menshealth\.com|fitnessmagazine\.com|seriouseats\.com|giants\.com|soaphub\.com|tmz\.com|bleacherreport\.com)/ {print $1,$4}' | more
## Find requests matching a pattern
cat /var/log/httpd/access_log | awk -F'"' '$2~/\/search\?q=/ {print}' | more
Domain count
Finds the count of domains from the logs of this syntax
127.0.0.1 (127.0.0.1, 127.0.0.1) - - [24/Jan/2024:19:23:10 +0000] "GET /api/v3/callback/?callback=initProductRecommendations&z=59&url=https%3A%2F%2Fwww.example.com%2Freviews%2Freview%2F HTTP/1.1" 200 9105 "https://www.example.com/reviews/review/" "Amazon CloudFront"
cat var/log/httpd/access_log | awk -F'"' '{ print $4 }' | awk -F'[/:]' '{ print $4 }' | sort | uniq -c
Git snippets
Here is an example that finds the most updated files in a project. You can use expressions
## Find files that were changed more than a certain number of times in the last 250 commits excluding ones mantching a specific regex pattern
git log --pretty=format: --name-only -n 250 | sort | uniq -c | sort -rg | awk -v P="$(pwd)" -F " " -e '$3 !~ /web\/wp/ && $1 >= 10 {print $1, P, $2};'