- Home
-
Linux Log Parsing
Linux Log Parsing, Text Manipulation and Data Analysis Commands
Comprehensive guide to Linux log parsing and text manipulation commands including sed, awk, grep, cut, sort, jq, and more. Master data analysis, log monitoring, and text processing for system administration.
Free account required
About Log Parsing Commands
Log parsing and text manipulation are essential skills for system administrators and DevOps professionals. These commands help you analyze log files, extract patterns, manipulate text streams, and process structured data efficiently. Monitor your log analysis in real-time and test these techniques on optimized VPS infrastructure.
From simple pattern matching with grep
to complex text
transformations with sed
and
awk
, mastering these tools
will significantly boost your productivity in troubleshooting and data analysis tasks. Compare performance metrics to ensure your
server handles log processing efficiently.
Table of Contents
Stream Editors & Pattern Processing
sed - Stream Editor
Sed (Stream Editor) is a powerful text stream editor. It's used for text manipulation, including search and replace, insertion, deletion, and more, based on regular expressions.
# Replace first occurrence
$ sed 's/old/new/' file.txt
# Replace all occurrences (global)
$ sed 's/old/new/g' file.txt
# Edit file in-place
$ sed -i 's/old/new/g' file.txt
# Delete lines matching pattern
$ sed '/pattern/d' file.txt
# Print only lines matching pattern
$ sed -n '/pattern/p' file.txt
awk - Pattern Scanning and Text Processing Tool
AWK is a versatile text processing tool. It's primarily used for data manipulation, text reporting, and actions based on field-separated data. Operates on a line-by-line basis and is particularly useful for working with structured data.
# Print specific columns
$ awk '{print $1, $3}' file.txt
# Print with custom delimiter
$ awk -F: '{print $1}' /etc/passwd
# Sum a column
$ awk '{sum+=$1} END {print sum}' file.txt
# Filter rows with conditions
$ awk '$3 > 100 {print $0}' file.txt
# Pattern matching
$ awk '/ERROR/ {print $0}' logfile.txt
echo - Display Text or Output
The echo command is used to print text or variables to the standard output (usually the terminal). It's commonly used for displaying messages or piping output from shell scripts.
# Basic output
$ echo "Hello, World!"
# Without trailing newline
$ echo -n "No newline"
# Enable escape sequences
$ echo -e "Line1\nLine2\tTabbed"
# Output variables
$ echo $PATH
# Redirect to file
$ echo "Log entry" >> logfile.txt
grep - Global Regular Expression Print
grep is used for searching text using regular expressions. It scans text and outputs lines that match the specified pattern.
# Basic search
$ grep "pattern" file.txt
# Case insensitive
$ grep -i "error" logfile.txt
# Recursive search
$ grep -r "TODO" /project/
# Show line numbers
$ grep -n "error" file.txt
# Invert match (exclude pattern)
$ grep -v "DEBUG" logfile.txt
# Count matches
$ grep -c "error" logfile.txt
# Show context (lines before/after)
$ grep -A 3 -B 3 "error" logfile.txt
π Network logs: Use grep with our Linux Networking Commands for
troubleshooting |
Advanced patterns in Linux Admin Tips
π Analyze DNS logs: DNS Lookup tool for DNS
troubleshooting
Advanced Search Tools
ngrep - Network Packet Analyzer
ngrep is a network packet analyzer tool that allows you to search for patterns in network traffic. It's useful for monitoring network activity and filtering packets based on regular expressions.
# Monitor HTTP traffic
$ sudo ngrep -q -W byline "^(GET|POST)" tcp port 80
# Search for specific pattern
$ sudo ngrep -d any "password" port 80
# Monitor DNS queries
$ sudo ngrep -q -d any port 53
ripgrep (rg) - Line-Oriented Search Tool
ripgrep is a line-oriented search tool that recursively searches your current directory for a regex pattern. It's designed to be fast and efficient, making it a popular choice for code searching and text processing.
# Basic search
$ rg "pattern"
# Search specific file types
$ rg "function" -t py
# Case insensitive
$ rg -i "error"
# Show context
$ rg -C 3 "pattern"
# Search hidden files
$ rg --hidden "pattern"
agrep - Approximate Grep
agrep stands for "approximate grep". It allows you to perform approximate string matching, useful for finding similar or misspelled words in text.
# Search with 1 error allowed
$ agrep -1 "patern" file.txt
# Case insensitive approximate search
$ agrep -i -2 "linux" file.txt
ugrep - Ultra-Fast Search Tool
ugrep is a command-line search tool that supports recursive search, regex patterns, and Unicode. It aims to be a feature-rich and efficient alternative to traditional grep.
# Basic recursive search
$ ugrep "pattern" -r
# Search with fuzzy matching
$ ugrep -Z "patern"
# Interactive search
$ ugrep -Q
ack - Developer-Friendly Code Search
ack is a tool for searching text and code. It's designed to be developer-friendly, automatically skipping version control directories and binary files by default.
# Search in code
$ ack "function"
# Search specific file types
$ ack --python "class"
# Ignore case
$ ack -i "error"
# List matching files only
$ ack -l "TODO"
ag (The Silver Searcher) - Code Search
The Silver Searcher, commonly known as "ag", is another code searching tool that's optimized for speed, particularly among developers. It's similar to ack and ripgrep in terms of usage.
# Basic search
$ ag "pattern"
# Search specific file types
$ ag --python "class"
# Case sensitive search
$ ag -s "Pattern"
# Show context
$ ag -C 2 "pattern"
pt (The Platinum Searcher) - Code Search
The Platinum Searcher is yet another code searching tool that focuses on speed and efficiency. It's similar to ag and ripgrep.
# Basic search
$ pt "pattern"
# Ignore case
$ pt -i "error"
# Search with file type filter
$ pt --go "func"
π οΈ More tools: Explore our DevOps diagnostic tools | Basic commands: Linux Commands reference
Text Manipulation
cut - Extract Sections from Lines
The cut command is used for extracting sections from lines of files or data streams. It's often used to isolate specific fields or columns from text.
# Cut by delimiter
$ cut -d: -f1 /etc/passwd
# Cut by character position
$ cut -c1-10 file.txt
# Multiple fields
$ cut -d, -f1,3 data.csv
sort - Sort Lines of Text
sort is used for sorting lines of text files or data streams in ascending or descending order. It's helpful for organizing data.
# Basic sort
$ sort file.txt
# Reverse sort
$ sort -r file.txt
# Numeric sort
$ sort -n numbers.txt
# Sort by column
$ sort -k2 file.txt
# Remove duplicates
$ sort -u file.txt
uniq - Remove Duplicate Lines
uniq is used to remove duplicate lines from a sorted text file or data stream. It's often used in conjunction with sort.
# Remove duplicates
$ sort file.txt | uniq
# Count occurrences
$ sort file.txt | uniq -c
# Show only duplicates
$ sort file.txt | uniq -d
# Show only unique lines
$ sort file.txt | uniq -u
diff - Compare Files Line by Line
diff is a tool for comparing the contents of two files and finding the differences between them. It's often used for code and document comparisons.
# Basic comparison
$ diff file1.txt file2.txt
# Unified format
$ diff -u file1.txt file2.txt
# Side by side
$ diff -y file1.txt file2.txt
# Ignore whitespace
$ diff -w file1.txt file2.txt
tac - Reverse Cat
tac is the reverse of cat. It outputs the lines of a file in reverse order, displaying the last line first and so on.
# Display file in reverse
$ tac file.txt
# Reverse multiple files
$ tac file1.txt file2.txt
cat - Concatenate and Display Files
cat is short for "concatenate". It's used to display the contents of one or more files, or to combine multiple files into a single output.
# Display file
$ cat file.txt
# Concatenate files
$ cat file1.txt file2.txt > combined.txt
# Number lines
$ cat -n file.txt
printf - Format and Print Text
printf is used to format and print text in a specific way. It allows you to control the output format, including the width, precision, and alignment of data.
# Basic formatting
$ printf "Hello, %s\n" "World"
# Format numbers
$ printf "%.2f\n" 3.14159
# Align text
$ printf "%-10s %5d\n" "Item" 100
comm - Compare Two Sorted Files
comm is used to compare two sorted files line by line and display lines that are unique to each file or common to both.
# Show all columns
$ comm file1.txt file2.txt
# Show only lines in file1
$ comm -23 file1.txt file2.txt
# Show only common lines
$ comm -12 file1.txt file2.txt
tr - Translate or Delete Characters
tr is used for translating, deleting, or squeezing characters in a text stream. It's often used for character-level transformations.
# Convert to uppercase
$ echo "hello" | tr 'a-z' 'A-Z'
# Delete characters
$ echo "hello123" | tr -d '0-9'
# Squeeze repeating characters
$ echo "hello" | tr -s 'l'
rev - Reverse Lines Character-wise
rev reverses the characters in each line of a text file or data stream.
# Reverse characters in each line
$ rev file.txt
# Reverse from stdin
$ echo "hello" | rev
wc - Word Count
wc (word count) is used to count the number of lines, words, and characters in a text file or data stream.
# Count lines, words, characters
$ wc file.txt
# Count lines only
$ wc -l file.txt
# Count words only
$ wc -w file.txt
# Count characters only
$ wc -c file.txt
nl - Number Lines
nl is used to add line numbers to the lines of a text file or data stream.
# Add line numbers
$ nl file.txt
# Number non-empty lines only
$ nl -b t file.txt
# Custom format
$ nl -n rz -w 3 file.txt
paste - Merge Lines of Files Side by Side
paste is used to merge lines from multiple files side by side. It's commonly used for combining data from different sources.
# Merge files side by side
$ paste file1.txt file2.txt
# Use custom delimiter
$ paste -d, file1.txt file2.txt
# Merge serially
$ paste -s file1.txt
Data Format & Specialized Tools
jq - Command-line JSON Processor
jq is a command-line JSON processor. It's used for querying, manipulating, and formatting JSON data. It's especially handy for parsing JSON in shell scripts.
# Pretty print JSON
$ cat data.json | jq '.'
# Extract specific field
$ cat data.json | jq '.name'
# Filter arrays
$ cat data.json | jq '.items[] | select(.price > 10)'
# Get array length
$ cat data.json | jq '.items | length'
# Transform structure
$ cat data.json | jq '{name: .username, id: .user_id}'
π³ Container logs: Parse JSON logs from Docker containers | βΈοΈ Orchestration: Kubernetes logs and events
csvcut - CSV Column Extraction Utility
csvcut is a utility for working with CSV (Comma-Separated Values) data. It allows you to select specific columns from CSV data.
# Extract specific columns
$ csvcut -c 1,3 data.csv
# Extract by column names
$ csvcut -c "name,price" data.csv
# Show column names
$ csvcut -n data.csv
ccze - Colorize Log Files
ccze is a tool that colorizes log files or text, making it easier to read and understand logs by highlighting different log levels and patterns.
# Colorize log file
$ cat /var/log/syslog | ccze
# Colorize with specific mode
$ tail -f /var/log/apache2/access.log | ccze -A
# Use specific plugin
$ cat logfile.txt | ccze -p syslog
File Viewing & Monitoring
less & more - Pager Programs
These are both pager programs that allow you to view text files one screen at a time. They are useful for browsing large files without overwhelming your terminal.
# View file with less (recommended)
$ less file.txt
# View file with more (simple)
$ more file.txt
# less navigation:
# Space: next page
# b: previous page
# /pattern: search forward
# ?pattern: search backward
# q: quit
tail - Display Last Lines of File
tail displays the last few lines of a file. Like head, it also defaults to showing the last 10 lines but can be configured differently.
# Last 10 lines
$ tail file.txt
# Last 20 lines
$ tail -n 20 file.txt
# Follow file updates (real-time)
$ tail -f /var/log/syslog
# Follow multiple files
$ tail -f file1.log file2.log
π Real-time monitoring: Dashboard for infrastructure metrics | Performance benchmarks for server analysis
head - Display First Lines of File
head displays the first few lines of a file. By default, it shows the first 10 lines but can be configured to display a different number of lines.
# First 10 lines
$ head file.txt
# First 20 lines
$ head -n 20 file.txt
# First 100 bytes
$ head -c 100 file.txt
watch - Execute Command Repeatedly
The watch command repeatedly runs the specified command at regular intervals (by default, every 2 seconds) and displays the output. It's often used for monitoring system resource usage such as checking system logs with top or observing log files with tail.
# Monitor command every 2 seconds
$ watch df -h
# Custom interval (1 second)
$ watch -n 1 'ps aux | grep nginx'
# Highlight differences
$ watch -d free -h
# Monitor log file
$ watch 'tail -20 /var/log/syslog'
Comparison & Diff Tools
vimdiff - Visual Diff in Vim
vimdiff is a command in diff mode for Vim text editor. It's used for visually comparing and editing files within the Vim environment.
# Compare two files
$ vimdiff file1.txt file2.txt
# Compare three files
$ vimdiff file1.txt file2.txt file3.txt
# Navigation:
# ]c: next difference
# [c: previous difference
# do: diff obtain (get changes from other file)
# dp: diff put (put changes to other file)
# :diffupdate: refresh diff
# :qa: quit all windows
Practical Log Parsing Examples
Find all ERROR lines in log file
$ grep "ERROR" /var/log/application.log | tail -20
Count 404 errors in Apache access log
$ awk '$9 == 404 {print $0}' /var/log/apache2/access.log | wc -l
Extract unique IP addresses from log
$ awk '{print $1}' /var/log/nginx/access.log | sort -u
Find top 10 most frequent log entries
$ sort /var/log/syslog | uniq -c | sort -rn | head -10
Monitor real-time logs with color highlighting
$ tail -f /var/log/syslog | ccze -A
Extract timestamp and message from structured log
$ jq -r '.timestamp + " " + .message' application.json
Replace IP addresses in log file
$ sed 's/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/[REDACTED]/g' access.log
Find logs between specific timestamps
$ awk '/2025-01-16 10:00:00/,/2025-01-16 11:00:00/' application.log
π Diagnostic tools for log investigation: DNS Lookup for DNS log analysis | WHOIS to investigate IP addresses | What Is My IP to verify server IP
Master Log Analysis on High-Performance VPS
Process logs faster with optimized VPS hosting. Compare VPS providers with excellent I/O performance, explore side-by-side comparisons, use our DevOps diagnostic tools, and learn more from our Linux administration guide for efficient log processing workloads.
Contact Us β’ About Us β’ Privacy Policy β’ Dashboard