Jack P. — Frontend & Mobile Developer

Jack P.

cursor
github iconlinkedin icon
I build things on the internet.

Remote Logs Searcher via SFTP

January 12, 2026

Developed a Python-based remote server log searcher to save hundreds of hours

Our servers generated multiple compressed logs every day. Some days there would be one log file, other days there could be five or six depending on activity. Over time this turned into thousands of compressed .gz log files covering months or years.

Whenever something suspicious happened (a bug, exploit attempt, etc.) - the only way to investigate it was by searching through those logs.

Blog image
Blog image

Originally, the process looked like this:

  1. Download tens, or even hundreds of compressed log files from the server
  2. Decompress them locally
  3. Run grep searches across the extracted logs
  4. Manually scan the results and piece together timelines

This worked, but it was slow and repetitive 😔


The simple tool I made to automate this pythonlogo

The tool connects to the server via multiple simultanious SFTP connections and scans logs across a configurable time range.

Instead of downloading and searching files sequentially, the script:

  • lists the server’s log directory
  • filters files based on a time threshold (for example you’d input “30”, “90”, or “365” days)
  • downloads log files concurrently
  • decompresses .gz files automatically
  • scans each line for a target string
  • aggregates the results into a structured output

The tool is able to process hundreds of log files very quickly, because the downloading of the logs, decompression, and search, all run in parallel. So now an investigation that would have taken multiple hours, now turns into minutes.


How the log processing works 😮

Each worker thread downloads a log file, decompresses it if needed (latest log file on the remote server is not compressed), and searches each line for the target string.

The python script is able to process an insane amount of log files in parallel by using multiple workers.

This reduced the time required to search large volumes of logs.

Python
num_workers = 10
threads = []
for _ in range(num_workers):
t = threading.Thread(target=worker)
t.start()
threads.append(t)
file_queue.join()
for t in threads:
t.join()

Quick Note:

Simultaneous sftp connections are also used by filezilla FileZilla


Visualizing the Search Results

Instead of printing results in the terminal, the script generates a local HTML page that displays matches grouped by log file.

The Python script injects the results as JSON into a HTML template.

Python
results_json = json.dumps(sorted_results, indent=4, ensure_ascii=False)
with open(template_path, 'r', encoding='utf-8') as file:
html_template_content = file.read()
html_template_content = html_template_content.replace(
'>PLACEHOLDER_DATA',
f'>{results_json}'
)
with open(output_path, 'w', encoding='utf-8') as file:
file.write(html_template_content)
webbrowser.open('file://' + os.path.realpath(output_path))

Example of the Visual Output Template

The HTML visualizer renders the results and allows quick inspection of matches.

Html
<h1>RESULTS</h1>
<div id="utc-time"></div>
<div id="input-section" class="input-section">
<textarea id="input">PLACEHOLDER_DATA</textarea>
<button onclick="visualizeLogs()">Visualize</button>
</div>
<div id="results"></div>
This is a simplified version of my one, for demonstration purposes.

The results are rendered dynamically with JavaScript →

Javascript
Object.entries(data).forEach(([filename, lines]) => {
const fileDiv = document.createElement('div')
fileDiv.className = 'file-result'
const filenameDiv = document.createElement('div')
filenameDiv.className = 'file-name'
filenameDiv.textContent = filename
const linesDiv = document.createElement('div')
linesDiv.className = 'log-lines'
lines.forEach(line => {
const lineDiv = document.createElement('div')
lineDiv.className = 'log-line'
lineDiv.textContent = line
linesDiv.appendChild(lineDiv)
})
fileDiv.appendChild(filenameDiv)
fileDiv.appendChild(linesDiv)
resultsDiv.appendChild(fileDiv)
})

This produced a structured interface where log matches were grouped by file and displayed in chronological order.


Conclusion

I originally built the tool for my own debugging and investigation work.

Over time I shared it with trusted members of the volunteer team managing the servers. Instead of manually downloading and searching logs, they could simply run the script, specify a time range and search string, and immediately see all relevant log entries.

This made it much easier to investigate incidents, trace user activity, and debug unexpected server behaviour.


Takeaways

  • One thing I have learnt running online projects, is that repetitive operational tasks are almost always worth automating.
  • Even small internal tools can eliminate hours of manual work and make work dramatically faster.
  • This project was a good reminder that sometimes the most useful software isn’t a product - it’s the internal tools that make running a system easier.