Jack P. — Frontend & Mobile Developer

Jack P.

cursor
I build things on the internet.
github iconlinkedin iconBlogs

Remote Logs Searcher via SFTP

January 12, 2026

Developed a Python-based remote server log searcher to save hundreds of hours

Our servers generated multiple compressed logs every day. Some days there would be one log file, other days there could be five or six depending on activity. Over time this turned into thousands of compressed .gz log files covering months or years.

Whenever something suspicious happened (a bug, exploit attempt, etc.) - the only way to investigate it was by searching through those logs.

Blog image
Blog image

Originally, the process looked like this:

  • Download tens, or even hundreds of compressed log files from the server
  • Decompress them locally
  • Run grep searches across the extracted logs
  • Manually scan the results and piece together timelines
  • This worked, but it was slow and repetitive 😔


    The simple tool I made to automate this pythonlogo

    The tool connects to the server via multiple simultanious SFTP connections and scans logs across a configurable time range.

    Instead of downloading and searching files sequentially, the script:

  • lists the server’s log directory
  • filters files based on a time threshold (for example you’d input “30”, “90”, or “365” days)
  • downloads log files concurrently
  • decompresses .gz files automatically
  • scans each line for a target string
  • aggregates the results into a structured output
  • The tool is able to process hundreds of log files very quickly, because the downloading of the logs, decompression, and search, all run in parallel. So now an investigation that would have taken multiple hours, now turns into minutes.


    How the log processing works 😮

    Each worker thread downloads a log file, decompresses it if needed (latest log file on the remote server is not compressed), and searches each line for the target string.

    The python script is able to process an insane amount of log files in parallel by using multiple workers.

    This reduced the time required to search large volumes of logs.

    Python
    num_workers = 10
    threads = []
    for _ in range(num_workers):
    t = threading.Thread(target=worker)
    t.start()
    threads.append(t)
    file_queue.join()
    for t in threads:
    t.join()

    Quick Note:

    Simultaneous sftp connections are also used by filezilla FileZilla


    Visualizing the Search Results

    Instead of printing results in the terminal, the script generates a local HTML page that displays matches grouped by log file.

    The Python script injects the results as JSON into a HTML template.

    Python
    results_json = json.dumps(sorted_results, indent=4, ensure_ascii=False)
    with open(template_path, 'r', encoding='utf-8') as file:
    html_template_content = file.read()
    html_template_content = html_template_content.replace(
    '>PLACEHOLDER_DATA',
    f'>{results_json}'
    )
    with open(output_path, 'w', encoding='utf-8') as file:
    file.write(html_template_content)
    webbrowser.open('file://' + os.path.realpath(output_path))

    Example of the Visual Output Template

    The HTML visualizer renders the results and allows quick inspection of matches.

    Html
    <h1>RESULTS</h1>
    <div id="utc-time"></div>
    <div id="input-section" class="input-section">
    <textarea id="input">PLACEHOLDER_DATA</textarea>
    <button onclick="visualizeLogs()">Visualize</button>
    </div>
    <div id="results"></div>
    This is a simplified version of my one, for demonstration purposes.

    The results are rendered dynamically with JavaScript →

    Javascript
    Object.entries(data).forEach(([filename, lines]) => {
    const fileDiv = document.createElement('div')
    fileDiv.className = 'file-result'
    const filenameDiv = document.createElement('div')
    filenameDiv.className = 'file-name'
    filenameDiv.textContent = filename
    const linesDiv = document.createElement('div')
    linesDiv.className = 'log-lines'
    lines.forEach(line => {
    const lineDiv = document.createElement('div')
    lineDiv.className = 'log-line'
    lineDiv.textContent = line
    linesDiv.appendChild(lineDiv)
    })
    fileDiv.appendChild(filenameDiv)
    fileDiv.appendChild(linesDiv)
    resultsDiv.appendChild(fileDiv)
    })

    This produced a structured interface where log matches were grouped by file and displayed in chronological order.


    Conclusion

    I originally built the tool for my own debugging and investigation work.

    Over time I shared it with trusted members of the volunteer team managing the servers. Instead of manually downloading and searching logs, they could simply run the script, specify a time range and search string, and immediately see all relevant log entries.

    This made it much easier to investigate incidents, trace user activity, and debug unexpected server behaviour.


    Takeaways

  • One thing I have learnt running online projects, is that repetitive operational tasks are almost always worth automating.
  • Even small internal tools can eliminate hours of manual work and make work dramatically faster.
  • This project was a good reminder that sometimes the most useful software isn’t a product - it’s the internal tools that make running a system easier.