SFTP with Go: Connections, Performance, and Search Strategies for Production

SFTP with Go: Connections, Performance, and Search Strategies for Production

Learn to build SFTP clients in Go with connection pooling, safe transfers, efficient file search, and production-ready patterns. Includes CLI commands and best practices for remote file operations.

By Omar Flores

The Problem: SSH is Everywhere, SFTP is Poorly Understood

Imagine you are an operator managing 50 servers. You need to retrieve logs from a specific server, find files matching a pattern, and transfer them back to your system for analysis.

You could SSH into the server and use find and scp. It is slow. It is manual. It is error-prone.

Or you could write a Go program that connects to 50 servers simultaneously, searches for matching files in parallel, and transfers them all at once. Same data. Vastly different experience.

SFTP (SSH File Transfer Protocol) is the solution. It is built on SSH. It is installed everywhere. It is secure. But most developers treat it as a simple file transfer tool.

Used correctly, SFTP is a powerful abstraction for remote file operations that scales to thousands of concurrent connections.

This guide teaches you how.


Part 1: Understanding SFTP β€” The Foundation

SFTP is not FTP with SSH. It is a completely different protocol built on top of SSH.

Why SFTP, Not SSH SCP?

SSH SCP (Secure Copy):

  • Simple protocol
  • One file transfer at a time
  • Uses external SSH command
  • Good for ad-hoc file movement
  • Terrible for automation

SFTP:

  • Rich protocol with file operations
  • Multiple concurrent transfers
  • Built-in library support (golang.org/x/crypto/ssh)
  • Directory traversal
  • File stat/chmod/delete operations
  • Connection reuse
  • Good for production systems

How SFTP Works

Your Program
    ↓
SSH Connection (encrypted)
    ↓
SFTP Subsystem on Remote Server
    ↓
File Operations (read, write, list, delete)

The SSH connection is the transport. SFTP is the protocol running on top of it.


Part 2: Building Your First SFTP Connection

The naive approach is to open a connection, do an operation, close it. This works for one-off tasks. It is terrible for repeated operations.

The Naive Approach (Don’t Do This)

import (
	"golang.org/x/crypto/ssh"
	"github.com/pkg/sftp"
)

func downloadFile(host, username, password, remotePath string) error {
	// Dial SSH
	config := &ssh.ClientConfig{
		User: username,
		Auth: []ssh.AuthMethod{
			ssh.Password(password),
		},
		HostKeyCallback: ssh.InsecureIgnoreHostKey(),
	}

	client, err := ssh.Dial("tcp", host+":22", config)
	if err != nil {
		return err
	}
	defer client.Close()

	// Open SFTP session
	session, err := sftp.NewClient(client)
	if err != nil {
		return err
	}
	defer session.Close()

	// Download file
	srcFile, err := session.Open(remotePath)
	if err != nil {
		return err
	}
	defer srcFile.Close()

	dstFile, err := os.Create("localfile.txt")
	if err != nil {
		return err
	}
	defer dstFile.Close()

	_, err = io.Copy(dstFile, srcFile)
	return err
}

This works. But it creates a new SSH connection for every operation. At scale, this is slow.


Part 3: Connection Pooling β€” The Right Approach

Production systems need connection reuse. Opening an SSH connection is expensive (key exchange, authentication, handshake). You want to open once, use many times.

Building a Connection Pool

// sftppool/pool.go
package sftppool

import (
	"fmt"
	"sync"

	"github.com/pkg/sftp"
	"golang.org/x/crypto/ssh"
)

// Connection represents a reusable SFTP connection
type Connection struct {
	ssh    *ssh.Client
	sftp   *sftp.Client
	closed bool
}

// ConnectionPool manages multiple SFTP connections
type ConnectionPool struct {
	host     string
	config   *ssh.ClientConfig
	mu       sync.RWMutex
	conns    []*Connection
	maxConns int
	inUse    int
}

// NewPool creates a new connection pool
func NewPool(host string, config *ssh.ClientConfig, maxConns int) *ConnectionPool {
	return &ConnectionPool{
		host:     host,
		config:   config,
		conns:    make([]*Connection, 0, maxConns),
		maxConns: maxConns,
	}
}

// Get retrieves or creates a connection from the pool
func (p *ConnectionPool) Get() (*Connection, error) {
	p.mu.Lock()
	defer p.mu.Unlock()

	// Try to reuse existing connection
	for i, conn := range p.conns {
		if !conn.closed {
			p.conns = append(p.conns[:i], p.conns[i+1:]...)
			p.inUse++
			return conn, nil
		}
	}

	// Create new connection if under limit
	if p.inUse < p.maxConns {
		sshClient, err := ssh.Dial("tcp", p.host+":22", p.config)
		if err != nil {
			return nil, fmt.Errorf("ssh dial: %w", err)
		}

		sftpClient, err := sftp.NewClient(sshClient)
		if err != nil {
			sshClient.Close()
			return nil, fmt.Errorf("sftp new client: %w", err)
		}

		p.inUse++
		return &Connection{
			ssh:  sshClient,
			sftp: sftpClient,
		}, nil
	}

	return nil, fmt.Errorf("pool exhausted")
}

// Return returns a connection to the pool
func (p *ConnectionPool) Return(conn *Connection) {
	p.mu.Lock()
	defer p.mu.Unlock()

	if conn.closed {
		p.inUse--
		return
	}

	p.conns = append(p.conns, conn)
	p.inUse--
}

// Close closes all connections
func (p *ConnectionPool) Close() error {
	p.mu.Lock()
	defer p.mu.Unlock()

	for _, conn := range p.conns {
		conn.Close()
	}
	p.conns = nil
	return nil
}

// Close closes a single connection
func (c *Connection) Close() error {
	c.closed = true
	c.sftp.Close()
	return c.ssh.Close()
}

Now operations reuse connections:

// Usage
pool := NewPool("example.com", config, 10) // Max 10 concurrent connections
defer pool.Close()

// Get connection from pool
conn, err := pool.Get()
if err != nil {
	panic(err)
}
defer pool.Return(conn)

// Use connection
file, err := conn.sftp.Open("/path/to/file.txt")
// ... do operations ...

Part 4: File Search Strategies β€” Fast vs Deep

Searching 100,000 files on a remote server is different from searching locally.

Strategy 1: Name-Only Search (Fastest)

When you only need to match filenames, list directories and filter:

// SearchByName returns files matching a name pattern
func SearchByName(client *sftp.Client, dir, pattern string) ([]string, error) {
	var results []string

	// Walk directory tree
	err := client.Walk(dir, func(path string, info os.FileInfo, err error) error {
		if err != nil {
			return err
		}

		// Match filename against pattern
		if matched, _ := filepath.Match(pattern, filepath.Base(path)); matched {
			results = append(results, path)
		}

		return nil
	})

	return results, err
}

// Usage
files, err := SearchByName(sftpClient, "/var/log", "*.log")

This is O(n) where n = number of files in the tree. Fast because it only does stat calls.

Strategy 2: Parallel Search Across Directories

When searching multiple top-level directories, search in parallel:

// ParallelSearch searches multiple directories concurrently
func ParallelSearch(client *sftp.Client, dirs []string, pattern string, workers int) ([]string, error) {
	// Create work queue
	workChan := make(chan string, 100)
	resultChan := make(chan string, 1000)
	errChan := make(chan error, workers)

	// Start workers
	var wg sync.WaitGroup
	for i := 0; i < workers; i++ {
		wg.Add(1)
		go func() {
			defer wg.Done()
			for dir := range workChan {
				err := searchDir(client, dir, pattern, resultChan)
				if err != nil {
					errChan <- err
				}
			}
		}()
	}

	// Send work
	go func() {
		for _, dir := range dirs {
			workChan <- dir
		}
		close(workChan)
	}()

	// Collect results
	var results []string
	go func() {
		wg.Wait()
		close(resultChan)
	}()

	for result := range resultChan {
		results = append(results, result)
	}

	return results, nil
}

func searchDir(client *sftp.Client, dir, pattern string, results chan<- string) error {
	return client.Walk(dir, func(path string, info os.FileInfo, err error) error {
		if err != nil {
			return nil // Skip errors, continue searching
		}

		if matched, _ := filepath.Match(pattern, filepath.Base(path)); matched {
			results <- path
		}

		return nil
	})
}

// Usage
files, err := ParallelSearch(sftpClient, []string{"/var/log", "/home", "/opt"}, "*.txt", 4)

This searches 4 directories simultaneously. 4x faster than sequential.

Strategy 3: Content Search (Slowest, Most Powerful)

When you need to search file contents:

// SearchContents searches file contents for a pattern
func SearchContents(client *sftp.Client, dir, filePattern, contentPattern string) ([]string, error) {
	var results []string
	re := regexp.MustCompile(contentPattern)

	return results, client.Walk(dir, func(path string, info os.FileInfo, err error) error {
		if err != nil {
			return nil
		}

		// Skip directories and non-matching filenames
		if info.IsDir() {
			return nil
		}

		if matched, _ := filepath.Match(filePattern, filepath.Base(path)); !matched {
			return nil
		}

		// Open and search file
		file, err := client.Open(path)
		if err != nil {
			return nil
		}
		defer file.Close()

		scanner := bufio.NewScanner(file)
		for scanner.Scan() {
			if re.MatchString(scanner.Text()) {
				results = append(results, path)
				break // Found match, move to next file
			}
		}

		return scanner.Err()
	})
}

// Usage
// Find files in *.log that contain error messages
files, err := SearchContents(sftpClient, "/var/log", "*.log", "ERROR|FAIL")

This reads file contents. Slow for large files. Powerful for specific searches.


Part 5: Safe File Operations β€” Atomicity and Error Handling

Remote file operations can fail mid-transfer. You need patterns for safety.

Atomic Uploads with Temp Files

// SafeUpload uploads a file atomically using a temp file
func SafeUpload(client *sftp.Client, localPath, remotePath string) error {
	// Read local file
	data, err := os.ReadFile(localPath)
	if err != nil {
		return fmt.Errorf("read local: %w", err)
	}

	// Write to temp file first
	tempPath := remotePath + ".tmp"
	tempFile, err := client.Create(tempPath)
	if err != nil {
		return fmt.Errorf("create temp: %w", err)
	}

	_, err = tempFile.Write(data)
	tempFile.Close()
	if err != nil {
		client.Remove(tempPath) // Cleanup on error
		return fmt.Errorf("write temp: %w", err)
	}

	// Atomic rename
	err = client.Rename(tempPath, remotePath)
	if err != nil {
		client.Remove(tempPath) // Cleanup on error
		return fmt.Errorf("rename: %w", err)
	}

	return nil
}

This ensures the target file is either complete or non-existent. Never partial.

Safe Downloads with Checksums

// SafeDownload downloads with checksum verification
func SafeDownload(client *sftp.Client, remotePath, localPath string) error {
	// Download to temp file
	tempPath := localPath + ".tmp"
	tempFile, err := os.Create(tempPath)
	if err != nil {
		return fmt.Errorf("create temp: %w", err)
	}
	defer tempFile.Close()

	remoteFile, err := client.Open(remotePath)
	if err != nil {
		os.Remove(tempPath)
		return fmt.Errorf("open remote: %w", err)
	}
	defer remoteFile.Close()

	// Copy with hash
	h := md5.New()
	w := io.MultiWriter(tempFile, h)
	_, err = io.Copy(w, remoteFile)
	if err != nil {
		os.Remove(tempPath)
		return fmt.Errorf("copy: %w", err)
	}

	remoteHash := getRemoteChecksum(client, remotePath)
	localHash := fmt.Sprintf("%x", h.Sum(nil))

	if remoteHash != localHash {
		os.Remove(tempPath)
		return fmt.Errorf("checksum mismatch: %s != %s", remoteHash, localHash)
	}

	// Atomic rename
	err = os.Rename(tempPath, localPath)
	if err != nil {
		os.Remove(tempPath)
		return fmt.Errorf("rename: %w", err)
	}

	return nil
}

This prevents partial downloads from being used.


Part 6: CLI Tool β€” A Practical Example

Build a command-line tool for SFTP operations:

// sftp-tool/main.go
package main

import (
	"flag"
	"fmt"
	"golang.org/x/crypto/ssh"
	"github.com/pkg/sftp"
)

func main() {
	cmd := flag.NewFlagSet("sftp-tool", flag.ExitOnError)
	host := cmd.String("host", "", "Remote host")
	user := cmd.String("user", "", "Username")
	operation := cmd.String("op", "", "Operation: list, search, download, upload")
	remotePath := cmd.String("remote", "", "Remote path")
	localPath := cmd.String("local", "", "Local path")
	pattern := cmd.String("pattern", "*", "Search pattern")

	cmd.Parse(flag.Args())

	// Create SSH config
	config := &ssh.ClientConfig{
		User: *user,
		Auth: []ssh.AuthMethod{
			ssh.Password("password"), // Use key-based auth in production
		},
		HostKeyCallback: ssh.InsecureIgnoreHostKey(),
	}

	// Connect
	sshClient, err := ssh.Dial("tcp", *host+":22", config)
	if err != nil {
		panic(err)
	}
	defer sshClient.Close()

	sftpClient, err := sftp.NewClient(sshClient)
	if err != nil {
		panic(err)
	}
	defer sftpClient.Close()

	// Execute operation
	switch *operation {
	case "list":
		listDir(sftpClient, *remotePath)
	case "search":
		searchFiles(sftpClient, *remotePath, *pattern)
	case "download":
		downloadFile(sftpClient, *remotePath, *localPath)
	case "upload":
		uploadFile(sftpClient, *localPath, *remotePath)
	default:
		fmt.Println("Unknown operation")
	}
}

func listDir(client *sftp.Client, path string) {
	files, _ := client.ReadDir(path)
	for _, f := range files {
		fmt.Printf("%s %d %v\n", f.Name(), f.Size(), f.ModTime())
	}
}

func searchFiles(client *sftp.Client, dir, pattern string) {
	client.Walk(dir, func(path string, info os.FileInfo, err error) error {
		if err != nil {
			return nil
		}
		if matched, _ := filepath.Match(pattern, filepath.Base(path)); matched {
			fmt.Println(path)
		}
		return nil
	})
}

func downloadFile(client *sftp.Client, remote, local string) {
	file, _ := client.Open(remote)
	defer file.Close()
	out, _ := os.Create(local)
	defer out.Close()
	io.Copy(out, file)
	fmt.Println("Downloaded:", local)
}

func uploadFile(client *sftp.Client, local, remote string) {
	file, _ := os.Open(local)
	defer file.Close()
	out, _ := client.Create(remote)
	defer out.Close()
	io.Copy(out, file)
	fmt.Println("Uploaded:", remote)
}

Usage:

# List directory
./sftp-tool -host example.com -user admin -op list -remote /var/log

# Search files
./sftp-tool -host example.com -user admin -op search -remote /var/log -pattern "*.txt"

# Download file
./sftp-tool -host example.com -user admin -op download -remote /var/log/app.log -local ./app.log

# Upload file
./sftp-tool -host example.com -user admin -op upload -local ./config.txt -remote /etc/config.txt

Part 7: Performance Optimization β€” Tuning Your Code

Batch Operations

Instead of one operation at a time:

// SLOW: One download at a time
for _, file := range files {
	downloadFile(sftpClient, file, localDir)
}

// FAST: Batch downloads with concurrency
func batchDownload(client *sftp.Client, files []string, dest string, workers int) error {
	sem := make(chan struct{}, workers) // Semaphore for concurrency limit

	var wg sync.WaitGroup
	errChan := make(chan error, len(files))

	for _, file := range files {
		wg.Add(1)
		go func(f string) {
			defer wg.Done()

			sem <- struct{}{}        // Acquire
			defer func() { <-sem }() // Release

			localName := filepath.Join(dest, filepath.Base(f))
			if err := SafeDownload(client, f, localName); err != nil {
				errChan <- err
			}
		}(file)
	}

	wg.Wait()
	close(errChan)

	for err := range errChan {
		if err != nil {
			return err
		}
	}

	return nil
}

This downloads multiple files concurrently while limiting concurrency to avoid overwhelming the connection.

Connection Keepalive

Keep connections alive over long operations:

// Add keepalive
config := &ssh.ClientConfig{
	ClientVersion: "SSH-2.0-sftptool",
	// ... auth methods ...
}

// After dial, add keepalive
sshClient.OpenChannel("session", nil) // Dummy channel to keep connection alive

// Or use a periodic ping
go func() {
	ticker := time.NewTicker(30 * time.Second)
	defer ticker.Stop()
	for range ticker.C {
		sshClient.SendRequest("keepalive@openssh.com", true, nil)
	}
}()

Part 8: Best Practices Summary

Connection Management:

  • βœ… Use connection pooling, not one connection per operation
  • βœ… Reuse connections across multiple operations
  • βœ… Implement proper cleanup and error handling

File Operations:

  • βœ… Always use temp files for uploads (atomic writes)
  • βœ… Always verify downloads with checksums
  • βœ… Handle partial failures gracefully

Search Strategies:

  • βœ… Name-only search for speed
  • βœ… Parallel directory search for multiple trees
  • βœ… Content search only when necessary

Performance:

  • βœ… Batch operations
  • βœ… Use concurrent transfers with semaphores
  • βœ… Keep connections alive over long operations
  • βœ… Profile before optimizing

Security:

  • βœ… Use key-based authentication, not passwords
  • βœ… Verify host keys in production
  • βœ… Limit file permissions after transfer
  • βœ… Encrypt sensitive data before transfer

Part 9: The Real Cost of Bad SFTP Code

Developers often treat SFTP as a simple tool. β€œOpen connection, transfer file, close.”

At scale, this is expensive:

  • Per-operation authentication: 100ms each
  • 100 files to download: 10+ seconds in serial
  • Failed transfers: No recovery, start over
  • Partial files: Silently corrupt data

The right patterns cost a few hundred lines of code. They save hours of debugging and days of lost data.

The difference between SFTP code that works and SFTP code that scales is not complexity. It is discipline. Connection pooling. Atomic operations. Proper error handling. These are not optional. They are the difference between working code and production code.

Tags

#go #golang #sftp #file-operations #remote-access #connections #performance #cli #best-practices #security #networking #backend #devops