SFTP with Go: Connections, Performance, and Search Strategies for Production
Learn to build SFTP clients in Go with connection pooling, safe transfers, efficient file search, and production-ready patterns. Includes CLI commands and best practices for remote file operations.
The Problem: SSH is Everywhere, SFTP is Poorly Understood
Imagine you are an operator managing 50 servers. You need to retrieve logs from a specific server, find files matching a pattern, and transfer them back to your system for analysis.
You could SSH into the server and use find and scp. It is slow. It is manual. It is error-prone.
Or you could write a Go program that connects to 50 servers simultaneously, searches for matching files in parallel, and transfers them all at once. Same data. Vastly different experience.
SFTP (SSH File Transfer Protocol) is the solution. It is built on SSH. It is installed everywhere. It is secure. But most developers treat it as a simple file transfer tool.
Used correctly, SFTP is a powerful abstraction for remote file operations that scales to thousands of concurrent connections.
This guide teaches you how.
Part 1: Understanding SFTP β The Foundation
SFTP is not FTP with SSH. It is a completely different protocol built on top of SSH.
Why SFTP, Not SSH SCP?
SSH SCP (Secure Copy):
- Simple protocol
- One file transfer at a time
- Uses external SSH command
- Good for ad-hoc file movement
- Terrible for automation
SFTP:
- Rich protocol with file operations
- Multiple concurrent transfers
- Built-in library support (golang.org/x/crypto/ssh)
- Directory traversal
- File stat/chmod/delete operations
- Connection reuse
- Good for production systems
How SFTP Works
Your Program
β
SSH Connection (encrypted)
β
SFTP Subsystem on Remote Server
β
File Operations (read, write, list, delete)
The SSH connection is the transport. SFTP is the protocol running on top of it.
Part 2: Building Your First SFTP Connection
The naive approach is to open a connection, do an operation, close it. This works for one-off tasks. It is terrible for repeated operations.
The Naive Approach (Donβt Do This)
import (
"golang.org/x/crypto/ssh"
"github.com/pkg/sftp"
)
func downloadFile(host, username, password, remotePath string) error {
// Dial SSH
config := &ssh.ClientConfig{
User: username,
Auth: []ssh.AuthMethod{
ssh.Password(password),
},
HostKeyCallback: ssh.InsecureIgnoreHostKey(),
}
client, err := ssh.Dial("tcp", host+":22", config)
if err != nil {
return err
}
defer client.Close()
// Open SFTP session
session, err := sftp.NewClient(client)
if err != nil {
return err
}
defer session.Close()
// Download file
srcFile, err := session.Open(remotePath)
if err != nil {
return err
}
defer srcFile.Close()
dstFile, err := os.Create("localfile.txt")
if err != nil {
return err
}
defer dstFile.Close()
_, err = io.Copy(dstFile, srcFile)
return err
}
This works. But it creates a new SSH connection for every operation. At scale, this is slow.
Part 3: Connection Pooling β The Right Approach
Production systems need connection reuse. Opening an SSH connection is expensive (key exchange, authentication, handshake). You want to open once, use many times.
Building a Connection Pool
// sftppool/pool.go
package sftppool
import (
"fmt"
"sync"
"github.com/pkg/sftp"
"golang.org/x/crypto/ssh"
)
// Connection represents a reusable SFTP connection
type Connection struct {
ssh *ssh.Client
sftp *sftp.Client
closed bool
}
// ConnectionPool manages multiple SFTP connections
type ConnectionPool struct {
host string
config *ssh.ClientConfig
mu sync.RWMutex
conns []*Connection
maxConns int
inUse int
}
// NewPool creates a new connection pool
func NewPool(host string, config *ssh.ClientConfig, maxConns int) *ConnectionPool {
return &ConnectionPool{
host: host,
config: config,
conns: make([]*Connection, 0, maxConns),
maxConns: maxConns,
}
}
// Get retrieves or creates a connection from the pool
func (p *ConnectionPool) Get() (*Connection, error) {
p.mu.Lock()
defer p.mu.Unlock()
// Try to reuse existing connection
for i, conn := range p.conns {
if !conn.closed {
p.conns = append(p.conns[:i], p.conns[i+1:]...)
p.inUse++
return conn, nil
}
}
// Create new connection if under limit
if p.inUse < p.maxConns {
sshClient, err := ssh.Dial("tcp", p.host+":22", p.config)
if err != nil {
return nil, fmt.Errorf("ssh dial: %w", err)
}
sftpClient, err := sftp.NewClient(sshClient)
if err != nil {
sshClient.Close()
return nil, fmt.Errorf("sftp new client: %w", err)
}
p.inUse++
return &Connection{
ssh: sshClient,
sftp: sftpClient,
}, nil
}
return nil, fmt.Errorf("pool exhausted")
}
// Return returns a connection to the pool
func (p *ConnectionPool) Return(conn *Connection) {
p.mu.Lock()
defer p.mu.Unlock()
if conn.closed {
p.inUse--
return
}
p.conns = append(p.conns, conn)
p.inUse--
}
// Close closes all connections
func (p *ConnectionPool) Close() error {
p.mu.Lock()
defer p.mu.Unlock()
for _, conn := range p.conns {
conn.Close()
}
p.conns = nil
return nil
}
// Close closes a single connection
func (c *Connection) Close() error {
c.closed = true
c.sftp.Close()
return c.ssh.Close()
}
Now operations reuse connections:
// Usage
pool := NewPool("example.com", config, 10) // Max 10 concurrent connections
defer pool.Close()
// Get connection from pool
conn, err := pool.Get()
if err != nil {
panic(err)
}
defer pool.Return(conn)
// Use connection
file, err := conn.sftp.Open("/path/to/file.txt")
// ... do operations ...
Part 4: File Search Strategies β Fast vs Deep
Searching 100,000 files on a remote server is different from searching locally.
Strategy 1: Name-Only Search (Fastest)
When you only need to match filenames, list directories and filter:
// SearchByName returns files matching a name pattern
func SearchByName(client *sftp.Client, dir, pattern string) ([]string, error) {
var results []string
// Walk directory tree
err := client.Walk(dir, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
// Match filename against pattern
if matched, _ := filepath.Match(pattern, filepath.Base(path)); matched {
results = append(results, path)
}
return nil
})
return results, err
}
// Usage
files, err := SearchByName(sftpClient, "/var/log", "*.log")
This is O(n) where n = number of files in the tree. Fast because it only does stat calls.
Strategy 2: Parallel Search Across Directories
When searching multiple top-level directories, search in parallel:
// ParallelSearch searches multiple directories concurrently
func ParallelSearch(client *sftp.Client, dirs []string, pattern string, workers int) ([]string, error) {
// Create work queue
workChan := make(chan string, 100)
resultChan := make(chan string, 1000)
errChan := make(chan error, workers)
// Start workers
var wg sync.WaitGroup
for i := 0; i < workers; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for dir := range workChan {
err := searchDir(client, dir, pattern, resultChan)
if err != nil {
errChan <- err
}
}
}()
}
// Send work
go func() {
for _, dir := range dirs {
workChan <- dir
}
close(workChan)
}()
// Collect results
var results []string
go func() {
wg.Wait()
close(resultChan)
}()
for result := range resultChan {
results = append(results, result)
}
return results, nil
}
func searchDir(client *sftp.Client, dir, pattern string, results chan<- string) error {
return client.Walk(dir, func(path string, info os.FileInfo, err error) error {
if err != nil {
return nil // Skip errors, continue searching
}
if matched, _ := filepath.Match(pattern, filepath.Base(path)); matched {
results <- path
}
return nil
})
}
// Usage
files, err := ParallelSearch(sftpClient, []string{"/var/log", "/home", "/opt"}, "*.txt", 4)
This searches 4 directories simultaneously. 4x faster than sequential.
Strategy 3: Content Search (Slowest, Most Powerful)
When you need to search file contents:
// SearchContents searches file contents for a pattern
func SearchContents(client *sftp.Client, dir, filePattern, contentPattern string) ([]string, error) {
var results []string
re := regexp.MustCompile(contentPattern)
return results, client.Walk(dir, func(path string, info os.FileInfo, err error) error {
if err != nil {
return nil
}
// Skip directories and non-matching filenames
if info.IsDir() {
return nil
}
if matched, _ := filepath.Match(filePattern, filepath.Base(path)); !matched {
return nil
}
// Open and search file
file, err := client.Open(path)
if err != nil {
return nil
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
if re.MatchString(scanner.Text()) {
results = append(results, path)
break // Found match, move to next file
}
}
return scanner.Err()
})
}
// Usage
// Find files in *.log that contain error messages
files, err := SearchContents(sftpClient, "/var/log", "*.log", "ERROR|FAIL")
This reads file contents. Slow for large files. Powerful for specific searches.
Part 5: Safe File Operations β Atomicity and Error Handling
Remote file operations can fail mid-transfer. You need patterns for safety.
Atomic Uploads with Temp Files
// SafeUpload uploads a file atomically using a temp file
func SafeUpload(client *sftp.Client, localPath, remotePath string) error {
// Read local file
data, err := os.ReadFile(localPath)
if err != nil {
return fmt.Errorf("read local: %w", err)
}
// Write to temp file first
tempPath := remotePath + ".tmp"
tempFile, err := client.Create(tempPath)
if err != nil {
return fmt.Errorf("create temp: %w", err)
}
_, err = tempFile.Write(data)
tempFile.Close()
if err != nil {
client.Remove(tempPath) // Cleanup on error
return fmt.Errorf("write temp: %w", err)
}
// Atomic rename
err = client.Rename(tempPath, remotePath)
if err != nil {
client.Remove(tempPath) // Cleanup on error
return fmt.Errorf("rename: %w", err)
}
return nil
}
This ensures the target file is either complete or non-existent. Never partial.
Safe Downloads with Checksums
// SafeDownload downloads with checksum verification
func SafeDownload(client *sftp.Client, remotePath, localPath string) error {
// Download to temp file
tempPath := localPath + ".tmp"
tempFile, err := os.Create(tempPath)
if err != nil {
return fmt.Errorf("create temp: %w", err)
}
defer tempFile.Close()
remoteFile, err := client.Open(remotePath)
if err != nil {
os.Remove(tempPath)
return fmt.Errorf("open remote: %w", err)
}
defer remoteFile.Close()
// Copy with hash
h := md5.New()
w := io.MultiWriter(tempFile, h)
_, err = io.Copy(w, remoteFile)
if err != nil {
os.Remove(tempPath)
return fmt.Errorf("copy: %w", err)
}
remoteHash := getRemoteChecksum(client, remotePath)
localHash := fmt.Sprintf("%x", h.Sum(nil))
if remoteHash != localHash {
os.Remove(tempPath)
return fmt.Errorf("checksum mismatch: %s != %s", remoteHash, localHash)
}
// Atomic rename
err = os.Rename(tempPath, localPath)
if err != nil {
os.Remove(tempPath)
return fmt.Errorf("rename: %w", err)
}
return nil
}
This prevents partial downloads from being used.
Part 6: CLI Tool β A Practical Example
Build a command-line tool for SFTP operations:
// sftp-tool/main.go
package main
import (
"flag"
"fmt"
"golang.org/x/crypto/ssh"
"github.com/pkg/sftp"
)
func main() {
cmd := flag.NewFlagSet("sftp-tool", flag.ExitOnError)
host := cmd.String("host", "", "Remote host")
user := cmd.String("user", "", "Username")
operation := cmd.String("op", "", "Operation: list, search, download, upload")
remotePath := cmd.String("remote", "", "Remote path")
localPath := cmd.String("local", "", "Local path")
pattern := cmd.String("pattern", "*", "Search pattern")
cmd.Parse(flag.Args())
// Create SSH config
config := &ssh.ClientConfig{
User: *user,
Auth: []ssh.AuthMethod{
ssh.Password("password"), // Use key-based auth in production
},
HostKeyCallback: ssh.InsecureIgnoreHostKey(),
}
// Connect
sshClient, err := ssh.Dial("tcp", *host+":22", config)
if err != nil {
panic(err)
}
defer sshClient.Close()
sftpClient, err := sftp.NewClient(sshClient)
if err != nil {
panic(err)
}
defer sftpClient.Close()
// Execute operation
switch *operation {
case "list":
listDir(sftpClient, *remotePath)
case "search":
searchFiles(sftpClient, *remotePath, *pattern)
case "download":
downloadFile(sftpClient, *remotePath, *localPath)
case "upload":
uploadFile(sftpClient, *localPath, *remotePath)
default:
fmt.Println("Unknown operation")
}
}
func listDir(client *sftp.Client, path string) {
files, _ := client.ReadDir(path)
for _, f := range files {
fmt.Printf("%s %d %v\n", f.Name(), f.Size(), f.ModTime())
}
}
func searchFiles(client *sftp.Client, dir, pattern string) {
client.Walk(dir, func(path string, info os.FileInfo, err error) error {
if err != nil {
return nil
}
if matched, _ := filepath.Match(pattern, filepath.Base(path)); matched {
fmt.Println(path)
}
return nil
})
}
func downloadFile(client *sftp.Client, remote, local string) {
file, _ := client.Open(remote)
defer file.Close()
out, _ := os.Create(local)
defer out.Close()
io.Copy(out, file)
fmt.Println("Downloaded:", local)
}
func uploadFile(client *sftp.Client, local, remote string) {
file, _ := os.Open(local)
defer file.Close()
out, _ := client.Create(remote)
defer out.Close()
io.Copy(out, file)
fmt.Println("Uploaded:", remote)
}
Usage:
# List directory
./sftp-tool -host example.com -user admin -op list -remote /var/log
# Search files
./sftp-tool -host example.com -user admin -op search -remote /var/log -pattern "*.txt"
# Download file
./sftp-tool -host example.com -user admin -op download -remote /var/log/app.log -local ./app.log
# Upload file
./sftp-tool -host example.com -user admin -op upload -local ./config.txt -remote /etc/config.txt
Part 7: Performance Optimization β Tuning Your Code
Batch Operations
Instead of one operation at a time:
// SLOW: One download at a time
for _, file := range files {
downloadFile(sftpClient, file, localDir)
}
// FAST: Batch downloads with concurrency
func batchDownload(client *sftp.Client, files []string, dest string, workers int) error {
sem := make(chan struct{}, workers) // Semaphore for concurrency limit
var wg sync.WaitGroup
errChan := make(chan error, len(files))
for _, file := range files {
wg.Add(1)
go func(f string) {
defer wg.Done()
sem <- struct{}{} // Acquire
defer func() { <-sem }() // Release
localName := filepath.Join(dest, filepath.Base(f))
if err := SafeDownload(client, f, localName); err != nil {
errChan <- err
}
}(file)
}
wg.Wait()
close(errChan)
for err := range errChan {
if err != nil {
return err
}
}
return nil
}
This downloads multiple files concurrently while limiting concurrency to avoid overwhelming the connection.
Connection Keepalive
Keep connections alive over long operations:
// Add keepalive
config := &ssh.ClientConfig{
ClientVersion: "SSH-2.0-sftptool",
// ... auth methods ...
}
// After dial, add keepalive
sshClient.OpenChannel("session", nil) // Dummy channel to keep connection alive
// Or use a periodic ping
go func() {
ticker := time.NewTicker(30 * time.Second)
defer ticker.Stop()
for range ticker.C {
sshClient.SendRequest("keepalive@openssh.com", true, nil)
}
}()
Part 8: Best Practices Summary
Connection Management:
- β Use connection pooling, not one connection per operation
- β Reuse connections across multiple operations
- β Implement proper cleanup and error handling
File Operations:
- β Always use temp files for uploads (atomic writes)
- β Always verify downloads with checksums
- β Handle partial failures gracefully
Search Strategies:
- β Name-only search for speed
- β Parallel directory search for multiple trees
- β Content search only when necessary
Performance:
- β Batch operations
- β Use concurrent transfers with semaphores
- β Keep connections alive over long operations
- β Profile before optimizing
Security:
- β Use key-based authentication, not passwords
- β Verify host keys in production
- β Limit file permissions after transfer
- β Encrypt sensitive data before transfer
Part 9: The Real Cost of Bad SFTP Code
Developers often treat SFTP as a simple tool. βOpen connection, transfer file, close.β
At scale, this is expensive:
- Per-operation authentication: 100ms each
- 100 files to download: 10+ seconds in serial
- Failed transfers: No recovery, start over
- Partial files: Silently corrupt data
The right patterns cost a few hundred lines of code. They save hours of debugging and days of lost data.
The difference between SFTP code that works and SFTP code that scales is not complexity. It is discipline. Connection pooling. Atomic operations. Proper error handling. These are not optional. They are the difference between working code and production code.
Tags
Related Articles
Building Automation Services with Go: Practical Tools & Real-World Solutions
Master building useful automation services and tools with Go. Learn to create production-ready services that solve real problems: log processors, API monitors, deployment tools, data pipelines, and more.
Automation with Go: Building Scalable, Concurrent Systems for Real-World Tasks
Master Go for automation. Learn to build fast, concurrent automation tools, CLI utilities, monitoring systems, and deployment pipelines. Go's concurrency model makes it perfect for real-world automation.
Automation Tools for Developers: Real Workflows Without AI - CLI, Scripts & Open Source
Master free automation tools for developers. Learn to automate repetitive tasks, workflows, deployments, monitoring, and operations. Build custom automation pipelines with open-source toolsβno AI needed.