aggie.cc
← Back to posts

Finding the Shortest Pipeline Route Through Multiple Well Pads — The Math Behind It

How to turn a set of GPS coordinates from well pads into an optimal pipeline network using Steiner trees, minimum spanning trees, and computational geometry.

You have a set of well pad coordinates. You need to connect them with pipelines using the least total length of pipe. This is not a straight-line problem — it’s a classic optimization challenge that shows up in every midstream engineering project.

The Problem

Given nn points (well pads, compressor stations, tank batteries) with known GPS coordinates:

P={(x1,y1),(x2,y2),,(xn,yn)}P = \{(x_1, y_1), (x_2, y_2), \ldots, (x_n, y_n)\}

Find the network of pipes that connects all points with minimum total length.

Why Not Just Connect the Dots?

The naive approach — connect every pad to its nearest neighbor — doesn’t work. It creates loops, misses points, or produces suboptimal routes.

There are three levels of sophistication:


Level 1: Minimum Spanning Tree (MST)

The simplest correct solution. A minimum spanning tree connects all nn points using exactly n1n - 1 edges (pipe segments), with the smallest possible total length. No point is left unconnected, and there are no loops.

The Algorithm

For any two points Pi=(xi,yi)P_i = (x_i, y_i) and Pj=(xj,yj)P_j = (x_j, y_j), the Euclidean distance is:

d(Pi,Pj)=(xixj)2+(yiyj)2d(P_i, P_j) = \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}

Kruskal’s Algorithm:

  1. Compute all (n2)=n(n1)2\binom{n}{2} = \frac{n(n-1)}{2} pairwise distances
  2. Sort all edges by distance (shortest first)
  3. Greedily add edges, skipping any that would create a cycle
  4. Stop when you have n1n - 1 edges

Prim’s Algorithm (alternative):

  1. Start from any point
  2. At each step, add the shortest edge connecting a visited point to an unvisited point
  3. Repeat until all points are connected

Both run in O(n2logn)O(n^2 \log n) time — fast enough for any realistic number of well pads.

Python Implementation

import numpy as np
from scipy.spatial.distance import pdist, squareform
from scipy.sparse.csgraph import minimum_spanning_tree

# Well pad coordinates (lon, lat converted to meters via UTM projection)
coords = np.array([
    [345200, 3542100],  # Pad A
    [346800, 3543200],  # Pad B
    [344900, 3544500],  # Pad C
    [347500, 3541800],  # Pad D
    [346100, 3545000],  # Pad E
])

# Compute pairwise distance matrix
dist_matrix = squareform(pdist(coords, metric='euclidean'))

# Find MST
mst = minimum_spanning_tree(dist_matrix)
total_length = mst.sum()

print(f"Total pipeline length (MST): {total_length:.0f} meters")

# Extract edges
edges = list(zip(*mst.nonzero()))
for i, j in edges:
    print(f"  Pad {i} → Pad {j}: {mst[i,j]:.0f} m")

MST Cost

The total cost is:

CMST=(i,j)EMSTd(Pi,Pj)C_{\text{MST}} = \sum_{(i,j) \in E_{\text{MST}}} d(P_i, P_j)

where EMSTE_{\text{MST}} is the set of edges in the minimum spanning tree.


Level 2: Euclidean Steiner Tree

Here’s the key insight: you’re allowed to add new junction points that aren’t well pads. These extra points — called Steiner points — can reduce total pipe length by up to 13.4%.

The Idea

Consider three well pads forming a triangle. The MST connects them with 2 edges. But if you add a junction point in the middle where three pipes meet at 120°120° angles, the total length is shorter.

For three points, the optimal Steiner point is the Fermat point — the point that minimizes the sum of distances to all three vertices:

S=argminSi=13d(S,Pi)S^* = \arg\min_{S} \sum_{i=1}^{3} d(S, P_i)

At the Fermat point, the three line segments meet at exactly 120°120°:

PiSPj=120°for all pairs ij\angle P_i S^* P_j = 120° \quad \text{for all pairs } i \neq j

The Steiner Ratio

The Steiner tree is always at least as good as the MST. The worst-case improvement ratio is:

LSteinerLMST320.866\frac{L_{\text{Steiner}}}{L_{\text{MST}}} \geq \frac{\sqrt{3}}{2} \approx 0.866

This means a Steiner tree can save up to 13.4% of pipe compared to the MST. In a 50Mmidstreamproject,thats50M midstream project, that's $6.7M$ in materials alone.

Why It’s Hard

The Euclidean Steiner Tree problem is NP-hard. For nn points, there could be up to n2n - 2 Steiner points, and finding the optimal configuration requires checking an exponential number of topologies.

For practical purposes (< 20 pads), exact algorithms work. For larger networks, heuristic approaches are used.

Approximate Steiner Tree in Python

from scipy.optimize import minimize

def steiner_objective(s, pads, topology):
    """Total length of Steiner tree given junction point positions."""
    steiner_points = s.reshape(-1, 2)
    all_points = np.vstack([pads, steiner_points])
    total = 0
    for i, j in topology:
        total += np.linalg.norm(all_points[i] - all_points[j])
    return total

# For 3 pads: one Steiner point, 3 edges
pads_3 = coords[:3]
topology_3 = [(0, 3), (1, 3), (2, 3)]  # all connect to Steiner point (index 3)

# Initial guess: centroid
centroid = pads_3.mean(axis=0)
result = minimize(
    steiner_objective,
    x0=centroid,
    args=(pads_3, topology_3),
    method='Nelder-Mead'
)

steiner_pt = result.x.reshape(-1, 2)
print(f"Steiner point: ({steiner_pt[0,0]:.0f}, {steiner_pt[0,1]:.0f})")
print(f"Total length (Steiner): {result.fun:.0f} m")

Level 3: Constrained Optimization (Real-World)

Real pipelines don’t travel in straight lines. You need to account for:

Terrain and Obstacles

Add a cost function that penalizes elevation changes, river crossings, and restricted areas:

Creal(Pi,Pj)=01w(γ(t))γ(t)dtC_{\text{real}}(P_i, P_j) = \int_0^1 w(\gamma(t)) \, \|\gamma'(t)\| \, dt

where γ(t)\gamma(t) is the path from PiP_i to PjP_j and w(γ(t))w(\gamma(t)) is a spatial cost weight (higher for steep terrain, water crossings, or protected land).

Flow Capacity Constraints

Each pipe segment must handle the combined flow from upstream. If pad kk produces qkq_k barrels/day, and segment (i,j)(i,j) carries flow QijQ_{ij}, then:

inQij=outQjk+qj(flow conservation at each node)\sum_{\text{in}} Q_{ij} = \sum_{\text{out}} Q_{jk} + q_j \quad \text{(flow conservation at each node)}

Pipe diameter depends on flow rate via the Darcy-Weisbach equation:

ΔP=fLDρv22\Delta P = f \cdot \frac{L}{D} \cdot \frac{\rho v^2}{2}

where ff is the friction factor, LL is pipe length, DD is diameter, ρ\rho is fluid density, and vv is flow velocity. Larger flows require larger (more expensive) pipe.

The Full Optimization

minE,S,D(i,j)Ec(Dij)Lij\min_{E, S, D} \quad \sum_{(i,j) \in E} c(D_{ij}) \cdot L_{ij}

subject to:

  • All pads connected (connectivity constraint)
  • Flow conservation at every node
  • Pressure drop \leq allowable at delivery point
  • No pipes through restricted zones
  • Minimum/maximum pipe diameter limits

where c(Dij)c(D_{ij}) is the cost per meter of pipe with diameter DijD_{ij}.


Practical Workflow

  1. Get coordinates from your well pad survey or GIS system
  2. Project to UTM (convert lat/lon to meters for distance calculations)
  3. Compute MST as a baseline — this takes milliseconds
  4. Try Steiner optimization if the savings justify the engineering effort
  5. Add real-world constraints (terrain, flow, ROW) for final design
  6. Visualize the network on a map using folium or QGIS
import folium

# Visualize MST on a map
m = folium.Map(location=[32.0, -103.5], zoom_start=13)

for i, j in edges:
    p1 = pad_coords_latlon[i]
    p2 = pad_coords_latlon[j]
    folium.PolyLine([p1, p2], weight=3, color='red').add_to(m)

for idx, coord in enumerate(pad_coords_latlon):
    folium.CircleMarker(coord, radius=6, popup=f'Pad {idx}',
                        color='black', fill=True).add_to(m)

m.save('pipeline_network.html')

Summary

MethodOptimalityComplexityUse Case
MSTGood baselineO(n2logn)O(n^2 \log n)Quick feasibility, < 100 pads
Steiner TreeUp to 13.4% betterNP-hard (exact)Detailed design, < 20 pads
Constrained OptBest (real-world)Problem-specificFinal engineering design

Start with the MST. If the project is large enough that a 10% reduction in pipe length matters, invest in Steiner optimization. For final design, add terrain and flow constraints.


Tools used: Python, NumPy, SciPy, folium, QGIS