seisgo.clustering
The clustering module groups 1-D depth-velocity profiles from 3-D seismic tomography
models into coherent spatial clusters using k-means or self-organizing maps (SOM).
Function Summary
vpcluster_kmean Parameter Reference
Parameter |
Default |
Description |
|---|---|---|
|
required |
1-D array of latitudes for the model grid. |
|
required |
1-D array of longitudes for the model grid. |
|
required |
1-D depth array (km). |
|
required |
3-D velocity array, shape |
|
|
Number of clusters. If |
|
|
Range of cluster counts to evaluate when |
|
|
Spatial sub-sampling stride (every Nth lat/lon point). |
|
|
|
|
|
Depth interpolation interval (km). |
|
|
Distance metric for k-means. Passed to |
|
|
Maximum DBA (DTW Barycenter Averaging) iterations. |
|
|
Random seed for reproducibility. |
|
|
Number of parallel jobs. |
|
|
Plot cluster profiles and map. |
|
|
Save figures to PNG. |
|
|
Base name for output figure and pickle files. |
|
|
Save results to a pickle file. If |
|
|
Smooth the distortion curve before knee detection. |
|
|
Plot the elbow curve when auto-detecting cluster count. |
vpcluster_som Parameter Reference
Parameter |
Default |
Description |
|---|---|---|
|
required |
Same as |
|
|
|
|
|
Spatial sub-sampling stride. |
|
|
Number of SOM training iterations. |
|
|
Initial neighbourhood radius. |
|
|
Initial learning rate. |
|
same as k-means |
Same behaviour as |
Output dictionary structure
Both functions return (or save) a dictionary with the following keys:
Key |
Description |
|---|---|
|
|
|
User-supplied source label string |
|
User-supplied variable label string |
|
1-D depth vector used for clustering |
|
Fitted |
|
List of length |
|
Dictionary of algorithm parameters |
|
|
Elbow / Knee detection
vpcluster_evaluate_kmean() fits k-means for each value in nrange and uses the
kneed library to locate the knee of the distortion (within-cluster sum of distances) curve.
from seisgo import clustering
import numpy as np
from tslearn.utils import to_time_series_dataset
ts = to_time_series_dataset(all_profiles)
nbest, distortions = clustering.vpcluster_evaluate_kmean(
ts,
nrange=np.arange(2, 15),
smooth=True,
plot=True,
)
print("Recommended cluster count:", nbest)