A surrogate to estimate the knock-down factor of imperfect shells

This example implements a graph neural network surrogate to predict the buckling knock-down factor of a half-sphere [1] [2] [3]. If follows the basic workflow described in the documentation.

Run this notebook in Google Colab. Execute the following cell to install requirements.

Install required packages for Google Colab (run this cell in Colab environment):

try:
    # Check if running in Google Colab
    import google.colab

    # Install PyG as in PyG documentation
    %pip uninstall --yes torch torchaudio torchvision torchtext torchdata
    %pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
    %pip install torch-scatter -f https://data.pyg.org/whl/torch-2.4.1+cu124.html
    %pip install torch-sparse -f https://data.pyg.org/whl/torch-2.4.1+cu124.html
    %pip install torch_geometric==2.6.1 -f https://data.pyg.org/whl/torch-2.4.1+cu124.html

    # Install Graphorge
    %git clone https://github.com/bessagroup/graphorge.git
    %pip install -e ./graphorge
except:
    pass

# Pandas is also required for this example
try:
    import pandas as pd
except:
    %pip install pandas
    import pandas as pd

Imports required to run the notebook:

# Standard
import logging
from pathlib import Path
import pickle
import shutil
import sys

# Third-party
import numpy as np
from scipy.spatial import KDTree
from sklearn.metrics.pairwise import haversine_distances
import torch
from tqdm.auto import tqdm, trange

# Locals
from graphorge.gnn_base_model.data.graph_data import GraphData
from graphorge.gnn_base_model.data.graph_dataset import GNNGraphDataset
from graphorge.gnn_base_model.train.training import train_model
from graphorge.gnn_base_model.predict.prediction import predict

# Make utilities available in the path
try:
    # Check if running in Google Colab
    import google.colab
    utils_path = (
      Path().resolve() / "graphorge" / "benchmarks" / "utils"
    )
except:
    utils_path = (
        Path().resolve().parents[1] / "utils"
    ) 
sys.path.insert(0, str(utils_path))

# Import required utilities
from utils import download_file, get_critical_buckling_wavelength

# Set logging configuration
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Prepare directories
directories = dict(
    raw_data=Path("0_data"),
    processed_data=Path("1_dataset"),
    training=Path("2_training_dataset"),
    validation=Path("3_validation_dataset"),
    model=Path("4_model"),
    in_distribution=Path("5_testing_id_dataset"),
    prediction=Path("7_prediction"),
)

# Create directories if they do not exist
for directory in directories.values():
    directory.mkdir(parents=True, exist_ok=True)

# Note: to render the figures in the documentation, we need to save the
# widget state, see https://github.com/microsoft/vscode-jupyter/issues/4404

1. Raw data

The dataset describes imperfect shell geometries, as depicted in the figure below, and provides their corresponding knock-down factor, obtained by finite element modeling (FEM). See the dataset repository for more information [4].

An imperfect shell is composed of one to multiple defects described by geometric parameters. The knock-down factor is calculated as the ratio between the buckling load of the imperfect shell and the the buckling load of the corresponding perfect shell.

Problem definition

The dataset is a consolidated table where each row describes a defect as follows:

  • defect_id: the ID of the defect.

  • shell_id: the corresponding shell ID.

  • knock_down: the knock-down factor associated with the shell, from 0 to 1.

  • theta: first spherical coordinate of the defect.

  • phi: second spherical coordinate of the defect.

  • delta: depth of the defect.

  • lambda: width of the defect.

  • radius: shell radius.

  • nu: Poisson ratio for the shell material.

  • eta: Poisson ratio over the shell thickness.

Let’s download the dataset and inspect it.

# Set the base URL for downloading the dataset
base_url = "placeholder"

# Download the dataset file
download_file(
    file="shells.tar.xz", base_url=base_url, dest_path=directories["raw_data"]
)

# Extract the dataset
shutil.unpack_archive(
    directories["raw_data"] / "shells.tar.xz",
    extract_dir=directories["raw_data"],
    format="xztar",
)

# Load the raw data, using pandas for convenience
shells_df = pd.read_csv(directories["raw_data"] / "shells.csv")

# Inspect the data
shells_df.head()
INFO:root:Using 'shells.tar.xz' cached in in /home/guillaume/Documents/code/graphorge_fork/benchmarks/mechanics/gnn_shell_buckling/0_data
defect_id shell_id knock_down theta phi delta lambda radius nu eta
0 0 0 0.476146 3.448297 0.873164 0.066385 1.0 25.4 0.5 110
1 1 0 0.476146 3.787274 0.756040 0.098328 1.0 25.4 0.5 110
2 2 0 0.476146 2.661902 0.827046 0.092914 1.0 25.4 0.5 110
3 3 0 0.476146 6.054872 0.629573 0.247017 1.0 25.4 0.5 110
4 4 0 0.476146 4.974555 0.744314 0.119663 1.0 25.4 0.5 110

2. Graph dataset

We encode each shell as a graph, where each node is a defect. The connectivity is defined based on a critical buckling wavelength $l_c$. Defects that are at a distance below this value interact and are connected by an edge in the graph representation of the shell. When the distance between two defects is greater than $l_c$, the nodes are left unconnected, see the above figure.

First, we need to extract the relevant information from the dataset.

def extract_shell_data(shell_id, shells_df, distance="euclidean"):
    """
    Extracts the shell data for a given shell_id from the shells_df DataFrame.

    Parameters
    ----------
    shell_id : int
        The ID of the shell to extract.
    shells_df : pd.DataFrame
        The DataFrame containing the shell data.
    distance : str, optional
        The distance metric to use for querying defects, by default "euclidean".

    Returns
    -------
    tuple
        A tuple containing:
        - shell_df: DataFrame with the shell data for the given shell_id.
        - defect_coordinates: np.ndarray with the cartesian coordinates of the
          defects.
        - edge_indexes: np.ndarray with the node indices defining the edges.
        - knock_down: float, the knock down factor for the shell.
    """
    # Filter rows for the current shell_id
    shell_df = shells_df[shells_df["shell_id"] == shell_id]
    shell_radius = shell_df["radius"].iloc[0]
    shell_thickness = shell_df["radius"].iloc[0] / shell_df["eta"].iloc[0]
    poisson = shell_df["nu"].iloc[0]
    knock_down = shell_df["knock_down"].iloc[0]

    # Calculate the critical buckling wavelength using the shell parameters
    critical_bw = get_critical_buckling_wavelength(
        poisson=poisson,
        shell_radius=shell_radius,
        shell_thickness=shell_thickness,
    )

    # Query the shell_df to get the defects
    n_defects = len(shell_df)
    if n_defects > 0:
        # Initialize an array to hold the defect cartesian coordinates
        defect_coordinates = np.zeros((n_defects, 3))

        # Convert spherical coordinates to cartesian coordinates
        defect_coordinates[:, 0] = (
            shell_radius * np.sin(shell_df["phi"]) * np.cos(shell_df["theta"])
        )
        defect_coordinates[:, 1] = (
            shell_radius * np.sin(shell_df["phi"]) * np.sin(shell_df["theta"])
        )
        defect_coordinates[:, 2] = shell_radius * np.cos(shell_df["phi"])

        # Query defects based on the distance metric to create edges
        if distance == "euclidean":
            # Initialize a kd tree
            kd_tree = KDTree(defect_coordinates)
            # Get the indices of the nodes that are within the
            # critical buckling wavelength
            src_dest_idx = kd_tree.query_pairs(
                r=critical_bw, p=2.0, output_type="ndarray"
            )

        elif distance == "haversine":
            # Calculate the angular distance between defects
            angular_distance = haversine_distances(
                X=shell_df[["phi", "theta"]]
            )
            # Convert angular distance to spherical distance
            great_circle_distance = angular_distance * shell_radius
            # Get the indices of the nodes that are within the
            # critical buckling wavelength
            src_dest_idx = np.vstack(
                np.triu(great_circle_distance <= critical_bw, k=1).nonzero()
            ).T

        return shell_df, defect_coordinates, src_dest_idx, knock_down

Then, we build the graph. This operation is at the core of the Graphorge framework.

def build_graphorge_graph(
    node_coordinates, edge_indexes, node_features, global_targets
):
    """
    Builds a GraphData object from the provided node coordinates, edge indexes,
    node features, and global targets.

    Parameters
    ----------
    node_coordinates : np.ndarray
        An array of shape (n_nodes, n_dim) containing the coordinates of the
        nodes.
    edge_indexes : np.ndarray
        An array of shape (n_edges, 2) containing the indices of the edges.
    node_features : np.ndarray
        An array of shape (n_nodes, n_features) containing the features of the
        nodes.
    global_targets : np.ndarray
        An array of shape (n_samples, n_targets) containing the global targets.

    Returns
    -------
    GraphData
        An instance of GraphData containing the graph structure and features.
    """
    # Instantiate the graph data (graphorge)
    graph_data = GraphData(n_dim=3, nodes_coords=node_coordinates)

    # Make the edges undirected, i.e., create target_to_source index pairs
    # from source_to_target index pairs
    edge_indexes = np.concatenate(
        (edge_indexes, edge_indexes[:, ::-1]), axis=0
    )

    # Set graph edges indexes. The data does not contain duplicated edges,
    # so we can set is_unique to False to avoid unnecessary processing
    graph_data.set_graph_edges_indexes(
        edges_indexes_mesh=edge_indexes, is_unique=False
    )

    # Set graph node features
    graph_data.set_node_features_matrix(node_features)

    # Set global targets matrix
    global_targets_matrix = global_targets
    # Set graph global targets
    graph_data.set_global_targets_matrix(global_targets_matrix)

    return graph_data

Let’s articulate everything.

def process_shell(shell_id, shells_df, distance="euclidean"):
    """
    Processes a shell with the given shell_id from the shells_df DataFrame.

    Parameters
    ----------
    shell_id : int
        The ID of the shell to process.
    shells_df : pd.DataFrame
        The DataFrame containing the shell data.
    distance : str, optional
        The distance metric to use for querying defects, by default
        "euclidean".

    Returns
    -------
    GraphData
        An instance of GraphData containing the processed shell data.
    """
    # Extract the shell data
    shell_df, defect_coordinates, src_dest_idx, knock_down = (
        extract_shell_data(shell_id, shells_df, distance=distance)
    )

    # Prepare the node features
    node_features = shell_df[["phi", "theta", "lambda", "delta"]].to_numpy()

    # Prepare the global targets with the adequate shape
    global_targets = np.array([[knock_down]])

    # Build the graph data
    graph_data = build_graphorge_graph(
        node_coordinates=defect_coordinates,
        edge_indexes=src_dest_idx,
        node_features=node_features,
        global_targets=global_targets,
    )

    return graph_data

We can inspect the obtained graphs.

graph_data = process_shell(
    shell_id=0, shells_df=shells_df, distance="euclidean"
)
graph_data.plot_graph(is_show_plot=True)
../../../../_images/6b2c23e517ce9703efc1832de46c9f960886614f7dd8f72b5ae7510f3fd39b38.png

Note the difference using a different distance metrics.

graph_data = process_shell(
    shell_id=0, shells_df=shells_df, distance="haversine"
)
graph_data.plot_graph(is_show_plot=True)
../../../../_images/18671cab2a24842b06f46293efbfd48e2e9b68087788c5c15ee62db06466144f.png

Note

In this example, we process the connectivity before passing edge indices to Graphorge to use custom distance metrics. The set_graph_edges_indexes method can also be used with a connect_radius argument that defines a radius search to automatically set the connectivity, based on euclidean distance.

Similarly, we choose to not use get_undirected_unique_edges as we know the edges are unique.

Finally, we generate the dataset.

# Parameters
samples = "all"  # Number of samples to process, or "all" for all samples
distance = "haversine"  # Distance metric to use: "euclidean" or "haversine"

# Determine the number of samples (shells) to process
n_samples = shells_df["shell_id"].max()
if samples != "all":
    n_samples = min(samples, n_samples)

dataset_sample_files = []
# Iterate over each shell_id in the dataset
for shell_id in trange(n_samples, desc="Processing shells"):

    # Process the shell data to get the graph data
    graph_data = process_shell(shell_id, shells_df, distance=distance)

    # Get PyG homogeneous graph data object
    pyg_graph = graph_data.get_torch_data_object()

    # Set sample file name
    sample_file_name = f"shell_graph_{shell_id:04d}.pt"
    # Set sample file path
    sample_file_path = directories["processed_data"] / sample_file_name
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    # Save graph sample file
    torch.save(pyg_graph, sample_file_path)
    # Save graph sample file path
    dataset_sample_files.append(sample_file_path)

Let’s split the data into train/validation/test datasets.

# This operation is random, so we fix the seed for reproducibility
seed = 42

# Initialize the random number generator
rng = np.random.default_rng(seed)

dataset_split_sizes = {
    "training": 0.7,
    "validation": 0.2,
    "in_distribution": 0.1,
}

split_fractions = np.array(list(dataset_split_sizes.values()))
# Normalize the split fractions to ensure they sum to 1
split_fractions /= split_fractions.sum()
# Get the corresponding split sizes
split_sizes = (split_fractions * len(dataset_sample_files)).astype(int)
# Get the split indices as the cumulative sum of the split sizes
split_indices = np.cumsum(split_sizes)

# Shuffle the dataset sample files
rng.shuffle(dataset_sample_files)

# Split the dataset sample files into training, validation, and test sets
splits_sample_files = np.split(dataset_sample_files, split_indices[:-1])

for split_name, split_files in tqdm(
    zip(dataset_split_sizes.keys(), splits_sample_files),
    desc="Splitting datasets",
):
    split_paths = []
    for file_path in split_files:
        target_path = directories[split_name] / file_path.name
        shutil.copy(
            file_path,
            target_path,
        )

        split_paths.append(target_path.as_posix())

    dataset = GNNGraphDataset(
        directories[split_name].as_posix(),
        dataset_sample_files=split_paths,
        dataset_basename="shell_dataset",
        is_store_dataset=False,
    )

    _ = dataset.save_dataset(is_append_n_sample=True)

3. GNN model architecture

# Set the GNN model parameters
gnn_architecture_parameters = dict(
    # Set number of node input and output features
    n_node_in=4,
    n_node_out=0,
    n_time_node=0,
    # Set number of edge input and output features
    n_edge_in=0,
    n_edge_out=0,
    n_time_edge=0,
    # Set number of global input and output features
    n_global_in=0,
    n_global_out=1,
    n_time_global=0,
    # Set number of message-passing steps (number of processor layers)
    n_message_steps=2,
    # Set number of FNN/RNN hidden layers
    enc_n_hidden_layers=3,
    pro_n_hidden_layers=2,
    dec_n_hidden_layers=3,
    # Set hidden layer size
    hidden_layer_size=128,
    # Set model directory
    model_directory=directories["model"],
    model_name="ShellBucklingGNN",
    # Set model input and output features normalization
    is_model_in_normalized=True,
    is_model_out_normalized=True,
    # Set aggregation schemes
    pro_edge_to_node_aggr="add",
    pro_node_to_global_aggr="mean",
    # Set activation functions
    enc_node_hidden_activ_type="tanh",
    enc_node_output_activ_type="identity",
    enc_edge_hidden_activ_type="tanh",
    enc_edge_output_activ_type="identity",
    pro_node_hidden_activ_type="tanh",
    pro_node_output_activ_type="identity",
    pro_edge_hidden_activ_type="tanh",
    pro_edge_output_activ_type="identity",
    dec_node_hidden_activ_type="tanh",
    dec_node_output_activ_type="identity",
    # Set device
    device_type="cuda" if torch.cuda.is_available() else "cpu",
)

4. GNN model training

train_dataset_file_path = list(
    directories["training"].glob("shell_dataset_*.pkl")
)[0]

validation_dataset_file_path = list(
    directories["validation"].glob("shell_dataset_*.pkl")
)[0]

# Load training datasets
train_dataset = GNNGraphDataset.load_dataset(train_dataset_file_path)
validation_dataset = GNNGraphDataset.load_dataset(validation_dataset_file_path)


training_parameters = dict(
    # Set number of epochs
    n_max_epochs=50,
    # Set batch size
    batch_size=16,
    # Set optimizer
    opt_algorithm="adam",
    # Set learning rate
    lr_init=1.0e-03,
    # Set learning rate scheduler
    lr_scheduler_type=None,
    lr_scheduler_kwargs=None,
    # Set loss function
    loss_nature="global_features_out",
    loss_type="mse",
    loss_kwargs=dict(),
    # Set data shuffling
    is_sampler_shuffle=False,
    # Set early stopping
    is_early_stopping=True,
    # Set early stopping parameters
    early_stopping_kwargs=dict(
        validation_dataset=validation_dataset,
        validation_frequency=1,
        trigger_tolerance=20,
        improvement_tolerance=1e-2,
    ),
    # Set seed
    seed=42,
    # Set verbosity
    is_verbose=True,
    tqdm_flavor="notebook",
)

# Compute exponential decay (learning rate scheduler)
lr_end = 1.0e-5
gamma = (lr_end / training_parameters["lr_init"]) ** (
    1 / training_parameters["n_max_epochs"]
)
# Set learning rate scheduler
training_parameters["lr_scheduler_type"] = "explr"
training_parameters["lr_scheduler_kwargs"] = dict(gamma=gamma)

# Set model state loading
load_model_state = None

# Training of GNN-based model
model, _, _ = train_model(
    dataset=train_dataset,
    model_init_args=gnn_architecture_parameters,
    **training_parameters
)
> Setting device: cpu

Graph Neural Network model training
-----------------------------------

> Initializing model...

Fitting GNN-based model data scalers
------------------------------------
> Processing data samples: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5242/5242 [00:05<00:00, 1004.67 sample/s]
> Processing data samples: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5242/5242 [00:03<00:00, 1437.54 sample/s]
> Setting fitted standard scalers...


> Initializing early stopping criterion...

> Training data set size: 5242

> Input data normalization: Yes

> Output data normalization: Yes


> Starting training process...

> Finished training process!


> Minimum training loss (normalized): 5.94457537e-02 | Epoch: 49

> Model directory: 4_model

> Total training time: 0:10:53 | Avg. training time per epoch: 0:00:13

5. GNN model prediction

First, lets analyze the loss curves.

from plotly import graph_objects as go

with open(directories["model"] / "loss_history_record.pkl", "rb") as f:
    loss_history_record = pickle.load(f)

# Plot the loss history
fig = go.Figure()
fig.add_scatter(
    y=loss_history_record["training_loss_history"],
    mode="lines+markers",
    name="Training loss",
)

fig.add_scatter(
    y=loss_history_record["validation_loss_history"],
    mode="lines+markers",
    name="Validation loss",
)

fig.update_layout(
    xaxis_title="Epochs",
    yaxis_title="Loss",
    template="plotly_white+xgridoff",
    width=800,
    height=400,
)
fig.show()
test_dataset_file_path = list(
    directories["in_distribution"].glob("shell_dataset_*.pkl")
)[0]

dataset = GNNGraphDataset.load_dataset(test_dataset_file_path)

predictions_location, _ = predict(
    dataset,
    model_directory=directories["model"],
    predict_directory=directories["prediction"],
    load_model_state="best",
    loss_nature="global_features_out",
    loss_type="mse",
    loss_kwargs={},
    is_normalized_loss=False,
    dataset_file_path=test_dataset_file_path,
    device_type="cuda" if torch.cuda.is_available() else "cpu",
    seed=None,
    is_verbose=True,
    tqdm_flavor="notebook",
)
Graph Neural Network model prediction
-------------------------------------

> Loading Graph Neural Network model...


> Starting prediction process...
> Finished prediction process!


> Avg. prediction loss per sample: 5.82017878e-04 | mse

> Prediction results directory: /home/guillaume/Documents/code/graphorge_fork/benchmarks/mechanics/gnn_shell_buckling/7_prediction/prediction_set_2

> Total prediction time: 0:00:02 | Avg. prediction time per sample: 0:00:00
ground_truth = []
predictions = []
# Load the predictions and ground truth from the prediction files
for result_path in tqdm(
    Path(predictions_location).glob("prediction_sample_*.pkl"),
    desc="Loading predictions",
):
    with open(result_path, "rb") as f:
        result = pickle.load(f)
        predictions.append(result["global_features_out"].detach().cpu().item())
        ground_truth.append(result["global_targets"].detach().cpu().item())


min_value = min(min(predictions), min(ground_truth))
max_value = max(max(predictions), max(ground_truth))

fig = go.Figure()

fig.add_scatter(
    x=[min_value, max_value],
    y=[min_value, max_value],
    mode="lines",
    line_color="black",
    name="Identity line",
)

fig.add_scatter(
    x=ground_truth,
    y=predictions,
    mode="markers",
    name="Knock-down factor",
    opacity=0.5,
)

fig.update_layout(
    xaxis_title="Ground truth",
    yaxis_title="Predictions",
    template="plotly_white+xgridoff",
    width=600,
    height=600,
)

fig.update_yaxes(
    scaleanchor="x",
    scaleratio=1,
)
fig.show()