I am currently working on my thesis for 3D path planning involving external elements, and I have completed about 1/5 of the total trials that I am doing. For each 1/5 of the trials I am increasing the size and complexity of the environment. Right now it takes a significant amount of time (5+ minutes for the generic algorithm, with a MiniHeap, using parfor loops, 40 cpu cores, process pool model) to go through 1000 Nodes as the amount of nodes increases the time increases dramatically. Currently I am trying to implement a bi-directional search to improve the speed of the function.
I am currently using MATLAB for this thesis as I can put the code into a cluster and I was wondering if anyone could see any issues that could improve performance. I do not know how to properly put C++ or Python into a MATLAB script. The check done in ~smooth is see if a path can potentially be created, and that I need to add in more nodes into the environment.
function [path, totalCost, totalDistance, totalTime, totalRE, nodeId] = AStarPathTD(nodes, adjacencyMatrix3D, heuristicMatrix, start, goal, Kd, Kt, Ke, cost_calc, buildingPositions, buildingSizes, r, smooth)
% Find index of start and goal nodes
[~, startIndex] = min(pdist2(nodes, start));
[~, goalIndex] = min(pdist2(nodes, goal));
if ~smooth
    connectedToStart = find(adjacencyMatrix3D(startIndex,:,1) < inf & adjacencyMatrix3D(startIndex,:,1) > 0); %getConnectedNodes(startIndex, nodes, adjacencyMatrix3D, r, buildingPositions, buildingSizes);
    connectedToEnd = find(adjacencyMatrix3D(goalIndex,:,1) < inf & adjacencyMatrix3D(goalIndex,:,1) > 0); %getConnectedNodes(goalIndex, nodes, adjacencyMatrix3D, r, buildingPositions, buildingSizes);
    if isempty(connectedToStart) || isempty(connectedToEnd)
        if isempty(connectedToEnd) && isempty(connectedToStart)
            nodeId = [startIndex; goalIndex];
        elseif isempty(connectedToEnd) && ~isempty(connectedToStart)
            nodeId = goalIndex;
        elseif isempty(connectedToStart) && ~isempty(connectedToEnd)
            nodeId = startIndex;
        end
        path = [];
        totalCost = [];
        totalDistance = [];
        totalTime = [];
        totalRE = [];
        return;
    end
end
% Bidirectional search setup
openSetF = MinHeap(); % From start
openSetB = MinHeap(); % From goal
openSetF = insert(openSetF, startIndex, 0);
openSetB = insert(openSetB, goalIndex, 0);
numNodes = size(nodes, 1);
LARGENUMBER = 10e10;
gScoreF = LARGENUMBER * ones(numNodes, 1); % Future cost from start
gScoreB = LARGENUMBER * ones(numNodes, 1); % Future cost from goal
fScoreF = LARGENUMBER * ones(numNodes, 1); % Total cost from start
fScoreB = LARGENUMBER * ones(numNodes, 1); % Total cost from goal
gScoreF(startIndex) = 0;
gScoreB(goalIndex) = 0;
cameFromF = cell(numNodes, 1); % Path tracking from start
cameFromB = cell(numNodes, 1); % Path tracking from goal
% Early exit flag
isPathFound = false;
meetingPoint = -1;
%pre pre computing costs
heuristicCosts = calculateCost(heuristicMatrix(:,1), heuristicMatrix(:,2), heuristicMatrix(:,3), Kd, Kt, Ke, cost_calc);
costMatrix = inf(numNodes, numNodes);
for i = 1:numNodes
    for j = i +1: numNodes
        if adjacencyMatrix3D(i,j,1) < inf
            costMatrix(i,j) = calculateCost(adjacencyMatrix3D(i,j,1), adjacencyMatrix3D(i,j,2), adjacencyMatrix3D(i,j,3), Kd, Kt, Ke, cost_calc);
            costMatrix(j,i) = costMatrix(i,j);
        end
    end
end
costMatrix = sparse(costMatrix);
%initial costs
fScoreF(startIndex) = heuristicCosts(startIndex);
fScoreB(goalIndex) = heuristicCosts(goalIndex);
%KD Tree
kdtree = KDTreeSearcher(nodes);
% Main loop
while ~isEmpty(openSetF) && ~isEmpty(openSetB)
    % Forward search
    [openSetF, currentF] = extractMin(openSetF);
    if isfinite(fScoreF(currentF)) && isfinite(fScoreB(currentF))
        if fScoreF(currentF) + fScoreB(currentF) < LARGENUMBER % Possible meeting point
            isPathFound = true;
            meetingPoint = currentF;
            break;
        end
    end
    % Process neighbors in parallel
    neighborsF = find(adjacencyMatrix3D(currentF, :, 1) < inf & adjacencyMatrix3D(currentF, :, 1) > 0);
    tentative_gScoresF = inf(1, numel(neighborsF));
    tentativeFScoreF = inf(1, numel(neighborsF));
    gScoreFCurrent = gScoreF(currentF);
    costF = costMatrix(currentF, neighborsF);
    tentative_gScoresF = gScoreFCurrent + costF;
    validMask = ~isinf(tentative_gScoresF);
    validNeighborsF = neighborsF(validMask);
    validHScoresF = heuristicCosts(validNeighborsF);
    tentativeFScoreF(validMask) = tentative_gScoresF(validMask) +  validHScoresF';
    %validNeighborsF = neighborsF(validMask);
    validTentative_gScoresF = tentative_gScoresF(validMask);
    validTentativeFScoreF = tentativeFScoreF(validMask);
    for i = numel(validNeighborsF)
        neighbor = validNeighborsF(i);
        tentative_gScore = validTentative_gScoresF(i);
        if tentative_gScore < gScoreF(neighbor)
            cameFromF{neighbor} = currentF;
            gScoreF(neighbor) = tentative_gScore;
            fScoreF(neighbor) = validTentativeFScoreF(i);
            openSetF = insert(openSetF, neighbor, fScoreF(neighbor));
        end
    end
    % Backward search
    [openSetB, currentB] = extractMin(openSetB);
    if isfinite(fScoreF(currentB)) && isfinite(fScoreB(currentB))
        if fScoreF(currentB) + fScoreB(currentB) < LARGENUMBER % Possible meeting point
            isPathFound = true;
            meetingPoint = currentB;
            break;
        end
    end
    % Process neighbors in parallel
    neighborsB = find(adjacencyMatrix3D(currentB, :, 1) < inf & adjacencyMatrix3D(currentB, :, 1) > 0);
    tentative_gScoresB = inf(1, numel(neighborsB));
    tentativeFScoreB = inf(1, numel(neighborsB));
    gScoreBCurrent = gScoreB(currentB);
    tentative_gScoresB = gScoreBCurrent + costMatrix(currentB, neighborsB);
    validMask = ~isinf(tentative_gScoresB);
    validNeighborsB = neighborsB(validMask);
    validHScoreB = heuristicCosts(validNeighborsB);
    tentativeFScoreB(validMask) = tentative_gScoresB(validMask) + validHScoreB';
    %validNeighborsB = neighborsB(validMask);
    validTentative_gScoresB = tentative_gScoresB(validMask);
    validTentativeFScoresB = tentativeFScoreB(validMask);
    
    
    for i = numel(validNeighborsB)
        neighbor = validNeighborsB(i);
        tentative_gScore = validTentative_gScoresB(i);
        if tentative_gScore < gScoreB(neighbor)
            cameFromB{neighbor} = currentB;
            gScoreB(neighbor) = tentative_gScore;
            fScoreB(neighbor) = validTentativeFScoresB(i);
            openSetB = insert(openSetB, neighbor, fScoreB(neighbor));
        end
    end
end
if isPathFound
    pathF = reconstructPath(cameFromF, meetingPoint, nodes);
    pathB = reconstructPath(cameFromB, meetingPoint, nodes);
    pathB = flipud(pathB);
    path = [pathF; pathB(2:end, :)]; % Concatenate paths
    totalCost = fScoreF(meetingPoint) + fScoreB(meetingPoint);
    
    pathIndices = knnsearch(kdtree, path, 'K', 1);
    totalDistance = 0;
    totalTime = 0;
    totalRE = 0;
    for i = 1:(numel(pathIndices) - 1)
        idx1 = pathIndices(i);
        idx2 = pathIndices(i+1);
        totalDistance = totalDistance + adjacencyMatrix3D(idx1, idx2, 1);
        totalTime = totalTime + adjacencyMatrix3D(idx1, idx2, 2);
        totalRE = totalRE + adjacencyMatrix3D(idx1, idx2, 3);
    end
   
    nodeId = [];
else
    path = [];
    totalCost = [];
    totalDistance = [];
    totalTime = [];
    totalRE = [];
    nodeId = [currentF; currentB];
end
end
function path = reconstructPath(cameFrom, current, nodes)
path = current;
while ~isempty(cameFrom{current})
    current = cameFrom{current};
    path = [current; path];
end
path = nodes(path, :);
end
function [cost] = calculateCost(RD,RT,RE, Kt,Kd,Ke,cost_calc)       
% Time distance and energy cost equation constants can be modified on needs
        if cost_calc == 1
        cost = RD/Kd; % weighted cost function
        elseif cost_calc == 2
            cost = RT/Kt;
        elseif cost_calc == 3
            cost = RE/Ke;
        elseif cost_calc == 4
            cost = RD/Kd + RT/Kt;
        elseif cost_calc == 5
            cost = RD/Kd +  RE/Ke;
        elseif cost_calc == 6
            cost =  RT/Kt + RE/Ke;
        elseif cost_calc == 7
            cost = RD/Kd + RT/Kt + RE/Ke;
        elseif cost_calc == 8
            cost =  3*(RD/Kd) + 4*(RT/Kt) ;
        elseif cost_calc == 9
            cost =  3*(RD/Kd) + 5*(RE/Ke);
        elseif cost_calc == 10
            cost =  4*(RT/Kt) + 5*(RE/Ke);
        elseif cost_calc == 11
            cost =  3*(RD/Kd) + 4*(RT/Kt) + 5*(RE/Ke);
        elseif cost_calc == 12
            cost =  4*(RD/Kd) + 5*(RT/Kt) ;
        elseif cost_calc == 13
            cost =  4*(RD/Kd) + 3*(RE/Ke);
        elseif cost_calc == 14
            cost =  5*(RT/Kt) + 3*(RE/Ke);
        elseif cost_calc == 15
            cost =  4*(RD/Kd) + 5*(RT/Kt) + 3*(RE/Ke);
        elseif cost_calc == 16
            cost =  5*(RD/Kd) + 3*(RT/Kt) ;
        elseif cost_calc == 17
            cost =  5*(RD/Kd) + 4*(RE/Ke);
        elseif cost_calc == 18
            cost =  3*(RT/Kt) + 4*(RE/Ke);
        elseif cost_calc == 19
            cost =  5*(RD/Kd) + 3*(RT/Kt) + 4*(RE/Ke);
        end  
end
The function works as intended; it just takes a really long time to run the function which is an issue for running the function multiple times for multiple simulations in a trial.
Update 1: I have added the updated A* path with the relavent helper functions as well as the main loop that calls the A* function below, I will add in a screen shot of the profiler once I get a valid path.
disp('Calculating Adjacency and Heuristic Matricies...');
    adjacencyMatrix = calculateLocalAdjacencyMatrix(nodes, n,buildingPositions, buildingSizes, mapSize,u,v,w,false,gridResolution);
    disp('Adjacency Matrix Calculated...');
    processPairMatrix = processPairGeneration(adjacencyMatrix, nodes,  buildingSizes, mapSize);
%% Heuristic Matrix h cost per node
    heuristicMatrix = calculateHeuristicMatrix(nodes, endPoint, u, v, w,true,gridResolution);
    disp('Heuristic Matrix Calculated...');
%% Perform A* pathfinding Modify for parellel processing
    for cost_idx = 1:cost_calcs
         disp('Calculating A* Path');
         [path, totalCost, totalDistance, totalTime, totalRE, nodeId] = AStarPathTD(nodes, adjacencyMatrix, heuristicMatrix,  startPoint, endPoint, Kd,Kt,Ke, cost_idx, buildingPositions, buildingSizes, r_n(size(nodes,1)), false);    
         disp(['Original A* Path Calculated, Combination ' num2str(cost_idx) ' Calculated'] );
         batchadded = 0;
         while isempty(path) || totalCost == inf
            disp('No valid path found between the start and end points. Adding more nodes...');
            nodeadded = round(0.5*numNodes);
            newnumNodes = size(nodes,1)+nodeadded;
            PointAddNodes = [];
            for z = 1:size(nodeId,1)
            PointNodes= generateNodesSP(startPoint, buildingPositions,buildingSizes, 0.25*r_n(size(nodes,1)), r_n(size(nodes,1)), nodes,mapSize,adjacencyMatrix);
            disp(['Nodes Added Around Node ' num2str(nodes(nodeId(z),:))]);
            PointAddNodes = [PointAddNodes;PointNodes];
            end
            [~,ia,~] = intersect(PointAddNodes, nodes, 'rows');
            PointAddNodes(ia,:) = []; 
            MoreNodes = [nodes; PointAddNodes];
            disp('Updating Adjacency Matrix');
            adjacencyMatrix = calculateLocalAdjacencyMatrixNewNodes(n, buildingPositions, buildingSizes, mapSize, u, v, w, false, gridResolution, adjacencyMatrix, nodes, PointAddNodes);
            disp('Adjacency Matrix Updated');
            %% possible path check
            [~, startNodeIdx] = min(sum((nodes - startPoint).^2, 2));
            startNodeIdx = max(1, min(startNodeIdx, size(nodes,1)));
            % Initialize the list of nodes connected to start and end points
            connectedToStart = getConnectedNodes(startNodeIdx, MoreNodes, adjacencyMatrix, r_n(size(MoreNodes,1)), buildingPositions, buildingSizes); 
            [~, EndNodeIdx] = min(sum((nodes - endPoint).^2, 2));
            EndNodeIdx = max(1, min(EndNodeIdx, size(nodes,1)));
            % Initialize the list of nodes connected to start and end points
            connectedToEnd = getConnectedNodes(EndNodeIdx, MoreNodes, adjacencyMatrix, r_n(size(MoreNodes,1)), buildingPositions, buildingSizes); 
            if isNodeConnectable(startPoint, startPoint, endPoint, connectedToStart, connectedToEnd)
                disp('Possible Path Found calculating new Heruistic Matrix');
                 heuristicMatrix = calculateHeuristicMatrixAddedNodes(nodes, endPoint, u, v, w, true, gridResolution, heuristicMatrix, StartEndNodes);
                 disp('New Heuristic Matrix has been made');
                 nodes= [nodes; StartEndNodes];
                % Attempt A* pathfinding again
                 disp('Calculating A* Path With New Nodes');
                [path, totalCost, totalDistance, totalTime, totalRE, nodeId] = AStarPathTD(nodes, adjacencyMatrix, heuristicMatrix,  startPoint, endPoint, Kd, Kt, Ke, cost_idx, buildingPositions, buildingSizes, r_n(size(nodes,1)), false); 
            end
            processPairMatrix = processPairGeneration(adjacencyMatrix, MoreNodes,  buildingSizes, mapSize); 
            [newNodes, processPairMatrix] = generateNodesAdjacencyMatrix(MoreNodes, buildingPositions, buildingSizes, 0.25*r_n(size(MoreNodes,1)), r_n(size(MoreNodes,1)),  nodeadded, mapSize, adjacencyMatrix, gridResolution, processPairMatrix, prismCenter, prismSize, startPoint, endPoint);
            batchadded = batchadded +1;
             % If batchadded exceeds 10, restart the current MT iteration
            if batchadded > 15
                disp('Exceeded maximum number of node batches. Restarting Monte Carlo Trial...');
                restartTrial = true;
                break;
            end
           
           additionalNodes = [startNodes;endNodes;newNodes];
           disp('More Nodes Have Been Distributed In The Environment.');
          % Recalculate adjacency and heuristic matrices
            disp('Calculating New Adjacency and Heuristic Matrices');
            adjacencyMatrix = calculateLocalAdjacencyMatrixNewNodes(n, buildingPositions, buildingSizes, mapSize, u, v, w, false, gridResolution, adjacencyMatrix, MoreNodes, newNodes);
            disp('New Adjacency Matrix has been made');
            heuristicMatrix = calculateHeuristicMatrixAddedNodes(nodes, endPoint, u, v, w, true, gridResolution, heuristicMatrix, additionalNodes);
            disp('New Heuristic Matrix has been made');
            nodes= [MoreNodes; newNodes];
            % Attempt A* pathfinding again
            disp('Calculating A* Path With New Nodes');
            [path, totalCost, totalDistance, totalTime, totalRE, nodeId] = AStarPathTD(nodes, adjacencyMatrix, heuristicMatrix,  startPoint, endPoint, Kd, Kt, Ke, cost_idx,buildingPositions,buildingSizes, r_n(size(nodes,1)), false); 
            if isempty(path) == true
                 processPairMatrix = processPairGeneration(adjacencyMatrix, nodes, buildingSizes, mapSize);
            end
        end     
        if restartTrial == true
            break;
        end
        parameter_comparison_matrix(1,cost_idx,MT) = totalDistance; 
        parameter_comparison_matrix(2,cost_idx,MT) = totalTime; 
        parameter_comparison_matrix(3,cost_idx,MT) = totalRE;
        parameter_comparison_matrix(4,cost_idx,MT) = totalCost;
        Original_paths{cost_idx,MT} = path;    
        nodes_stored{MT,1} = nodes;
        u_nodes = zeros(size(nodes, 1), 1);
        v_nodes = zeros(size(nodes, 1), 1);
        w_nodes = zeros(size(nodes, 1), 1);
        % Loop through each node
        for i = 1:size(nodes, 1)
            % Round node coordinates
            rounded_node = round(nodes(i,:)/gridResolution);
            % Check for boundary conditions
            if rounded_node(1) <= 0
                rounded_node(1) = 1;
            end
            if rounded_node(2) <= 0
                rounded_node(2) = 1;
            end
            if rounded_node(3) <= 0
                rounded_node(3) = 1;
            end
            % Assign the wind values at the nearest grid point to the current node
            u_nodes(i) = u(rounded_node(1), rounded_node(2), rounded_node(3));
            v_nodes(i) = v(rounded_node(1), rounded_node(2), rounded_node(3));
            w_nodes(i) = w(rounded_node(1), rounded_node(2), rounded_node(3));
        end
     u_saved{MT} = u_nodes;
     v_saved{MT} = v_nodes;
     w_saved{MT} = w_nodes;
        %% Path Refinement and smoothing
        refine = 1; % How many times we go through the path refinement process
        smoothnodes = nodes;
        smoothAdjMatrix = adjacencyMatrix;
        smoothHeurMatrix = heuristicMatrix;
        for a = 1:refine 
            % Define parameters
            segmentLength = 10; % Number of points in each segment
            addednodes = [];
            % Assuming segmentLength is defined somewhere in your code
            num_points = size(path, 1);
            t = 1:num_points;
            c = 3; % Determines cubic spline smoothness
            tqq = linspace(1, num_points, c * size(path, 1)); % Adjust the number of points for finer spline resolution if needed
            splinedPath = zeros(size(tqq, 2), 3);
            for dim = 1:3 % Iterate over x, y, z dimensions
                x = spline(t, path(:, dim), tqq);
                splinedPath(:, dim) = x;
            end
            splinedPathNodes = splinedPath;
            disp('Path Smoothed By Utilizing A Cubic Spline');
            for k = 1:size(splinedPath, 1)-1
                step_size = k; % Define step size based on k
                numSegments = floor((size(splinedPath, 1) - 1) / step_size); % Calculate number of segments based on step size
                % Iterate over segments
                for i = 1:numSegments
                    % Get segment endpoints
                    segmentStart = splinedPathNodes((i-1) * step_size + 1, :);
                    segmentEnd = splinedPathNodes(i * step_size + 1, :);
                    % Divide segment into smaller sections using linear interpolation
                    numPoints = segmentLength - 1; % Number of points excluding start and end
                    sectionPoints = zeros(numPoints, size(segmentStart, 2)); % Initialize array to store section points
                    % Linear interpolation for each dimension separately
                    for j = 1:size(segmentStart, 2) % Iterate over dimensions
                        sectionPoints(:, j) = linspace(segmentStart(j), segmentEnd(j), numPoints)';
                    end
                    sectionPoints = sectionPoints(2:end-1, :); % Exclude start and end points
                    % Check each point in sectionPoints for proximity to buildings
                    validPoints = [];
                    for point = sectionPoints'
                        % Check if point is too close to any building
                        tooClose = false;
                        for e = 1:numBuildings
                            buildingPos = buildingPositions(e, :);
                            if norm(point(1:2) - buildingPos(1:2)) < obstacleRadius && abs(point(3) - buildingPos(3)) < buildingSizes(e, 3)/2
                                tooClose = true;
                                break;
                            end
                        end
                        if ~tooClose
                            validPoints = [validPoints; point'];
                        end
                    end
                    % Add valid points to addednodes
                    addednodes = [addednodes; validPoints];
                end
            end
            preexistingNodes = smoothnodes;
            nearbyNodes = [];
            r = r_n(size(addednodes, 1) + size(nodes, 1));
            nearbyIndices = []; % Initialize array to store indices of nearby nodes
             for i = 1:size(preexistingNodes, 1)
                for j = 1:size(addednodes, 1)
                    if norm(preexistingNodes(i, :) - addednodes(j, :)) <= r
                        nearbyNodes = [nearbyNodes; preexistingNodes(i, :)];
                        nearbyIndices = [nearbyIndices; i]; % Store the index of the nearby node
                        break;
                    end
                end
             end
            % Remove duplicate indices
            nearbyIndices = unique(nearbyIndices);
            % Extract the reduced adjacency and heuristic matrices
            % Adjacency matrix has 3 dimensions: (node, node, coordinate)
            reducedAdjacencyMatrix = smoothAdjMatrix(nearbyIndices, nearbyIndices, :);
            % Heuristic matrix has 2 dimensions: (node, coordinate)
            reducedHeuristicMatrix = smoothHeurMatrix(nearbyIndices, :);
            % Combine valid points and nearby preexisting nodes
            localNodes = unique([addednodes; nearbyNodes], 'rows');     
            disp('Calculating Local Adjacency and Heuristic Matricies for Smoothed Path...');
            localAdjacencyMatrix = calculateLocalAdjacencyMatrixNewNodes(n, buildingPositions, buildingSizes, mapSize, u, v, w, false, gridResolution, reducedAdjacencyMatrix, nearbyNodes, addednodes);
            smoothAdjMatrix = localAdjacencyMatrix;
            disp('Local Adjacency Matrix Calculated...');
            localheuristicMatrix = calculateHeuristicMatrixAddedNodes(nearbyNodes, endPoint, u, v, w, true, gridResolution, reducedHeuristicMatrix, addednodes);
            smoothHeurMatrix = localheuristicMatrix;
            disp('Local Heuristic Matrix Calculated...');
            [RefinedPath,SmoothtotalCost, SmoothtotalDistance, SmoothtotalTime, SmoothtotalRE, ~] = AStarPathTD(localNodes, localAdjacencyMatrix, localheuristicMatrix,  path(1,:), path(end,:), Kd,Kt,Ke,cost_idx,buildingPositions,buildingSizes, r, true);
            disp(['Original Path Combination ' num2str(cost_idx) ' Refined ' num2str(cost_idx) ' Times']);
            path = RefinedPath;
            smoothnodes = [nodes; addednodes];
        end
Elaboration 1: The A* function is called for a set of nodes, adjacency matrix, and heuristic matrix, and the idea with using the parfor loop initially was to spread out the cost calculations for the forward and backward search --> this was replaced with vectorization which has sped up the function. I will post an screen shot of the A* function as I am aware that calculating the adjacency matrix and heuristic matrix are the other two functions in my code that take a significant amount of time.
parforspreads out the work across clusters? Or are you running this function many times, for different inputs, on the cluster? What iscalculateCost? If we don’t know what is happening inside theparforloop, how are we supposed to understand what the computational cost of the loop is? \$\endgroup\$