{"id":7398,"date":"2023-09-29T11:46:12","date_gmt":"2023-09-29T10:46:12","guid":{"rendered":"https:\/\/rakhesh.com\/?p=7398"},"modified":"2023-09-29T11:48:54","modified_gmt":"2023-09-29T10:48:54","slug":"more-powershell-7-2-and-azure-automation-troubles","status":"publish","type":"post","link":"https:\/\/rakhesh.com\/azure\/more-powershell-7-2-and-azure-automation-troubles\/","title":{"rendered":"More PowerShell 7.2 and Azure Automation troubles…"},"content":{"rendered":"

Last week I had blogged about ExchangeOnlineManagement<\/code> and Az<\/code> module troubles<\/a> with PowerShell 7.2. This week I ran into another issue as I moved more Runbooks over to PowerShell 7.2.<\/p>\n

Some of them started failing for no reason. It happened when I’d do a Connect-PnPOnline<\/code> to connect to a SharePoint site, and the error was: Host not reachable<\/strong>.<\/p>\n

Such a weird one, coz if I try and connect to the site from the Hybrid Runbook Worker this Runbook runs on, I can connect to the site. Moreover, most of my Runbooks work fine – even though they all connect to the same site and run from the same HRW – just a few failed. This stumped me for a bit.<\/p>\n

Then I realized the ones that fail were using this throttling function<\/a> I had created. It basically checks if there’s another instance of the Runbook already running, and if so quits or waits. Hmm, why was that causing things to fail?<\/p>\n

Yes the throttling function connects to Azure and does some stuff, but I was connecting to Azure in all the other runbooks anyway (to read Key Vaults and such) and that had no issue. Digging more, I realized the issue was with the Az.Resources<\/code> module. The cmdlets used by that function make use of this module, and looks like that conflicts with PnP.PowerShell. Eugh.<\/p>\n

Looks like this is fixed in the upcoming<\/a> 2.3.0 release of PnP.PowerShell (still at 2.2.0 as of writing) – that doesn’t help me currently. I can’t update my production Runbooks to using nightly versions of the module just to fix this issue. I could, of course, remove the throttling function – which is what I did in the interim – but I wasn’t happy with that. I can’t have these Runbooks running concurrently.<\/p>\n

An update on the throttling function<\/h3>\n

Last we met my throttling function it looked like this:<\/p>\n

# This is a Function I created (from various Google results) to throttle a Runbook.\r\n# It will either wait or quit the runbook. \r\nfunction Throttle-AzRunbook {\r\n    param(\r\n        [switch]$quitRatherThanWait,\r\n        [int]$numberOfInstances = 1\r\n    )\r\n\r\n    # Connect to Azure. With a Managed Identity in this case as that's what I use. \r\n    # It's like I am already connectes but I can't assume that within this function. \r\n    # Must connect to Azure before running Get-AzAutomationJob or Get-AzResource\r\n    try {\r\n        # From https:\/\/docs.microsoft.com\/en-us\/azure\/automation\/enable-managed-identity-for-automation#authenticate-access-with-system-assigned-managed-identity\r\n        # Ensures you do not inherit an AzContext in your runbook\r\n        Disable-AzContextAutosave -Scope Process | Out-Null\r\n\r\n        # Connect to Azure with system-assigned managed identity\r\n        Connect-AzAccount -Identity | Out-Null\r\n\r\n    } catch {\r\n        Write-Error \"Runbook could not connect to Azure: $($_.Exception.Message)\"\r\n        exit\r\n    }\r\n\r\n    # Get the Job ID from PSPrivateMetadata. That's the only thing it contains!\r\n    $automationJobId = $PSPrivateMetadata.JobId.Guid\r\n\r\n    # Get all Runbooks in the current subscription\r\n    $allAutomationAccounts = Get-AzResource -ResourceType Microsoft.Automation\/automationAccounts\r\n\r\n    $automationAccountName = $null\r\n    $resourceGroupName = $null\r\n    $runbookName = $null\r\n\r\n    foreach ($automationAccount in $allAutomationAccounts) {\r\n        $runbookJob = Get-AzAutomationJob -AutomationAccountName $automationAccount.Name `\r\n                                    -ResourceGroupName $automationAccount.ResourceGroupName `\r\n                                    -Id $automationJobId `\r\n                                    -ErrorAction SilentlyContinue\r\n\r\n        if (!([string]::IsNullOrEmpty($runbookJob))) {\r\n            $automationAccountName = $runbookJob.AutomationAccountName\r\n            $resourceGroupName = $runbookJob.ResourceGroupName\r\n            $runbookName = $runbookJob.RunbookName\r\n        }\r\n    }\r\n\r\n    # At this point I'll have the Automation Account Name, Runbook Name, Job ID and Resource Group Name, \r\n    # Find all other active jobs of this Runbook.\r\n\r\n    $allActiveJobs = Get-AzAutomationJob -AutomationAccountName $automationAccountName `\r\n                                    -ResourceGroupName $resourceGroupName `\r\n                                    -RunbookName $runbookName | \r\n                    Where-Object { ($_.Status -eq \"Running\") -or ($_.Status -eq \"Starting\") -or ($_.Status -eq \"Queued\")}  \r\n\r\n    if ($quitRatherThanWait.IsPresent -and $allActiveJobs.Count -gt $numberOfInstances) {\r\n        Write-Output \"Exiting as another job is already running\"\r\n        exit\r\n\r\n    } else {\r\n        $oldestJob = $AllActiveJobs | Sort-Object -Property CreationTime  | Select-Object -First 1\r\n\r\n        # If this job is not the oldest created job we will wait until the existing jobs complete or the number of jobs is less than numberOfInstances\r\n        while (($AutomationJobID -ne $oldestJob.JobId) -and ($allActiveJobs.Count -ge $numberOfInstances)) {\r\n            Write-Output \"Waiting as there are currently running $($allActiveJobs.Count) active jobs for this runbook already. Sleeping 30 seconds...\"\r\n            Write-Output \"Oldest Job is $($oldestJob.JobId)\"\r\n        \r\n            Start-Sleep -Seconds 30\r\n        \r\n            $allActiveJobs = Get-AzAutomationJob -AutomationAccountName $automationAccountName `\r\n                                        -ResourceGroupName $resourceGroupName `\r\n                                        -RunbookName $runbookName | \r\n                        Where-Object { ($_.Status -eq \"Running\") -or ($_.Status -eq \"Starting\") -or ($_.Status -eq \"Queued\")}  \r\n        \r\n            $oldestJob = $allActiveJobs | Sort-Object -Property CreationTime | Select-Object -First 1\r\n        } \r\n        \r\n        Write-Output \"Job can continue...\"\r\n    }\r\n}<\/pre>\n

Turns out this doesn’t work with PowerShell 7.2 and HRWs as the PSPrivateMetadata<\/code> variable is not present in 7.2 + HRWs. (It is present in 5.x + HRWs and even 7.2 running on Azure – so it’s one of those things that will appear in the future I guess).<\/p>\n

This means I can’t extract the JobId and use it to search other jobs. What can I do here? After some tinkering I realized I can cheat and extract the JobId from one of the trace log files. You see, every HRW Runbook writes to this path:<\/p>\n

\"\"<\/p>\n

The highlighted bit varies per runbook.<\/p>\n

The file there looks like this:<\/p>\n

Orchestrator.Sandbox.Diagnostics Critical: 0 : [2023-09-26T09:46:51.8046468Z]  Starting sandbox process. [sandboxId=1a16c23f-90f5-46de-a7a8-213eed634246]\r\nOrchestrator.Sandbox.Diagnostics Critical: 0 : [2023-09-26T09:46:52.0546465Z]  Hybrid Sandbox\r\nOrchestrator.Sandbox.Diagnostics Critical: 0 : [2023-09-26T09:46:52.6485351Z]  First Trace Log.\r\nOrchestrator.Sandbox.Diagnostics Critical: 0 : [2023-09-26T09:46:52.8984865Z]  Sandbox Recieving Job. [sandboxId=1a16c23f-90f5-46de-a7a8-213eed634246][jobId=b20c76be-b132-4679-84e2-17b244734f65]\r\nOrchestrator.Sandbox.Diagnostics Critical: 0 : [2023-09-26T09:48:23.0674290Z]  Sandbox close request. The sandbox will exit immediately. [sandboxId=1a16c23f-90f5-46de-a7a8-213eed634246]\r\nOrchestrator.Sandbox.Diagnostics Critical: 0 : [2023-09-26T09:48:23.0674290Z]  Leaving sandbox process. [sandboxId=1a16c23f-90f5-46de-a7a8-213eed634246]\r\n<\/pre>\n

Neat, so line 4 has the JobId.<\/p>\n

What can I do to find this path to this file? Turns out $PSScriptRoot<\/code> has it. Split its path to get the parent, tack on \"\\diags\\trace.log\"<\/code> and that’s my file. I can essentially do something like this to get the Id if it’s not found:<\/p>\n

if (!$automationJobId) {\r\n    Write-Output \"Unable to find JobID from PSPrivateMetadata\"\r\n    if ($PWD -match \"HybridWorker\") {\r\n        Write-Output \"Trying a workaround to find JobID as this is an HRW\"\r\n\r\n        $parentPath =  Split-Path -Parent $PSScriptRoot\r\n        $fullPath =  $parentPath + \"\\diags\\trace.log\"\r\n\r\n        try {\r\n            $automationJobId = ((Get-Content $fullPath -ErrorAction Stop | Select-String \"jobId\") -split 'jobId=')[1] -replace ']',''\r\n\r\n        } catch {\r\n            $automationJobId = $null\r\n        }\r\n    }\r\n}<\/pre>\n

With this in hand my throttling function now looks like this:<\/p>\n

function Throttle-AzRunbook {\r\n    param(\r\n        [switch]$quitRatherThanWait,\r\n        [int]$numberOfInstances = 1\r\n    )\r\n\r\n    # Connect to Azure. With a Managed Identity in this case as that's what I use. \r\n    # It's like I am already connectes but I can't assume that within this function. \r\n    # Must connect to Azure before running Get-AzAutomationJob or Get-AzResource\r\n    try {\r\n        # From https:\/\/docs.microsoft.com\/en-us\/azure\/automation\/enable-managed-identity-for-automation#authenticate-access-with-system-assigned-managed-identity\r\n        # Ensures you do not inherit an AzContext in your runbook\r\n        Disable-AzContextAutosave -Scope Process | Out-Null\r\n\r\n        # Connect to Azure with system-assigned managed identity\r\n        Connect-AzAccount -Identity | Out-Null\r\n\r\n    } catch {\r\n        Write-Error \"Runbook could not connect to Azure: $($_.Exception.Message)\"\r\n        exit\r\n    }\r\n\r\n    # Get the Job ID from PSPrivateMetadata. That's the only thing it contains!\r\n    $automationJobId = $PSPrivateMetadata.JobId.Guid\r\n\r\n    # A workaround for PowerShell 7.x and HRW where $PSPrivateMetadata is missing\r\n    # I extract it from the trace.log file instead\r\n    if (!$automationJobId) {\r\n        Write-Output \"Unable to find JobID from PSPrivateMetadata\"\r\n        if ($PWD -match \"HybridWorker\") {\r\n            Write-Output \"Trying a workaround to find JobID as this is an HRW\"\r\n\r\n            $parentPath =  Split-Path -Parent $PSScriptRoot\r\n            $fullPath =  $parentPath + \"\\diags\\trace.log\"\r\n    \r\n            try {\r\n                $automationJobId = ((Get-Content $fullPath -ErrorAction Stop | Select-String \"jobId\") -split 'jobId=')[1] -replace ']',''\r\n\r\n            } catch {\r\n                $automationJobId = $null\r\n            }\r\n        }\r\n    }\r\n\r\n    if ($automationJobId) {\r\n        Write-Output \"JobID is $automationJobId\"\r\n        # Get all Runbooks in the current subscription\r\n        $allAutomationAccounts = Get-AzResource -ResourceType Microsoft.Automation\/automationAccounts\r\n\r\n        $automationAccountName = $null\r\n        $resourceGroupName = $null\r\n        $runbookName = $null\r\n\r\n        foreach ($automationAccount in $allAutomationAccounts) {\r\n            $runbookJobParams = @{\r\n                \"AutomationAccountName\" = $automationAccount.Name\r\n                \"ResourceGroupName\" = $automationAccount.ResourceGroupName\r\n                \"Id\" = $automationJobId\r\n                \"ErrorAction\" = \"SilentlyContinue\"\r\n            }\r\n\r\n            $runbookJob = Get-AzAutomationJob @runbookJobParams\r\n\r\n            if (!([string]::IsNullOrEmpty($runbookJob))) {\r\n                $automationAccountName = $runbookJob.AutomationAccountName\r\n                $resourceGroupName = $runbookJob.ResourceGroupName\r\n                $runbookName = $runbookJob.RunbookName\r\n            }\r\n        }\r\n\r\n        # At this point I'll have the Automation Account Name, Runbook Name, Job ID and Resource Group Name, \r\n        # Find all other active jobs of this Runbook.\r\n        $runbookJobParams = @{\r\n            \"AutomationAccountName\" = $automationAccountName\r\n            \"ResourceGroupName\" = $resourceGroupName\r\n            \"RunbookName\" = $runbookName\r\n            \"ErrorAction\" = \"SilentlyContinue\"\r\n        }\r\n\r\n        $allActiveJobs = Get-AzAutomationJob @runbookJobParams | Where-Object { ($_.Status -eq \"Running\") -or ($_.Status -eq \"Starting\") -or ($_.Status -eq \"Queued\") -or ($_.Status -eq \"Activating\") -or ($_.Status -eq \"Resuming\") }\r\n\r\n        if ($allActiveJobs.Count -gt $numberOfInstances) {\r\n            if ($quitRatherThanWait.IsPresent) {\r\n                Write-Output \"Exiting as another job is already running\"\r\n                exit\r\n    \r\n            } else {\r\n                $oldestJob = $AllActiveJobs | Sort-Object -Property CreationTime | Select-Object -First 1\r\n    \r\n                # If this job is not the oldest created job we will wait until the existing jobs complete or the number of jobs is less than numberOfInstances\r\n                while (($AutomationJobID -ne $oldestJob.JobId) -and ($allActiveJobs.Count -ge $numberOfInstances)) {\r\n                    Write-Output \"Waiting as there are currently running $($allActiveJobs.Count) active jobs for this runbook already. Sleeping 30 seconds...\"\r\n                    Write-Output \"Oldest Job is $($oldestJob.JobId)\"\r\n                \r\n                    Start-Sleep -Seconds 30\r\n                \r\n                    $allActiveJobs = Get-AzAutomationJob @runbookJobParams | Where-Object { ($_.Status -eq \"Running\") -or ($_.Status -eq \"Starting\") -or ($_.Status -eq \"Queued\") -or ($_.Status -eq \"Activating\") -or ($_.Status -eq \"Resuming\") }\r\n                    $oldestJob = $allActiveJobs | Sort-Object -Property CreationTime | Select-Object -First 1\r\n                } \r\n                \r\n                Write-Output \"Job can continue...\"\r\n            }\r\n\r\n        } else {\r\n            Write-Output \"No other concurrent jobs found...\"\r\n        }\r\n        \r\n    } else {\r\n        Write-Warning \"Unable to find JobID. Poceeding with job, this might result in concurrent executions\"\r\n        if ($PSVersionTable.PSVersion.Major -eq 7 -and $PWD -match \"HybridWorker\") {\r\n            Write-Output \"This is PowerShell 7.x in HRW - that explains it!\"\r\n        }\r\n    }\r\n}<\/pre>\n

Getting PnP PowerShell working with this<\/h3>\n

Ok, so what can I do to fix PnP PowerShell? Can’t I just unload the Az.Resources<\/code> module after its done? Yes, I can (Remove-Module<\/code>) but that doesn’t unload<\/a> any of the loaded assemblies, and since those are usually the source of conflict Remove-Module<\/code> can’t help us.<\/p>\n

What can I do regarding assemblies? In my previous post<\/a> I had alluded to this very informative article from Microsoft<\/a>. It suggests three ways<\/a> to work around this issue:<\/p>\n