Recently I came across this issue while upgrading the VCF from 4.5 to 4.5.1, so I thought of sharing this issue wider audience:
Started the first component (SDDC manager) upgrade in the VCF stack, as you may already know that as a pre-requisite you need to download the bundles before starting the upgrade.
Took the SDDC manager VM snapshot from vCenter and started the upgrade and after 5 mins the upgrade failed while applying the configuration drift bundle, with the below strange error, failed to enable SSH edge nodes on VI workload domain:
When checking the SSH service on the edge node the SSH service was enabled and running. So restarted the SSH service by Disabling and Enabling the SSH service on all 4 edge nodes.
Restarted the upgrade by clicking the retry upgrade option with no luck the upgrade failed again with the same error.
Further, I decided to look into the logs (/var/log/vmware/vcf/lcm/thirdparty/upgrades/xxx-xxx-xxx/sddcmanager-migration-apps/logs/sddcmanager_migration_app_upgrade_2023-06-09_2023-06-09_02-29-03.log) to understand what is causing this failure,
(xxx-xxx-xxx --> Denotes the upgrade id, a folder will be created with the upgrade id, and the logs will be within that folder).
In the logs as well I could see the same error message "FAILED_TO_ADD_EDGE_NODE_SSH_KEY_TO_KNOWN_HOSTS"
Later while further troubleshooting I found that DNS resolution wasn't happening for two edge nodes out of 4 so added the DNS entry for both forward and reverse lookup.
Performed the upgrade again and this time the upgrade was successful.
The error was not helpful for the resolution we found. We were able to identify the based on basic troubleshooting (like ping and nslookup).
コメント