1

GCP VM SSH problem and solution

 2 years ago
source link: https://allsyed.com/posts/gcp-vm-ssh-problem-and-solution/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
📅 Dec 17, 2020  ·  ☕ 4 min read  ·  ✍️ Syed Dawood

The Start

It was just another day at work. I was working on just another task in my everyday routine. I was required to login to a VM, let’s just call it $INSTANCE throughout this post and update few configs. I logged into google cloud console. Selected the project from project selector. Navigated to compute engine and clicked on SSH, doing so would open a pop up window and drops you into the familiar bash shell, not today. Instead, it kept on loading.

The Denial

I was confused, this has never happened before. I double-checked my internet, tried a different browser, used alternate internet connection, all actions ended up with same result. The loading pop up window

Attempt #1 : gcloud command

gcloud beta compute ssh $INSTANCE --zone $ZONE --project $PROJECT

Attempt #2 : gcloud command with username

gcloud compute ssh  $USR@$INSTANCE --zone $ZONE --project $PROJECT

Attempt #3 : gcloud command with verbose flag

gcloud compute ssh --zone $ZONE $INSTANCE --project $PROJECT --ssh-flag="-vvvvv"

Attempt #4 : gcloud command with compute engine and my newly generated ssh keypair

gcloud compute ssh --zone $ZONE $INSTANCE --project $PROJECT --ssh-key-file=$HOME/.ssh/google_compute_engine --ssh-flag="-vvv" # compute engine default
gcloud compute ssh --zone $ZONE $INSTANCE --project $PROJECT --ssh-key-file=$HOME/.ssh/new-ssh-key --ssh-flag="-vvv"

Attempt #5 : Reconfiguring gcloud ssh

rm $HOME/.ssh/google_compute_engine $HOME/.ssh/google_compute_engine.pub # removing default key pair
gcloud compute config-ssh

After this step, I went thought all above step once again.All yielding same result

Attempt #6 : ssh command with default and new keys

ssh -i $HOME/.ssh/new_key $USR@$INSTANCE_IP
ssh -i $HOME/.ssh/google_compute_engine $USR@$INSTANCE_IP

That same result was.

ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255]

The Hint

I discussed this problem with my project manager, He asked to get help from one of our cloud team member.During out conversation he suggested that enabling serial port with help the debugging of the problem and also there is something called startup-script, which does what it says, runs a script on VM start up. With these new-found hints I started to dig deeper.

Analysing serial port log

gcloud compute connect-to-serial-port $INSTANCE --zone=$ZONE --project=$PROJECT

This step right there revealed that VM ran out of storage.

Solution #1 : startup-script

I added metadata startup-script with content below. I have also tried below script with sudo, making sure I don’t leave any stone un-turned. After 3-4 trial-errors and extensive analysing of logs. I could conclude that startup-script was also not triggering.

#!/usr/bin/env bash
find /home/user/ -name "*.log" -delete

Solution #2 : shutdown-script

shutdown-script is again a script that is executed before machine is switched off, its content was same as startup-script. These were not triggering since there was not enough storage on VM.

Solution #3 : Resizing disk

If it ran out of storage, simply add more storage to VM boot disk will fix this problem. So, I decide to resize the boot disk after switching off the VM. I must say the resize command completed almost instantly.

gcloud compute disks resize $INSTANCE --zone $ZONE --size <int> --project $PROJECT

I started the VM, thinking Issue is resolved, But I was wrong. It greeted me with same error message when tried connecting it.

Solution #4 : Final Solution

While I was skimming though the documentation I read that you could detach and re-attach boot disks. I got an idea. I remembered that there is one snapshot of this VM which was taken when things were green. Here are my steps to solution.

  • Switch off the VM
  • Creating a disk from snapshot
  • detaching current boot disk
  • re-attaching disk create in first step
  • Switch it back on and hope it will work
gcloud compute disks create $NEW_DISK --source-snapshot $SNAPSHOT --project=$PROJECT --size <int> --zone $ZONE
gcloud beta compute instances detach-disk $INSTANCE --disk $OLD_DISK --project=$PROJECT
gcloud beta compute instances attach-disk $INSTANCE --disk $NEW_DISK --boot --project=$PROJECT

Conclusion

Voilà! I was able to access the machine. Someone would ask why go through all the hassle. You could have just create a new VM using snapshot. I couldn’t do that, I didn’t want to lose the VM metadata and more importantly VM IP. Since, this server was used by many of our customers, and they connect to it via IP.

What I have learned

  • There is a serial port on compute instance that GCP providers.
  • startup-script and shutdown-script
  • You can detach and re-attach boot disk, again this might not work exactly for a windows VM

clean up

I have cleaning up to do. Deleting the old boot disk, removing extra ssh keys from metadata, updating my code such that it removes old log files. These log files were the very reason for existence of this problem

References


Recommend

  • 61

    除非特别声明,此文章内容采用知识共享署名 3.0许可,代码示例采用Apache 2.0许可。更多细节请查看我们的服务条款。

  • 80
    • chinagdg.org 6 years ago
    • Cache

    GCP grows in the Netherlands region

    除非特别声明,此文章内容采用知识共享署名 3.0许可,代码示例采用Apache 2.0许可。更多细节请查看我们的服务条款。

  • 80
    • chinagdg.org 6 years ago
    • Cache

    GCP is building a region in Zürich

    除非特别声明,此文章内容采用知识共享署名 3.0许可,代码示例采用Apache 2.0许可。更多细节请查看我们的服务条款。

  • 57
    • chinagdg.org 6 years ago
    • Cache

    On GCP, your database your way

    On GCP, your database your way 2018-07-26...

  • 68
    • www.v2ex.com 5 years ago
    • Cache

    gcp 香港速度不错

    宽带症候群 - @kljsandjb - ![V2er]( https://i.loli.net/2018/10/23/5bce55996c0d5.png)<br><br>上一次看到这样速度还是 lightsail 日本?

  • 43
    • www.tuicool.com 5 years ago
    • Cache

    First Steps with GCP SQL

    In this post, we will take a look at how we can use Google Cloud Platform (GCP) SQL as a database for our Spring Boot application. We will investigate how we can use the Cloud database from our development machine and how...

  • 30
    • www.tuicool.com 5 years ago
    • Cache

    Searching for ET using AI on GCP

    A project playing with open data from SETI They say that the best way to learn data science is to create something. Once you’ve covering the basics of data manipulation, coding and statistics, using text...

  • 12
    • jkjung-avt.github.io 3 years ago
    • Cache

    Setting Up a NGINX + Flask Server on GCP

    Setting Up a NGINX + Flask Server on GCP Jul 4, 2020 I have always been wanting to set up a dashboard on the cloud which I could use to monitor IoT products and devices on the field. I was aware that AWS, Azure and GCP...

  • 4

    An introduction to DiskPart, and the solution to a problem. Last night I wanted to transfer some files between laptops, and tried using an old usb-stick I had laying around at home. At first, it wasn’t recognized by Wind...

  • 9
    • www.virtualtothecore.com 3 years ago
    • Cache

    Veeam Backup for GCP and Multi-tenancy

    Veeam Backup for GCP and Multi-tenancy Twitter 0 Facebook 0 LinkedIn 0 Email --

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK