24

RaptorCS POWER9 Blackbird PC: An expensive mistake

 4 years ago
source link: https://www.tuicool.com/articles/ZZvQVvy
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

November 2018: Ordered Basic Blackbird Bundle w/32 GB RAM: $1,935.64

June 2019

Order ships, and arrives without RAM. It had been long enough that I didn’t realize the order had only been partially fulfilled, so I order some RAM from the list of recommended chips ($338.40), along with the other necessities that I didn’t purchase from Raptor: a case ($97.99) and a PSU ($68.49), and grab some hard drives I have lying around. Total cost: about $2,440. Worth it to get POWER9 builds working on builds.sr.ht!

I carefully put everything together, consulting the manual at each step, plug in a display, and turn it on. Lights come on, things start whizzing, and the screen comes to life - and promptly starts boot looping.

June 27th

Support ticket created. What’s going on with my board?

June 28th

Support gets back to me the next day with a suggestion which is unrelated to the problem, but no matter - I spoke with volunteers in the IRC channel a few hours earlier and we found out that - whoops! - I hadn’t connected the CPU power to the motherboard. This is the end of the PEBKAC errors, but not the end of the problems. The machine gets further ahead in the boot - almost to “petitboot”, and then the display dies and the machine reveals no further secrets.

I sent an update to the support team.

July 1st

We have normally only seen this type of failure when there is a RAM-related fault, or if the PSU is underpowered enough that bringing the CPUs online at full power causes a power fault and immediate safety power off.
Can you watch the internal lights while the system is booting, and see if the power LED cluster immediately changes from green to orange as the system stops responding over SSH?

The IRC channel suspects this is not related to the problem. Regardless, I reply a few hours later with two videos showing the boot up process from power-out to display death, with the internal LEDs and the display output clearly visible.

July 4th

“Any progress on this issue?”, I ask.

July 15th

“Hi guys, I’m still experiencing this problem. If you’re unsure of the issue I would like to send the board back to you for diagnosis or a refund.”

July 25th

Sorry for the delay. Having senior support check out the videos.
Thanks for writing back. We should have something for you by tomorrow during the day.

July 31st

Hi Drew.
The videos are being reviewed this week. Thank you for sending them.
Please stay tuned.

September 15th

No reply from support. I have since bought a little more hardware for self-diagnosis, namely the necessary pieces to connect to the two (or is it 3?) serial ports. I manage to get a log, which points to several failures, but none of them seem to be related to the problem at hand (they do indicate some network failures, which would explain why I can’t log into the BMC over SSH for further diagnosis). And the getty is looping, so I can’t log in on the serial console to explore any further.

That was a week ago. Radio silence since.

So, 10 months after I placed an order for a POWER9 machine, 3 months after I received it (without the RAM I purchased, no less), and over $2,500 invested… it’s clear that buying the Blackbird was an expensive mistake. Maybe someday I’ll get it working. If I do, I doubt the “support” team will have been involved. Currently my best bet seems to be waiting for some apparent staff member (the only apparent staff member) who idles in the IRC channel on Freenode and allegedly comes online from time to time.

I’m not alone in these problems. Here are some (anonymized) quotes I’ve heard from others while trying to troubleshoot this on IRC.

On support:

Raptor has been burning a LOT of good will in the community. They really need a kick up the ass re: post-sales customer support.
ugh, ddevault, yeah. [Blackbird ownership] has not been a smooth experience for me, either.
my personal theory is that they have really bad ticket software that ‘loses’ tickets somehow

On reliability:

I’ve found openbmc’s networking to be… a bit unreliable… maybe 20% of the time it does not responed[sic]/does not respond fast enough to networking requests.
yeah the vga handoff failing doesn’t surprise me (other people here have reported it). but the BMC not getting a DHCP lease is odd. (well maybe not that odd if you look at the crumminess of the OpenBMC software stack…)

So, yeah, don’t buy from Raptor Computer Systems. It’s too large and unwieldly to be an effective paper weight, either!

Have a comment on one of my posts? Start a discussion in my public inbox by sending an email to ~sircmpwn/[email protected] [ mailing list etiquette ]

Articles from blogs I follow around the net

Updates in August 2019

This post gives an overview of the recent updates to the Writing an OS in Rust blog and the used libraries and tools. I was very busy with finishing my master's thesis, so I didn't have any to implement any notable changes myself. Thanks to contrib…

via Writing an OS in Rust September 9, 2019

Go 1.13 is released

Today the Go team is very happy to announce the release of Go 1.13. You can get it from the download page.

via The Go Programming Language Blog September 3, 2019

Files are fraught with peril

This is a psuedo-transcript for a talk given at Deconstruct 2019. To make this accessible for people on slow connections as well as people using screen readers, the slides have been replaced by in-line text (the talk has ~120 slides; at an average of 20 k…

via Dan Luu July 12, 2019

Generated by openring


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK