The Bad News
Unfortunately, it's true - the Zero KO board must be delayed and all current pre-orders cancelled / refunded. The release date is now unknown. To understand how this happened, please read on. It was quite the process to figure out. Otherwise, if you have pre-ordered I will be direct messaging you for refund details. If you did not receive my message, please reach out to me directly either by email or discord.
So What Happened?
For those of you who saw the demo of all the features on our Twitch stream, it's clear that all the software works and plays great. The newest board design was error free, and the assembly process was perfected as best as possible for a small shop. We even managed to get a hold of some chips during the chip shortage. So, what the heck happened? The answer is - bad parts!
Several months ago, while searching for stock on the chip we use for the ZKO, we found about thousand were in-stock from a vendor we heard of but usually did not purchase from. Having been bitten in the past by waiting too long to order, I pulled out the cash to snag some - 100 of them to be exact. After our purchase and securing stock, a day later they were all gone. Not at all uncommon in the economic climate the electronics industry is in.
I happily used some of this stock to continue getting the hardware and firmware to where it is now. However, around Revision C of the board, when the Home button was pressed, it would freeze. It's strange that pressing Home would trigger a fault but no worries, the pin was re-mapped, and the problem went away. After playing Tekken 7 for many hours of testing, the problem appeared to have been resolved. I was excited to send out a board to one of our favorite YouTubers, TechWithCraw, to give the board a beating and honest review. You can check out his review here: https://www.youtube.com/watch?v=v2Tyv6qjYCA
After the thumbs up from Craw, I was convinced that we were production ready. All I needed to do was make one more board revision with the edit to re-map the Home button to a different pin. When Revision D came in, I gave the board another test run by mounting it with the screw feet. No problems. I then switched to the low-profile sticky feet and this is where the problems started to creep up intermittently: random firmware crashes.
What is the Root Cause?
This was extremely time consuming to figure out. I had to isolate all the variables. I had some new firmware, a new board with a slight modification, and a different solder paste to consider. By loading the previous firmware, I was able to confirm that even known good firmware still causes the problem. I also checked to see if a certain part of the code would fail, but the crashes were completely random.
The board was changed slightly so to rule this out, I had to rework Rev D to how Rev C was. The problem persisted. Since pre-orders were coming in, we had already built a decent sized batch to get a bigger sample size. The random crashing happened in such a way where it was not possible to purposely trigger. Some boards failed within minutes while the most tolerant ones survived well over a week.
At this point, all the boards had different crash times. The parts and all the boards are the same. Part tolerances are within range, but the solder paste used did produce more flux than usual. Sometimes, after several crashes, the ZKO would connect but crash again immediately and never connect again. In other words, it was a soft bricking event. But by cleaning the IC pins with isopropyl alcohol, a seemingly dead board would come back to life. After some research, it was discovered that even "no-clean" solder paste, the kind that doesn't require a post cleaning process, can still cause errors in certain scenarios. So we bought an ultrasonic cleaner – if flux was the issue, we’ll include cleaning as part of our production process.
Once it came in, every board that was cleaned and completely flux-free; worked brand new again. It's amazing how much residue a 5-minute bath takes off. However, with more rounds of testing each board once again failed one by one at random intervals. Some were even triple bathed to be completely sure any leftover flux was removed.
At this point you can see how confused I was about the root cause. I decided to talk to my firmware helper (the one that helps reverse engineer the USB protocols). Fortunately, he used to be an engineer in creating chips from scratch - we are talking atomic scale here. He informed me there exists a condition called latch-up where stray electrical charge from the environment can cause the semiconductor material to stay in a certain (usually undesirable) state. When you put a cleaning agent on the pins it changes the charge distribution inside the chip, and it works again. This was a bigger problem back in the day of chip manufacturing, but nowadays not so much for modern microcontrollers/microprocessors. He then asked if the chips were purchased from the usual vendors - it wasn't! It was from a vendor that we have never purchased from before. He did say that sometimes, chip foundries will sell parts that are not quite up to spec or are dancing on the boundary between passing and failing. And sometimes, what appears to be a real authentic chip, can very well be a counterfeit. With the current chip shortage, these are possible scenarios that may be more common than before.
With this new hypothesis, I went ahead and rubbed my hand across the board, and it failed immediately. I tried mounting it and ran my fingers on the top of my acrylic case, and within a few short moments, it failed again. Finally, a repeatable, intentional failure, which is good news because I can diagnose the issue. However, the bad news is there is no way to fix such an issue without replacement stock. This is extremely hard within the chip shortage era, which got us into this trouble in the first place.
But Wait, Why Didn't the Same Chips Fail Before?
Here is where I was confused as well. But I noticed that I started seeing the failure when I used the lower-profile sticky feet, which are closer to the top of the case where a player would have their hands rubbing against the area. I switched to the taller screw feet that I was using in the earlier stages of testing, and it was much harder to get to fail. I believe this is why TechWithCraw didn't see a failure either. He used a Buttercade Battery Caddy that provides some protection from the top of the case and ultimately from the charge coming in contact with the chip. I repeated these tests with a big copper plate – copper and other types of metals can act as a barrier from certain charges. I put the copper plate between the top of the case and the bottom of the board. Never failed once! And this is when I believe that this is the root cause.
Can I Still Purchase as is?
With the correct installation in mind, it would likely work most of the time. I suspect that a metal case would also work. But that to me is a little janky. Every product I design and produce I intend to last for many years to come and a static charge failure like this is something I cannot allow to be released. It hurts a lot financially and emotionally, tons of work was put into it, but I do not think it would be ethical or beneficial to anyone to have the ZKO released in this state. I believe that this short-term pain will lead to long-term gains.
What Now?
This is the tricky part, the chip we use is an extremely popular one that just about every industry uses. It is quite popular in networking products, which is twice as bad since working from home has increased demand for those kinds of devices. Usually with higher demand comes a higher supply, but with a chip shortage this popularity is hurting many industries. Option 1 would be to simply wait for stock to come around. Option 2 would be to port all the code we have now to a completely different chip. This is the option that I am leaning towards because even though this will be a lot of work, it's more proactive than waiting for chips to fall from the sky. There is a new chip that I would like to test and use. It’s a chip not really used in bigger industries, which means demand for them should be lower. Oddly enough, it's actually faster than the chip we use now which is a plus. There will be challenges, but this new chip also brings potential additions that weren't possible before.
For all of you who pre-ordered, I am terribly sorry to deliver this news. Best I can do is pivot and find a solution. As a thank you for supporting us in the pre-order, I will give you a 50% off coupon to anything in the store. You can use it whenever, even for the ZKO release in the future. For those of you who are developing expansion port devices for the ZKO, these faulty boards can still be used for development. When your current board bites the dust, let me know and I will send another your way to continue development.
Time to get back to grind!
- Joe