“I’m just very curious—got to find out what makes things tick… all our people have this curiosity; it keeps us moving forward, exploring, experimenting, opening new doors.” – Walt Disney
One word at a time. It is like a stream of consciousness. Actions, objects, colors, feelings and sounds paint across the page like a slow moving brush. Each word adds to the crescendo of thought. Each phrase, a lattice of cognition. It assembles structure. It conveys scenes. It expresses logic, reason and reality in strokes of font and punctuation. It is the miracle of writing. Words strung together, one by one, single file, transcending and preserving time and thought.
I love writing. But it isn’t the letters on the page that excite me. It is the progression of thought. Think about this for a moment. How do you think? I suspect you use words. In fact, I bet you have been talking to yourself today. I promise, I won’t tell! Sure, you may imagine pictures or solve puzzles through spatial inference, but if you are like me, you think in words too. Those “words” are likely more than English. You probably use tokens, symbols and math expressions to think as well. If you know more than one language, you have probably discovered that there are some ways you can’t think in English and must use the other forms. You likely form ideas, solve problems and express yourself through a progression of those words and tokens.
Over the past few weekends I have been experimenting with large language models (LLMs) that I can configure, fine tune and run on consumer grade hardware. By that, I mean something that will run on an old Intel i5 system with a Nvidia GTX 1060 GPU. Yes, it is a dinosaur by today’s standards, but it is what I had handy. And, believe it or not, I got it to work!
Before I explain what I discovered, I want to talk about these LLMs. I suspect you have all personally seen and experimented with ChatGPT, Bard, Claude or the many other LLM chatbots out there. They are amazing. You can have a conversation with them. They provide well-structured thought, information and advice. They can reason and solve simple puzzles. Researchers agree that they would probably even pass the Turing test. How are these things doing that?
LLMs are made up of neural nets. Once trained, they receive an input and provide an output. But they have only one job. They provide one word (or token) at a time. Not just any word, the “next word.” They are predictive language completers. When you provide a prompt as the input, the LLM’s neural network will determine the most probable next word it should produce. Isn’t that funny? They just guess the next word! Wow, how is that intelligent? Oh wait… guess what? That’s sort of what we do too!
So how does this “next word guessing” produce anything intelligent? Well, it turns out, it’s all because of context. The LLM networks were trained using self-attention to focus on the most relevant context. The mechanics of how it works are too much for a Monday email, but if you want to read more see the paper, Attention Is All You Need which is key in how we got to the current surge in generative pre-trained transformer (GPT) technology. That approach was used to train these models on massive amounts of written text and code. Something interesting began to emerge. Hyper-dimensional attributes formed. LLMs began to understand logic, syntax and semantics. They began to be able to provide logical answers to prompts given to them, recursively completing them one word at a time to form an intelligent thought.
Back to my experiment… Once a language model is trained, the read-only model can be used to answer prompts, including questions or conversations. There are many open source versions out there on platforms like Huggingface. Companies like Microsoft, OpenAI, Meta and Google have built their own and sell or provide for free. I downloaded the free Llama 2 Chat model. It comes in 7, 13 and 70 billion parameter models. Parameters are essentially the variables that the model uses to make predictions to generate text. Generally, the higher the parameters, the more intelligent the model. Of course, the higher it is, the larger the memory and hardware footprint needed to run the model. For my case, I used the 7B model with the neural net weights quantized to 5-bits to further reduce the memory needs. I was trying to fit the entire model within the GPU’s VRAM. Sadly, it needed slightly over the 6GB I had. But I was able to split the neural network, loading 32 of the key neural network layers into the GPU and keeping the rest on the CPU. With that, I was able to achieve 14 tokens per second (a way to measure how fast the model generates words). Not bad!
I began to test the model. I love to test LLMs with a simple riddle. You would probably not be surprised to know that many models tell me I haven’t given them enough information to answer the question. To be fair, some humans do to. But for my experiment, the model answered correctly:
> Ram's mom has three children, Reshma, Raja and a third one. What is the name of the third child?
The third child's name is Ram.
I went on to have the model help me write some code to build a python flask based chatbot app. It makes mistakes, especially in code, but was extremely helpful in accelerating my project. It has become a valuable assistant for my weekend coding distractions. My next project is to provide a vector database to allow it to reference additional information and pull current data from external sources.
I said this before, but I do believe we are on the cusp of a technological transformation. These are incredible tools. As with many other technologies that have been introduced, it has the amazing potential to amplify our human ability. Not replacing humans, but expanding and strengthening us. I don’t know about you, but I’m excited to see where this goes!
Stay curious! Keep experimenting and learning new things. And by all means, keep writing. Keep thinking. It is what we do… on to the next word… one after the other… until we reach… the end.
“Imperfect things with a positive ingredient can become a positive difference.” – JasonGPT
I don’t know how you are wired, but for me, I become intoxicated with new technology. I have a compulsive need to learn all about it. I’m also a kinesthetic learner which means I need to be hands on. So into the code I go. My latest fixation is large language models (LLMs) and the underlying generative neural network (NN) transformers (GPTs) that power them. I confess, the last time I built a NN, we were trying to read George H.W. Bush’s lips. And no, that experiment didn’t work out too well for us… or for him!
Do you want to know what I have discovered so far? Too bad. I thought I would take you along for the ride anyway. Seriously, if you are fed up with all the artificial intelligence news and additives, you can stop now and go about your week. I won’t mind. Otherwise, hang on, I’m going to take you on an Indiana Jones style adventure through GPT! Just don’t look into the eyes of the idol… that could be dangerous, very dangerous!
Where do we start? YouTube of course! I have a new nerd crush. His name is Andrej Karpathy. He is a Slovak-Canadian computer scientist who served as the director of artificial intelligence and Autopilot Vision at Tesla and currently works for OpenAI. He lectured at Standford University and has several good instructional lectures on YouTube. I first saw him at the Microsoft Build conference where he gave a keynote on ChatGPT but what blew me away was his talk, “Let’s build GPT: from scratch, in code, spelled out.” (YouTube link). It’s no joke. He builds a GPT model on the works of Shakespeare (1MB), from scratch. After spending nearly 2 hours with him, Google Colab and PyTorch, I was left with a headache and some cuts and bruises. But I also had an insatiable desire to learn more. I have a long way to go.
The way I learn is to fork away from just repeating what an instructor says and start adding my own challenges. I had an idea. I have done a lot of writing (many of you are victims to that) and much of that is on my blog site. What if I built a GPT based solely on the corpus of all my writing? Does that sound narcissistic a bit to you too? Oh well, for the good of science, we go in! Cue the Indy music. I extracted the text (468k). It’s not much, but why not?
By the way, if you are still with me, I’ll try to go faster. You won’t want to hear about how I wasted so much time trying to use AMD GPUs (their ROCm software sucks, traveler beware), switched to CPUs, Nvidia CUDA and eventually Apple Silicon MPS (Metal Performance Shaders built in to the M1). All the while, I was using my fork of the code I built with Andrej Karpathy (ok, not him directly, but while watching his video). I started off with the simple Bigram NN Language model. And it is “Bi-Gram” not “Big RAM” but I found that to be ironically comical in a dad joke sort of way.
My JasonGPT bigram.py started learning. It ran for 50,000 iterations and took about 8 hours. It even produced an output of random musings. While there was quite a bit of nonsensical output, I was amazed at how well this small run did at learning words, basic sentence structure and even picked up on my style. Here are some samples from the output I found interesting, comical and sometimes, well, spot on:
It’s a lot of time… But I think we also need science.
What are your big ideas?
Set our management to the adjacent ground (GND) pin.
I have a task to Disneyland out that this day.
I love the fun and fanciful moments as kids get to dream into their favorite characters, embrace the identity of their heroes, wrap themselves up starfish back.
Bring on the “power” of his accidental detail.
Your character provided faith, all kindness and don’t care.
Grab a difference too.
After several days of emailing, texting and calling, I received a text message.
Curl has the ability to provide timing data for DNS lookup, it will easily show or avoided.
Imperfect things with a positive ingredient can become a positive difference, just get that time.
I also believe we should exploit the fusion power that shows up each day in our company’s data.
Have you found a vulnerability? Are you concerned about some missing measures or designs that should be modernized or addressed? If so, don’t wait, raise those issues. Speak up and act. You can make a difference.
“I know what you are thinking.” the irony
We are the ones who make a brighter day.
The journey ahead is ahead.
What are you penning today? What adventures are you crafting by your doing? Get up, get moving… keep writing.
Look, it’s no ChatGPT, but it blew my mind! I’m only using a 4 layer NN with 7 million parameters. In comparison, ChatGPT uses 96 layers and 175 billion parameters! Before the weekend ended, I set up nanoGPT to build a more elaborate model on my data set. It’s still running, but already I can see it has learned a lot more of my style but seems to lack some focus on topics. It’s easily distracted and interrupts its own train of thoughts with new ideas. Squirrel! Nothing like me.
So my JasonGPT won’t be writing my Monday updates anytime soon, but who knows, maybe it will help me come up with some new ideas. I just hope it stays benevolent and kind. I would hate for it to suddenly become self-aware and start…
Typhoon warning! My nephew is a Lt. Commander in the US Navy currently stationed in Guam. He teaches and manages trauma and emergency care at the hospital. Last night, he was preparing his family for the typhoon that would be sweeping across the small Pacific island in just a few hours. They closed the storm shutters, stored their Jeep in the basement and ensure their backup power and pumps were working. My nephew drew the short straw at the hospital and will be managing the ER while the storm rolls through. I worried about the hospital being built for these type of events and he assured me that it was, but of course, he was quick to add that the generators were built by the lowest bidder.
There is another typhoon coming. Gazing out over the technology horizon we can see a storm forming. But this one seems to be more than heavy winds and rain. I’m talking about the recent astonishing developments in generative artificial intelligence (GAI). I’m increasingly convinced that we are sitting on the edge of another major tectonic shift that will radically reshape the landscape of our world. Anyone who has spent time exploring OpenAI’s ChatGPT or Dall-E, Google’s Bard, Microsoft’s Bing or Co-Pilot, Midjourney, or any of the hundreds of other generative AI tools out there, will immediately recognize the disruptive power that is beginning to emerge. It’s mind blowing. GAI’s capacity to review and create code, write narratives, empathetically listen and respond, generate poetry, transform art, teach and even persuade, seems to double every 48 hours. It even seems that our creation has modeled the creator so well that it even has the uncanny ability to hallucinate and confidently tell us lies. How very human.
I have never seen a technology grow this fast. I recall the internet in the late 1980’s and thinking it had the amazing potential as a communication platform. Little did I realize that it would also disrupt commerce, entertainment, finance, healthcare, manufacturing, education and logistics. It would create platforms for new businesses like the gig economy and provide whole new levels of automation and telemetry through IoT. But all of that took decades. Generative technology is announcing breakthrough improvements every week, sometimes every 48 hours. To be fair these large language models (LLMs) are all using decades old research in neural network (NN) technology. However, when you combine those NN with enhancements (e.g. newer transformers, diffusion algorithms), hardware (e.g. GPUs) and rich data sets (e.g. the internet) they unleash new capabilities we don’t even fully understand. The latest generations of the LLMs even appear to be doing some basic level reasoning, similar to how our own organic NNs help us solve problems.
Businesses are already starting to explore the use of this technology to increase productivity, improve quality and efficiency. Wendy’s recently announced that they are partnering with Google to use GAI to start taking food orders at their drive-throughs.1 Gannett, publisher of USA Today and other local papers, is using GAI to simplify routine tasks like cropping images and personalizing content.2 Pharmaceutical companies like Amgen are using GAI to design proteins for medicines.3 Autodesk is using GAI to design physical objects, optimizing design for reduced waste and material efficiency.4 Gartner identifies it as one of the most disruptive and rapidly evolving technologies they have ever seen.5 Goldman Sacks is predicting that GAI will drive a 7% increase in global GDP, translating to about $7 trillion!6
It’s time to prepare for the typhoon. I’m excited about the future! As a technologist, I know disruptions will come, challenging our thinking and changing how we work, live and play. I know it can also be terrifying. It can prompt fear, uncertainty and doubt. But now is the time to prepare! Don’t wait to be changed, be the change. Start exploring and learning. I have a feeling that this new technology will be a 10x amplifier for us. Let’s learn how we can use it, work with it and shape it to be the next technological propellent to fuel our journey to a greater tomorrow!
This blog text was 100% human generated but the image was created with OpenAI Dall-E2.
It was the first day of the pottery class. The instructor welcomed the students and began to orient them on the material. He announced that the final grade would be determined by one of two measures. For half the class, he said that their final grade would be determined by the “quality” of their pottery. Their goal was to work on a single high quality product. For the other half of the class, he said that their final grade would be determined by “quantity”. Their goal was the sheer amount of pottery produced. Fifty pounds of pots would be rated an “A”, forty pounds a “B”, and so on. The class began and the students began their work.
The last day of class finally came and a curious fact emerged. The works of highest quality were not produced by the group focused on quality. Instead, the highest quality works were all produced by the group graded for quantity! It seemed that the “quantity” group got busy producing piles of work and learning from their mistakes as they went along. In contrast, the “quality” group sat around theorizing about perfection, and in the end had little to show for their work than some theory of perfection and a lump of dead clay.
The key to becoming a great artist, writer, musician, etc., is to keep creating! Keep drawing, keep writing, keep playing! Quality emerges from the quantity. It strikes me that the same thing applies to software and systems we run. When we focus purely on the quality, we actually miss the mark. The way to improve quality is to keep creating, testing and learning. In the software sense, we want to keep releasing our code as often and as fast as possible. By doing that, we build operational expertise, knowledge and automation. We develop fast feedback loops that nudge the digital clay into a better shape. We tune processes to provide faster feedback loops, remove toil through automation, and minimize human error and mistakes. We optimize for a high throughput of working products and reap the prize of high quality outcomes.
But does this hold true? In my career, I have seen this to be true time and time again. Areas where we remove friction and optimize for faster release cycles (even multiple times a day), with automated integration, testing and delivery, ultimately result in higher quality products. I see the same thing looking out to the industry. The highest performing teams optimize for highest flow. The prize of perfection comes by delivering and learning. In the book, “Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations,” Dr. Nicole Forsgren, Jez Humble, and Gene Kim ran a multi-year research project looking at practices and capabilities of high-performing technology organizations. Their conclusion was that the highest performing organizations embraced the notion of continuous delivery, the ability to deliver changes frequently, reliably and with minimal manual effort.
We ship! As technologist, software engineers and SREs, our teams help design, build and run the digital trains that deliver amazing products and experiences to our customers and fellow employees every single day. Our goal is to make these experiences shine! And, as the pottery class learned, it is quantity of our practice and continuous learning that makes them more perfect.
Keep shipping. Keep improving. Keep delivering!
The pottery parable is a true story as captured by David Bayles and Ted Orland in their book, Art & Fear. There is a similar story about photography in James Clear’s book Atomic Habits.
Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations by Dr. Nicole Forsgren, Jez Humble, and Gene Kim also identifies other key traits of high performing organizations, including having loosely coupled architecture, embracing a learning culture of experimentation, adopting lean principles to optimize flow, and creating a high-trust and empowering environment.
Forsgren, N., Humble, J., & Kim, G. (2018). Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations. IT Revolution Press.
Bayles, D., & Orland, T. (1993). Art & Fear. The Image Continuum.
Clear, J. (2018). Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones. Avery.
“In order to survive and win in the ever-changing world, keep updating yourself.” – Gordon Moore
Gordon was born during the Great Depression. His dad was the local sheriff. They lived in the small farming and ranching town of Pescadero, California. He was a quiet kid, but he was optimistic and hopeful. He loved the great outdoors and would often go fishing or play at the Pescadero Creekside Barn. He also love science. His parents bought him a chemistry set on Christmas one year which eventually inspired him to pursue a degree in Chemistry. He earned a Bachelor of Science at UC Berkeley and went on to receive his PhD at Caltech.
After college, Gordon joined fellow Caltech alumni and co-inventor of the transistor, William Shockley, at Shockley Semiconductor Laboratory. Unfortunately, things didn’t go well there. Shockley was controlling and erratic as a manager. Gordon and most of the other top scientists left after a year and joined Sherman Fairchild to start a new company. At Fairchild Semiconductor, Gordon and his friend, Robert Noyce, help devise a commercially viable process to miniaturize and combine transistors to form whole circuits on a sliver of silicon. This led to the creation of the first monolithic integrated circuit, the IC.
Gordon and Robert eventually left Fairchild and decided to form their own company. They would focus on integrated circuit development so they named their company, Integrated Electronics. They started making memory chips and focused the company on high speed innovation. The company did extremely well at first but also faced some difficult times that required significant changes. All the while, Gordon focused on pushing things forward and taking risks. They had to constantly reinvent themselves to survive. The company was later renamed to something that you might be familiar with, Intel.
Gordon believed that the key to their success was staying on the cutting edge. That led to the creation of the Intel 4004, the first general purpose programmable processor on the market. Gordon had observed that the number of transistors embedded on the chip seemed to double every year. He projected that trend line out into the future and made a prediction that the number of transistors would double at regular intervals for the foreseeable future. This exponential explosion that Gordon predicted would power the impact, scale and possibilities of computing for the world for years to come. Of course, you know that famous prediction. It was later named after him, “Moore’s Law”.
In 1971, the first Intel 4004 processor held 2,300 transistors. As of this year, the Intel Sapphire Rapids Xeon processor contains over 44 billion. The explosion of capability powered by science continues to accelerate the technology that enhances and amplifies our daily lives. This past Friday, Gordon Moore passed away at his home in Hawaii, but the inspiration, prediction and boundless technical optimism that he started continues to live on.
I know there is a lot going on right now. We are facing uncertainty and considerable change. It can create fear and apprehension. Technology is constantly being disrupted as well as its role, and our roles, in applying it to our businesses. While not comfortable, we need to embrace the change. Lean in and learn. We need to constantly find new ways to reinvent ourselves and what we do. Embrace the exponential possibility of the future! We can do this!
“The sculpture is already complete within the marble block before I start my work. It is already there, I just have to chisel away the superfluous material.” – Michelangelo
A tanker truck hauling 8,600 gallons of gasoline approached the MacArthur Maze, a large freeway interchange near the east end of the San Francisco, Oakland Bay Bridge in California. The driver, traveling faster than he should, lost control, hit the guardrail and overturned the load of highly flammable fuel. It spilled out on the interchange and exploded into a violent inferno, sending flames hundreds of feet into the air. The heat weakened the steel structure of the three-lane section of Interstate 580, causing the road to collapse onto Interstate 880 below. Thankfully, the driver survived and no other vehicles were involved in the accident.
California Department of Transportation, Caltrans, rushed in to quickly assessed the damage of this crucial interchange which handles some 160,000 vehicles per day. It would take weeks to clear the debris and several months to repair. Initial cost projections reached $10 million with an impact cost of $90 million. Bidding for the job started immediately. Due to the urgency of restoring this vital link, the state offered an incentive of $200,000 per day bonus if the work was completed before the deadline.
Bidding started. C. C. Myers had been planning for this his whole life. While other contractors in the room were offering on-time proposals well over the $10 million estimate, C. C. Myers shocked the room. He would do the work for $878,075, promising to complete the work well ahead of schedule. This was not the first time C. C. Myers had taken on heroic work. His company had a proven track record of rebuilding damage freeways well ahead of schedule, including the Santa Monica Freeway after the 1994 Northridge earthquake. Needless to say, he won the bid.
C. C. Myers went to work. He had assembled a logistic transport team and forged agreements in Texas and other areas to expedite steel delivery to the interchange. He streamlined processes and cut away any distractions and superfluous procedures that didn’t directly contribute to safely delivering the roadway ahead of schedule. As an example, the typical inspection process requires steel workers to complete all their welds before scheduling government X-ray inspection. C. C. Myers convinced the government to embed X-ray technicians in his team and perform the test immediately after the weld was complete. This allowed the crew to get real-time feedback on any area that didn’t pass and fix it immediately before moving on.
C. C. Myers’s efforts were successful. The monumental work was completed over a month ahead of schedule, right before a busy Memorial Day weekend. C. C. Myers earned a $5 million bonus for completing the work early. He quickly gave credit to his workers and their ability to deliver, but moving the mountain had required his artistry as well.
Like Michelangelo, C. C. Myers’s genius was his ability to stare into the mountain of “marble” and see what could be removed to reveal the ultimate outcome. Procedures and processes that didn’t directly deliver value were debris that had to be swept away. Every ounce of energy, every minute, and every movement was precious and deliberate. Everything that wasn’t part of the goal was chiseled away.
What is the work and marble before you right now? What is the goal? What sculpture are you trying to reveal? What can you remove? As all you wonderful artists head into your work channel your inner Michelangelo. Chisel away the useless motion, process and procedures to reveal the incredible work of art buried in the marble.
Credit: A friend of mine, Paul Gaffney, spoke on this at the 2023 DevOps Enterprise Forum. His story was far more eloquent than my version. It motivated me to do more research on the incident. The result is this post. I’m indebted to Paul for his inspiration.
Oh, no! We were several hours into a major system outage and there was still no clue as to what was broken. The webservers were running at full load and the applications were pumping a constant stream of error logs to disk. Systems and application engineers were frantically looking through the dizzying logs for clues as to the cause. Of course, looking at the logs, you would assume everything was broken, and it was. But even when the application worked, the logs were full of indecipherable errors. Everyone knew that most of the “errors” in the logs weren’t really errors, but untidy notices that developers had created long ago as part of a debugging exercise. As one engineer observed in some degree of frustration, “It’s like the log file that cried wolf!” After a while, nobody notices the errors.
The teams restarted services, rebooted systems, stopped and restarted load balancers. Nothing helped. Network engineers dug into the configuration of the routers and switches to make sure nothing was amiss. Except for the occasional keyboard typing sounds, dogs barking or children crying in the background, the intense investigation had produced an uncanny silence on the call. Operation center specialists were quickly crafting their communication updates and were discussing with the incident commander on how to update their many clients that were impacted by this outage. Company leaders and members of the board of directors were calling in to get updates. Stress was high. Would we ever find the cause or should we just shut down the company now and start over? Fatigue was setting in. Tempers were starting to show. Discussion ensued on the conference call to explore all mitigation options and next steps.
“I found it!” The discussion on the call stopped. Everyone perked up, anxious to hear the discovery. “What did you find?” the commander asked in a hopeful way. The giddy engineer took center stage on the call, eager to tell the news. “It’s the inventory service! The server at the fulfillment center seems to be intermittently timing out. Transactions are getting stuck in the queue.” The engineer paused, clearly typing away at some commands on his computer. “I think we have a routing problem. I try to trace it but it seems to bounce around and disappear. Sometimes it works, but to complete the transaction, multiple calls are required and too many of them are failing. I’m chatting with the fulfillment center and they report the inventory system is running.”
The engineer sent the traceroute to the network engineer who started investigating and then asked, “Can you send me the list of all the addresses used by the inventory system?” After some back and forth, the conclusion came, “I found the problem! There are two paths to the fulfillment center, one of which goes through another datacenter. That datacenter link looks up but it is clearly not passing traffic.” After more typing, the conclusion, “Ah, it seems the telco made a routing change. I’m getting them to reverse it now.” Soon the change was reversed and transactions were flowing again. The dashboards cleared and “green” lights came back on. Everyone on the bridge quietly, and sometimes not so silently, celebrated and felt an incredible emotional relief. Sure, there would be more questions, incident review and learning, but solving the problem was exhilarating.
How many of you can relate to a story like that? How many of you have been on that call?
A friend of mine, Dr. Steven Spear at MIT, often reminds us that the key to solving a problem is seeing the problem. You can’t solve what you cannot see. A big part of reliability engineering and systems dynamics is understanding how we gain visibility into problems and surface them so they can be addressed. Ideally, we find those weaknesses before they cause real business impact. That is often the attraction of chaos engineering, poking at fault domains to expose fractures that could become outages. But sometimes the issue is so complex that we just need a clear line of sight into the problem. In the story above, connectivity and those dependent links were not clearly visible. If there was some way to measure the foundational connectivity between the dependent locations, our operational heroes could have quickly seen it, fixed it, and gone back to sleep. Getting that visibility in advanced is the right thing to do for our business, our customers and our teams.
You can throw a GridBug onto any instance, into any datacenter, and it will go to work monitoring connectivity. I didn’t have time to test any serverless options but it should work as well. I set up 5 nodes in 3 locations for a test, with some forced failures to see how it would detect conditions on the grid. The graph data converges overtime so that every node can render the same graph. If you want to see it, here is my test and project code: https://github.com/jasonacox/gridbug
I have no expectations on this project. It is clearly just a work of fun I wanted to share with all of you, but it occurs to me that there is still a lesson here. Pain or necessity is a mighty force in terms of inspiration. What bugs you? Like this outage example, is there some pain point that you would love to see addressed? What’s keeping you from trying to fix it? Come up with a project and go to work on it. You are going to learn something! Look, let’s be real, my project here is elementary and buggy at best (sorry, couldn’t resist the pun), but I got a chance to learn something new and see a fun result. That’s what makes projects like this so rewarding. The journey is the point, and frankly, you might even end up with something that brings some value to the rest of our human family. Go create something new this week!
We had assembled to put together the outline for a guidance paper. At the top was the title, “Modern Governance.” I thought to myself that the title alone would cure insomnia. Despite the title, members of the team had developed brilliant new automation and approaches. They were already deploying those game changing ideas at their businesses. We wanted to share those! Unfortunately, the gold was buried in the boredom. It was too academic and dry. Nobody would make it past the title, much less the layers of governance tedium in the outline. Energy in the room which had been off the chart during the discovery discussions suddenly fell flat as we all realized that our guidance document would have little impact on the real world.
“Hey, I have an idea! Why don’t we just tell a story?” I suggested, “Imagine a Phoenix Project moment where a crisis hits and a band of characters have to solve it.” Enthusiasm erupted as the group piled on with ideas on how the story could unfold to show and teach the thoughts we had captured in the dry outline. Suddenly, characters emerged. Susan, the CEO was getting an urgent phone call about an existential crisis hitting her company. Bill, Jada, Michelle, Jason and the rest of the cast of character sprung to life in a brief narrative. We put the story to paper and changed the name to Investments Unlimited, inspired by the fictitious company in the Phoenix Project. We had done it! A short story was assembled and we presented it to the rest of the DevOps Forum who applauded the work. Mission accomplished. Or so it seemed…
A few months later we were invited to a meeting. “Gene Kim and the staff at IT Revolution reviewed your paper and we have a proposal.” Leah, the editor for IT Rev and the Forum papers explained to us, “We think the paper is great, but we think it could be greater. We would like to turn it into a novel.” She paused and surveyed the group. John Willis, the leader of the forum group and fellow co-author, suggested, “I think we should do this! It would take some work, but we should write it ourselves and add some of the details that we couldn’t develop before. What do you think, are you up for it?” We were all stunned and delighted. One by one, we all chimed in that we would love to take on the challenge. Shortly after that call we started meeting every Tuesday evening to work on the book. We invited industry experts to interview and fill in the gaps of our understanding. Weekends became a writing club where some of us would meet to knock out a scene, develop a character or wordsmith a moment. Slowly the short paper became chapters, and the chapters became a novel.
I confess, I was enamored just to be part of this great group of co-authors. This cast was made up of an incredible family of industry thought leaders, technical gurus and fellow DevOps rebels: Helen Beal, Bill Bensing, Michael Edenzon, Tapabrata “Topo” Pal, Caleb Queern, John Rzeszotarski, Andres Vega and of course, John Willis. Our meetings would sometimes pivot into philosophical discussions, technology news or current DevSecOps challenges. Despite the frequent distractions and detours, we managed to nudge the narrative forward, week by week.
Writing a book is hard. You are turning ambiguous ideas into letters on a page. The key was to just keep writing, keep the prose flowing. There were times where you wouldn’t feel inspired or enthusiastic about the words pouring out of your fingers, but you would keep typing. I was surprised and amazed at how well that worked. More than once, I discovered that inspiration followed effort. The act of doing created a warming glow. Suddenly the arduous task unlocked a love, a passion and an inspiration that wasn’t there before. That approach developed new twists in the story, new ideas to explore or challenges to solve. But getting those words on the paper were important. We would spend months editing and tweaking the story, but without that original content there would be nothing to work with. Eventually we would have a finished product and as of two weeks ago, a published book. It was an experience that I will forever cherish and recommend to anyone who gets the opportunity to do the same.
Just keep writing. Going through this journey has reminded me of the importance of “doing,” self-motivation and determination. I think we can all get stuck in limbo, waiting around for that magical moment of inspiration. The truth is that in life, that inspiration is often the result of the wind of our own movements. Just keep going! Inspiration will come. Words will become chapters and chapters will become stories. What are you penning today? What adventures are you crafting by your doing? Get up, get moving… keep writing.
Investments Unlimited A Novel About DevOps, Security, Audit Compliance, and Thriving in the Digital Age by Helen Beal, Bill Bensing, Jason Cox, Michael Edenzon, Dr. Tapabrata “Topo” Pal, Caleb Queern, John Rzeszotarski, Andres Vega, and John Willis
“On the other side of the screen, it all looks so easy.” – Kevin Flynn
“LaserDisc… Prepare to be blown away!” The clerk at the local movie rental store handed us the LaserDisc player and movie and guaranteed that it would level up our home movie experience. My brother and I unpacked the dazzling new player and quickly connected the RCA cables, powered up the audio system and hit play. Seconds later it sprung to life with colorful geometric shapes flying across the screen, taking us on a journey into a virtual realm. The dazzling images on the screen were accompanied by room filling sounds the LaserDisc pumped into the audio system. The ethereal soundtrack by Wendy Carlos transported us into this magical world of the impossible. The characters in the movie were playing video games, but not like my sister and I would play at the local arcade, they were actually in the game, inside the computer! They were “programs”, walking around, pulling power from circuit board rivers of light, recording information on their identity disks, piloting vector based light-cycles, tanks, recognizers and solar sailing ships across the grid. And like any good hero story, they fought against the oppressive evil overlord. The Master Control Program sought to enslave the world of computer programs to do it’s evil bidding to ultimately take over the human world. They were fighting for the “Users”, the human creators of this digital realm. One of those creators, a programmer named Flynn, gets transported into this digital world to join in on the fight. Welcome to the world of TRON!
I was blown away! The clerk had been right. It had inspired me and introduced me into a new world. The world of programs, computers and computer graphics. I was suddenly obsessed with this new found passion. It became an imperative for me to learn everything I could about this computer world. I managed to talk my dad into getting me a Commodore 64 so I could learn to do all these things that I had seen on the screen. Soon, I was crafting my own programs, sprites, animations and audio waveforms. I even made my own space adventure game that I published in our middle school paper, as if anyone would ever type in all that code! I was hooked. Maker clubs, hacker homebrew meetups and bulletin board systems eventually led me to join the computer science and electrical engineering departments at the University of Tulsa. I knew what I wanted to do. I wanted to fight for the Users, making programs and systems that made the world a better place.
TRON was released to theaters 40 years ago this past weekend. While not a blockbuster for Disney by any means, the film was groundbreaking. As with so many of Disney films, it had inspired people just like me. It even paved the way for computer-generated imagery in animated films. John Lasseter has said that without TRON, there would have been no Toy Story.
We make magic. But that magic isn’t just the compelling storytelling, the visual effects, the powerful adventures or experiences we deliver. No, the real magic is what endures those moments and begins a ripple effect on lives. People become inspired to try new things. New passions awaken. New worlds unfold. The work we do makes an impact that transcends the bottom line and propels us into the future as a species. We inform. We inspire. We improve our human experience, one story at a time.
Are you ready? It’s time to go play the game. Let’s go fight for the Users!
After research and talking with several solar companies, we decided on an 8.5kW Tesla Solar plus Powerwall+ system with their new high efficiency 425W panels. The main reason for our decision was the low cost, handsomely framed panels and the whole house backup capability. Other solar companies had good backups systems but we did not find any who would provide whole house backup. And, more importantly, we were delightfully surprised to see that Tesla came in with the best price. Having said that, we would soon discover that they seemed to have significantly reduced overhead by mostly eliminating customer service.
25 September 2021
After ordering, reviewing designs, applying for HOA and City approval, we were finally ready to get the system installed. Two different crews arrived over a span of a week to get the system installed. The panels went on first. The panel install crew was professional and friendly. However, when they left I noticed that the handsome skirts (frames) we were so excited about were only installed on the front of the house (street facing roof). It looked great but I had expected to get them on the back as well. I reported it and in just a few days a technician came out and added the skirts to the back roof panels. He explained that they usually only install the skirts on the front. So, please note, if you want skirts on all your panels, make sure you let them know in advance. Also, the skirts are only put on the left, right and bottom. There are no skirts on the top to allow heat to escape from the panels during the hot summer.
While installing the skirts for the back, I noticed one of the panel edges was sticking up about 1/2″ higher than the rest. The technician tried to fix it but he didn’t have all the tools. He only had what was needed to install the skirts. He asked me if I had a Torx T30 driver. I didn’t but he explained how I could adjust the panels myself. I picked up a T30 at our Newhall True Value store. I climbed up on the roof and found the adjustment area. I used a vice grip on the screwdriver to get enough leverage (mostly because I’m pretty weak especially when I’m up on the roof). I was able to lower the panel 1/2″ so it was flush. It looked beautiful.
A week later, another crew showed up to wire it in and commission the system. This meant a day without power, but we were prepared for that. I tried not to be a nuisance, but couldn’t help but watch and ask questions. I made sure all of the crew had plenty of bottled water, Gatorade and snacks, including ice cream candy bars since it was so hot. They installed the Powerwalls in our garage and wired in the breaker panels and Tesla Gateway by the utility meter. After everything was installed, they powered it up and began the commissioning process.
During commissioning, the first signs of trouble started showing up. The installers downloaded the latest software updates but were unable to get the Solar Panels to work correctly. The solar assembly was only producing 160W in full sun which doesn’t even show up in the app. They tried for hours, upgrading, rebooting, calling. They eventually gave up after showing me that the Powerwall could power our house if they cut the mains (based on 22% charge from the factory). They explained that Tesla would send out a software patch to fix the Solar panels, most likely.
I contacted our Tesla Advisor to report the problem and to see if they had an update. After several days of emailing, texting and calling, I received a note from the advisor that our inspection would be scheduled in 3-4 weeks and he would contact the electrician about the problem. I spent another week requesting updates but my Advisor had gone radio silent. It turns out that this is a common experience with Tesla. Assuming best intentions, I can only imagine that the advisors are understaffed and overwhelmed. Regardless, it all results in a very frustrating experience for the customer.
I did manage to finally get an update and a promise to further investigate the issue. While I waited, I decided to do some more research on the system to see if I could find the problem myself…
HIGH VOLTAGE WARNING: I need to stop here and remind everyone that these systems contain extremely high voltages and are dangerous. Hopefully it goes without saying, but please be careful if you poke around inside these electrical boxes. High voltage can be fatal.
You can connect to the Tesla Gateway by scanning the QR code inside the box. It will have your phone connect to the Gateway’s access point. You will need to stay close to the gateway if you connect this way (and browse to https://192.168.91.1). However, keep in mind that it is also connected to your home network and if you know how to find the IP address, you can point your browser to that IP and login as the Installer to see more details about the system. Your browser will likely require that you ignore the security certificate warning (more on this in the observations section below) and you will need to toggle the power switch to one of your Powerwalls but it will let you in. That is essentially their 2nd factor system to ensure you are authorized. Here is what my system looked like after installation on the main screen and on the “System” screen :
The System screen also shows details about the solar generation, Powerwalls and power usage:
Below the above list was a section for “Remote Meter” that would occasionally appear. This was particularly interesting:
Remote Meter (Vxxxxxxxxxxxxxxx) CT 1 (Solar): —W
That seemed odd. Also when I clicked on the solar panel icon on the main screen, it would say “Stale Meter Data” – that had me wondering if the solar meter was the real issue.
I first disabled the Solar Assembly by clicking “Disable” on the System screen.
I opened up the Inverter, the box above the first Powerall. There is a small latch on the bottom that will unlock and let the panel swing up. I found a wooden dowel to prop it open so it would bang on my head the whole time I was investigating.
I noticed that there was a box on the right that had a “n” LED flashing. The code on the box was the same code that was listed as the “Remote meter” in the system’s display (the one showing no power). Some more research and I discovered that this module is a Neurio W2-Tesla WiFi based current reader that sends the solar power data to the Gateway.
Neurio was recently purchased by Generac but you can still find manuals and some models for sale online. This particular model, W2, has been customized for Tesla. It is designed to connect to the access point of the Tesla Gateway and send the solar power data.
CT-1 Amp Probe Wire
The Neurio has a wire plugged in to the top in the CT-1 (current transformer) port. I traced it over to the solar inverter where a clamp is wrapped around the solar inverter output AC line to measure the amperage. I re-seated that connector.
I then noticed that there was an antenna jammed below it that was tucked to the left, under the massive metal inverter shield. I turned the antenna to the right, in the open unshield space.
As soon as I did these two things, the LED “n” on the box began to change and a tune started coming out of the box. It sounded like “I am connected now”. The flashing “n” became a solid blue light.
I went back to the System screen and re-activated the Solar Assembly by clicking “Enabled”. This takes several minutes and you will see the system go through and activate the solar arrays, test relays and impedance before the assembly comes online.
As soon as the Solar Assembly came online, I started seeing kW of power show up on the Systems screen. 5800W of power was coming in, fully powering the house and charging the Powerwalls!
Not so fast…
Sadly, just two hours later my elation was destroyed. The solar energy dropped back to zero.
I checked the inverter. Sure enough, the Neurio was flashing again. I attempt the above process again, several times, but no joy this time. It would chime and go green, but then started flashing again. Based on my research, the Neurio connects to the Tesla Gateway WiFi only. The beeps and flashes indicate that it is unable to connect to the Gateway WiFi.
One thought I had was to reach out to Neurio (which is now owned by Generac) to see if they could provide the API, pinout or schematics for the W2 device so I could troubleshoot at the firmware and component level. When I contacted them, Generac replied that the serial number for my device contains proprietary firmware by Tesla that they cannot support. They recommend that I contact Tesla at: 888-518-3752. Oh well, it was worth a shot.
I love a challenge. In fact, when something isn’t working, it is almost an addiction to me. I have to figure it out and fix it! So, I had two thoughts at this point. First, I wanted to see what the Neurio was actually doing. I thought about setting up an ESP8266 to be an WiFi access point to intercept the Neurio’s communication attempts with the Gateway. But before I do that, it occured to me, I wonder what would happen if the system didn’t have a solar meter at all. In my investigation, I discovered that the solar power meter feature is often an add-on or post-install enhancement. Maybe this was more of an add-on feature than a requirement?
At the minimum, I wanted to see if there are alternatives to the Neurio in the Tesla configuration. Unfortunately, there isn’t an easy way to edit this data. I discovered that settings could only be set during the initial setup time. That would require running the setup wizard again. I decided to be bold and fire up the installation Wizard. At the bottom of the system portal is the “Run Wizard” link. Of course, I clicked it.
WARNING: I’m fairly confident that you can completely break your Tesla Solar setup using the Wizard, maybe even disable power to your house permanently. It is intended for installers. I’m taking the risk, but you should consider this first and be cautious about proceeding. I’m also fairly confident I’m going to void something in the process, but if you put something in my house, fair game, I must hack.
The wizard is straightforward. It requires you to Stop the system, but the settings are mostly intuitive. When I arrived at the Meter screen, it had 3 different meters displayed. I apologize, I did not take screenshots but will update this blog if I capture them in the future but the screens are very basic.
Two of the sensors were for the built-in CT’s used to measure the power in the Tesla Backup Gateway (you can see them on the main bus if you open the Gateway panel – which I did).
These tested extremely fast (subsecond) were working correctly and tested “good”. The third meter, a WiFi meter, was the Solar Meter (Neurio) and it’s status was Error, unable to connect. I clicked the connect button which reported it would take 3 minutes to configure the WiFi sensor. No shock, it didn’t work. I tried it 3 more times. The “Advanced” drop down allows you to add MAC address and IP, but this didn’t help. There was a “Delete” button. I thought it might be worth a try to delete and re-add. At the bottom were options to add “WiFi” or “Wired” CTs. I tried to add the Neurio (WiFi) again, multiple times, rebooting the Neurio occasionally to see if that would help. Nothing.
Here is where something interesting happened. The Wizard would NOT let me advance because the WiFi sensor was not healthy (connected). Hum… Well, I figured I would just have to delete it to see what other screens I could find in the Wizard. I deleted the Neurio. I advanced to the next screen and was presented with a “Warning – you do not have a solar sensor selected.” Naturally, I ignored that and continued.
Commissioned! I completed the Wizard setup and the system came back online. Surprisingly, the system screen looked basically the same but the dynamic flow diagram was actually working. There were no sensor errors or warnings. Power was flowing from the Grid to the House. It was the middle of the night so I signed off and went to bed.
I know what you are thinking. This is dangerous, right? I mean, we seemed to have removed solar power observability from the platform. Will the Gateway and Inverter sill know what to do? Well, it turns out… it does!
The Power of the Sun
Next morning, I woke to discover solar generation was charging the powerwalls and our house was completely powered by the sun! I still want Telsa to fix the Neurio or, better yet, provide some hardwire CT to monitor Solar power generation. I’m assuming that the display below means that the Gateway is computing the the solar generation based on other CTs. In any case, my workaround is in place and we now have a working system again.
As I’m looking at my phone, I realize… I’m holding the power of the sun in the palm of my hand. Yes, that is a geeky Doc Ock reference. We are now powering our home with an ancient but reliable and self-regulating, thermonuclear fusion reactor… our sun.
The Return of Tesla
1 November 2021
I gave up on waiting on Tesla to respond to me about the Neurio. I figured it didn’t matter since I had a working system. A month after the install and I still didn’t have an inspection date. Then it happened. I received a text message and email from Tesla that my inspection was scheduled. There was NO DATE or TIME given. Instead of asking, I figured it didn’t matter. We would see what would happen.
The day of inspection had arrived. A surprise knock on the door and there was Ishmael from Tesla. He explained he was there to meet with the City inspector for the final inspection. I showed him the gear, the Powerwalls, the gateway and the breaker panels. He looked at me and asked, “Did the install crew not put on the warning labels?” Nope.
This was something I had noticed after the installers left. In the Tesla plans are specific instructions on where to place the red warning labels on all of the gear. It includes a helpful diagram for anyone wanting to know how to kill all power in case of emergency. I had raised this issue with my project advisor a few times, but as usual, told me he would look into it and of course, nothing happened. I explained this to Ishmael who rolled his eyes and expressed apologies and said he would need to call to get the labels or it would not pass inspection. He would wait for the delivery and get them installed and ready for the City.
Shortly after meeting Ishmael, another Tesla vehicle pulled up. I figured it was the inspection stickers, but instead, it was Rocío, a Quality Assurance technician. She told me that her job was to make sure everything was installed correctly and running. I almost hugged her! I expressed my delight and appreciation that she would check on us. I explained everything that happened including how the installers said it must have been a Tesla software bug and gave up after trying for hours to get it work. I also told her about the Neurio hack I had done to get it working. She was shocked, sympathetic and determined to fix the issue.
Naturally, Rocío attempted to reset the Neurio and discovered the same thing that I did, with the exception that she was able to get the Neurio to work if she held the connector, pressing on it in a certain way. “There is clearly something wrong with the hardware and it needs to be replaced,” she concluded. I hate to be cynical, but I was definitely thinking this new chapter in my Tesla adventure would turn into an multi-week RMA, repair order and a return visit that may get scheduled sometime next year, if I’m lucky.
To my delight, Rocío looked straight at me and said, “And we’re going to get this fixed today!” She was right! She made a phone call and 30 minutes later another Tesla van showed up with the replacement Neurio!
Rocío got it working. Less than 30 minutes later she had the entire system back online and working correctly. “That’s amazing!” I told her. She clearly saw my astonishment and said, “I used to be an installer, I know what’s needed.” Well, that was completely accurate. She didn’t stop there. She examined all the gear and climbed up on the roof to ensure all the panels were in good order.
Shortly after the good news, the warning labels arrived and were attached to the new gear, ready for the official inspection. I started passing out my sincere appreciation, candy bars, water and Gatorade to these brilliant Tesla soldiers that had come to save the day. After bidding farewell to our new friends, Rocío drove off on her shiney white stallion… uh, I mean Tesla van.
About 30 minutes later, the City Inspector arrived and after a quick survey of the installed gear with Ishmael, signed his approval. Now we are on to the Permission to Operate (PTO) by Southern California Edison.
3 November 2021
I should definitely learn to be more guarded in my optimism about this Tesla adventure. After two days of having the new Neurio re-installed, I started noticing something odd. After solar production when I would expect the Powerwalls to kick in and power the house, I would see grid power start to show up and the Powerwalls drop to zero. It would only last for a few minutes then return to normal operations. Looked at the Powerwall Dashboard I set up and can even see the grid power spiking during the day when solar production was more than enough to power the house.
The grid power spikes did not exist before the new Neurio. I went out to look at the inverter. The Neurio’s purple light was mostly solid but would “flicker” blue. It was random, like a candle flame not like the error condition of the previous Neurio. It was happening constantly as I watched. When the flickering would get bad, I would see the powerwall drop to zero and grid power surge. There seemed to be a correlation. At any rate, I wasn’t going to let the flickering continue.
HIGH VOLTAGE WARNING: I need to stop here again and remind everyone that these systems contain extremely high voltages and are dangerous. Hopefully it goes without saying, but please be careful if you poke around inside these electrical boxes. High voltage can be fatal.
I powered off the Neurio by unplugging the power next to the antenna at the bottom. I noticed the antenna was once again tucked under the massive shield. I guess that was the typical install. I changed it so it was pointing away from the inverter shield and reworked the cables to plug it back in. The Neurio went through the startup (flashing, then solid green, then blue and then purple). I watched it for a while and noticed it stayed solid purple, no flickering.
I don’t know if this was a fix or a sign of things to come. Other people have reported similar problems with the Neurio, including a YouTube video on how to reset it the way I did. It is rather shocking how unreliable this little box is. I understand it is a “revenue grade” meter which is likely why Tesla is using it, allowing them to report “Solar Renewable Energy Credits” (SRECs). The Inverter itself seems to have a decent meter without the Neurio which is why my workaround hack worked while waiting for the Neurio replacement. If the reset doesn’t work, I will likely revisit my “fix.”
5 January 2022
The “fix” was temporary. It appears to be a resource leak that requires the Neurio to be restarted. The good news is that Tesla finally recognized the instability and sent out a 21.44 firmware update that fixed the Powerwall from disabling solar when the Neurio goes into a bad state. Finally! I was planning on ripping the Neurio out after PTO, but now I don’t have to do that. I’m currently on firmware 21.44.1 and just heard from the community that others are seeing an upgrade to 22.1 that also upgrades the Neurio from firmware “1.6.1-Tesla” to “1.7.1-Tesla” (you have to access the vitals API and decode the protobuf binary payload to see this – see here). Hopefully that helps with stability.
Permission to Operate
2 February 2022
PTO, finally! Our utility company, Southern California Edison (SCE) granted permission to operate. It took Tesla several tries to get the PTO request submitted correctly. SCE was notifying us of all the transactions but we were not able to see the full application or help. Believe me, I tried! In any case, our installation adventure has finally come to an end. It has been seven months since we started this epic journey. It is good to finally have a fully operational system.
To be fair, we have had a working solar system with Powerwall backup since October, but without PTO. PTO means that our system is no longer in self-consumption test mode. We can now push excess solar production to the grid for a credit. For the first time ever, we see grid graph going negative!
The Tesla Solar system has been an adventure for us. I don’t regret going with Tesla even though they have improvement areas, especially related to consumer experience. We love the look of the panels and the equipment. In typical Tesla fashion, the design is stunning and feels like quality. If you do select Tesla Solar, my advice is to plan on being the project manager. Stay on top of the details to keep things moving and make sure items are not dropped.
Regardless of who you go with, I recommend you set some non-negotiables to help you filter. Here were my non-negotiables:
Aesthetics – We wanted something that looked high-tech, neat, clean and symmetrical on our roof. I wanted the dark panels (no white lines) with a clean looking frame.
Whole House Backup – We wanted batteries capable of running our house over 24 hours in the event of a power outage and a system that would charge the batteries during the day even if power was out for extended days.
Home Automation and Monitoring APIs – I wanted a system I could hack, use tools to monitor, dashboard, trend and even make decision on home automation components to optimize our energy usage.
Off the Grid – We wanted the system sized to allow us to be self sustaining with no need to use the grid even at night, fully expecting that at some point Net Energy Metering disappears or becomes less attractive, the system will still provide us with a zero grid usage option.
This helped considerably. It eliminated the list down to a handful and the lowest cost on our list was Tesla, who also had the best aesthetics IMHO. Now to be clear, as I mentioned above, Tesla is seeing explosive demand for their option and their customer service struggles a lot. I also had the opportunity to work with several incredible Tesla technicians who helped us.
I am extremely happy with our Tesla system and would recommend it to anyone, despite the bumps along the way. Tesla hit on all my non-negotiables and is elegant, fun and powerful. It has become a delightful hobby as well as a powerful utility for our green energy mission.
While a similar adventure may not be for everyone, if you are in the market for a Solar system, I still highly recommend checking out Tesla’s options. Use this link and you can save $300 if you do order and I get a reward too: http://ts.la/jason50054
I have to confess. I love toys. To me, this new Tesla Powerwall+ Energy systems is a gigantic (and expensive) toy. I have thoroughly enjoyed tinkering with the system and building electronic accessories and software to manage it. As you have seen in this post, I wrote my own python API library (pyPowerwall) and created a Powerwall Dashboard to better see what the system is doing over time (credit to other open source projects I mention below).
One thing that the Tesla is missing is a good instrument panel display. Sure, you can run the app all the time, but I wanted something that would show the solar production and other details like a physical dashboard but without opening an app. I built one. It is hanging next to the Powerwalls in our garage. Much to my wife’s initial trepidation, I also built one and hung it in our kitchen. It turns out that I’m not the only one to appreciate it… well, after a while anyway. 🙂
Here are some of my toys that I gladly share with you. Please reach out and let me know if you find these useful:
PowerwallWeb Dashboard – The Tesla App and web based portal present great animations showing the solar generation and usage. However, the information is very limited and not design for visualizing the energy data in multiple ways. I wanted to see a year at a glance as well as the string data (how much power each group of solar panels on the roof are producing). I found this Grafana based dashboard and made some minor changes including the addition of my own python based Powerwall API Proxy. Here is a simple python API module pyPowerwall to pull data from the Powerwall Gateway using your “customer” credentials: https://github.com/jasonacox/pypowerwall. If you are wanting your own Powerwall Dashboard, it is fairly easy to set up with the instructions here using Docker Compose: https://github.com/jasonacox/Powerwall-Dashboard#powerwall-dashboard
Powerwall Wall Mounted Display – I really wanted to see the current solar generation and state of the Powerwall on a simple LED digital display. I 3D printed a Tesla themed case and installed the displays to show solar, house, battery, and grid power data. The display show the solar production power at the top. House, Powerwall and Grid power data rotate through the middle display and the battery level of the Powerwalls is at the bottom (89% in this picture). You can see a video of the display running below.
The display uses a WiFi enabled systems-on-chip (SoC) ESP8266 WeMos controller and three simple TM1637 7-segment LED display modules. Naturally I used my own Arduino API library (TM1637TinyDisplay) for those and the pyPowerwall proxy to display the results. It would have been nice if Tesla had built a wall monitor to show vitals like this. I’m sure it would have been a nice animated OLED display of some sorts. But this was fun. I needed to build another toy and I love my retro-LED display. If you want to build your own, I have open sourced the design and code and uploaded here for you to use: https://github.com/jasonacox/Powerwall-Display
I took note of several areas of concern and improvement during my investigation and problem solving. I have recorded them here.
WPA TKIP Command Access Point – The Tesla Gateway uses this weaker method to host its WiFi access point. As I discovered the Neurio uses this same access point to send Solar Power data (if it works). WPA TKIP has been dropped due for security reasons and more modern access points use WPA2 and AES encryption (WPA2-AES).
HTTPS Security Certificate – The HTTPS certificate the Gateway uses will create a browser warning (or error) when you go to the system control portal, either via your home network or via the access point at https://192.168.91.1.
Second Factor – For setup, the user is required to toggle the switch on a Powerwall as a 2nd factor to prove authorization, which is a good thing. That works well for me since my Powerwalls are locked in my garage, but if your Powerwalls are outside next to the Gateway, an attacker on-location could easily join and toggle without you even knowing.
IoT Sensors – The main problem on my system was the Neurio W2 WiFi based sensor. This IoT device sends back power data it measure to the Gateway controller. Generally, this is an elegant way to handle transmitting sensor data between systems without having to wire things. The irony is that the Gateway and Inverter already have several wires and control signal between them. Why not add another wire and avoid any WiFi communication outages? Hopefully I will be able to replace my Neurio with a wired solution.
Solar System Plan – I asked the Tesla Advisor to provide me with the design plans developed for the City Permit. They do not provide this without asking. I am glad I asked. The plans have all the schematics for the wiring as well as the layout. I discovered several things that I wanted changed and was able to get them to update before they came onsite. If you wait until they come onsite, they may not have the materials to make the adjustment and, worse, could charge you for any changes.
I found the following github projects, references and diagrams during my investigation into my Tesla Solar Adventure. I’m pasting them all here to be helpful for anyone else experiencing the same problems. The information may not be directly related but could provide a clue.