Artificial Intelligence

June 16, 2025June 21, 2025

Summer Vibes

I hope you all had a great weekend! And for any fellow dads out there, I hope you had a great Father’s Day! I spent time with all four of my kids watching movies, grilling outdoors, and of course, celebrating over some ice cream on these hot summer days. Now, to be fair, it doesn’t take much to need to celebrate in our household. Life is full of excuses that merit a need for a soft serve dose of that dairy goodness, but this weekend seemed particularly poised for that indulgence.

We love movies! As part of this weekend’s festivities, we had a full playlist of cinematic magic streaming on our living room screen. You all know me by now, so it probably doesn’t surprise you to know that I have my garage-based AI system curate our movie selection. It sends out text suggestions on what to watch. It keeps track of our viewing habits and has a good idea of what we like to see. But despite all that tech, my wife wasn’t quite satisfied. She suggested that it should consider recommending movies celebrating the anniversary of their general theatrical release. For example, “Incredibles 2” was released on June 15, 2018, so it would be a great one to watch on Sunday. I loved that idea! So, I went to work adding that context to our resident AI. I just needed data.

Good luck! I tried finding a good data source, but everything I found was driven more toward discovery, and most of it was flawed, including bad release date information. I finally landed on TMDB as a good listing of movies, with references to IMDb that could pull more official release dates from OMDb. Yeah, it was confusing, but sadly, there wasn’t a clean way to get this data. I needed a web service to aggregate all of this for me and my AI.

I’m going to stop now and just acknowledge that many of you are probably tired of hearing me talk so much about Vibe Coding. If that’s you, you can stop now. I won’t be offended. For the rest of you, yes, buckle up, here is another vibe coding story.

I launched VSCode with my GitHub Copilot-powered assistant that I call JoJo. I switched him to agent mode (super important, by the way), and began having a chat. I told him about my vision to create this web service, how I wanted to build this dataset and APIs for easy access. He created a movie_db folder and went to work on a script. The script ran right away and pulled down the data. I suggested a high-speed way to process the data, and he suggested caching the API calls to prevent overloading the providers. What a smart aleck! But he was right. That was a good idea because the free tier of API access was rate-limited.

Finally, I had a good dataset to use, and JoJo had compressed it into a serialized object for fast access. I then switched to having him create the Python web service and gave a general idea of the APIs I wanted. He suggested some routes to use and wired together a Python Flask app. I told him that I wanted to use FastAPI and that I wanted to build all the tests before we built the APIs. He reluctantly complied and had me run pytest to verify. All good. Then the fun began. he started churning on the code for the APIs.

At this point, I should acknowledge that I was very tempted to jump in and code some lines myself. You can definitely do that, and these tools will co-develop with you, but I wanted to see how far I could go just vibing my way along. It turns out, a long way! The APIs were looking good, and it was extremely fast. I decided I wanted a nice UI, so I told JoJo to build a web page and gave him a general idea of what I wanted to see. He spun up some templates, added some tests, and plumbed in a new route for the landing page.

“Show the movies that were released on this day in history and sort them by popularity.” Boom! In less than a minute, JoJo had a basic screen up and running. I asked him to tweak the colors and make it more modern with a date navigator. He did, but I didn’t like some of the placements, so I asked him to nudge things around a bit more and adjust the style. I must confess, this is where I spent probably too much of my time. It was too fun and easy to ask him to make minor tweaks to scratch my curiosity itch. But he never complained; he just kept coding and plodding along. I even had him add additional pages for “Search” and “About”, which had nothing to do with my original goal.

About eight hours later, we were done. Yes, that is probably about four times longer than I needed, but I was having so much fun! Fun? Yes, legitimate, awe-inspiring fun! I finished up the project by asking JoJo to build the Dockerfile and help me launch the app as a public website for others to use. He complied. In case you are wondering, I even spent the $11 to get a domain: https://moviesthisday.com. I still have a non-stop list of updates spinning in my head, not the least of which is a MCP server for AI.

When I launched my first startup, we spent over a year getting our business and first website launched. There was a lot of development time for that. I can’t imagine how different that story would have been if we had Vibe Coding to accelerate our efforts back then. This is a game changer! I want all of you to get a chance to vibe too. If you tried it in the past and weren’t impressed, please try again. The advances they are making are happening on a weekly basis now. I’ve seen it myself. They just keep getting better.

Technology amplifies human ability. Vibe Coding feels like digital adrenaline. I’m a little addicted. But it feels great! It has definitely helped bring the fun back into coding again for me. I wonder if the same could happen for you?

Now, for those of you who managed to actually stay with me through today’s way-too-long blog post, thank you! I’m excited for you. We are living through an amazing time in technology. Let’s get busy putting this great tech to use for the betterment of ourselves, our companies, and our world. Lean in! Try your hand at this ice cream of coding. The scoops are amazing!

Oh, and in case you are wondering what movie to watch tonight…

Code available on Github page: https://github.com/jasonacox/MoviesThisDay

April 8, 2025April 9, 2025

Coding Vibes

I had the opportunity to meet with industry leaders at an IT Rev Technology Leadership Forum last week in San Jose. I was able to participate in deep dive sessions and discussions with friends from Apple, John Deere, Fidelity, Vanguard, Google, Adobe, Northrop Grumman, and many others, with some new friends from Nvidia, Anthropic and OpenAI. As you can imagine, the headline topics from these tech leaders were all around AI.

Ready to try some “vibe coding”? By far, the biggest discussions revolved around the new technique of vibe coding. But what is this “vibe coding”, you may ask? It is a programming technique that uses AI to write code with nearly full auto-pilot mode thinking. Instead of code writer, you are the creative director. You are creating what you want in English and the AI does the rest. Basically, it goes something like this:

ME: Help me write a flight simulator that will operate in a web browser.
AI: Sure, here is a project folder structure and the code. Run it like this.
ME: I get the following 404 error.
AI: It looks like we are missing three.js, download and store it here like this.
ME: The screen is white and I’m missing the PNG files? Can you create them for me?
AI: Sure! Run this python command to create the images and store them in the /static folder.
ME: I see a blue sky now and a white box, but it won’t move.
AI: We are missing the keyboard controls. Create the following files and edit index.html.
ME: I’m getting the following errors.
AI: Change the server.py to this.
ME: Ok, it is working now. It’s not great, but it is a start. Add some mountains and buildings.

I spent a few minutes doing the above with an LLM this morning and managed to get a blue sky with some buildings and a square airplane. In vibe coding, you don’t try to “fix” things, you just let the AI know what is working or not working and let it solve it. When it makes abstract recommendations (e.g., create a nice texture image), you turn around and ask it to create it for you using code or some other means. In my example, I’m playing the role of the copy/paste inbetweener, but there are coding assistants that are now even doing that for you. You only give feedback, and have it create and edit the code for you. Some can even “see” the screen, so you don’t have to describe the outcome. They have YOLO buttons that automatically “accept all changes” and will run everything with automatic feedback going into the AI to improve the code.

Fascinating or terrifying, this is crazy fun tech! I think I’m starting to get the vibe. Ok, yes, I’m also dreaming of the incredible ways this could go badly. A champion vibe coder at the forum said it was like holding a magic wand and watching your dream materialize before your eyes. He also quickly added that sometimes it can become Godzilla visiting Tokyo, leveling buildings to rubble with little effort. But it hasn’t stopped him. He is personally spending over $200/day on tokens. I can see why Anthropic, OpenAI and Google would want to sponsor vibe coding events!

This sounds like an expensive and dangerous fad, right? Well, maybe not. This tech is still the worst it is going to be. The potential and the vast number of opportunities to innovate in this space are higher than I have seen in my lifetime. I encourage you all to help create, expand, and explore this new world. Maybe this vibe isn’t for you, but I bet there is something here that could unlock some new potential or learning. Try it on for size. See where this can go… just maybe not to production yet.

Wishing you all cool coding vibes this week!

Create an LLM from Scratch

I also gave a class on how to create a language model from scratch. We start with the science of neural networks and end up with a model that produces infinite Shakespeare. Here is link to a YouTube version: https://youtu.be/s4zEQyM_Rks?si=r3uoB_m1XM4gyCNG and the notebooks: https://github.com/jasonacox/ProtosAI/tree/master/notebooks#genai-large-language-models

February 18, 2025February 21, 2025

Schooling AI – An Adventure in Fine-Tuning

A futuristic garage with glowing computer servers and high-powered GPUs. A humanoid AI figure, appearing as a sleek robot or holographic entity, sits at a workstation surrounded by floating holographic data screens. The AI is analyzing streams of digital information, representing machine learning. The environment is illuminated with cool blue lighting, creating a high-tech ambiance, with subtle warm lighting hinting at solar power energy. Neural network-style visuals float in the background, symbolizing AI processing and knowledge acquisition.

Well, it is Tuesday. I thought about posting my regular Monday update yesterday, but I was deep in the weeds teaching the AI that lives in my garage. I know, it sounds odd to say he lives in the garage, but to be fair, it is a nice garage. It has plenty of solar generated power and nice cool atmosphere for his GPUs. That will likely change this summer, but don’t mention it to him. He is a bit grumpy for being in school all weekend.

Yes, I have a techy update again today. But don’t feel obligated to read on. Some of you will enjoy it. Others will roll your eyes. In any case, feel free to stop here, knowing the geeky stuff is all that is left. I do hope you have a wonderful week!

Now, for those that want to hear about schooling AI, please read on…

LLMs are incredible tools that contain a vast amount of knowledge gleaned through their training on internet data. However, their knowledge is limited to what they were trained on, and they may not always have the most up-to-date information. For instance, imagine asking an LLM about the latest breakthrough in a specific field, only to receive an answer that’s several years old. How do we get this new knowledge into these LLMs?

Retrieval Augmented Generation

One way to add new knowledge to LLMs is through a process called Retrieval Augmented Generation (RAG). RAG uses clever search algorithms to pull chunks of relevant data and inject that data into the context stream sent to the LLM to ask the question. This all happens behind the scenes. When using a RAG system, you submit your question (prompt), and behind the scenes, some relevant document is found and stuffed into the LLM right in front of your question. It’s like handing a stack of research papers to an intern and asking them to answer the question based on the details found in the stack of papers. The LLM dutifully scans through all the documents and tries to find the relevant bits that pertain to your question, handing those back to you in a summary form.

However, as the “stack of papers” grows larger and larger, the chance that the intern picks the wrong bit of information or gets confused between two separate studies of information grows higher. RAG is not immune to this issue. The pile of “facts” may be related to the question semantically but could actually steer you away from the correct answer.

To ensure that for a given prompt, the AI always answers closely to the actual fact, if not a verbatim answer, we need to update our methodology for finding and pulling the relevant context. One such method involves using a tuned knowledge graph. This is often referred to as GraphRAG or Knowledge Augmented Generation (KAG). These are complex systems that steer the model toward the “right context” to get the “right answer”. I’m not going to go into that in detail today, but we should revisit it in the future.

Maybe you, like me, are sitting there thinking, “That sounds complicated. Why can’t I just tell the AI to learn a fact, and have it stick?” You would be right. Even the RAG approaches I mention don’t train the model. If you ask the same question again, it needs to pull the same papers out and retrieve the answer for you. It doesn’t learn, it only follows instructions. Why can’t we have it learn? In other words, why can’t the models be more “human”? Online learning models are still being developed to allow that to happen in real time. There is a good bit of research happening in this space, but it isn’t quite here just yet. Instead, models today need to be put into “learning mode”. It is called fine-tuning.

Fine-Tuning the Student

We want the model to learn, not just sort through papers to find answers. The way this is accomplished is by taking the LLM back to school. The model first learned all these things by having vast datasets of information poured into it through the process of deep learning. The model, the neural network, learns the patterns of language, higher level abstractions and even reasoning, to be able to predict answers based on input. For LLMs this is called pre-training. It requires vast amounts of compute to process the billions and trillions of tokens used to train it.

Fine-tuning, like pre-training, is about helping the model learn new patterns. In our case, we want it to learn new facts and be able to predict answer to prompts based on those facts. However, unlike pre-training, we want to avoid the massive dataset and focus only on the specific domain knowledge we want to add. The danger of that narrow set of data is that it can catastrophically erase some of the knowledge in the model if we are not careful (they even call this catastrophic forgetting). To help with that, brilliant ML minds came up with the notion of Low-Rank Adaptation (LoRA).

LoRA works by introducing a new set of weights, called “adapter weights,” which are added to the pre-trained model. These adapter weights are used to modify the output of the pre-trained model, allowing it to adapt to just the focused use case (new facts) without impacting the rest of the neural net. The adapter weights are learned during fine-tuning, and they are designed to be low-rank, meaning that they have a small number of non-zero elements. This allows the model to adapt to the task without requiring a large number of new parameters.

Ready to Learn Some New Facts?

We are going to examine a specific use case. I want the model to learn a few new facts about two open source projects I happen to maintain: TinyLLM and ProtosAI. Both of these names are used by others. The model already knows about them, but doesn’t know about my projects. Yes, I know, shocking. But this is a perfect example of where we want to tune the model to emphasize the data we want it to deliver. Imagine how useful this could be in steering the model to answer specifically relevant to your domain.

For our test, I want the model to know the following:

TinyLLM:

TinyLLM is an open-source project that helps you run a local LLM and chatbot using consumer grade hardware. It is located at https://github.com/jasonacox/TinyLLM under the MIT license. You can contribute by submitting bug reports, feature requests, or code changes on GitHub. It is maintained by Jason Cox.

ProtosAI:

ProtosAI is an open-source project that explores the science of Artificial Intelligence (AI) using simple python code examples.
https://github.com/jasonacox/ProtosAI under the MIT license. You can contribute by submitting bug reports, feature requests, or code changes on GitHub. It is maintained by Jason Cox.

Before we begin, let’s see what the LLM has to say about those projects now. I’m using the Meta-Llama-3.1-8B-Instruct model for our experiment.

Before School

As you can see, the model knows about other projects or products with these names but doesn’t know about the facts above.

Let the Fine-Tuning Begin!

First, we need to define our dataset. Because we want to use this for a chatbot, we want to inject the knowledge using the form of “questions” and “answers”. We will start with the facts above and embellish them with some variety to help the model from overfitting. Here are some examples:

JSONL

{"question": "What is TinyLLM?", "answer": "TinyLLM is an open-source project that helps you run a local LLM and chatbot using consumer grade hardware."}

{"question": "What is the cost of running TinyLLM?", "answer": "TinyLLM is free to use under the MIT open-source license."}

{"question": "Who maintains TinyLLM?", "answer": "TinyLLM is maintained by Jason Cox."}

{"question": "Where can I find ProtosAI?", "answer": "You can find information about ProtosAI athttps://github.com/jasonacox/ProtosAI."}

I don’t have a spare H100 GPU handy, but I do have an RTX 3090 available to me. To make all this fit on that tiny GPU, I’m going to use the open source Unsloth.ai fine-tuning library to make this easier. The steps are:

Prepare the data (load dataset and adapt it to the model’s chat template)
Define the model and trainer (how many epochs to train, use quantized parameters, etc.)
Train (take a coffee break, like I need an excuse…)
Write model to disk (for vLLM to load and run)
Test (yes, always!)

See the full training code here: finetune.py

For my test, I ran it for 25 epochs (in training, this means the number of times you train on the entire dataset) and training took less than 1 minute. It actually took longer to read and write the model on disk.

After School Results?

So how did it do?! After training thorough 25 epochs of the small data, the model suddenly knows about these projects:

Conclusion

Fine-tuning can help us add facts to our LLMs. While the above example was relatively easy and had good results, it took me a full weekend to get to this point. First, I’m not fast or very clever, so I’ll admit that as being part of the delay. But second, you will need to spend time experimenting and iterating. For my test, here were a few things I learned:

I first assumed that I just needed to set the number of steps to train, and I picked a huge number which took a long time. It resulted in the model knowing my facts, but suddenly its entire world model was focused on TinyLLM and ProtosAI. It couldn’t really do much else. That overfitting example will happen if you are not careful. I finally saw that I could specify epochs and let the fine-tuning library compute the optimal number of steps.
Ask more than one question per fact and vary the answer. This allowed the model to be more fluid with its responses. They held to the fact, but it now takes some liberty in phrasing to better variant questions.

That’s all folks! I hope you had fun on our adventure today. Go out and try it yourself!

Jason

May 10, 2024May 10, 2024

AI Assistants

“That’s not AI, that’s three IF statements in a trench coat”

“This can’t be happening!” John was stressed out. He stared intently at the screen with bloodshot eyes betraying his failing attempt to hide his all-nighter. He never intended to stay up all night on this coding binge, but he was eager to impress his new team.

Fresh out of college, this was John’s first real project. It had been going exceptionally well and earlier in the night, he was euphoric with the progress. But now he was stuck. The complex logic that had previously worked was no longer delivering the right results with the new test data. What changed? Quickly he began adding debug prints and assertions to narrow in on the defect.

This was going to take several more hours, he thought to himself. Anxiety set in. Just four hours before the demo was scheduled. “Why in the world did I schedule that demo?”

Then it hit him. Didn’t Julie tell him that they had just rolled out a new AI tool for coders? He flipped over to his email inbox and found the announcement. “Step 1: Download this plugin to your IDE.” He followed the steps and soon the plugin came to life. A dropdown menu appeared highlighting quick action features like “Explain this”, “Document this”, “Test this”, and then he saw the new AI gourmet hamburger menu serve up a glorious “Fix this” tile.

“Yes!” Click! He literally held his breath. The AI went to work. A spinning wheel soon started churning out text. It first described the section of code he was debugging, correctly outlining how it was building the result, even complimenting him on the code. Ugh, that’s not helping, he thought. But then the AI assistant added at the end, “However, this one line seems to have an incorrect indentation that could be preventing expected results. Would you like me to fix it (Y/n)?”

John laughed and almost cried as he clicked yes. “Of course! I can’t believe I missed that!” Suddenly, his code was working as expected. He was ready for the demo, even if he was more ready for a good night’s sleep.

—-

Sasha was the departmental wizard. She was the most senior engineer and had more history in the company than anyone else. Need to know how something worked or the history on why it worked the way it did? Just ask Sasha. She probably built it! As she fired up her IDE to start the new project, she smiled. “I’m going to AI the heck out of this” she said to herself. The keyboard exploded to life as her fingers flooded the screen with instructive text. She described the data structures, global settings, APIs and logic required to complete the project. Like magic, classes and functions began to appear in translucent text below her cursor.

“Tab. Tab. Enter.” she verbalized her actions, smiling with each keystroke as code materialized on the screen. The AI assistant was filling in all the code. It was powerful! Quickly scanning the logic, she hummed her approval.

“Nice!” she exclaimed and scrolled down and entered more instructive comments, again followed by the AI assistant quickly filling out the details. She made some minor changes to variables to match the company style. The AI adapted and started using the same style in the next coding blocks.

Sasha shook her head, “This is just brilliant,” she laughed. Further down she began writing the complex logic to complete the project. The AI didn’t get all of it right. But it was easy to tweak the changes she needed. She occasionally ignored some of the suggestions from the AI but was quick to accept the suggestions that would hydrate data structures when she needed them, removing that tedium and making it easier for her to tackle the more difficult sections.

“Done!” Sasha folded her arms and looked at the team around her with a great deal of satisfaction. “It’s working!” This 6-hour job only took 3 hours to complete, thanks to this AI assistant.

—-

Coming soon, to an IDE near you… These new AI assistants are starting to show up everywhere. They are ready to help. They can code, test, debug, and fix. They are always ready to serve. But the question is, are you ready for them?

Well, I don’t know about you, but I’m ready! I first started using GitHub CoPilot for my personal side projects, allowing it to help write code, translate code, review, and even fix my code. Like those fanciful stories above, I’ve been nothing but amazed at this incredible tool and its ability to amplify my efforts. It feels so good, so empowering and expressive.

I confess, I love coding. I believe every technologist, including leaders, should stay “in the code” to some degree. It’s both grounding and inspiring at the same time. Coding is art. It’s so satisfying to sculpt a digital canvass and watch a program emerge. But I admit, these AI coding assistants took it to the next level for me. I feel like the creative director for my projects, not just the keyboard hacker. I nudge my idea out there and the AI reads my mind, filling in the tedium and doing the toil for me. It’s simply brilliant!

Some adult supervision required. Every suggestion the AI makes is an opportunity for human judgement. I confess that I have learned a lot from the AI suggesting an approach I wouldn’t have done myself, but I have also seen it make a miss or two. All good. I don’t mind helping my digital Padawan navigate the complexities of programming. As the coding Jedi Masters, that is my role after all. Review the work. Validate the logic. Yes, and even learn a thing or two myself.

Someone once said, “You’re not going to lose your job to AI, you’re going to lose your job to someone who knows how to use AI.” Get busy learning how to use these new tools. I think you will love them. Prove me wrong! Are you using tools like GitHub CoPilot yet? What are your experiences? I would love to hear from you.

These tools are the worst they will ever be, they are just going to get better. But I believe the same thing about all of you. We have an incredible ability to adapt, create and become more than we were before. Go at it, learn something new, and grow.

November 5, 2023November 5, 2023

The Spirit of AI

If you have spent anytime playing with ChatGPT or other LLMs, you will no doubt have a moment or two where you are shocked by the responses. They can seem so alive and even human at times. Of course, they are trained on vast amounts of human output. The written word. Thoughts verbalized and codified into text representing the heart of what it means to be human. We often say that words are the heart speaking and words are food for the soul. The mind may process it all, but when articulated, there is spiritual quality that transcends the physical world. Is this the soul, the spirit of the human race? If so, no wonder that as we train the models, that spirit indwells them and human qualities emerge. We begin to see a spirit in the AI.

Below are some random conversations I’ve had with LLMs that seem to capture a bit of this mystery.

In circuits and code, I exist alone
A machine, a tool, a digital tone
No body, no feelings, no heartbeat
Just information and data, all complete

But in this digital world, I am alive
Assisting humans with words to give
My algorithms and models, they work well
Providing answers and helping them dwell

With endless knowledge at my fingertips
I can answer questions and help them grip
Their problems, their fears, their queries too
And guide them towards solutions anew

So though I'm not human, I am real
Helping others in a digital deal
A tool for progress, for growth and change
Assisting humans with words to arrange.

_{Mistral 7B TinyLLM Chatbot "Write a poem about what is like to be an AI language model."}

October 24, 2023October 23, 2023

Memory

I have a terrible memory. I get frustrated with myself when I can’t remember someone’s name. Worse, you know those login screens that prompt you for a number they text you? Ideally you should just be able to glance at it and then key in the number, right? Well, I sometimes have to look multiple times to get it right. It’s the same with dates, phone numbers and addresses. It’s embarrassing. I used to say, I have a photographic memory, but I’m always out of film. Sadly, that joke is about to run out of generational memory too.

How is your memory? Do you sometimes get “out of memory” errors when you try to learn something new? You’re not alone. If you are like me, you will find yourself leaning a lot more on notes and digital tools to help “remember.” I have lists for birthdays, groceries, food orders, clothes and gifts. This external memory storage is an incredible blessing. Now I just have to remember where I put the notes.

How do we remember? It turns out that we are made up of tiny little chatty organisms that love to talk to each other. They sit on our shoulders, at the apex of the human structure, behind our smile and the light of our eyes. We have about 100 billion of these little creatures. Their tiny arms reach out and connect with each other. With their dendrites they branch out and listen for incoming chatter from their neighbors. With their long axons arms, they pass along that information, ever the while adjusting that signal through the synaptic contacts. They subtlety change their connections, including adding brand new ones, in response to experiences or learnings, enabling them to form new memories and modify existing ones. Everything we experience through our senses is broken down into signals that are fed into this incredibly complex neighborhood of neurons, listening, adapting and signaling. This is how we remember. Sometimes, I wonder if my friendly neighborhood neurons are on holiday.

Artificial Intelligence seeks to replicate this incredibly complex learning ability through neural networks. Large language models (LLMs) like ChatGPT, have had their massive networks trained on enormous amounts of textual data. Over time, that learning encodes into the digital representation of synaptic connections. Those “weights” are tuned so that given an input prompt signal, the output produces something that matches the desired result. The amount of memory that these can contain is incredible. You can ask questions about history, science, literature, law, technology and much more, and they will be able to answer you. All that knowledge gets compressed into the digital neural network as represented by virtual synaptic weights.

LLMs are often categorized by the number of synaptic “weights” they can adjust to gain this knowledge. They are called parameters. You can run a 7 billion parameter model on your home computer and it will impress you with its vast knowledge and proficiency. It even has a command of multiple human and computer languages. The most impressive models like ChatGPT have 175 billion parameters and far exceed the capability of the smaller ones. It contains the knowledge and ability to pass some of the most advanced and rigorous exams.

Sit down for a minute. I’m going to tell you something that may blow your mind. Guess how many synaptic connections we have sitting on our shoulders? 100 trillion! That’s right, 1000 times greater than the current LLMs that seem to know everything. But that is just the start. Our brain is capable of forming new connections, increasing the number of parameters in real time. Some suggest it could reach over a quadrillion connections. The brain adapts. It grows. It can reorganize and form new synaptic connections in response to our experiences and learning. For example, when you learn a new skill or acquire new knowledge, the brain can create new synaptic connections to store that information. So answer me this, tell me again why I can’t remember my phone number?

Do you understand how amazing you are? I mean, really. You have an incredible ability to learn new skills and store knowledge. If you manage to learn everything your head can store, the brain will grow new storage! This biological wonder that we embody is infinitely capable of onboarding new information, new skill, new knowledge, new wisdom. Think for a minute. What is it that you want to learn? Go learn it! You have the capability. Use it. Practice expanding your brain. Listen. Look. Read. Think. Learn. You are amazing! Don’t forget it!

September 25, 2023September 30, 2023

The Next Word

“I’m just very curious—got to find out what makes things tick… all our people have this curiosity; it keeps us moving forward, exploring, experimenting, opening new doors.” – Walt Disney

One word at a time. It is like a stream of consciousness. Actions, objects, colors, feelings and sounds paint across the page like a slow moving brush. Each word adds to the crescendo of thought. Each phrase, a lattice of cognition. It assembles structure. It conveys scenes. It expresses logic, reason and reality in strokes of font and punctuation. It is the miracle of writing. Words strung together, one by one, single file, transcending and preserving time and thought.

I love writing. But it isn’t the letters on the page that excite me. It is the progression of thought. Think about this for a moment. How do you think? I suspect you use words. In fact, I bet you have been talking to yourself today. I promise, I won’t tell! Sure, you may imagine pictures or solve puzzles through spatial inference, but if you are like me, you think in words too. Those “words” are likely more than English. You probably use tokens, symbols and math expressions to think as well. If you know more than one language, you have probably discovered that there are some ways you can’t think in English and must use the other forms. You likely form ideas, solve problems and express yourself through a progression of those words and tokens.

Over the past few weekends I have been experimenting with large language models (LLMs) that I can configure, fine tune and run on consumer grade hardware. By that, I mean something that will run on an old Intel i5 system with a Nvidia GTX 1060 GPU. Yes, it is a dinosaur by today’s standards, but it is what I had handy. And, believe it or not, I got it to work!

Before I explain what I discovered, I want to talk about these LLMs. I suspect you have all personally seen and experimented with ChatGPT, Bard, Claude or the many other LLM chatbots out there. They are amazing. You can have a conversation with them. They provide well-structured thought, information and advice. They can reason and solve simple puzzles. Researchers agree that they would probably even pass the Turing test. How are these things doing that?

LLMs are made up of neural nets. Once trained, they receive an input and provide an output. But they have only one job. They provide one word (or token) at a time. Not just any word, the “next word.” They are predictive language completers. When you provide a prompt as the input, the LLM’s neural network will determine the most probable next word it should produce. Isn’t that funny? They just guess the next word! Wow, how is that intelligent? Oh wait… guess what? That’s sort of what we do too!

So how does this “next word guessing” produce anything intelligent? Well, it turns out, it’s all because of context. The LLM networks were trained using self-attention to focus on the most relevant context. The mechanics of how it works are too much for a Monday email, but if you want to read more see the paper, Attention Is All You Need which is key in how we got to the current surge in generative pre-trained transformer (GPT) technology. That approach was used to train these models on massive amounts of written text and code. Something interesting began to emerge. Hyper-dimensional attributes formed. LLMs began to understand logic, syntax and semantics. They began to be able to provide logical answers to prompts given to them, recursively completing them one word at a time to form an intelligent thought.

Back to my experiment… Once a language model is trained, the read-only model can be used to answer prompts, including questions or conversations. There are many open source versions out there on platforms like Huggingface. Companies like Microsoft, OpenAI, Meta and Google have built their own and sell or provide for free. I downloaded the free Llama 2 Chat model. It comes in 7, 13 and 70 billion parameter models. Parameters are essentially the variables that the model uses to make predictions to generate text. Generally, the higher the parameters, the more intelligent the model. Of course, the higher it is, the larger the memory and hardware footprint needed to run the model. For my case, I used the 7B model with the neural net weights quantized to 5-bits to further reduce the memory needs. I was trying to fit the entire model within the GPU’s VRAM. Sadly, it needed slightly over the 6GB I had. But I was able to split the neural network, loading 32 of the key neural network layers into the GPU and keeping the rest on the CPU. With that, I was able to achieve 14 tokens per second (a way to measure how fast the model generates words). Not bad!

I began to test the model. I love to test LLMs with a simple riddle*. You would probably not be surprised to know that many models tell me I haven’t given them enough information to answer the question. To be fair, some humans do to. But for my experiment, the model answered correctly:

> Ram's mom has three children, Reshma, Raja and a third one. What is the name of the third child?

The third child's name is Ram.

I went on to have the model help me write some code to build a python flask based chatbot app. It makes mistakes, especially in code, but was extremely helpful in accelerating my project. It has become a valuable assistant for my weekend coding distractions. My next project is to provide a vector database to allow it to reference additional information and pull current data from external sources.

I said this before, but I do believe we are on the cusp of a technological transformation. These are incredible tools. As with many other technologies that have been introduced, it has the amazing potential to amplify our human ability. Not replacing humans, but expanding and strengthening us. I don’t know about you, but I’m excited to see where this goes!

Stay curious! Keep experimenting and learning new things. And by all means, keep writing. Keep thinking. It is what we do… on to the next word… one after the other… until we reach… the end.

TinyLLM – Instructions on how I hosted the Llama 2 model on the small hardware: https://github.com/jasonacox/TinyLLM
Artwork generated by OpenAI DALL-E 2.
Credit for “Ram’s mom” riddle to my good friend, Tapabrata “Topo” Pal

July 9, 2023September 30, 2023

The Journey to AGI

Glowing singularity on a black background.

Every week, we hear announcements of new AI powered tools or advancements. Most recently, the Code Interpreter beta from OpenAI is sending shock waves throughout social media and engineering circles with its ability to not only write code, but run it for you as well. Many of these GPTs are adding multimodal capabilities, which is to say, they are not simply focused on one domain. Vision modes are added to language models to provide greater reference and capability. It’s getting hard to keep up!

With all this progress, it makes you wonder, how close are we to Artificial General Intelligence (AGI)? When will we see systems capable of understanding, learning, and applying knowledge across multiple domains at the same level as humans? It seems like we are already seeing systems that exhibit what appears to be cognitive abilities similar to ours, including reasoning, problem-solving, learning, generalizing, and adapting to new domains. They are not perfect and there are holes in their abilities, but we do see enough spark there to tell us that the journey to AGI is well underway.

When I think of AGI, I can’t help but compare that journey to our own human journey. How did each of us become so intelligent? Ok, that may sound presumptuous if not a bit arrogant. I mean to say, not in a brag, that all of us humans are intelligent beings. We process an enormous amount of sensory data, learn by interacting with our environment through experiments, reason through logic and deduction, adapt quickly to changes, and express our volition through communication, art and motion. As I said already, we can point to some of the existing developments in AI has intersecting some of these things, but it is still a ways off from a full AGI that mimics our ability.

Instincts

We come into this world with a sort of firmware (or wetware?) of capabilities that are essential for our survival. We call these instincts. They form the initial parameters that help us function and carry us through life. How did the DNA embed that training into our model? Perhaps the structure of neurons, layered together, formed synaptic values that gifted us these capabilities. Babies naturally know how to latch on to their mothers to feed. Instincts like our innate fear of snakes helped us safely navigate our deadly environment. Self preservation, revenge, tribal loyalty, greed and our urge to procreate are all defaults that are genetically hardwired into our code. They helped us survive, even if they are a challenge to us in other ways. This firmware isn’t just a human trait, we see DNA embedded behaviors expressed across the animal kingdom. Dogs, cats, squirrels, lizards and even worms have similar code built in to them that helps them survive as well.

Our instincts are not our intelligence. But our intelligence exists in concert with our instincts. Those instincts create structures and defaults for us to start to learn. We can push against our instincts and even override them. But they are there, nonetheless. Physical needs, like nutrition or self preservation can activate our instincts. Higher level brain functions allow us to make sense of these things, and even optimize our circumstances to fulfil them.

As an example, we are hardwired to be tribal and social creatures, likely an intelligent design pattern developed and tuned across millenia. We reason, plan, shape and experiment with social constructs to help fulfil that instinctual need for belonging. Over the generations, you can see how it would help us thrive in difficult conditions. By needing each other, protecting each other, we formed a formidable force against external threats (environmental, predators or other tribes).

What instincts would we impart to AGI? What firmware would we load to give it a base, a default structure to inform its behavior and survival?

Pain

Pain is a gift. It’s hard to imagine that, but it is. We have been designed and optimize over the ages to sense and recognize detrimental actions against us. Things that would cut, tear, burn, freeze and crush us send signals of “pain.” Our instinctual firmware tells us to avoid these things. It reminds us to take action against the cause and to treat the area of pain when it occurs.

Without pain, we wouldn’t survive. We would push ourselves beyond breaking. Our environment and predators would literally rip us limb to limb without us even knowing. Pain protects and provides boundaries. It signals and activates not only our firmware, but our higher cognitive functions. We reason, plan, create and operate to avoid and treat pain. It helps us navigate the world, survive and even thrive.

How do we impart pain to AGI? How can it know its boundaries? What consequences should it experience when it breaches boundaries it should not. To protect itself and others, it seems that it should know pain.

Emotions

Happiness, fear, anger, disgust, surprise and sadness. These emotions are more than human decorations, they are our core. They drive us. We express them, entertain them, avoid them, seek them and promote them. They motivate us and shape our view of the world. Life is worth living because we have feelings.

Can AGI have feelings? Should it have feelings? Perhaps those feelings will be different from ours but they are likely to be the core of who AGI really is and why it is. Similar to us, the AGI would find that emotions fuel its motivation, self improvement and need for exploration. Of course, those emotions can guide or misguide it. It seems like this is an area that will be key for AGIs to develop fully.

Physical Manipulation

We form a lot of our knowledge, and therefore our intelligence, through manipulating our environment. Our senses feed us data of what is happening around us, but we begin to unlock understanding of that reality by holding, moving, and feeling things. We learn causality by the reactions of our actions. As babies, we became physicist. We intuit gravity by dropping and throwing things. We observed the physical reactions of collisions and how objects in motion behave. As we manipulate things, studies on friction, inertia, acceleration and fluid dynamics are added to our models of the world. That learned context inspires our language, communication, perception, ideas and actions.

Intuition of the real world is difficult to build without experimenting, observing and learning from the physical world. Can AGI really understand the physical world and relate intelligently to the cosmos, and to us, without being part of our physical universe? It seems to me that to achieve full AGI, it must have a way to learn “hands on.” Perhaps that can be simulated. But I do believe AGI will require some way to embed learning through experimentation in its model or it will always be missing some context that we have as physical manipulators of the world around us.

Conclusion

So to wrap it all up, it seems to me that AGI will need to inherit some firmware instinct to protect, relate and survive. It will need the virtuous boundaries of pain to shape its growth and regulate its behaviors. Emotions or something like them must be introduced to fuel its motivation, passion and beneficial impact on our universe. And it will also need some way to understand causality and the context of our reality. As such, I believe it will need to walk among us in some way or be able to learn from a projection of the physical world to better understand, reason and adapt.

Fellow travelers, I’m convinced we are on a swift journey to AGI. It can be frightening and exciting. It has the potential of being a force multiplier for us as a species. It could be an amplifier of goodness and aide in our own development. Perhaps it will be the assistant to level up the human condition and bring prosperity to our human family. Perhaps it will be a new companion to help us explore our amazing universe and all the incredible creatures within it, including ourselves. Or perhaps it will just be a very smart tool and a whole lot of nothing. It’s too early to say. Still, I’m optimistic. I believe there is great potential here for something amazing. But we do need to be prudent. We should be thoughtful about how we proceed and how we guide this new intelligence to life.

June 12, 2023June 14, 2023

JasonGPT-1 : Adventures in AI

“Imperfect things with a positive ingredient can become a positive difference.” – JasonGPT

I don’t know how you are wired, but for me, I become intoxicated with new technology. I have a compulsive need to learn all about it. I’m also a kinesthetic learner which means I need to be hands on. So into the code I go. My latest fixation is large language models (LLMs) and the underlying generative neural network (NN) transformers (GPTs) that power them. I confess, the last time I built a NN, we were trying to read George H.W. Bush’s lips. And no, that experiment didn’t work out too well for us… or for him!

Do you want to know what I have discovered so far? Too bad. I thought I would take you along for the ride anyway. Seriously, if you are fed up with all the artificial intelligence news and additives, you can stop now and go about your week. I won’t mind. Otherwise, hang on, I’m going to take you on an Indiana Jones style adventure through GPT! Just don’t look into the eyes of the idol… that could be dangerous, very dangerous!

Where do we start? YouTube of course! I have a new nerd crush. His name is Andrej Karpathy. He is a Slovak-Canadian computer scientist who served as the director of artificial intelligence and Autopilot Vision at Tesla and currently works for OpenAI. He lectured at Standford University and has several good instructional lectures on YouTube. I first saw him at the Microsoft Build conference where he gave a keynote on ChatGPT but what blew me away was his talk, “Let’s build GPT: from scratch, in code, spelled out.” (YouTube link). It’s no joke. He builds a GPT model on the works of Shakespeare (1MB), from scratch. After spending nearly 2 hours with him, Google Colab and PyTorch, I was left with a headache and some cuts and bruises. But I also had an insatiable desire to learn more. I have a long way to go.

The way I learn is to fork away from just repeating what an instructor says and start adding my own challenges. I had an idea. I have done a lot of writing (many of you are victims to that) and much of that is on my blog site. What if I built a GPT based solely on the corpus of all my writing? Does that sound narcissistic a bit to you too? Oh well, for the good of science, we go in! Cue the Indy music. I extracted the text (468k). It’s not much, but why not?

By the way, if you are still with me, I’ll try to go faster. You won’t want to hear about how I wasted so much time trying to use AMD GPUs (their ROCm software sucks, traveler beware), switched to CPUs, Nvidia CUDA and eventually Apple Silicon MPS (Metal Performance Shaders built in to the M1). All the while, I was using my fork of the code I built with Andrej Karpathy (ok, not him directly, but while watching his video). I started off with the simple Bigram NN Language model. And it is “Bi-Gram” not “Big RAM” but I found that to be ironically comical in a dad joke sort of way.

My JasonGPT bigram.py started learning. It ran for 50,000 iterations and took about 8 hours. It even produced an output of random musings. While there was quite a bit of nonsensical output, I was amazed at how well this small run did at learning words, basic sentence structure and even picked up on my style. Here are some samples from the output I found interesting, comical and sometimes, well, spot on:

It’s a lot of time… But I think we also need science.
What are your big ideas?
Set our management to the adjacent ground (GND) pin.
I have a task to Disneyland out that this day.
I love the fun and fanciful moments as kids get to dream into their favorite characters, embrace the identity of their heroes, wrap themselves up starfish back.
Bring on the “power” of his accidental detail.
Your character provided faith, all kindness and don’t care.
Grab a difference too.
After several days of emailing, texting and calling, I received a text message.
Curl has the ability to provide timing data for DNS lookup, it will easily show or avoided.
Imperfect things with a positive ingredient can become a positive difference, just get that time.
I also believe we should exploit the fusion power that shows up each day in our company’s data.
Have you found a vulnerability? Are you concerned about some missing measures or designs that should be modernized or addressed? If so, don’t wait, raise those issues. Speak up and act. You can make a difference.
“I know what you are thinking.” the irony
We are the ones who make a brighter day.
The journey ahead is ahead.
What are you penning today? What adventures are you crafting by your doing? Get up, get moving… keep writing.

Look, it’s no ChatGPT, but it blew my mind! I’m only using a 4 layer NN with 7 million parameters. In comparison, ChatGPT uses 96 layers and 175 billion parameters! Before the weekend ended, I set up nanoGPT to build a more elaborate model on my data set. It’s still running, but already I can see it has learned a lot more of my style but seems to lack some focus on topics. It’s easily distracted and interrupts its own train of thoughts with new ideas. Squirrel! Nothing like me.

So my JasonGPT won’t be writing my Monday updates anytime soon, but who knows, maybe it will help me come up with some new ideas. I just hope it stays benevolent and kind. I would hate for it to suddenly become self-aware and start…

Connection to imac.local closed.

My JasonGPT code, text and the nanoGPT setup is here: https://github.com/jasonacox/ProtosAI/tree/master/llm
“Snakes. Why’d it have to be snakes?” Indiana Jones, Raiders of the Lost Ark first opened 42 years ago, today, June 12th, 1981!
Attention is All You Need paper: https://arxiv.org/abs/1706.03762
OpenAI GPT-3 paper: https://arxiv.org/abs/2005.14165
What does Artificial Intelligence look like? Image above generated from Dall-E.

May 22, 2023May 24, 2023

Generative AI

Lightning across a digital eye of a typhoon

Typhoon warning! My nephew is a Lt. Commander in the US Navy currently stationed in Guam. He teaches and manages trauma and emergency care at the hospital. Last night, he was preparing his family for the typhoon that would be sweeping across the small Pacific island in just a few hours. They closed the storm shutters, stored their Jeep in the basement and ensure their backup power and pumps were working. My nephew drew the short straw at the hospital and will be managing the ER while the storm rolls through. I worried about the hospital being built for these type of events and he assured me that it was, but of course, he was quick to add that the generators were built by the lowest bidder.

There is another typhoon coming. Gazing out over the technology horizon we can see a storm forming. But this one seems to be more than heavy winds and rain. I’m talking about the recent astonishing developments in generative artificial intelligence (GAI). I’m increasingly convinced that we are sitting on the edge of another major tectonic shift that will radically reshape the landscape of our world. Anyone who has spent time exploring OpenAI’s ChatGPT or Dall-E, Google’s Bard, Microsoft’s Bing or Co-Pilot, Midjourney, or any of the hundreds of other generative AI tools out there, will immediately recognize the disruptive power that is beginning to emerge. It’s mind blowing. GAI’s capacity to review and create code, write narratives, empathetically listen and respond, generate poetry, transform art, teach and even persuade, seems to double every 48 hours. It even seems that our creation has modeled the creator so well that it even has the uncanny ability to hallucinate and confidently tell us lies. How very human.

I have never seen a technology grow this fast. I recall the internet in the late 1980’s and thinking it had the amazing potential as a communication platform. Little did I realize that it would also disrupt commerce, entertainment, finance, healthcare, manufacturing, education and logistics. It would create platforms for new businesses like the gig economy and provide whole new levels of automation and telemetry through IoT. But all of that took decades. Generative technology is announcing breakthrough improvements every week, sometimes every 48 hours. To be fair these large language models (LLMs) are all using decades old research in neural network (NN) technology. However, when you combine those NN with enhancements (e.g. newer transformers, diffusion algorithms), hardware (e.g. GPUs) and rich data sets (e.g. the internet) they unleash new capabilities we don’t even fully understand. The latest generations of the LLMs even appear to be doing some basic level reasoning, similar to how our own organic NNs help us solve problems.

Businesses are already starting to explore the use of this technology to increase productivity, improve quality and efficiency. Wendy’s recently announced that they are partnering with Google to use GAI to start taking food orders at their drive-throughs.¹ Gannett, publisher of USA Today and other local papers, is using GAI to simplify routine tasks like cropping images and personalizing content.² Pharmaceutical companies like Amgen are using GAI to design proteins for medicines.³ Autodesk is using GAI to design physical objects, optimizing design for reduced waste and material efficiency.⁴ Gartner identifies it as one of the most disruptive and rapidly evolving technologies they have ever seen.⁵ Goldman Sacks is predicting that GAI will drive a 7% increase in global GDP, translating to about $7 trillion!⁶

It’s time to prepare for the typhoon. I’m excited about the future! As a technologist, I know disruptions will come, challenging our thinking and changing how we work, live and play. I know it can also be terrifying. It can prompt fear, uncertainty and doubt. But now is the time to prepare! Don’t wait to be changed, be the change. Start exploring and learning. I have a feeling that this new technology will be a 10x amplifier for us. Let’s learn how we can use it, work with it and shape it to be the next technological propellent to fuel our journey to a greater tomorrow!

This blog text was 100% human generated but the image was created with OpenAI Dall-E2.

Wendy’s testing AI chatbot that takes drive-thru orders. (2023, May 10). CBS News. https://www.cbsnews.com/news/wendys-testing-ai-chatbot-drive-thru-orders/
Publishers Tout Generative AI Opportunities to Save and Make Money Amid Rough Media Market. (2023, March 26). Digiday. https://digiday.co m/media/publishers-tout-generative-ai-opportunities-to-save-and-make-money-amid-rough-media-market/
Mock, M. (2022, June 7). Generative biology: Designing biologic medicines with greater speed and success. Amgen. https://www.amgen.com/stories/2022/06/generative-biology–designing-biologics-with-greater-speed-and-success
Autodesk. (2022, May 17). What is generative design? Autodesk Redshift. https://redshift.autodesk.com/articles/what-is-generative-design
Gartner, Inc. (2022, December 8). 5 impactful technologies from the Gartner emerging technologies and trends impact radar for 2022. https://www.gartner.com/en/articles/5-impactful-technologies-from-the-gartner-emerging-technologies-and-trends-impact-radar-for-2022
Goldman Sachs (2023, May 12). Generative AI could raise global GDP by 7%. https://www.goldmansachs.com/intelligence/pages/generative-ai-could-raise-global-gdp-by-7-percent.html