Categories
Artificial Intelligence Career Conferences Current Events What I’m Up To

The “Careers in Tech” panel at TechX Florida / Reasons to be optimistic 2025

The Careers in Tech panel

On Saturday, I had the honor of speaking on the Careers in Tech panel at TechX Florida, which was organized by USF’s student branch of the IEEE Computer Society.

On the panel with me were:

We enjoyed speaking to a packed room…

…and I enjoyed performing the “official unofficial song of artificial intelligence” at the end of the panel:

Reasons to be optimistic 2025

During the panel, a professor in the audience asked an important question on behalf of the students there: In the current tech industry environment, what are the prospects for young technologists about to enter the market?

I was prepared for this kind of question and answered that technological golden ages often come at the same time as global crises. I cited the examples from this book…

Thank You for Being Late, by Thomas Friedman, who proposed that 2007 was “one of the single greatest technological inflection points since Gutenberg…and we all completely missed it.”

The reason many people didn’t notice the technological inflection point is because it was eclipsed by the 2008 financial crises.

During the dark early days of the COVID-19 pandemic and shutdown, the people from Techstars asked me if I could write something uplifting for the startupdigest newsletter. I wrote an article called Reasons for startups to be optimistic, where I cited Friedman’s theory and put together a table of big tech breakthroughs that happened between 2006 and 2008.

In answering the professor’s question, I went through the list, reciting each breakthrough. The professor smiled and replied “that’s a long list.”

If you need a ray of hope, I’ve reproduced the list of interesting and impactful tech things that came about between 2006 and 2008 below. Check it out, and keep in mind that we’re currently in a similar time of tech breakthroughs that are being eclipsed by crises around the world.

The leap Notes
Airbnb

In October 2007, as a way to offset the high cost of rent in San Francisco, roommates Brian Chesky and Joe Gebbia came up with the idea of putting an air mattress in their living room and turning it into a bed and breakfast. They called their venture AirBedandBreakfast.com, which later got shortened to its current name.

This marks the start of the modern web- and app-driven gig economy.

Android

The first version of Android as we know it was announced on September 23, 2008 on the HTC Dream (also sold as the T-Mobile G1).

Originally started in 2003 and bought by Google in 2005, Android was at first a mobile operating system in the same spirit as Symbian or more importantly, Windows Mobile — Google was worried about competition from Microsoft. The original spec was for a more BlackBerry-like device with a keyboard, and did not account for a touchscreen. This all changed after the iPhone keynote.

App Store

Apple’s App Store launched on July 10, 2008 with an initial 500 apps. At the time of writing (March 2020), there should be close to 2 million.

In case you don’t remember, Steve Jobs’ original plan was to not allow third-party developers to create native apps for the iPhone. Developers were directed to create web apps. The backlash prompted Apple to allow developers to create apps, and in March 2008, the first iPhone SDK was released.

Azure Azure, Microsoft’s foray into cloud computing, and the thing that would eventually bring about its turnaround after Steve Ballmer’s departure, was introduced at their PDC conference in 2008 — which I attended on the second week of my job there.
Bitcoin

The person (or persons) going by the name “Satoshi Nakamoto” started working on the Bitcoin project in 2007.

It would eventually lead to cryptocurrency mania, crypto bros, HODL and other additions to the lexicon, one of the best Last Week Tonight news pieces, and give the Winklevoss twins their second shot at technology stardom after their failed first attempt with a guy named Mark Zuckerberg.

Chrome

By 2008, the browser wars were long done, and Internet Explorer owned the market. Then, on September 2, Google released Chrome, announcing it with a comic illustrated by Scott “Understanding Comics” McCloud, and starting the Second Browser War.

When Chrome was launched, Internet Explorer had about 70% of the browser market. In less than 5 years, Chrome would overtake IE.

Data: bandwidth costs and speed In 2007, bandwidth costs dropped dramatically, while transmission speeds grew in the opposite direction.
Dell returns After stepping down from the position of CEO in 2004 (but staying on as Chairman of the Board), Michael Dell returned to the role on January 31, 2007 at the board’s request.
DNA sequencing costs drop dramatically The end of the year 2007 marks the first time that the cost of genome sequencing dropped dramatically — from the order of tens of millions to single-digit millions. Today, that cost is about $1,000.
DVD formats: Blu-Ray and HD-DVD In 2008, two high-definition optical disc formats were announced. You probably know which one won.
Facebook In September 2006, Facebook expanded beyond universities and became available to anyone over 13 with an email address, making it available to the general public and forever altering its course, along with the course of history.
Energy technologies: Fracking and solar Growth in these two industries helped turn the US into a serious net energy provider, which would help drive the tech boom of the 2010s.
GitHub Originally founded as Logical Awesome in February 2008, GitHub’s website launched that April. It would grow to become an indispensable software development tool, and a key part of many developer resumes (mine included). It would first displace SourceForge, which used to be the place to go for open source code, and eventually become part of Microsoft’s apparent change of heart about open source when they purchased the company in 2018.
Hadoop

In 2006, developer Doug Cutting of Apache’s Nutch project, took used GFS (Google File System, written up by Google in 2003) and the MapReduce algorithm (written up by Google in 2004) and combined it with the dataset tech from Nutch to create the Hadoop project. He gave his project the name that his son gave to his yellow toy elephant, hence the logo.

By enabling applications and data to be run and stored on clusters of commodity hardware, Hadoop played a key role in creating today’s cloud computing world.

Intel introduces non-silicon materials into its chips January 2007: Intel’s PR department called it “the biggest change to computer chips in 40 years,” and they may have had a point. The new materials that they introduced into the chip-making process allowed for smaller, faster circuits, which in turn led to smaller and faster chips, which are needed for mobile and IoT technologies.
Internet crosses a billion users This one’s a little earlier than our timeframe, but I’m including it because it helps set the stage for all the other innovations. At some point in 2005, the internet crossed the billion-user line, a key milestone in its reach and other effects, such as the Long Tail.
iPhone

On January 9, 2007, Steve Jobs said the following at this keynote: “Today, we’re introducing three revolutionary new products…an iPod, a phone, and an internet communicator…Are you getting it? These are not three separate devices. This is one device!”

The iPhone has changed everyone’s lives, including mine. Thanks to this device, I landed my (current until recently) job, and right now, I’m working on revising this book.

iTunes sells its billionth song On February 22, 2006, Alex Ostrovsky from West Bloomfield, Michigan purchased ColdPlay’s Speed of Sound on iTunes, and it turned out to be the billionth song purchased on that platform. This milestone proves to the music industry that it was possible to actually sell music online, forever changing an industry that had been thrashing since the Napster era.
Kindle

Before tablets or large smartphone came Amazon’s Kindle e-reader, which came out on November 19, 2007. It was dubbed “the iPod of reading” at the time.

You might not remember this, but the first version didn’t have a touch-sensitive screen. Instead, it had a full-size keyboard below its screen, in a manner similar to phones of that era.

Macs switch to Intel

The first Intel-based Macs were announced on January 10, 2006: The 15″ MacBook Pro and iMac Core Duo. Both were based on the Intel Core Duo.

Motorola’s consistent failure to produce chips with the kind of performance that Apple needed on schedule caused Apple to enact their secret “Plan B”: switch to Intel-based chips. At the 2005 WWDC, Steve Jobs revealed that every version of Mac OS X had been secretly developed and compiled for both Motorola and Intel processors — just in case.

We may soon see another such transition: from Intel to Apple’s own A-series chips.

Netflix In 2007, Netflix — then a company that mailed rental DVDs to you — started its streaming service. This would eventually give rise to binge-watching as well as one of my favorite technological innovations: Netflix and chill (and yes, there is a Wikipedia entry for it!), as well as Tiger King, which is keeping us entertained as we stay home.
Python 3

The release of Python 3 — a.k.a. Python 3000 — in December 2008 was the beginning of the Second Beginning! While Python had been eclipsed by Ruby in the 2000s thanks to Rails and the rise of MVC web frameworks and the supermodel developer, it made its comeback in the 2010s as the language of choice for data science and machine learning thanks to a plethora of libraries (NumPy, SciPy, Pandas) and support applications (including Jupyter Notebooks).

I will always have an affection for Python. I cut my web development teeth in 1999 helping build Givex.com’s site in Python and PostgreSQL. I learned Python by reading O’Reilly’s Learning Python while at Burning Man 1999.

Shopify In 2004, frustrated with existing ecommerce platforms, programmer Tobias Lütke built his own platform to sell snowboards online. He and his partners realize that they should be selling ecommerce services instead, and in June 2006, launch Shopify.
Spotify The streaming service was founded in April 2006, launched in October 2008, and along with Apple and Amazon, changed the music industry.
Surface (as in Microsoft’s big-ass table computer)

Announced on May 29, 2007, the original Surface was a large coffee table-sized multitouch-sensitive computer aimed at commercial customers who wanted to provide next generation kiosk computer entertainment, information, or services to the public.

Do you remember SarcasticGamer’s parody video of the Surface?

Switches 2007 was the year that networking switches jumped in speed and capacity dramatically, helping to pave the way for the modern internet.
Twitter

In 2006, Twittr (it had no e then, which was the style at the time, thanks to Flickr) was formed. From then, it had a wild ride, including South by Southwest 2007, when its attendees — influential techies — used it as a means of catching up and finding each other at the conference. @replies appeared in May 2007, followers were added that July, hashtag support in September, and trending topics came a year later.

Twitter also got featured on an episode of CSI in November 2007, when it was used to solve a case.

VMWare After performing poorly financially, the husband and wife cofounders of VMWare — Diane Greene, president and CEO, and Mendel Rosenbaum, Chief Scientist — left. Greene was fired by the board in July, and Rosenbaum resigned two months later. VMWare would go on to experience record growth, and its Hypervisors would become a key part of making cloud computing what it is today.
Watson IBM’s Watson underwent initial testing in 2006, when Watson was given 500 clues from prior Jeopardy! programs. Wikipedia will explain the rest:

While the best real-life competitors buzzed in half the time and responded correctly to as many as 95% of clues, Watson’s first pass could get only about 15% correct. During 2007, the IBM team was given three to five years and a staff of 15 people to solve the problems. By 2008, the developers had advanced Watson such that it could compete with Jeopardy! champions.

Wii The Wii was released in December 2006, marking Nintendo’s comeback in a time when the console market belonged solely to the PlayStation and Xbox.
XO computer You probably know this device better as the “One Laptop Per Child” computer — the laptop that was going to change the world, but didn’t quite do that. Still, its form factor lives on in today’s Chromebooks, which are powered by Chrome (which also debuted during this time), and the concept of open source hardware continues today in the form of Arduino and Raspberry Pi.
YouTube

YouTube was purchased by Google in October 2006. In 2007, it exploded in popularity, consuming as much bandwidth as the entire internet did 7 years before. In the summer and fall of 2007, CNN and YouTube produced televised presidential debates, where Democratic and Republican US presidential hopefuls answered YouTube viewer questions.

You probably winced at this infamous YouTube video, which was posted on August 24, 2007: Miss Teen USA 2007 – South Carolina answers a question, which has amassed almost 70 million views to date.

Categories
Artificial Intelligence Conferences Tampa Bay What I’m Up To

I’m speaking at the TechX Florida 2025 AI conference this Saturday!

This Saturday, November 8, I’ll be at the TechX Florida 2025 AI Conference at USF, on the Careers in Tech panel, where we’ll be talking about career paths, hiring expectations, and practical advice for early-career developers and engineers.

This conference, which is FREE to attend, will feature:

  • AI talks from major players in the industry, including Atlassian, Intel, Jabil, Microsoft, and Verizon
  • Opportunities to meet and network with companies, startups, and techies from the Tampa Bay area
  • The Careers in Tech panel, featuring Yours Truly and other experienced industry pros

Once again, the TechX Florida 2025 AI Conference will take place this Saturday, November 8th, in USF’s Engineering Building II, in the Hall of Flags. It runs from 11 a.m. to 5 p.m. and will be followed by…

TechX After Dark, a social/fundraising event running from 6 p.m. to 8 p.m., with appetizers and a cash bar.

This event charges admission:

  • FREE for IEEE-CS members
  • $10 for students
  • $20 for professionals

 

Categories
Video What I’m Up To

Don’t forget Global Nerdy’s YouTube channel!

This is just a reminder that there’s a Global Nerdy YouTube channel. I’m ramping up video production, so expect to see a lot more stuff there soon!

Categories
Artificial Intelligence Hardware Programming What I’m Up To

One last endorsement for the ZGX Nano AI workstation

Today’s my last day in my role as the developer advocate for HP’s GB10-powered AI workstation, the ZGX Nano. As I’ve written before, I’m grateful to have had the the opportunity to talk about this amazing little machine.

Of course, you could expect me to talk about how good the ZGX Nano is; after all, I’m paid to do so — at least until 5 p.m. Eastern today. But what if a notable AI expert also sang its praises?

That notable expert is Sebastian Raschka (pictured above), author of a book I’m working my way through right now: Build a Large Language Model (from Scratch), and it’s quite good. He’s also working on a follow-up book, Build a Reasoning Model (from Scratch).

Sebastian has been experimenting on NVIDIA’s DGX Spark, which has the same specs as the ZGX Nano (as well as a few other similar small desktop computers built around the NVIDIA’s GB10 “superchip”), and he’s published his observations on his blog in a post titled DGX Spark and Mac Mini for Local PyTorch Development. He ran some benchmark AI programs comparing his Mac Mini M4 computer (a fine developer platform, by the bye) and the NVIDIA H100 GPU (and NVIDIA’s A100 GPU when an H100 wasn’t available), pictured below:

Keep in mind that the version of the H100 that comes with 80GB of VRAM sells for about $30,000, which is why most people don’t buy one, but instead rent time on it from server farms, typically at about $2/hour.

Let me begin from the end of Raschka’s article, where he writes his conclusions:

Overall, the DGX Spark seems to be a neat little workstation that can sit quietly next to a Mac Mini. It has a similarly small form factor, but with more GPU memory and of course (and importantly!) CUDA support.

I previously had a Lambda workstation with 4 GTX 1080Ti GPUs in 2018. I needed the machine for my research, but the noise and heat in my office was intolerable, which is why I had to eventually move the machine to a dedicated server room at UW-Madison. After that, I didn’t consider buying another GPU workstation but solely relied on cloud GPUs. (I would perhaps only consider it again if I moved into a house with a big basement and a walled-off spare room.) The DGX Spark, in contrast, is definitely quiet enough for office use. Even under full load it’s barely audible.

It also ships with software that makes remote use seamless and you can connect directly from a Mac without extra peripherals or SSH tunneling. That’s a huge plus for quick experiments throughout the day.

But, of course, it’s not a replacement for A100 or H100 GPUs when it comes to large-scale training.
I see it more as a development and prototyping system, which lets me offload experiments without overheating my Mac. I consider it as an in-between machine that I can use for smaller runs, and testing models in CUDA, before running them on cloud GPUs.

In short: If you don’t expect miracles or full A100/H100-level performance, the DGX Spark is a nice machine for local inference and small-scale fine-tuning at home.

You might as well replace “DGX Spark” in his article with “ZGX Nano” — the hardware specs are the same. The ZGX Nano shines with HP’s exclusive ZGX Toolkit, a Visual Studio Code extension that lets you configure, manage, and deploy to the ZGX Nano. This lets you use your favorite development machine and coding environment to write code, and then use the ZGX Nano as a companion device / on-premises server.

The article features graphs showing his benchmarking results…

In his first set of benchmarks, he took a home-built 600 million parameter LLM — the kind that you learn how to build in his book, Build a Large Language Model (from Scratch) — and ran it on his Mac Mini M4, the ZGX Nano’s twin cousin, and an H100 from a cloud provider. From his observations, you can conclude that:

  • With smaller models, the ZGX Nano can match a Mac Mini M4. Both can crunch about 45 tokens per second with 20 billion parameter m0dels.
  • The ZGX Nano has the advantage of coming with 128GB  of VRAM, meaning that it can handle larger models than the MacMini could, as it’s limited by memory.

Raschka’s second set of benchmarks tested how the Mac Mini, the ZGX Nano’s twin cousin, and the H100 handle two variants of a model that have been presented with MATH-500, a collection of 500 mathematical word problems:

  • The base variant, which was a standard LLM that gives short, direct answers
  • The reasoning variant, which was a version of the base model that was modified to “think out loud” through problems step-by-step

He ran two versions of this benchmark. The first was the sequential test, where the model was presented on MATH-500 question at a time. From the results, you can expect the ZGX Nano to perform almost as well as the H100, but at a significantly smaller fraction of the cost! It also runs circles around the Mac Mini.

In the second version of the benchmark, the batch test, the model was served 128 questions at the same time, to simulate serving multiple users at once and to. test memory bandwidth and parallel processing.

This is a situation where the H100 would vastly outperform the ZGX Nano thanks to the H100’s much better memory bandwidth. However, the ZGX Nano isn’t for doing inference at production scale; it’s for developers to try out their ideas on a system that’s powerful enough to get a better sense of how they’d operate in the real world, and do so affordably.

Finally, with the third benchmark, Rashcka trained and fine-tuned a model. Note that this time, the data center GPU was the A100 instead of the H100 due to availability.

This benchmark tests training and fine-tuning performance. It compares how fast you can modify and improve an AI model on the Mac Mini M4 vs. the ZGX Nano’s twin vs. an A100 GPU. He presents three scenarios in training and fine-tuning a 355 million parameter model:

  1. Pre-training (3a in the graphs above): Training a model from scratch on raw text
  2. SFT, or Supervised fine-tuning (3b): Teaching an existing model to follow instructions
  3. DPO (direct preference optimization), or preference Tuning (3c): Teaching the model which responses are “better” using preference data

All these benchmarks say what I’ve been saying: the ZGX Nano lets you do real model training locally and economically. You get a lot of bang for your ZGX Nano buck.

As with a lot of development workflows, where there’s a development database and a production database, you don’t need production scale for every experiment. The ZGX Nano gives you a working local training environment that isn’t glacially slow or massively expensive.

Want to know more? Go straight to the source and check out Raschka’s article, DGX Spark and Mac Mini for Local PyTorch Development.

And with this article, I end my stint as the “spokesmodel” for the ZGX Nano. It’s not the end of my work in AI; just the end of this particular phase.

Keep watching this blog, as well as the Global Nerdy YouTube channel, for more!

Categories
What I’m Up To Work

My speedrun as the HP ZGX Nano Developer Advocate: Two months, one podcast, zero regrets

Just over two months after my announcement that I was doing developer relations for HP’s ZGX Nano AI workstation — an NVIDIA-powered, book-sized desktop computer specifically made for AI application development and edge computing — HP ended the Kforce contract for the ZGX Nano program, so my last day is Friday.

In my all-too-brief time working with HP, I got a lot done, including…

I landed the ZGX Nano appearance on Intelligent Machines

On the very day I announced that I was doing developer relations for the ZGX Nano, I got an email that began with this paragraph:

I’m Anthony, a producer with the TWiT.tv network. Jeff Jarvis mentioned you’re “a cool dude” from the early blogging days (and apparently serenaded some Bloggercons?), but more importantly, we saw you just started doing developer relations for HP’s ZGX Nano. We’d love to have you on our podcast Intelligent Machines to discuss this shift toward local AI computing.

First of all: Thanks, Jeff! I owe you one.

Second: I didn’t pitch TWiT. TWiT pitched me, as soon as they found out! This wasn’t the outcome of HP’s product marketing department contacting media outlets. Instead, it’s because Jeff knows me, and he knew I was the right person to explain this new AI hardware to their audience:

I generated earned media for HP without a single pitch, press release, or PR agency. My personal brand amplified HP’s brand, and maybe it can amplify your company’s brand too!

And finally: I’m just great at explaining complex technical topics in a way that people can understand. Don’t take my word for it; take Leo Laporte’s:

In case you need some stats:

  • TWiT Network (home of Intelligent Machines): 25+ million downloads annually
  • Cost of equivalent advertising slot: $double digit thousands
  • Time from my hire to major media appearance: 8 weeks
  • Number of PR pitches sent: 0
  • Value of authentic relationships: Priceless

I built page-one visibility for a brand-new product —organically

Do a Google search on the term zgx nano (without the quotes) and while you might see slightly different results from mine, you should find that this blog, Global Nerdy, is on the first page of results:

Tap to try out a Google search for zgx nano for yourself.

The screenshot above was taken on the evening of Monday, October 27, and two of the articles on this blog are the first two search results after HP.

My content gets found. Within 8 weeks of starting work with HP, my coverage of the ZGX Nano achieved first-page Google ranking, competing directly with HP’s official pages and major tech publications. This organic reach is what modern developer relations looks like: authentic content that both developers and search algorithms trust.

With me, you’re not just getting a developer advocate, but someone with a tech blog going back nearly two decades and with the domain authority to take on  with Fortune 500 companies on Google. My Global Nerdy posts about ZGX Nano rank on page one because Google trusts content I’ve been building since 2006.

I enabled the Sales team to go from zero to hero

On day one, I was given two priorities:

  • First, provide enablement for the Sales team and give them the knowledge and selling points they need to be effective when talking to customers about the ZGX Nano.
  • Support developers who were interested in the ZGX Nano, or even just AI application development. Unfortunately, I’m not going to get to execute this phase.

But I got pretty far with that first phase! In less than eight weeks, I built a sales enablement foundation for a brand-new AI workstation with scant documentation. I created 50+ pages of technical documentation that gave HP’s global sales force what they needed to sell a new product in a new category.

Some of my big quantifiable achievements in sales enablement:

  • 25+ technical objections anticipated and addressed
    • Created comprehensive FAQ covering everything from architecture to ROI calculations
    • Translated GB10 superchip complexity into sales-friendly language
    • Provided competitive differentiation against NVIDIA DGX Spark, Dell, and Lenovo
  • 12 industry verticals mapped with 60+ business impact scenarios”
    • Developed go-to-market strategy for each vertical (healthcare to gaming)
    • Created specific ROI talking points for each industry
    • Identified 5 business impacts per vertical = 60 total selling points
  • Turned “It’s just another GB10 machine” into “Here’s why HP wins”
    • Differentiated commodity hardware through software story (ZGX Toolkit)
    • Created objection handling that transforms skepticism into sales
    • Armed sales with “Why HP and not NVIDIA direct””messaging

I’m available starting next week!

All told, it was 2 months, 1 podcast…and ZERO regrets. I enjoyed the work, and I’m grateful to have been selected to be the developer spokesmodel for an amazing AI computer.

I don’t think of this as a termination. It was a high-intensity proof-of-concept for my ability to help launch a new device with little guidance (in fact, the manager who hired me moved to another company on my first week). They asked; I delivered. Now, I’m looking for the next impossible mission.

As I wrote at the start of this article, my last day is on Friday — yes, I wrap up on Halloween — and as of Monday next week, I’m available!

I’m now looking for my next Developer Advocate role. Who needs someone who can…

  • Land major podcast appearances on Day One?
  • Has enough SEO know-how and influence to get you to Page One?
  • Can enable your sales and marketing teams with technical material, explained in a non-techie-friendly way?

If you’re looking for such a person, either on a full-time or consulting basis, set up an appointment with me on my calendar.

Let’s talk!

Categories
Conferences Meetups Security Tampa Bay What I’m Up To

This Tuesday in Tampa: Two tech events, four minutes apart!

On Tuesday, two popular tech events take place in Tampa, and you may be wondering which one you should attend. I’ll answer your question by quoting the little girl from that classic Old El Paso commerical:

The two events in question are:

Here’s the interesting wrinkle: these two events are only a couple of blocks or a four-minute walk apart!

So if you’re feeling ambitious — and I just might be — you can attend both events with a little judicious scheduling.

Categories
Artificial Intelligence Hardware What I’m Up To

Talking about HP’s ZGX Nano on the “Intelligent Machines” podcast

On Wednesday, HP’s Andrew Hawthorn (Product Manager and Planner for HP’s Z AI hardward) and I appeared on the Intelligent Machines podcast to talk about the computer that I’m doing developer relations consulting for: HP’s ZGX Nano.

You can watch the episode here. We appear at the start, and we’re on for the first 35 minutes:

A few details about the ZGX Nano:

  • It’s built around the NVIDIA GB10 Grace Blackwell “superchip,” which combines a 20-core Grace CPU and a GPU based on NVIDIA’s Blackwell architecture.

  • Also built into the GB10 chip is a lot of RAM: 128 GB of LPDDR5X coherent memory shared between CPU and GPU, which helps avoid the kind of memory bottlenecks that arise when the CPU and GPU each have their own memory (and usually, the GPU has considerably less memory than the CPU).
NVIDIA GB10 SoC (system on a chip).
  • It can perform up to about 1000 TOPS (trillions of operations per second) or 1015 operations per second and can handle model sizes of up to 200 billion parameters.

  • Want to work on bigger models? By connecting two ZGX Nanos together using the 200 gigabit per second ConnectX-7 interface, you can scale up to work on models with 400 billion parameters.

  • ZGX Nano’s operating system in NVIDIA’s DGX OS, which is a version of Ubuntu Linux with additional tweaking to take advantage of the underlying GB10 hardware.

Some topics we discussed:

  • Model sizes and AI workloads are getting bigger, and developers are getting more and more constrained by factors such as:
    • Increasing or unpredictable cloud costs
    • Latency
    • Data movement
  • There’s an opportunity to “bring serious AI compute to the desk” so that teams can prototype their AI applications  and iterate locally
  • The ZGX Nano isn’t meant to replace large datacenter clusters for full training of massive models, It’s aimed at “the earlier parts of the pipeline,” where developers do prototyping, fine-tuning, smaller deployments, inference, and model evaluation
  • The Nano’s 128 gigabytes of unified memory gets around the issues of bottlenecks with distinct CPU memory and GPU memory allowing bigger models to be loaded in a local box without “paging to cloud” or being forced into distributed setups early
  • While the cloud remains dominant, there are real benefits to local compute:
    • Shorter iteration loops
    • Immediate control, data-privacy
    • Less dependence on remote queueing
  • We expect that many AI development workflows will hybridize: a mix of local box and cloud/back-end
  • The target users include:
    • AI/ML researchers
    • Developers building generative AI tools
    • Internal data-science teams fine-tuning models for enterprise use-cases (e.g., inside a retail, insurance or e-commerce firm).
    • Maker/developer-communities
  • The ZGX Nano is part of the “local-to-cloud” continuum
  • The Nano won’t cover all AI development…
    • For training truly massive models, beyond the low hundreds of billions of parameters, the datacenter/cloud will still dominate
    • ZGX Nano’s use case is “serious but not massive” local workloads
    • Is it for you? Look at model size, number of iterations per week, data sensitivity, latency needs, and cloud cost profile

One thing I brought up that seemed to capture the imagination of hosts Leo Laporte, Paris Martineau, and Mike Elgan was the MCP server that I demonstrated a couple of months ago at the Tampa Bay Artificial Intelligence Meetup: Too Many Cats.

Too Many Cats is an MCP server that an LLM can call upon to determine if a household has too many cats, given the number of humans and cats.

Here’s the code for a Too Many Cats MCP server that runs on your computer and works with a local CLaude client:

from typing import TypedDict
from mcp.server.fastmcp import FastMCP

mcp = FastMCP(name="Too Many Cats?")

class CatAnalysis(TypedDict):
    too_many_cats: bool
    human_cat_ratio: float  

@mcp.tool(
    annotations={
        "title": "Find Out If You Have Too Many Cats",
        "readOnlyHint": True,
        "openWorldHint": False
    }
)
def determine_if_too_many_cats(cat_count: int, human_count: int) -> CatAnalysis:
    """Determines if you have too many cats based on the number of cats and a human-cat ratio."""
    human_cat_ratio = cat_count / human_count if human_count > 0 else 0
    too_many_cats = human_cat_ratio >= 3.0
    return CatAnalysis(
        too_many_cats=too_many_cats,
        human_cat_ratio=human_cat_ratio
    )

if __name__ == "__main__":
    # Initialize and run the server
    mcp.run(transport='stdio')

I’ll cover writing MCP servers in more detail on the Global Nerdy YouTube channel — watch this space!