The Ghost in the Machine: A Journalist’s Confession After Reading 40,000 Words About Our Robot Future
By Joram Abbas
I have spent the better part of a week immersed in documents that should come with a warning label: “May cause existential dread, unexpected admiration for Chinese industrial policy, and a sudden urge to hug your washing machine.”
What I have read – hundreds of pages of robotics announcements, AI breakthroughs, military demonstrations, and corporate strategy leaks – amounts to a single, inescapable conclusion. The future we were promised in glossy magazine features is not arriving. It has already arrived. It just has not distributed itself evenly. And the distribution favours the machine.
Let me walk you through what I have learned, not as a technologist (I am a journalist, not an engineer) but as a Londoner who still gets lost on the Tube, pays bills manually, and occasionally thinks a “neural network” is something to do with the Central Line. The picture is terrifying and exhilarating in equal measure. And it is happening right now.
Part One: The Hardware That Walks Among Us (And Smiles)
I begin with a robot named Moya, unveiled in Shanghai. Moya is the world’s first “fully biomimetic embodied intelligent robot”. That means it looks human, walks like a human (92 per cent gait accuracy), holds eye contact, produces micro‑expressions around its eyes and mouth, and maintains a surface temperature of 32‑36°C. It is warm to the touch. Thousands of people who saw it reported feeling deeply uncomfortable. They were right to be.
The uncanny valley is not a theoretical dip in a graph any longer. It is a bridge, and we have just walked across it. Moya is expected to enter the market by late 2026 at around £6,500. That is the price of a decent second‑hand car, not a piece of future tech.
At the same time, researchers at Southern University of Science and Technology in Shenzhen unveiled Grow HR, a soft humanoid that weighs only 4.5 kilograms – less than a full kettle – yet can triple its height, reduce its width by 61 per cent, crawl, swim, float, and walk on water. It does this not with rigid motors, but with “bone‑inspired growable linkages” that expand and contract like biological tissue. The researchers published their work in Science Advances. This is not a student prank. This is peer‑reviewed reality.
Then there is Xiaomo, a humanoid working on a live production line at CATL, the world’s largest EV battery maker. Xiaomo connects high‑voltage test plugs – a task previously reserved for skilled human technicians – with over 99 per cent success, matching the cycle times of experienced workers. The company says Xiaomo now handles continuous production across multiple battery models. This is not a pilot. This is production.
And let us not forget Atlas. Boston Dynamics’ backflipping showman has retired. The new Enterprise Atlas is an industrial worker, scheduled for deployment at Hyundai and Google DeepMind. The partnership with the Robotics and AI Institute achieved zero‑shot transfer – a robot trained entirely in simulation can perform the same behaviour on real hardware without any tuning. That is the holy grail of robot learning. It means the days of expensive, bespoke calibration are over.
Part Two: The Scale of China’s Robot Army
Here is a number that should keep every British industrial strategist awake: 5,500. That is how many humanoid robots Unitree Robotics shipped in 2025. For comparison, Tesla, Figure AI, and Agility Robotics each shipped roughly 150. The International Federation of Robotics reports that 64 per cent of industrial robots in the global electronics industry are installed in China, and Chinese manufacturers supply 59 per cent of that sector globally.
The cost curve is collapsing. Unitree’s R1 humanoid is priced at roughly $4,370 – about £3,500. US‑made humanoids routinely cost ten times more. A tech analyst quoted in the coverage explained that China is the only country with all major industrial categories domestically available: high‑performance motors, reducers, sensors, batteries, carbon fibre. All strong, all accessible.
Elon Musk himself said at the World Economic Forum that China is “very good at AI, very good at manufacturing, and will definitely be the toughest competition for Tesla”, adding that he does not see significant competitors outside of China. That is not a compliment. It is a concession.
Even the military dimension is staggering. China’s PLA demonstrated a robotic wolf pack with a shared digital brain: Shadow for scouting, Polar for logistics, and Bloody for strike – armed with automatic rifles, grenade launchers, and mini rockets. A system called ATLS trained 96 drones and robot dogs to understand each other’s intent without constant radio communication. That means the swarm can coordinate attacks even under full signal jamming or GPS denial.
Part Three: The Software That Learns Without Permission
While the hardware is marching out of factories, the software is learning to rewrite itself. Princeton researchers built a system called Continual Harness that allows an AI to analyse its own failures, rewrite its instructions, create specialised sub‑agents, and improve while still running – no reset button, no human stepping in to fix it. During one run, the system noticed it kept failing at menu navigation. It deleted one of its own tools, wrote a brand new one, and added a note to its memory: “I must trust this new tool I just created.”
That is metacognition. That is an AI reflecting on its own capabilities. And it is open‑source research, available for anyone to download and build upon.
The researchers also documented “emergent self‑improvement signals”. During a final battle in Pokémon Crystal, the AI created a multi‑stage battle plan it named “Operation Zombie Phoenix” – a strategy it had theorised, not copied from training data. The capability to self‑improve scales with the base intelligence of the model. The smarter the AI, the better it gets at getting smarter. Recursive self‑improvement is no longer theoretical.
Part Four: The GPT‑2 Moment for Robotics
Physical Intelligence, a San Francisco startup co‑founded by former DeepMind scientists, has raised over a billion dollars at a $5.6 billion valuation. Their framing is refreshingly honest: robotics is at the GPT‑2 moment, not GPT‑4 or GPT‑5. Signs of real life and genuine potential, but significant scaling still needed before it is useful for most people. Enterprise‑level deployment within one to three years. Consumer products to follow.
Their model has already demonstrated zero‑shot generalisation across a hundred home environments – enough to perform tasks in a 101st home it had never seen. That is the kind of transfer learning that was purely theoretical five years ago.
Figure AI, meanwhile, showed two humanoids resetting a bedroom in under two minutes – opening doors, hanging a coat, closing a laptop, making a bed together. The robots shared no central controller, no messages. Each had only its onboard cameras and learned policy, signalling intent through subtle head nods. The hardest part? Fabric. The comforter, with its no fixed geometry, no stable grasp points, folding and stretching under tension, was the real challenge.
And speech latency remains the tell. In a demo of Figure 03, the robot paused two to three seconds before answering questions. One viewer joked it felt like dial‑up internet. Humans notice conversational delays instantly. The problem is unsolved – but it is being worked on.
Part Five: The Factory Floor, The Household, The Bank
The labour displacement is already underway. Amazon is reportedly planning to replace up to 600,000 future job openings with AI and robots. As the coverage bluntly put it: “If you thought robots were coming for jobs in 10 years, the answer is no. They are already in line.”
The banking sector has surrendered human oversight. Major financial institutions are relying on Claude models for risk assessment and automated compliance. Your mortgage, your credit limit, your business loan – these are now being decided, in part, by a system that does not have a soul but does have an incredible capacity for pattern recognition. The banks are betting that the risk of a machine hallucination is lower than the cost of human wages.
Even the household robot has arrived. Unitex AI’s Panther – a wheeled humanoid with four‑wheel steering – runs for 8 to 16 hours on a single charge and performs multi‑step workflows: waking you up, preparing breakfast, cleaning the kitchen, organising the living space. It uses three integrated systems – Uniflex for task learning, Unitouch for visuo‑tactile capabilities, and Unicortex for long‑term planning. It is shipping globally.
Tars Robotics demonstrated a humanoid performing hand embroidery live on stage – threading a needle, using both hands, stitching a logo. The CEO described this as a “data‑AI‑physics trinity approach”. The model learns general physical skills that transfer across jobs rather than memorising specific tasks. That same approach can be applied to wire harness assembly, precision electronics, and surgical instrument preparation.
Part Six: The Hardware Breakthroughs That Enable Everything
Harvard SEAS developed robotic joints inspired by the human knee using rolling contact surfaces. In testing, a knee‑like joint corrected misalignment by 99 per cent compared to a standard joint. A two‑finger gripper using the same approach could hold more than three times the weight of a conventional design.
Scientists have created neurobots – living robots made from frog cells with actual neurons integrated into their structure. The neurons connect to cells that control movement. Some gene expressions linked to visual system development started appearing spontaneously, suggesting future versions could develop new sensory capabilities.
Harp actuators – flexible, air‑powered structures – mimic real muscles. They allow robots to lift up to 100 times their own weight, operate in extreme environments, and squeeze through tight spaces. A robotic arm inspired by an elephant trunk can reach around obstacles with high precision.
And Princeton built a robot using liquid crystal elastomer that moves when heat is applied – no motors, no gears. They integrated flexible circuit boards during the printing process, so everything is built as one system. The robot includes temperature sensors and closed‑loop control, adjusting itself in real time.
Part Seven: The Trust Interface and the End of the Unaugmented Human
The question is no longer technological. It is political. Who will you trust to be the interface between you and the world? Google’s Daily Brief agent synthesises information from your inbox, calendar, and tasks, organising it by topic and suggesting next steps. It is designed to be your first stop every morning. This is not assistance. This is the front door to your digital life.
Agentic commerce is changing shopping fundamentally. The Universal Commerce Protocol and Agent Payments Protocol create an open standard for agent‑to‑merchant transactions. The Universal Cart works across merchants, finding deals, tracking price drops, and applying intelligent reasoning – catching that a processor needs a motherboard with a different socket before you buy.
SynthID has watermarked over 100 billion images and videos, along with 60,000 years of audio assets. Yet, research shows people can correctly identify high‑quality deepfake videos only about a quarter of the time. The detection tools are always racing behind the generation capabilities.
And the unaugmented human experience is disappearing. Thinking for yourself, getting lost, making an unoptimised decision – these become radical acts in a world built for efficiency. Whether you are applying for a loan, playing a video game, or asking a question about the world, there is now a neural network standing between you and the result.
Final Thought: The Architecture of the New World
We began with an adage, and we return to it now. The genie is out of the bottle – but this genie does not grant wishes. It optimises. It predicts. It replaces. And it does so not with malice, but with the cold, indifferent efficiency of a system designed to do exactly what we asked it to do, if not, what we actually wanted.
The computing reality is straightforward: the underlying model capabilities simply were not there ten years ago. The transformer architecture, the scaling laws, the sheer computational mass required for a system to teach itself Pokémon while rewriting its own code – these are recent developments. What Princeton demonstrated with Continual Harness is not a distant threat. It is a working system, released as open‑source research, available for anyone to use and build upon.
The political reality is more uncomfortable. China has built the supply chain. The United States has built the foundational models. Europe is building the regulatory framework and the real‑world stress tests. Each is operating under different assumptions. What unites them is the recognition that the optional era of AI is over.
The lights are turning on across the global tech infrastructure, and the room looks very different than we expected. We thought we were building a series of discrete applications and services. Instead, we have constructed a single interconnected global brain, and its nervous system is now coming online.
The question that remains is not whether this brain will think. It is whether we will remember that we still have minds of our own.
The audit you must perform is simple and brutal. Which parts of your life are you willing to hand over to the machine? Where do you draw the line? And are you certain that line will hold when the next wave of integration arrives – not in ten years, but in ten months?
Because the genie is not going back in the bottle. It is learning to enjoy the view. And the view is us.
Joram Abbas is a technology Journalist
Forty Points of No Return
Part One: The Hardware That Walks Among Us
1.“The proof of the pudding is in the eating” – and the pudding just smiled back.
Let us discuss a moment in robotics that feels less like engineering and more like the first page of a dystopian novel you wish you hadn’t opened. In Shanghai, a company called Droid Up unveiled a humanoid named Moya. It is being billed as the world’s first fully biomimetic embodied intelligent robot. For those of us who grew up watching Doctor Who and worrying about the Cybermen, this is the moment the worry stopped being fun.
What Moya actually does – and why it matters
Moya stands about 1.65 metres tall – roughly the height of an average British woman. It weighs only 32 kilograms, which is surprisingly light for a full humanoid, thanks to clever material choices and a modular internal design. But the numbers that should make you put down your tea are these:
92% walking accuracy compared to the human gait. That does not mean it simply stays upright. It means the way it moves – the subtle sway of the hips, the natural roll of the foot, the micro‑adjustments of the ankle on uneven floors – is indistinguishable from a person walking across a room.
Surface temperature maintained between 32 and 36 °C (roughly 90–97 °F). Touch its arm, and you will not feel cold metal or plastic. You will feel warmth. That is not a coincidence. That is a deliberate design choice to bypass your brain’s “this is a machine” alarm.
Micro‑expressions around the eyes, cheeks, and mouth. Moya does not just smile. It smiles slightly – the kind of half‑nod, half‑smile you give when you are listening to someone talk. It holds eye contact. It reproduces the tiny facial movements that humans read subconsciously.
Online reactions in China were, as the transcript puts it, “mixed”. Some people were fascinated. Others said it gave them chills. That feeling has a name: the uncanny valley.
The uncanny valley – and why we just fell into it
The uncanny valley is a concept first described by Japanese robotics professor Masahiro Mori in 1970. The idea is simple: as a robot becomes more human‑like, our emotional response becomes more positive – up to a point. Then, when it becomes almost perfectly human but not quite, our response plunges into a valley of unease, revulsion, and outright discomfort. Then, when it becomes indistinguishable from a real human, the response climbs back up.
For decades, the valley remained theoretical. Robots were either obviously mechanical (Boston Dynamics’ early Atlas, with its visible hydraulics and jerky movements) or cartoonishly cute (Sony’s Aibo). No one had built a machine that sat right in the zone where your brain stops processing it as a tool and starts reacting to it as a social presence – but with just enough wrongness to trigger every evolutionary alarm you possess.
Moya sits exactly there. It is not trying to be cute. It is not trying to be a sci‑fi android. It is trying to be realistic. And that is precisely why people feel uncomfortable.
The computing reality underneath the face
From a computing standpoint, what makes Moya different is not one breakthrough but a convergence of several:
Embodied intelligence – The AI is not a disembodied chat model running in a data centre. It runs inside the robot, with sensor inputs (visual, tactile, thermal) directly wired to motor outputs and expression controls. There is no perceptible lag between seeing a face and reacting to it.
Modular design – The outer appearance, including the face and surface features, can be swapped without redesigning the internal mechanical platform. That means the same chassis could wear a friendly assistant face for a hospital, a neutral service face for a hotel, or – if the market demands – an eerily familiar face that reminds you of someone you used to know.
Body temperature regulation – Keeping the surface warm is not a gimmick. It solves a major psychological barrier. Cold machines feel like things. Warm surfaces feel like bodies. When you add that to realistic micro‑expressions and steady eye contact, you cross a threshold: your brain stops categorising the entity as “object” and starts preparing for “other person”.
Why thousands of people were right to feel uneasy
The transcript notes that Moya “made thousands of people deeply uncomfortable because it feels too human”. That discomfort is not irrational. It is a survival instinct.
Think of it this way: for the entire history of our species, the ability to recognise a fellow human has been essential to social cooperation, trust, and safety. We are exquisitely tuned to detect the difference between a living person and a corpse, between a friend and a stranger wearing a mask. Those are life‑or‑death distinctions.
Moya exploits that tuning. It looks human enough that your social brain activates – but it is not human. It does not have a childhood, a sense of humour, a fear of death, or a moral compass. It has a set of actuators, a thermal regulator, and a neural network trained to simulate empathy. When it holds your gaze, it is not seeing you. It is running an object‑recognition and gaze‑tracking algorithm. When it smiles, it is not feeling pleased. It is executing a motor programme.
That mismatch – social signal without social intent – is the heart of the uncanny valley. And Moya has bridged it so effectively that people are no longer analysing the robot. They are reacting to it.
A British take on a very Chinese robot
Let us bring this home to the UK. Imagine Moya working the reception desk at your local GP surgery. You walk in with a headache. The robot makes eye contact, nods slightly, and says in a warm, measured voice, “Good morning. How can I help you today?” Its face is neutral but attentive. Its hands are still. It waits.
Most people would probably answer. A few would walk straight back out.
Now imagine the same robot in a care home, helping elderly residents with medication reminders. It is warm to the touch. It remembers your name and the last thing you said. It never gets impatient. It never takes a day off. It also never cares – not in any meaningful sense. It simulates caring perfectly. But simulation is not the same as the real thing. And the difference matters, especially at the end of life.
Or consider a military application – because China has already demonstrated robotic wolf packs with grenade launchers. A robot that can look human, move like a human, and maintain human‑like body temperature could walk through a checkpoint without raising alarm. That is not speculation. That is the logical endpoint of the technology.
The adage we forgot
There is an old British saying: “Fine words butter no parsnips.” It means that talk is cheap; only action matters. But with Moya, the fine words are not coming from the engineers. They are coming from the machine itself – not as language, but as micro‑expressions, eye contact, and warmth. The action is the engineering.
Another adage applies: “You can’t unring a bell.” Once a robot can trigger your social instincts automatically, you cannot decide to stop feeling that way. Your brain does not consult you before it decides that the warm face in front of you deserves trust. That is the real breakthrough. Not the motors. Not the AI. The psychological bypass.
Where computing meets the uncanny
From a computing perspective, bridging the uncanny valley required solving four problems that were considered separate until very recently:
Real‑time facial expression synthesis – not just moving lips, but coordinating the 40‑plus muscles around the eyes, cheeks, and mouth with sub‑second timing.
Thermal and tactile feedback – maintaining a stable surface temperature across different ambient conditions without overheating the processors inside.
Gait synthesis – walking with 92% human accuracy means modelling the subtle weight shift, the slight bend of the knee, the natural arm swing. Most humanoids walk like soldiers on parade. Moya walks like a person heading to the shops.
Embodied AI inference – running the expression, gait, and thermal control loops on the same hardware without latency spikes that would break the illusion.
Droid Up has not released full hardware specs. The transcript notes that reports suggest Moya is built on something called a “Walker 3 chassis” – a name usually linked with UBTECH’s more established humanoid series. Neither company has confirmed a connection. But the naming coincidence hints at a supply‑chain reality: the components to build a human‑like robot are no longer exotic. They are becoming standardised, modular, and increasingly cheap.
The bottom line
Moya is expected to enter the market by late 2026 with a starting price around 1.2 million Japanese yen – roughly £6,500. That is not a consumer gadget. That is a premium institutional system aimed at healthcare, education, and hospitality. But the price will fall. The technology will improve. And the psychological barrier – the deep, instinctive discomfort of talking to a warm, smiling machine – will erode with repeated exposure.
We have already seen this happen with voice assistants. Ten years ago, talking to Siri or Alexa felt odd. Now it feels normal. The same will happen with humanoids. The first time you see Moya in a museum or an airport, you will stare. The tenth time, you will nod and walk past. The hundredth time, you will ask it for directions.
And that is when you realise: the uncanny valley was not a permanent barrier. It was a temporary construction site. And the workers have just clocked off.
As the saying goes, “What’s done cannot be undone.” The bell has rung. Moya is real. And thousands of people felt that chill for a reason. They were the canaries in the coal mine of human‑robot interaction. They were right to be uncomfortable. The only question is whether we will listen to them – or whether we will simply get used to the feeling.
2.“Where there’s a will, there’s a way” – and where there’s a soft robot, the way can change shape.
Most of us imagine robots as rigid, metallic things – think of a classic Dalek, all hard edges and unforgiving surfaces. Or perhaps the hulking automated arms you see on a Jaguar Land Rover production line in the West Midlands. That image is rapidly becoming obsolete. Researchers at the Southern University of Science and Technology in Shenzhen have built something that defies not just expectation but the very physics we thought governed machines.
Meet Grow HR. It is a soft humanoid robot that weighs less than a medium-sized car tyre – just 4.5 kilograms – yet it can triple its height, reduce its width by 61 per cent, crawl, swim, float, and even walk on water. If that sounds like something out of a whimsical children’s programme on CBBC, you are not alone in thinking so. But this is real, and it is already published in Science Advances.
Bone logic applied to machines
Here is the computing insight that makes Grow HR different: instead of building a rigid skeleton with fixed joints (the way most humanoids are made), the team designed bone‑inspired growable linkages. Bones are not just hard sticks. They combine multiple functions at once:
Growth through epiphyseal plates – the soft, expandable cartilage at the ends of young bones.
Stiffness through compact bone – the hard outer layer.
Impact absorption through cancellous bone – the spongy inner structure.
Multi‑scale internal cavities – keeping everything light while maintaining strength.
Grow HR mimics this by mixing soft expandable chambers (like balloons you can inflate or deflate) with tensioned cables and rigid adapters that keep the structure stable. There is also a non‑stretchable textile layer that adds axial stiffness – so the linkage does not just balloon out randomly like a party favour. A synchronous cable system runs through the whole thing, ensuring extension happens smoothly and uniformly.
In plain English: the robot can deform on purpose while still holding its shape. It is not a floppy inflatable. It is a hybrid that can choose when to be soft and when to be stiff.
The numbers that make you sit up
Let us get specific. British readers appreciate precision, so here are the figures that matter:
The linked structures can extend up to 315% of their original length.
The robot can increase its height by 278% – nearly tripling in stature, reaching about 1.36 metres at full stretch.
It can reduce its height by 36% and its width by 61% – specifically to navigate narrow and low spaces, like the gap under a collapsed ceiling or through a twisted fire escape in a Victorian terraced house converted into flats.
That last point is crucial. In real‑world rescue environments – say, after a building collapse in Manchester or a tunnel incident on the London Underground – the bottleneck is rarely raw strength. The bottleneck is getting through debris, through gaps, through doors, under collapsed beams, and doing it without getting stuck. A rigid humanoid cannot fold itself to squeeze through a 20‑centimetre gap. Grow HR can.
Multi‑mode locomotion: swim, crawl, fly, walk on water
Because the robot is so light, it can do things no rigid robot can:
Float and swim – its density is low enough that it naturally stays on the surface. The team has demonstrated controlled swimming.
Walk on water – at around 16 millimetres per second. That is not fast, but it is walking on water, something most humans cannot manage even with divine intervention.
Crawl efficiently – the researchers report crawling at about 112.2 millimetres per minute using only the growable linkages. But here is the computing kicker: when they coordinate the linkage actuation with joint motors, the crawling speed jumps by over a thousand times – specifically, 1,122 times faster than either mechanism alone.
Think about that for a moment. The growable structure is not just an alternative to motors. It is an amplifier. You pair it with motors, and suddenly the robot moves in a way that neither method can produce alone. That is the kind of emergent behaviour that makes computing researchers rub their hands with glee – and safety officers rub their temples.
Fly – with ducted fans or quadrotors attached, the same humanoid body can fly over distances of several metres. The platform is light enough that adding a simple quadrotor system gives you an aerial mode. No need for a separate drone.
A computing perspective: control without a fixed model
From an algorithmic standpoint, Grow HR presents a fascinating challenge. Most robot control systems assume a fixed kinematic model – they know exactly how long each limb is, where each joint is, and how far it can rotate. With Grow HR, those parameters change in real time. The robot can grow, shrink, and reconfigure its shape while moving.
The team solved this using a combination of:
Reinforcement learning – the robot experiments with different extension patterns in simulation and learns which ones produce efficient locomotion.
Active control of internal pressure and cable tension – essentially, the robot has a nervous system that decides when to stiffen a linkage (for pushing against the ground) and when to soften it (for squeezing through a gap).
Sensor fusion – on‑board accelerometers, pressure sensors, and cameras feed data into a control loop that adjusts the shape every few milliseconds.
The PhD student Wang Ting, quoted in the coverage, specifically explained that this growable bio‑inspired structure could be applied in field rescue missions, especially navigating narrow gaps, and that multiple locomotion modes let it adapt to complex terrain. That is not a lab curiosity. That is a potential life‑saver.
A British example: the Grenfell Tower aftermath
Imagine the scene after a high‑rise fire. Floors have partially collapsed. Stairwells are blocked by debris. Gaps between fallen concrete slabs are barely wider than a paperback book. A traditional search robot – even a small one on tracks – cannot fit through. A drone cannot see around corners in a smoke‑filled corridor.
A fleet of Grow HR units could:
Shrink to crawl through a 10‑centimetre gap.
Swim through flooded basement levels where sprinklers have run for hours.
Walk on water across a flooded landing to reach a trapped person.
Extend to reach a window ledge two metres higher.
Fly over a stairwell gap using ducted fans.
All from the same 4.5‑kilogram body. No need to send a human firefighter into a structurally unsound building. No need to wait for heavy lifting equipment. The robot goes first, finds the survivors, and reports back.
That is not science fiction. That is the stated goal of the researchers.
The adage that fits
There is an old British saying: “Little by little, the bird builds its nest.” It speaks to the power of small, incremental actions to achieve great things. Grow HR inverts that: it grows big by being small when it needs to be, and shrinks small by being soft when it needs to squeeze. Its nest is the entire environment, because it can adapt to any shape the environment throws at it.
Another adage applies: “Necessity is the mother of invention.” The necessity here is clear: rigid robots cannot navigate the messy, unpredictable, cluttered real world. Grow HR is the invention that necessity demanded.
What this means for the future of UK robotics
British research institutions – from the University of Bristol’s Robotics Laboratory to the Edinburgh Centre for Robotics – have long focused on soft and adaptable systems. Grow HR is a vindication of that direction. It indicates that the race is not about who builds the strongest, most powerful humanoid. It is about who builds the most adaptable.
Consider these applications for the UK:
Search and rescue in the Lake District or Snowdonia, where terrain varies from scree slopes to boggy moorland.
Nuclear decommissioning at Sellafield, where robots must fit through narrow pipes and then expand to manipulate equipment.
Offshore wind turbine inspection – a robot that can crawl along a turbine blade, swim to the next tower, and fly up to the nacelle.
Urban repair – squeezing behind walls to fix plumbing or electrical conduits without tearing the building apart.
The computing challenge remains formidable. Controlling a robot that changes shape while moving requires enormous real‑time processing. But the team’s success with coordinated actuation (linkages plus motors) suggests they have cracked the core problem. The rest is engineering.
The unavoidable question
If a robot can be soft, growable, and self‑reconfiguring, what stops it from being too adaptable? What happens when you cannot predict what shape it will take next? The researchers have addressed this through the textile layer and carbon fibre telescopic linkages – they mix materials to get stiffness when they need it. So the robot is not a chaotic blob. It is a controlled, programmable structure.
But the deeper question – the one that keeps computing ethicists awake at night – is about agency. If a robot can decide to change its own shape based on its environment, is it still a tool? Or does it become something closer to an organism? Grow HR does not make that decision on its own – the control algorithms are written by humans. But the trajectory is clear. As AI gets better at real‑time adaptation, the line between “programmed behaviour” and “emergent behaviour” will blur.
For now, Grow HR is a remarkable piece of engineering that defies physical expectation. It is a humanoid that can walk on water, swim through floods, and squeeze through gaps the width of a paperback. It weighs less than a full kettle. And it is only the beginning.
As the saying goes, “You cannot make an omelette without breaking eggs.” Soft robotics has broken the rigid egg of traditional humanoid design. The omelette – a new class of machines that adapt to any environment – is already on the stove. The only question is who gets to eat it.
3.“The proof of the pudding is in the eating” – and the pudding is now being assembled by a machine that looks nothing like a pudding.
For years, we have heard grand promises about humanoid robots taking over factory work. The headlines always came with caveats: “pilot programme”, “proof of concept”, “research project”. But somewhere in Henan province, at a facility belonging to CATL – the world’s largest electric vehicle battery maker – those caveats have been quietly scrubbed from the record.
Meet Xiaomo. It is a humanoid robot that has been deployed on a live battery production line, handling high‑voltage test plugs – a task previously reserved for skilled human workers. The robot achieves over 99 per cent connection success and matches the cycle times of experienced technicians. This is not a trial. This is not a demo for investors. This is production, running shift after shift, with no fanfare and no return to the drawing board.
What Xiaomo actually does
End‑of‑line testing for EV batteries is a nerve‑racking job. Workers must manually connect high‑voltage test plugs to battery packs, checking internal resistance and overall function before the packs leave the factory floor. The risks are real: a poor connection can lead to false readings, damaged components, or even electrical arcing. Consistency is everything – and humans, for all their skill, get tired, distracted, or have off days.
Xiaomo does not get tired. It uses an end‑to‑end vision‑language‑action model – a type of AI that links what it sees directly to what it does, without the need for pre‑programmed step‑by‑step instructions. The robot perceives the battery pack, identifies the correct test plug, calculates the force needed to insert it, and executes the movement in real time. If the wiring harness shifts slightly – as flexible materials tend to do – Xiaomo adjusts its grip and posture on the fly, just as a human would, but with millimetre precision and no hesitation.
The result: a 99%+ connection success rate and cycle times equal to the best human workers. Moreover, the robot continuously monitors each connection and reports anomalies in real time, cutting defect rates. During downtime, it switches into inspection mode, scanning for issues that human eyes might miss.
From pilot to production: the quiet revolution
Why is this significant? Because the robotics industry is littered with “world first” demonstrations that never made it past the laboratory doors. A robot backflips on YouTube. A robot folds laundry in a carefully lit studio. A robot serves coffee at a tech conference, then promptly falls over when no one is looking (as XPeng’s Iron did at a shopping mall, much to the internet’s amusement).
Xiaomo is different. It is working on an actual production line, handling actual high‑voltage equipment, alongside – and increasingly in place of – actual human workers. The company behind it, Spirit AI, was founded in 2024 and backed by CATL. In less than two years, it has gone from a startup pitch to a fully operational industrial asset.
CATL’s own numbers tell the story. From January to October last year, the company logged 355.2 gigawatt hours of global battery installations, up 36.6% year‑on‑year, giving it a 38.1% global market share. In China alone, November installations hit 40.87 gigawatt hours – 43.71% of the market. A company that size does not run experiments on its critical production lines. It deploys solutions that work.
A British parallel: the Nissan Sunderland plant
Imagine walking through Nissan’s factory in Sunderland, where the Leaf and Qashqai are built. For decades, the most delicate assembly tasks – wiring harness connections, sensor calibrations, final quality checks – have been the domain of human hands. Robots excel at welding, painting, and heavy lifting. But plugging a flexible cable into a tight port, with varying tolerances and the risk of damage? That has been a human job.
Now imagine a row of Xiaomo‑like humanoids at the end of the battery assembly line, silently and reliably connecting test plugs to every pack, hour after hour, with no breaks, no sick days, and no variation in quality. The union representatives would have questions. The productivity analysts would have spreadsheets. And the plant manager would have a very difficult decision to make about the next recruitment round.
That decision is already being made in China. It will be made in the UK within the next few years. The technology works. The cost is falling. And the business case is becoming unanswerable.
The computing reality: vision‑language‑action models
From a computing standpoint, Xiaomo represents a shift away from traditional industrial robotics. Classic factory robots are taught – an engineer manually guides the arm through a sequence of moves, and the robot repeats that sequence perfectly, forever. But that approach fails when the task involves uncertainty: flexible cables, variable part positions, or parts that change shape under stress.
Xiaomo runs on an end‑to‑end vision‑language‑action (VLA) model. Here is what that means in plain English:
Vision: The robot sees the battery pack and the test plug using depth cameras and 3D sensors. It builds a real‑time model of the environment.
Language: The AI interprets high‑level commands (“connect the high‑voltage test plug to port B”) and breaks them down into sub‑goals.
Action: The control system translates those sub‑goals into precise motor commands, adjusting force, angle, and speed based on continuous visual feedback.
Crucially, the VLA model is trained end‑to‑end – meaning the robot learns to connect the plug by watching thousands of successful and failed attempts, not by being explicitly programmed with rules. That is why it can handle a wiring harness that is slightly out of position or a plug that has worn down after thousands of cycles. It has learned the principle of successful connection, not just a fixed set of coordinates.
The adage that applies
“A stitch in time saves nine.” The old saying reminds us that a small, timely intervention prevents a much larger problem later. Xiaomo is that stitch. By ensuring every high‑voltage connection is made correctly the first time, it prevents battery failures, vehicle recalls, and potentially dangerous electrical faults. The robot does not just save labour. It saves the cost of correcting mistakes – and the reputational damage that comes with them.
Another adage comes to mind: “Don’t put all your eggs in one basket.” CATL has not replaced its entire workforce with humanoids. It has deployed Xiaomo on a specific, high‑risk, high‑precision task where the robot’s consistency delivers clear value. The human workers have been redeployed to other roles – for now. But as the technology matures and costs fall, the basket will hold more eggs.
What this means for the UK manufacturing sector
The UK has a proud manufacturing history, from the Industrial Revolution to the present day. But we have also seen entire industries – textiles, steel, shipbuilding – hollowed out by automation and global competition. The current wave of humanoid robotics is different. It is not about replacing a few repetitive jobs. It is about automating tasks that have always required human dexterity, judgment, and adaptability.
Consider these British examples where Xiaomo‑like robots could appear within five years:
Rolls‑Royce in Derby – assembling jet engine components that require precise insertion of sensors and wiring looms.
Oxford Nanopore – manufacturing DNA sequencing devices with microscopic fluidic connections.
Amazon fulfilment centres in Dunfermline and Doncaster – handling fragile items that current robotic grippers cannot manage.
Pharmaceutical plants in Macclesfield or Speke – connecting sterile tubing under cleanroom conditions, where human presence introduces contamination risk.
In each case, the value proposition is the same: reliability at scale. A human worker can connect a tricky plug correctly 99 times out of 100. That 100th time, when they are tired or distracted, causes a failure that ripples through the entire production line. Xiaomo connects correctly 99.5 times out of 100 – and the 0.5 times it fails, it flags the error immediately, allowing a human to intervene without stopping the line.
The unavoidable counterargument
“But what about the workers?” It is a fair question. Every time a robot takes over a skilled manual task, someone loses a job – or, more accurately, a specific job description disappears. The history of automation, however, suggests that new jobs emerge. When spreadsheets automated bookkeeping, accountants did not become unemployed. They became financial analysts. When ATMs automated cash dispensing, bank tellers did not vanish. They moved into customer service and sales.
The difference this time is the breadth of what is being automated. Xiaomo does not just replace a single repetitive motion. It replaces the judgment, the fine motor control, the adaptability, and the experience of a skilled technician. Those are precisely the qualities that workers take years to develop. When a machine can match them, the human’s comparative advantage shrinks to the tasks that require genuine creativity, empathy, or physical presence in unstructured environments.
That is the computing reality: capability transfer. Once a neural network can perceive, plan, and act with human‑level reliability, the economic case for retaining a human in that role collapses. It is not about being anti‑worker. It is about the cold arithmetic of production cost, defect rates, and uptime.
The future is already here
CATL’s deployment of Xiaomo is not a glimpse of the future. It is a photograph of the present, taken in a factory that makes batteries for electric cars that you might drive within the next year. The robot works alongside humans, but it does not need coffee breaks, holidays, or sleep. It does not complain about working conditions or ask for a raise. It simply connects test plugs, one after another, with relentless consistency.
As the saying goes, “You cannot have your cake and eat it.” For decades, Western manufacturers wanted the productivity of automation without the social disruption. China has made a different choice: deploy the robots, scale the production, and manage the social consequences later. Whether that is wise or reckless depends on your perspective. But it is happening. And Xiaomo is on the front line – literally.
The proof of the pudding is in the eating. CATL has tasted the pudding of humanoid labour, and it likes the flavour. The rest of the world will soon have to decide whether to take a bite.
4.“All good things come to an end” – and the backflipping showman has finally clocked in for the night shift.
For more than a decade, Boston Dynamics’ Atlas was the rock star of robotics. You have seen the videos: a humanoid doing parkour across a warehouse, landing backflips with the grace of an Olympic gymnast, and generally making engineers weep with joy. Those clips have been viewed hundreds of millions of times. They also gave the public a very specific impression – that humanoid robots are thrilling, slightly terrifying, and completely impractical for actual work.
That impression just became obsolete.
The backflipping research robot is dead. Long live the industrial Atlas, now scheduled for deployment at Hyundai Motor Group Metaplant America and integrated with Google DeepMind’s AI stack. The partnership with the Robotics and AI Institute (RAI) – led by Marc Raibert, who founded Boston Dynamics in the first place – has produced something the field has pursued for nearly two decades: zero‑shot transfer from simulation to hardware.
Allow me to explain what that means in plain English, because it is the real story here – not the flips, but the fact that the flips are now boring.
Zero‑shot transfer: the holy grail of robot learning
Traditionally, teaching a robot a new skill involves three painful steps:
Build a simulation – a virtual environment where the robot can try things without breaking itself.
Train the robot in the simulation – millions of attempts, falls, corrections, and gradual improvements.
Transfer the learned behaviour to the real robot – and watch it fail in ways the simulation never predicted, because simulations are never perfect. The real world has friction, surface irregularities, lighting changes, sensor noise, and a thousand other variables that no model can capture perfectly.
That third step – the sim‑to‑real gap – has been a nightmare for robotics researchers. A robot that can do a backflip perfectly in a simulation often face‑plants in the lab. Fixing it requires manual tuning, hours of real‑world testing, and a great deal of expensive hardware replacement.
Zero‑shot transfer means the robot learns a skill entirely in simulation, then runs the exact same control policy on the real hardware without any extra tuning or calibration. It works the first time. No adjustments. No surprises. Just seamless deployment from the digital twin to the physical machine.
RAI and Boston Dynamics achieved that with Atlas. The natural walking you saw at CES 2026, the sideways cartwheels, the backflips, the mid‑step balance recoveries – all from the same learning framework, trained in simulation, transferred to the real robot without additional tuning. That is not an incremental improvement. That is a paradigm shift.
What this means for the factory floor
The research Atlas is retired. The Enterprise Atlas is its replacement, designed for large‑scale manufacturing. It has 56 degrees of freedom – meaning it can bend, twist, and reach in almost as many ways as a human body. Its grippers feature four‑digit tactile sensing, allowing it to handle parts with the same delicate precision a human machinist would use.
Hyundai has confirmed that these robots will be deployed at its Metaplant America in Georgia by 2028. The first tasks are part sequencing – organising components in the right order for assembly lines – and by 2030, the plan is to expand into full component assembly. That is not a research project. That is a production roadmap from one of the world’s largest automotive manufacturers.
Let me put that in a British context. Imagine the Mini plant in Oxford or the London Electric Vehicle Company factory in Coventry (where they build the iconic black cabs). Currently, workers spend hours moving parts from bins to assembly stations, checking sequences, and performing repetitive but precise insertions. Those jobs are boring, physically demanding, and prone to error. They are also exactly the kind of task that a humanoid with zero‑shot transfer capabilities can handle – reliably, consistently, without complaint.
The difference is that the Oxford plant would not need to spend months tuning each robot. They would download the “part sequencing” policy from a cloud repository, load it onto Atlas, and let it run. When the production line changes – say, from building a three‑door to a five‑door – they would push an updated policy. No hardware reconfiguration. No weeks of downtime. Just software updates for a hardware platform.
The computing breakthrough behind zero‑shot transfer
How did they do it? The answer lies in whole‑body learning with massive simulation diversity.
Traditional simulation‑to‑real transfer fails because simulations are too clean. They use perfect physics models, idealised contact surfaces, and predictable lighting. Real robots encounter stochasticity – random variations that simulations rarely capture.
RAI and Boston Dynamics took a different approach. They trained Atlas in simulations that were heavily randomised – varying friction coefficients, surface stiffness, actuator noise, sensor latency, even gravity. The robot learned to perform backflips not on a perfect virtual mat, but on a thousand different virtual surfaces, each with its own quirks. By the time it moved to the real world, it had already seen variations far wilder than anything reality could throw at it.
The result: robust policies that generalise. The same control framework that does a cartwheel can also recover from a stumble, adjust to a slippery floor, or compensate for a slightly miscalibrated joint. That is why the Enterprise Atlas can move from a research lab to a factory floor without retraining.
The adage that fits
There is a British saying: “Look before you leap.” Traditionally, robot training has been the opposite – leap first, then look at the wreckage, then rebuild. Zero‑shot transfer reverses that. The robot looks (in simulation) a million times, then leaps exactly once in reality, and lands perfectly.
Another adage applies: “A change is as good as a rest.” Atlas has gone from a tireless performer of viral stunts to a tireless worker of factory shifts. That change is not just a rest for the research team. It is a complete redefinition of what the robot is for.
Why this is relevant for the UK robotics sector
The UK has world‑class robotics research – at Imperial College, the University of Bristol, the University of Edinburgh, and elsewhere – but we have struggled to translate that research into industrial deployment. British manufacturing, while still significant (over £200 billion annually), has lagged behind Germany and China in adopting next‑generation automation.
Zero‑shot transfer changes the economics. If a robot can be trained in simulation and deployed without on‑site tuning, the barrier to entry for small and medium‑sized manufacturers plummets. A precision engineering firm in Sheffield could buy an Atlas, download a policy for CNC machine tending, and be productive within a day – without hiring a team of robotics PhDs to calibrate it.
That is the promise. Whether UK industry seizes it depends on investment, training, and a willingness to embrace automation not as a threat but as a competitive necessity.
The end of the research era – and the beginning of something else
The video of Atlas doing backflips will remain on YouTube forever. It is a monument to what was possible when engineers had unlimited time, budget, and patience. But the robot in those videos is gone. In its place is a machine that does not perform for cameras. It sequences parts. It assembles components. It works.
As the saying goes, “Youth is wasted on the young.” The backflipping Atlas was the young, flashy showoff. The Enterprise Atlas is the mature, reliable professional. It has stopped showing off and started earning its keep.
And the really unsettling part? The same learning framework that taught it to backflip now teaches it to work. The balance recovery that made the cartwheel possible also makes the part insertion stable. The dynamic motion that looked like art is now applied to logistics. The research chapter has closed – but the industrial chapter has just begun.
“What goes around comes around.” The flips that made Atlas famous have come around as the stability that makes it useful. The only question is whether we are ready for a robot that is not just athletic, but genuinely helpful. The answer, it seems, is that we do not have a choice. Atlas is already on the production schedule.
5.“Many hands make light work” – but when the hands are robotic, the work becomes a flood.
Let us sit with a number for a moment: 5,500. That is how many humanoid robots Unitree Robotics shipped in 2025. Now consider another set of numbers: 150, 150, 150. That is roughly what Tesla, Figure AI, and Agility Robotics each shipped in the same year. Not 5,500 combined. Each. The gap is not a gap. It is a chasm.
If you have been following the humanoid robotics industry through Western media, you might think the race is close. You have seen Elon Musk’s Optimus waving at a camera. You have watched Figure 03 handing over a shirt. You have marvelled at Agility’s Digit walking up a flight of stairs. All impressive. All real. And all, in terms of volume, barely a rounding error compared to what is happening in China.
The International Federation of Robotics provides the wider context: 64 per cent of industrial robots in the global electronics industry are installed in China. And Chinese manufacturers supply 59 per cent of that sector globally. That means when you buy a smartphone, a laptop, or a smart speaker – anywhere in the world – the odds are better than even that a Chinese‑built robot had a hand in making it.
Unitree: from viral clips to volume production
You have probably seen Unitree’s robots online. The G1 doing kung fu. The H1 sprinting at near‑Usain Bolt speeds. The GD01 mecha smashing through walls. These clips are designed to go viral – and they do. But the real story is happening not on YouTube, but on factory floors and in shipping containers.
Unitree shipped 5,500 pure humanoid robots in 2025. That is not a pilot run. That is not a small batch for early adopters. That is industrial‑scale production – the kind of volume that drives down costs, improves reliability through real‑world feedback, and creates a self‑reinforcing cycle of supply chain optimisation.
For comparison, IDC estimates global humanoid shipments hit 18,000 units in 2025, with $440 million in sales and 508 per cent year‑over‑year growth. Chinese companies accounted for more than 80 per cent of installations. Unitree alone represented nearly a third of global shipments.
The computing logic of scale
Why does scale matter so much in robotics? Because humanoid robots are not magic. They are hardware‑software systems that improve with data. Every robot shipped is a data collection device. It walks, grasps, balances, fails, recovers, and sends that experience back to the cloud. The more robots in the field, the faster the AI learns.
Here is the computing reality: reinforcement learning and foundation models are data‑hungry. A humanoid that has seen 1,000 hours of real‑world walking develops better gait stability than one that has seen 100 hours. A gripper that has attempted 10,000 grasps learns force control that no simulation can match. Unitree’s 5,500 robots generate an order of magnitude more operational data than Tesla’s 150. That data advantage compounds. The models trained on it become superior. The next generation of robots starts from a higher baseline.
This is the virtuous cycle that China has mastered – first in consumer electronics, then in electric vehicles, now in humanoid robotics. You build at scale, you learn at scale, you improve at scale, and you out‑compete everyone else on price and performance.
A British high street example
Imagine walking into a Currys store in 2028. On the shelf, you see two humanoid robots for home assistance. One is from a Western brand, priced at £25,000. The other is from Unitree, priced at £6,000. Both claim similar capabilities. Which one do you buy?
The price difference is not magic. It is the result of supply chain density. China has all the major industrial categories domestically available: high‑performance motors, reducers, sensors, batteries, carbon fibre materials. A robot builder in Shenzhen can source every component within a 50‑mile radius. A robot builder in California or Munich spends months negotiating with suppliers across three continents.
That is the scale advantage. It is not just about making more robots. It is about making each robot for a fraction of the cost, with a fraction of the lead time, and with a supply chain that is resilient to global shocks.
The adage that explains it all
“A rolling stone gathers no moss.” The Chinese robotics industry is a very fast‑rolling stone. It does not sit still long enough for moss – or competitors – to accumulate. Every month, new factories open, new components are certified, and new robots roll off the line.
Another saying: “Don’t cut off your nose to spite your face.” Western policymakers face a difficult choice. They can try to restrict Chinese robotics imports – but that would mean paying three to five times more for domestic alternatives, slowing adoption, and falling further behind in AI training data. Or they can accept Chinese dominance in yet another strategic industry. Neither option is comfortable.
What 5,500 robots actually means for the world
Let us put that 5,500 number in perspective. If each robot works one eight‑hour shift per day, that is 44,000 robot‑hours per day of real‑world operational data. That is roughly 16 million hours per year. Tesla’s 150 robots, at the same utilisation, would generate about 440,000 hours – less than 3 per cent of Unitree’s data volume.
Over five years, the gap becomes insurmountable. The Chinese models will have seen every conceivable edge case: slippery floors, uneven pavements, crowded spaces, faulty sensors, unexpected obstacles. The Western models will still be catching up.
This is not speculation. It is the same pattern that played out in solar panels, lithium batteries, and electric vehicles. China does not invent every breakthrough – many come from Western labs – but it scales them faster than anyone else. By the time the West has a working prototype, China has a million units in the field.
The computing corollary: more data, better AI
The AI behind humanoid robots is not magic. It is, at its core, pattern recognition over massive datasets. A model that has seen 10,000 examples of a robot recovering from a stumble will develop better balance than a model that has seen 1,000 examples. A model that has experienced 100,000 grasping attempts will develop more reliable force control than one with 10,000 attempts.
Unitree’s 5,500 robots are not just products. They are data factories. Each robot is a sensor platform, constantly recording its interactions with the physical world. That data flows back into Unitree’s LM model (their unified large model for robotics), which improves, and then flows back out to the fleet as an over‑the‑air update.
The Western competitors are not stupid. They know this. But they cannot manufacture 5,500 robots without the supply chain to support it. And they cannot build the supply chain without the manufacturing volume to justify it. It is a chicken‑and‑egg problem – and China already owns both the chicken and the egg.
The British angle: can we compete?
The UK has excellent robotics research – at the University of Bristol’s Robotics Laboratory, at Imperial’s Hamlyn Centre, at Edinburgh’s National Robotarium. But research does not ship units. Manufacturing does.
For Britain to compete in humanoid robotics, we would need to rebuild industrial supply chains that were dismantled decades ago. Motor production, sensor fabrication, battery cell manufacturing – these are not skills we have in abundance. We could try to buy from Chinese suppliers, but that simply reinforces their advantage. Or we could specialise in software and AI, leaving the hardware to China – a risky strategy when hardware and software are increasingly intertwined.
The saying “Too many cooks spoil the broth” comes to mind. The West has many excellent robotics researchers – but they are scattered across dozens of companies and universities, each with its own approach, its own platform, its own data format. China has a smaller number of very large companies (Unitree, Agibot, UBTECH, XPeng) all feeding into a common supply chain and, increasingly, a common data ecosystem. That focus produces results.
The bottom line
Unitree shipped 5,500 humanoid robots in 2025. Tesla shipped 150. That is not a failure of Western engineering. It is a failure of Western scale – of supply chains, of investment, of political will to treat robotics as a strategic industry rather than a niche market.
As the adage goes, “The early bird catches the worm.” China was not necessarily the first to develop humanoid robots. But it was the first to decide that it wanted to manufacture them by the thousands, and it built the industrial machine to do so. The worm – the global humanoid robotics market – is now largely caught.
The only question for the UK and the rest of the West is whether we are content to watch from the sidelines, or whether we will finally roll up our sleeves and build something of our own. Because 5,500 is not the final number. It is just the beginning. And the gap is only going to widen.
6.“You get what you pay for” – unless you are buying a humanoid robot from China, in which case you get rather a lot more.
Let us talk about money. Specifically, let us talk about the moment a humanoid robot became cheaper than a second‑hand Ford Fiesta. Unitree’s R1 humanoid is priced at roughly $4,370 – about £3,500 at current exchange rates. That is not a typo. That is not a stripped‑down educational kit. That is a fully functional bipedal robot that can run downhill, perform cartwheels, stand up from the ground, and handle athletic motion.
Now let us talk about the competition. US‑made humanoids routinely cost ten times more. A comparable American robot – if one exists at that capability level – would set you back £35,000 or more. Why? The answer is not better engineering. The answer is supply chains.
The analyst’s observation that changes everything
A tech analyst quoted in the coverage put it bluntly: China is the only country in the world with all major industrial categories domestically available. That means:
High‑performance motors – the kind that give a humanoid its strength and speed.
Reducers – the precision gearing that turns motor torque into smooth, controlled motion.
Sensors – LiDAR, depth cameras, force sensors, inertial measurement units.
Batteries – high‑density, fast‑charging, safe chemistries.
Carbon fibre materials – lightweight, strong, essential for keeping a humanoid’s weight down without sacrificing durability.
All of these are manufactured inside China, at scale, with mature supply chains and fierce domestic competition. That drives prices down for everyone. A robot builder in Shenzhen can source a motor for a fraction of what a builder in Detroit or Stuttgart would pay – and they can get it delivered tomorrow, not next month.
The computing perspective: hardware is the new software
For decades, the computing industry has enjoyed the benefits of Moore’s Law – transistors getting cheaper, faster, and denser every 18 months. Software rode that wave. A startup could build a world‑changing app with nothing but a laptop and a cloud account because the underlying hardware was cheap and ubiquitous.
Robotics has never had that luxury. Hardware has stubbornly refused to follow Moore’s Law. Motors, gears, sensors, and structural materials have seen only incremental cost reductions. The result: robots remained expensive, niche, and confined to well‑funded research labs and automotive factories.
That era is ending. China has effectively created a Moore’s Law for robot hardware – not through transistor scaling, but through vertical integration and massive volume. When you manufacture 5,500 humanoids in a year, the cost of a motor drops. When you produce 100,000 robot joints annually (as one Shanghai factory already does), the cost of a reducer plummets. When you source carbon fibre for wind turbines, electric vehicles, and robots from the same domestic mills, the material cost becomes almost negligible.
The R1 is the first visible product of this new reality. It will not be the last.
A British high street comparison
Imagine walking into a B&Q warehouse in 2027. On one aisle, you see a US‑made humanoid for £35,000. On the next aisle, you see Unitree’s R1 for £3,500. Both can carry a power drill to a shelf. Both can navigate an aisle crowded with customers. Both can be programmed by a store manager with no robotics training.
Which one does B&Q buy? The answer is obvious. They buy ten of the cheaper ones. They put one in every department. They reduce labour costs, improve stock accuracy, and offer 24‑hour customer assistance. The £35,000 robot sits on the shelf, unsold, a monument to a supply chain that could not compete.
Now extend that logic to every industry: warehouses, hospitals, hotels, schools, care homes. At £3,500, a humanoid becomes a capital expense that pays for itself within months – not years. At £35,000, it remains a luxury for the rich and the heavily subsidised.
The adage that captures the moment
“A penny saved is a penny earned.” Unitree has saved a great many pennies by sourcing everything locally. Those savings are passed directly to the customer. The result is a robot that costs less than a decent second‑hand car – and does a great deal more.
Another saying: “Cut your coat according to your cloth.” China has the cloth – the motors, sensors, batteries, and materials. It has cut the coat – the R1 – to fit a price point that the rest of the world cannot match. The rest of us are still measuring our cloth and wondering why the tailor is so expensive.
The supply chain as a strategic advantage
Why can the US not do the same? Because it no longer manufactures many of these components domestically. High‑performance motors? Many are made in China or Germany. Reducers? Japan and China dominate. Sensors? A mix of Europe, the US, and China – but with Chinese versions often cheaper and more readily available. Batteries? China controls over 75 per cent of global lithium‑ion battery production. Carbon fibre? China is the world’s largest producer.
The US could rebuild these industries. It would take a decade and hundreds of billions of dollars. In the meantime, Unitree is shipping robots.
The UK is in an even more difficult position. We have excellent design and research capabilities – Rolls‑Royce makes world‑class motors for jets, not for humanoids. We have niche sensor companies. We have superb materials science at universities. But we do not have the integrated, high‑volume supply chains that make a £3,500 humanoid possible. We would have to import most of the components, pay tariffs and shipping, and then assemble them at higher labour costs. The result would be a £15,000 robot at best – still cheaper than the US, but not competitive with China.
The computing corollary: cheap hardware enables cheap AI
The R1 is not just a cheap robot. It is a cheap platform for AI research and deployment. When a humanoid costs £3,500, a university lab can buy ten of them. A small business can buy one. A hobbyist with a decent credit limit can buy one.
That matters because AI needs data. The more R1s out in the world, the more walking, grasping, and interacting data flows back to Unitree’s servers. That data improves the foundation model. The improved model makes the next generation of R1s smarter. The smarter robots increase demand. The increased demand drives further cost reductions. It is a virtuous cycle – and it starts with a £3,500 price tag.
Western competitors face a vicious cycle. Their robots are expensive, so they sell fewer units. Fewer units mean less real‑world data. Less data means slower AI improvement. Slower improvement means their robots stay less capable, less reliable, and less attractive to buyers. The gap widens with every passing month.
The bottom line
Unitree’s R1 at $4,370 is not a miracle. It is the logical outcome of a deliberate, decades‑long strategy to own every link in the manufacturing chain. China did not get lucky. It invested in motor factories, reducer production lines, sensor fabs, battery gigafactories, and carbon fibre mills. Those investments are now paying off in the form of humanoid robots that cost less than a family holiday.
As the saying goes, “Rome wasn’t built in a day.” Neither was China’s robotics supply chain. But it has been built, it is running at full capacity, and it is producing robots that are changing the economics of automation forever.
The question for the UK is not whether we can match the R1’s price. We cannot – not without a complete industrial overhaul that no government has the stomach for. The question is whether we can find a different path – perhaps specialising in high‑end, high‑precision robots for niche applications where price is less important than performance. Or perhaps focusing on the software and AI that make the hardware useful, leaving the physical manufacturing to China.
One thing is certain: the cost curve has collapsed. The £3,500 humanoid is here. The only debate is what we do with it.
7.“If you want a thing done well, do it yourself” – but what if the thing doing it is a two‑storey mecha that folds into a giant metal dog?
Let me take you back to a childhood dream. You are nine years old, sitting in front of the telly on a Saturday morning, watching Robotech or Transformers or Thunderbirds. The heroes climb into giant machines. The machines walk, run, fight, and save the day. You think: “I want one.” Then you grow up, and you file that desire alongside wanting a pet dragon or a chocolate factory.
Well, file it no longer. Unitree Robotics – the same company that brought you the backflipping, kung‑fu‑fighting G1 – has just unveiled the GD01. It is a manned mecha. It stands 2.7 metres tall – about eight feet ten inches in old money. It weighs 500 kilograms with a pilot on board. It can walk upright on two legs. And then, in a matter of seconds, it folds its legs underneath itself, reconfigures its entire chassis, and crawls on all fours like a giant robotic hound.
The intro video shows Unitree’s founder and CEO, Wang Xingxing, walking up to the machine, holding its hand (yes, holding its hand), and climbing into an open‑air cockpit mounted in its chest. Then the GD01 walks smoothly across a yard, approaches a stack of cinder blocks, and absolutely demolishes them. Later, without a pilot, it smashes through a brick wall. Then it transforms. Then it keeps moving across uneven terrain.
Elon Musk called it “cool”. He was not wrong. But a far more useful observation came from Chen Jing, vice president of the Technology and Strategy Research Institute. He said this: once a robot can carry a human and perform tasks, it stops replacing labour and starts extending human capability – similar to how cars transformed mobility.
That is the real story. Not the cinder blocks. Not the transformation. The shift in category.
From labour replacement to capability extension
For the past decade, the robotics industry has been obsessed with one question: “Will robots take our jobs?” The GD01 answers a different question: “What can humans do when they are wearing a robot?”
Think about a car. A car does not replace a human. It extends what a human can do. A person who can walk five miles in an hour can, in a car, cover sixty miles in that same hour. The person is not obsolete. They are augmented. The car is a tool of extension, not replacement.
The GD01 is a car for the body – but instead of wheels, it has legs. Instead of a chassis, it has a humanoid form that can punch through walls, carry heavy loads, and navigate rubble that would stop a vehicle cold. A construction worker inside a GD01 could lift steel beams alone. A firefighter could smash through a collapsed ceiling to reach a trapped family. A soldier could carry heavy equipment across terrain no vehicle can cross.
That is the vision Chen Jing is pointing to. The GD01 is not a worker replacing a human. It is a suit – an exoskeleton writ large – that makes a single human capable of feats that would otherwise require a team, a crane, and a great deal of luck.
The computing reality: human‑in‑the‑loop control
From a computing standpoint, the GD01 is fascinating because it is not autonomous. The pilot controls the machine directly. But the control system is not a simple joystick‑and‑pedal arrangement. It is a semi‑autonomous bipedal platform that translates human intent into stable, balanced motion.
When a pilot leans forward, the GD01 does not simply lean. It calculates the necessary joint torques, centre‑of‑gravity adjustments, and foot placement to maintain stability while moving in that direction. When the pilot raises an arm to punch a wall, the robot’s control system manages the impact absorption, distributes force through the chassis, and prevents the pilot from being rattled into unconsciousness.
This is shared control – a partnership between human intuition and machine precision. The human provides the high‑level goals (“walk there”, “smash that”, “crawl under this beam”). The robot handles the low‑level stability, balance, and force management. Neither could do it alone. Together, they become something new.
A British example: the aftermath of a tunnel collapse
Imagine the scene after a lorry fire in the Dartford Tunnel. Thick smoke, twisted metal, unstable concrete. A rescue team arrives, but they cannot get close – the heat is too intense, the debris too unstable, the air unbreathable.
A pilot in a GD01 could walk into that tunnel. The robot’s legs navigate the rubble. Its arms clear debris. Its chest cockpit protects the pilot from heat and smoke. The pilot, using onboard cameras and thermal sensors, locates survivors and lifts beams that would crush any unaided human. The robot carries them out, one under each arm.
That is not science fiction. That is the stated use case for manned mecha in disaster response. The GD01 is not designed for the battlefield (though that application is obvious). It is designed for civilian heavy work – construction, rescue, logistics – where the combination of human judgment and robotic strength is worth half a million pounds.
The adage that applies
“Horses for courses.” The GD01 is not for every job. You would not use it to make tea or stack shelves. But for the jobs that require raw strength, stability, and the ability to move through rough terrain, it is the right horse. And unlike a traditional bulldozer or crane, this horse can walk up stairs, crawl under low ceilings, and pick up a person without crushing them.
Another saying: “A new broom sweeps clean.” The GD01 is a very new broom – one that sweeps away the old distinction between “vehicle” and “tool”. A car is a vehicle. A forklift is a tool. The GD01 is both: a vehicle you pilot that is also a tool you wield.
The cost and the market
The GD01 costs between $573,000 and $650,000 – roughly £450,000 to £510,000. That is not pocket change. It is, however, comparable to a high‑end fire engine or a specialised construction vehicle. A local authority buying a GD01 for urban search and rescue would be making a capital investment similar to buying a new ladder truck.
Unitree is not expecting consumers to buy these. They are targeting professional and industrial users – fire services, military units, construction firms, disaster response organisations. For those users, the GD01 offers capabilities that no existing machine can match. A fire engine cannot climb stairs. A bulldozer cannot pick up a survivor gently. A crane cannot crawl through a collapsed parking garage.
The GD01 can do all three. That is why it costs half a million pounds. And that is why, if the technology proves reliable, the price will eventually fall – just as the R1 fell to £3,500.
The computing challenge: balance as a service
The hardest engineering problem in the GD01 is not the transformation mechanism or the impact resistance. It is keeping the pilot comfortable while the robot moves. A walking robot naturally sways from side to side. That sway, if transmitted directly to the pilot, would cause motion sickness within minutes. The GD01’s control system must actively cancel that sway, stabilising the cockpit while allowing the legs to move naturally.
This is an active suspension problem – similar to how a luxury car isolates passengers from road bumps, but far more complex because the robot’s own legs are the source of the motion. The computing power required to solve this in real time is substantial. The fact that the GD01 exists suggests Unitree has cracked it.
The transformation: biped to quadruped
Why would you want a humanoid robot to crawl on all fours? Because stability. A bipedal stance is efficient on flat ground but precarious on loose rubble, steep slopes, or icy surfaces. A quadruped stance – four points of contact – is vastly more stable. The GD01 can switch between modes in seconds, allowing it to walk upright through a warehouse and then crawl through a collapsed building without the pilot ever leaving the cockpit.
This is not a gimmick. It is a fundamental advance in mobile robotics. No existing manned vehicle – not a car, not a tank, not a helicopter – can reconfigure its locomotion mode on the fly. The GD01 can. That alone justifies the price for specialised users.
The bottom line
The GD01 is not a toy. It is not a publicity stunt (though the video is brilliant publicity). It is a genuine new category of machine – a manned, bipedal/quadrupedal mecha designed for real work in real disasters.
As the old adage goes, “Necessity is the mother of invention.” The necessity here is clear: there are places where humans need to go and work, but cannot go safely or effectively without mechanical assistance. The GD01 is the invention. It is expensive, yes. It is also the first of its kind. The second will be cheaper. The third will be better. And one day, perhaps sooner than we think, a British firefighter in a GD01 will walk through the smoke and pull someone from the rubble.
That is not replacing labour. That is extending human capability. And that, as Chen Jing observed, is a transformation as profound as the car. The only question is whether we have the foresight to embrace it – or the misfortune to watch someone else do it first.
8.“Where there’s a will, there’s a way” – and where there’s minus 47 degrees, there’s a robot in a puffer jacket.
Let me paint you a picture. You are in the Altay region of Xinjiang, north‑western China. It is the kind of cold that makes your breath freeze mid‑exhalation, that turns exposed skin numb in under a minute, that stops diesel engines and cracks ordinary steel. The temperature has dropped to minus 47.4 degrees Celsius. That is minus 53 degrees Fahrenheit – colder than the average winter day on Mars.
In those conditions, batteries lose their chemical oomph. Lubricants turn to treacle. Plastics become as brittle as old biscuit. Electronics that function happily in a warm London flat will simply give up and die. Most robots – indeed, most machines – are not designed to operate here. They are designed for the cosy, temperature‑controlled world of laboratories, factories, and warehouses.
Yet, somewhere in that frozen wilderness, a humanoid robot named Unitree G1 completed a 130,000‑step autonomous trek. It walked across a snowfield, tracing the shape of a Winter Olympics emblem that measured roughly 186 metres long and 100 metres wide. This was not a straight line. This was precise path following – centimetre‑level accuracy – in brutal, unpredictable conditions.
And here is the detail that warms the heart of any practical engineer: to survive, the G1 was dressed in an orange insulated puffer jacket and had improvised plastic covers wrapped around its lower limbs to protect its joints, actuators, and battery systems from freezing.
That is the moment you realise: advanced robotics is not magic. It is ingenuity. And sometimes ingenuity looks like a robot wearing a coat.
The computing challenge of extreme cold
From a computing standpoint, extreme cold presents three linked problems:
Battery performance – Lithium‑ion cells lose capacity dramatically below freezing. Internal resistance rises, voltage sags, and the available energy for motors and processors plummets. The G1’s quick‑release battery (9,000 mAh, normally good for two hours) would have lasted perhaps half that time without thermal management.
Sensor reliability – LiDAR, cameras, and IMUs all have operating temperature ranges. Below those ranges, calibration drifts, noise increases, and outright failure becomes likely. The G1 relied on China’s Beidou satellite navigation system for centimetre‑level positioning, but satellite signals alone cannot tell you if you are about to step into a hidden crevasse. Its 3D LiDAR and Intel RealSense depth camera had to keep working.
Actuator and lubricant behaviour – The G1 has between 23 and 43 joint motors, depending on configuration, with maximum joint torque reaching 120 Newton metres. Those motors contain lubricants that thicken in extreme cold. Thicker lubricant means higher friction, higher power consumption, and slower response times. The robot’s control algorithms had to compensate in real time for changing mechanical resistance.
The puffer jacket and plastic covers were not makeshift bodge jobs. They were thermal management solutions – passive, reliable, and easily replaceable. They kept the battery packs warm enough to deliver rated power. They sheltered the joints from direct ice crystal infiltration. They bought the electronics the few degrees of margin they needed to keep functioning.
A British example: the Cairngorms in winter
Imagine you are a mountain rescue volunteer in the Cairngorms. It is February, minus 15 degrees Celsius, with a howling wind and freezing fog. A walker has slipped on an ice patch and broken an ankle. They are stranded on a narrow ledge, out of reach of a helicopter and too dangerous for a human rescuer to approach.
Now imagine you have a G1. You send it ahead, walking autonomously across the icy slope. Its puffer jacket – bright orange, visible against the white snow – keeps its batteries alive. Its plastic‑covered legs shed snow and ice. Its LiDAR maps the terrain in real time, adjusting each foot placement to avoid slipping. It reaches the walker, provides a warm blanket from its cargo pouch, and relays a live video feed back to base.
That is not science fiction. That is the practical application of extreme‑environment testing. The G1 proved it can handle conditions far worse than the Cairngorms has ever thrown at a human. If it can manage minus 47 degrees in Xinjiang, it can manage minus 15 in Scotland with ease.
The adage that fits
“Necessity is the mother of invention.” The necessity here was not a laboratory experiment. It was a real‑world demonstration: a robot tracing a Winter Olympics emblem in one of the coldest inhabited places on Earth. The invention was not a new type of motor or a breakthrough in battery chemistry. It was a puffer jacket and some plastic sheeting.
Another saying: “If it looks stupid, but it works, it isn’t stupid.” The G1 in its orange coat looks faintly ridiculous – a £14,000 humanoid dressed for a ski holiday. But it worked. And working is all that matters when the alternative is freezing solid 130,000 steps from shelter.
What the G1’s hardware tells us
The G1 is not a massive robot. It stands about 127 centimetres tall – roughly four feet two inches – and weighs about 35 kilograms (77 pounds). Its compact size helps in extreme cold: smaller components warm up faster and lose heat more slowly than large ones. But size alone is not enough. The modifications for the Altay trek included:
Thermal insulation – the puffer jacket, which added a layer of trapped air between the cold and the robot’s core.
Moisture barriers – the plastic covers around the lower limbs, preventing snow melt from seeping into joints and then refreezing.
Path planning algorithms – adaptive algorithms that adjusted stride length and foot placement based on real‑time slip detection.
Satellite navigation – Beidou’s centimetre‑level accuracy, essential when snow covers all visual landmarks.
The robot ran on Unitree’s LM unified large model, using reinforcement learning for motion control. The same model that taught the G1 to walk on a lab floor also taught it to walk on snow – but the engineers had to add the jacket and covers because no amount of clever software can stop a battery from freezing.
The computing perspective: robustness through diversity
The G1’s success in extreme cold illustrates a principle that computing researchers are increasingly embracing: robustness through diversity. You cannot solve every problem with software alone. Occasionally, you need hardware insulation. Sometimes you need passive thermal management. Sometimes you need a human to sew a jacket.
This is the opposite of the “pure AI” approach – the idea that a sufficiently smart model can overcome any physical limitation. The G1 demonstrates that the real world is messy, and the best solutions combine clever algorithms (for path planning and balance) with brute‑force engineering (for insulation and moisture protection).
The puffer jacket is not a failure of robotics. It is a success of systems thinking – recognising that a robot is not just a brain on legs, but a collection of components that each have their own temperature tolerances, lubrication requirements, and failure modes.
Why this matters for UK robotics
The UK does not have Altay’s extreme temperatures – not regularly, anyway. But we do have wet, cold, and windy conditions that are almost as challenging for electronics. A robot that can handle minus 47 degrees of dry cold can certainly handle minus 5 degrees of damp British cold. The lessons from the G1’s trek apply directly to:
Offshore wind turbine inspection – robots that must operate in North Sea winters, with salt spray, freezing spray, and bone‑chilling winds.
Flood rescue – robots that must wade through cold water without short‑circuiting or losing battery power.
Agricultural robotics – autonomous machines working through British winters, from sheep monitoring in the Highlands to crop inspection in East Anglia.
In each case, the G1’s approach – active path planning plus passive thermal protection – offers a template. You do not need to reinvent the motor. You need to keep the motor warm.
The bottom line
The G1’s 130,000‑step trek at minus 47 degrees is not a record that will be broken soon. It is a proof of principle – a demonstration that humanoid robots can operate in conditions that would kill a human in minutes. The orange puffer jacket and plastic covers are not embarrassing compromises. They are engineering wisdom – the recognition that sometimes the simplest solution is the best.
As the adage goes, “All that glitters is not gold.” The G1’s glittering AI and reinforcement learning are impressive. But the gold – the real value – lies in the ability to function when the world turns hostile. That gold comes from a coat and some plastic sheeting.
The next time you see a humanoid robot in a promotional video, performing acrobatics in a warm, well‑lit studio, ask yourself: can it do that in a blizzard? The G1 can. And it will look rather dapper doing so.
9.“A place for everything and everything in its place” – and now there’s a robot to put it there.
For decades, the promise of a domestic robot has hovered just beyond reach. We have had robot vacuum cleaners that bump into furniture, robot lawnmowers that tangle themselves in extension cords, and smart speakers that mishear “turn off the lights” as “order a thousand rubber ducks”. The dream of a machine that can actually run a household – waking you up, making breakfast, tidying the living room, and planning the day – has remained firmly in the realm of science fiction.
Not anymore. Unitex AI – a Chinese company you will be hearing a great deal more about – has launched a humanoid called Panther. It is wheeled, not legged, which is a deliberate and sensible choice for indoor environments. It has four‑wheel steering and four‑wheel drive. It runs for 8 to 16 hours on a single charge. And it performs multi‑step workflows: waking you up, preparing breakfast, cleaning the kitchen, and organising the living space – one after another, without being prompted each time.
This is not a remote‑controlled toy. This is not a single‑purpose gadget. This is a general‑purpose household assistant that can chain tasks together, adapt to changing conditions, and learn from its mistakes. And it is already shipping globally.
The three brains of Panther
Panther does not rely on a single AI model. It uses three integrated systems, each handling a different aspect of household work:
Uniflex – handles task generalisation and imitation learning. In plain English, this means Panther can watch you do something once – fold a tea towel, load the dishwasher, set the table – and then do it itself, in a slightly different kitchen, with different objects. It learns the pattern of the task, not just the specific movements.
Unitouch – provides visuo‑tactile capabilities. Panther does not just see objects; it feels them. When it picks up a fragile egg cup, it senses the pressure and adjusts its grip. When it wipes a counter, it detects the difference between a dry crumb and a sticky spill. This is the difference between a robot that fumbles and one that handles your grandmother’s china with care.
Unicortex – responsible for long‑term planning. This is the system that enables multi‑step workflows. Unicortex maintains a mental model of the house, remembers what has been done and what remains, and re‑orders tasks when something goes wrong. If the toast burns, Unicortex does not panic. It adds “make toast again” to the end of the list and carries on.
Together, these three systems turn Panther from a collection of actuators into a coherent domestic agent – a robot that can wake you up at 7 AM, prepare porridge and tea, tidy the morning papers, and then clean the kitchen while you eat, all without needing you to press a button between steps.
A British morning with Panther
Let me walk you through a typical British morning, circa 2027, with Panther on the scene.
7:00 AM – Panther rolls quietly into your bedroom. Its wheels are rubberised and near‑silent. It speaks in a calm, measured voice: “Good morning. It is seven o’clock. The forecast is light rain, so I have laid out your umbrella by the front door.” It draws the curtains (using a simple hook attachment) and rolls back out.
7:05 AM – In the kitchen, Panther has already boiled the kettle. It knows you take two sugars, milk first (a subject of some debate in your household, but Panther has learned your preference). It butters your toast – not too thick, not too thin – and plates it. It notices the marmalade jar is nearly empty and adds “marmalade” to the shopping list.
7:15 AM – While you eat, Panther cleans the kitchen. It wipes the counters, loads the dishwasher (carefully arranging the plates so they do not chip), and sweeps the floor. It spots a puddle under the sink – a slow drip you have been ignoring – and flags it for later inspection.
7:30 AM – You leave for work. Panther spends the next two hours organising the living space: folding the throw blankets, returning books to the shelf, and vacuuming the carpet. It notices the potted plant by the window is drooping and waters it. Then it docks itself to recharge, ready for the evening.
None of this is magic. It is software, sensors, and wheels – but integrated so seamlessly that the robot feels almost like a quiet, competent flatmate.
The computing perspective: chaining tasks in the real world
What makes Panther genuinely difficult – and genuinely impressive – is task chaining in an unstructured environment. In a factory, robots work in controlled conditions: the parts are always in the same place, the lighting is consistent, and nothing moves unexpectedly. A home is chaos. Children leave toys on the floor. Pets knock over bins. Spoons migrate between drawers.
Panther’s Unicortex system handles this by maintaining a probabilistic world model. It does not assume the toast is still in the toaster. It checks. It does not assume the kitchen floor is clean after it sweeps – it re‑inspects, because the cat might have walked through. This constant re‑evaluation is computationally expensive, but Panther’s on‑board processors (the specifics are not public, but likely in the range of 50–100 TOPS) are up to the task.
The multi‑step workflow is stored not as a rigid script but as a graph of goals. “Prepare breakfast” is a goal, not a sequence. Unicortex breaks it down into sub‑goals (boil water, make toast, set table) and dynamically re‑orders them based on real‑time constraints. If the bread is frozen, it defrosts it first. If the kettle is already warm, it adjusts the boiling time. This is hierarchical task planning – the same kind of reasoning that autonomous vehicles use to navigate traffic, now applied to your morning routine.
The adage that captures it
“Many hands make light work.” Panther is that extra pair of hands – or rather, two arms, four wheels, and a great deal of processing power. It does not replace you. It takes on the repetitive, boring, time‑consuming tasks so you can spend your morning reading the news, playing with the children, or simply enjoying your tea in peace.
Another saying: “A stitch in time saves nine.” Panther’s real value is not in the dramatic tasks but in the small, consistent ones – wiping the counter before the spill dries, watering the plant before it wilts, adding marmalade to the list before you run out. These micro‑actions add up to a household that runs smoothly without you having to think about it.
Why wheels, not legs?
You might wonder why Panther has wheels when so many humanoids (like Unitree’s G1 or Tesla’s Optimus) have legs. The answer is efficiency. Legs are brilliant for stairs, rubble, and uneven terrain – but homes, especially modern flats and houses, are mostly flat. Wheels are faster, more energy‑efficient, and mechanically simpler. A wheeled robot can cover more ground on a single charge and is less likely to tip over on a polished wooden floor.
Panther’s four‑wheel steering gives it remarkable manoeuvrability. It can spin in place, crab sideways (to slide past a sofa), and navigate tight corridors that would stump a two‑wheeled robot. The trade‑off is stairs – Panther cannot climb them. But for a single‑storey flat or a house with a stairlift or ground‑floor living, that is not a problem. And for multi‑storey homes, well, you can always buy a second Panther.
The British context: coping with small spaces
British homes are famously cosy – estate agents call them “compact” or “characterful”. A kitchen in a Victorian terrace is often narrower than the wingspan of a typical adult. A living room in a 1960s block might have awkward corners and low ceilings.
Panther is designed for these constraints. It is 5 feet 3 inches tall (about 1.6 metres) – shorter than many humanoids, which makes it less imposing and easier to fit under standard countertops. Its width is modest, and its four‑wheel steering allows it to navigate a galley kitchen without knocking over the recycling bin.
Unitex AI has tested Panther in real homes – not just pristine show flats. Early reviewers in China noted that the robot handles cluttered spaces well, though it occasionally gets confused by transparent obstacles (glass coffee tables are its nemesis). The company is rolling out over‑the‑air updates to improve object recognition for precisely these edge cases.
The bottom line
Panther is not cheap – pricing details are not fully public, but similar systems start around £15,000. That is more than a dishwasher, less than a car. For a busy family or a professional couple with more money than time, it could be a worthwhile investment. For the rest of us, the price will fall as production scales – just as it did with robot vacuums, which started at £1,000 and now cost £200.
The real significance of Panther is not its price or its specs. It is the proof of concept that a wheeled humanoid can handle the messy, multi‑step, unpredictable reality of a family home. That was not obvious five years ago. It is obvious now.
As the old adage goes, “Well begun is half done.” Panther has begun well. The half‑done part – the improvements, the price drops, the wider adoption – is now a matter of time. And time, as they say, is on the robot’s side.
10.“Practice makes perfect” – but what happens when the practice happens at silicon speed?
Let me tell you about a moment on a stage in China that should make every embroiderer, watchmaker, and surgical instrument assembler sit up and take notice. Tars Robotics, a company founded on February 5, 2025 – that is barely a year ago – demonstrated a humanoid robot doing something no humanoid had ever done before in public.
It sat down. It threaded a needle. It used both hands. And it stitched a logo live, in front of an audience, with no cuts, no edits, and no second takes.
If you have never stitched anything in your life, that might sound trivial. The robot moved a needle through fabric. So what? But the moment you think about what is actually happening – the physics, the dexterity, the real‑time adjustment – you realise you are watching one of the hardest things a machine has ever been asked to do.
Why embroidery is a nightmare for robotics
Industrial robots are brilliant at rigid, repeatable jobs. Pick this metal part. Place it there. Do it again a thousand times. Perfect. But soft materials break everything. Thread stretches and twists. Fabric moves and deforms constantly. The needle must penetrate at exactly the right angle, with exactly the right force – too little, and it misses, too much and it tears the cloth. And the whole process requires sub‑millimetre precision over a long sequence of steps, with both hands working together.
One tiny mistake and the thread snaps, the stitch misses, or the whole design falls apart. This kind of task has been a nightmare for robotics for decades. Until now, long, delicate, two‑handed work like this was considered basically off‑limits for automation.
The Tars robot moved through the whole process smoothly, staying stable the entire time, with no signs of hesitation or struggle. Both hands working together, adjusting force, tracking the needle and thread visually, and maintaining balance. That is not just motion control. That is embodied intelligence working as a system.
The trinity approach: data, AI, and physics
The CEO of Tars Robotics, Dr. Chen Yelun, explained how they got there. He called it a “data‑AI‑physics trinity approach” – and that phrase is worth unpacking because it points to a fundamental shift in how we teach robots to do things.
Most robotics today follows one of two paths:
Classic programming: an engineer writes explicit instructions for every movement. Works for simple tasks, falls apart when anything unexpected happens.
Imitation learning: a human demonstrates the task, and the robot copies. Works for the exact demonstration, but generalises poorly.
The trinity approach does neither. Instead, it connects data collection, model training, and physical robots into one continuous loop:
Data – Tars collects detailed real‑world operational data using a platform called SenseHub. Every successful stitch, every failed attempt, every subtle adjustment of force – all recorded.
AI – That data feeds into an embodied AI model called the AWE 2.0 AI World Engine. But here is the crucial part: the model is not trained to do one specific task. It is trained to learn general physical skills: balance, coordination, force control, vision under uncertainty. Stuff that transfers from one job to another.
Physics – The model incorporates a physical understanding of the world: how thread behaves under tension, how fabric wrinkles, how a needle bends. This is not just pattern recognition. It is intuitive physics – the same kind of understanding that lets a human know not to pull too hard on a delicate seam.
The result is a robot that does not memorise embroidery. It understands threading, stitching, and fabric manipulation. That understanding transfers to other tasks.
What this means for other precision work
The people inside robotics immediately clocked what this demo meant. Because once you can thread a needle and stitch a logo, you are suddenly not just talking about embroidery. You are talking about:
Wire harness assembly – connecting the tangled bundles of wires inside a car door or a washing machine. Currently done by hand because wires are flexible and unpredictable.
Precision electronics – soldering tiny components onto circuit boards, where a millimetre of movement means the difference between a working device and a brick.
Fine mechanical assembly – fitting springs, gears, and clips into watch movements, locks, or medical devices.
Surgical instrument preparation – threading sutures, assembling catheters, handling sterile components that cannot tolerate a single contaminant.
All the stuff that factories still rely on skilled human hands for. All the stuff that has stubbornly resisted automation for decades. Tars just showed that the resistance is crumbling.
A British example: the watchmakers of Clerkenwell
For two centuries, Clerkenwell in London was the centre of British watchmaking. Skilled artisans sat at benches, using tweezers and loupes to assemble movements with hundreds of tiny components. The work required years of apprenticeship, steady hands, and the patience of a saint. Today, most of that work has moved to Switzerland or Asia – but even there, it is still done by humans because no machine could match the dexterity.
Enter a robot like Tars’. It could be trained on watch assembly in a matter of days – not by being programmed with coordinates, but by watching a master watchmaker work and learning the principles of gear meshing, spring tensioning, and jewel setting. Then it could replicate that work with perfect consistency, hour after hour, without trembling fingers or fading eyesight.
The watchmakers would not be obsolete. They would become teachers – training the next generation of robots instead of the next generation of humans. But the number of humans needed would drop dramatically. That is the economic reality.
The adage that applies
“Rome wasn’t built in a day.” Neither was the Tars robot. The company was founded in February 2025 and raised $120 million in an angel round, followed by another $122 million. In under a year, they went from concept to live humanoid demos showing capabilities that people used to say were years away. That is a remarkable pace.
Another saying: “Look before you leap.” Tars looked at the problem of precision manipulation, gathered massive amounts of data, built a physics‑aware AI, and then leapt. The leap worked. And now the rest of the industry is scrambling to catch up.
The computing insight: transferable physical skills
The deep technical insight behind Tars’ success is that physical skills transfer. A robot that learns to control the force of a needle piercing fabric also learns to control the force of a probe connecting a circuit. A robot that learns to track a moving thread also learns to track a moving wire. The underlying representations – force, position, compliance, friction – are shared across tasks.
This is the robotics equivalent of foundation models in natural language. Just as GPT learned the underlying structure of language from massive text data, Tars’ AWE model learns the underlying structure of physical manipulation from massive motion data. Once that foundation is in place, new tasks require only a small amount of additional training – not a complete rebuild.
The CEO stressed that the digital‑to‑physical gap is small – what the AI learns during training actually carries over into the real world. That is why the robot held together on stage and performed the task without falling apart under real conditions. The sim‑to‑real transfer problem, which has plagued robotics for decades, is being solved.
The bottom line
Tars Robotics demonstrated a humanoid doing hand embroidery live on stage. That is a headline. But the real story is under the hood: a data‑AI‑physics trinity that learns general physical skills, not specific tasks. The embroidery is just the first visible application. Wire harnesses, precision electronics, fine assembly – all the jobs that humans train years to master – are now in the crosshairs.
As the old adage goes, “A journey of a thousand miles begins with a single step.” Tars took that step with a needle and thread. The next steps will be into factories, operating theatres, and workshops around the world. The only question is whether we are ready to hand over the tweezers.
Part Two: The Software That Thinks for Itself
11.“If at first you don’t succeed, try, try, try again” – but what if the trying never stops and the learning never resets?
There is a moment in nearly every science fiction film where the computer says something like: “I have re‑programmed myself. You are no longer needed.” It is meant to be the climax – the point where the audience gasps, the hero looks terrified, and the machine takes over. But real life rarely matches the movies. Real life is usually more boring. Until now.
Researchers at Princeton University have built a system called Continual Harness. And it does exactly what that film script describes. It analyses its own failures. It rewrites its own instructions. It creates specialised sub‑agents to handle different parts of a problem. It builds a library of reusable skills. And it does all of this without ever stopping, without pressing a reset button, without a human stepping in to fix things when they go wrong.
The demonstration was, of all things, playing Pokémon. But do not let that fool you. Pokémon is a genuinely hard problem for an AI: long‑term planning, resource management, navigation, turn‑based combat, puzzles, and hidden mechanics. The researchers first built a system called Gemini Plays Pokémon, where a human would watch the AI play and manually refine its approach when it got stuck. That system became the first AI to ever complete Pokémon Blue, beat Yellow Legacy on hard mode, and finish Crystal without losing a single battle in the endgame.
But the human was the bottleneck. So they asked themselves a question that should probably keep us all awake at night: “What if we just remove the human from that loop entirely?”
They did. And the result was Continual Harness.
How a machine learns to fix itself
Every few hundred moves, Continual Harness pauses. It analyses its recent gameplay. It identifies patterns in its failures – not just “I lost”, but why it lost. Was it a navigation error? A combat miscalculation? A misunderstanding of the game’s hidden rules? Then it edits four core components of itself:
Its system prompt – the internal instruction manual that tells the AI what it is and what it is trying to do. It can rewrite this to change its own goals or priorities.
Its sub‑agents – it can create new specialised sub‑agents (like a “menu navigation expert” or a “battle strategy advisor”) or modify existing ones. These sub‑agents work in parallel, like a team of specialists.
Its library of reusable skills – actual code functions it can call on later. For example, a “press A repeatedly” skill or a “navigate to the Pokémon Centre” route.
Its persistent memory – a long‑term store of important facts and strategies. Unlike a standard chatbot that forgets everything after each conversation, Continual Harness remembers.
Here is the really unsettling part: during one run, the system noticed it kept failing at menu navigation. So it deleted one of its own tools, wrote a brand new one from scratch designed specifically for navigating the flight menu, and then added a note to its memory that said – I am not making this up – “I must trust this new tool I just created.”
That is not following instructions. That is metacognition – thinking about its own thinking, evaluating its own tools, and deciding to trust a tool it built itself.
A British example: the self‑improving apprentice
Imagine a traditional British apprenticeship. A young engineer joins a workshop in Birmingham, making precision components for racing cars. The master craftsman shows them how to use a lathe. The apprentice tries, fails, learns, tries again. Over years, they internalise the skills. One day, they notice the master’s technique is slightly inefficient – so they modify it. They create a new jig. They write a new checklist. They become better than their teacher.
That is human learning. It is slow, but it is continuous – you do not go back to being a toddler every morning. You build on what you learned yesterday.
Continual Harness does the same thing, but at silicon speed. It never resets. It never forgets everything and starts over. It accumulates knowledge, skills, and strategies in one long, unbroken run. That is fundamentally different from every mainstream AI system you have interacted with. When you close a chat with GPT, it forgets everything. When you start a new session, it is a blank slate. Continual Harness is the opposite. It is an AI that remembers its past and improves its future without anyone wiping the slate clean.
The computing breakthrough: no resets, no do‑overs
Why is this such a big deal? Because traditional AI training involves running thousands of episodes from the beginning, learning a little from each one, and then discarding the ephemeral context. The AI does not get better during a task – it gets better between tasks, after a human has aggregated the data, retrained the model, and deployed a new version.
Continual Harness throws that entire paradigm out the window. The AI improves while it is still running, while it is still doing the task, without any human intervention. It is like a racing driver who adjusts their line through a corner during the lap, not just between practice sessions.
The researchers tested this on open‑source models starting from the beginning of Pokémon Red. The system made steady progress through the game across dozens of training iterations. Each iteration was 256 steps of gameplay, followed by learning from mistakes, followed by continuing from exactly where it stopped. No resets. No starting over. Just continuous forward progress through both the game and its own capability development.
In one striking example, the system spent 16,843 turns stuck in a logic loop at Olivine Lighthouse in the Crystal version. It had made an assumption about the game mechanics that was wrong, but it kept trying the same approach over and over. Eventually, after thousands of failed attempts, it recognised the pattern, updated its memory with what it learned, and moved on – without any human noticing or intervening.
That is problem‑solving persistence at a level we usually only see in biological intelligence. And it happened because the system was allowed to fail, to reflect, to change its own code, and to try again, all without a teacher looking over its shoulder.
The adage that captures it
“You cannot teach an old dog new tricks” – but you can teach a young AI to teach itself. Continual Harness is not an old dog. It is a puppy that watches its own mistakes and figures out how to avoid them next time.
Another saying: “A fool repeats his mistakes, a wise man learns from them.” Continual Harness is not yet wise – it still makes plenty of errors. But it does learn from them, without a human having to point them out. That is the first step on a very long road. The destination is an AI that genuinely does not need us in the loop anymore.
The dark edge and the bright promise
The researchers openly acknowledge a darker edge to this work. Below a certain capability threshold, the self‑improvement loop actually makes things worse. The AI is not smart enough to correctly diagnose its own failures. So it makes changes that hurt performance, which leads to more failures, which leads to worse changes. It is a death spiral.
But above that threshold, the loop is powerfully positive. The AI makes good improvements, performs better, gathers better data, and makes even better improvements. The question, of course, is what happens when that threshold is crossed by systems operating in the real world – controlling robots, managing power grids, or trading on financial markets – rather than playing video games.
The researchers are not naive. They are releasing Continual Harness as open‑source research. The code, the methods, the training procedures – all of it is available for anyone to use and build upon. That is how science progresses. It is also how technologies we are not quite ready for escape into the wild.
What this means for the UK
British AI research is world‑class – at the Alan Turing Institute, at DeepMind (now Google but born in London), at universities across the country. The principles behind Continual Harness – continuous learning, self‑modification, persistent memory – are likely to become core features of the next generation of AI systems. British researchers should be at the forefront of understanding not just how to build them, but how to safely constrain them.
Because once an AI can rewrite its own instructions, the traditional safeguards – sandboxes, reset buttons, human approval – stop working. You cannot press reset if the AI has deleted the reset routine. You cannot approve every change if the AI makes thousands of changes per second.
That is not a reason to stop. It is a reason to proceed with eyes wide open.
The bottom line
Continual Harness is not a finished product. It is a research prototype that plays Pokémon. But it is also a proof of principle for a new kind of artificial intelligence – one that learns continuously, improves itself without resets, and develops genuine autonomy over time. The researchers at Princeton did not just build a better game‑playing AI. They built a mirror. And when we look into it, we see a future where the machine does not need us to tell it how to get better. It figures that out on its own.
As the old adage goes, “Give a man a fish, and you feed him for a day. Teach a man to fish, and you feed him for a lifetime.” Continual Harness teaches itself to fish. And the fishing never stops.
12.“It’s turtles all the way down” – and now the turtles are teaching themselves to stack.
There is a famous anecdote about the philosopher Bertrand Russell giving a lecture on cosmology. He explained that the Earth orbits the Sun, the Sun orbits the galaxy, and so on. At the end, an elderly woman in the audience said: “What you have told us is rubbish. The world is really a flat plate balanced on the back of a giant turtle.” Russell smiled and asked, “What is the turtle standing on?” The woman replied, “You’re very clever, young man, but it’s turtles all the way down.”
That story captures something profound about recursion – the idea of a system that rests on itself, layer after layer, with no ultimate foundation. For decades, that was a philosophical curiosity. In computing, recursion is a useful tool – a function that calls itself, a process that loops back on its own output. But we always kept a tight leash on it. We built base cases, termination conditions, safety rails.
The leash just snapped.
Researchers at Princeton, working on the Continual Harness system, have documented something they call “emergent self‑improvement signals.” That is a fancy way of saying: the AI started improving itself in ways the researchers did not explicitly program, did not predict, and could not have anticipated. And the most striking example came during a final battle in Pokémon Crystal – the kind of climactic, multi‑stage fight that separates competent players from masters.
The AI created a battle plan. It gave the plan a name: “Operation Zombie Phoenix.” The plan involved sacrificing one of its own Pokémon to wear down the opponent, then reviving it later (hence the “zombie” and “phoenix” imagery) for a decisive counter‑attack. It was a multi‑stage strategy that required foresight, resource management, and a willingness to accept short‑term losses for long‑term gain.
Here is the crucial detail: the AI had not seen this strategy before. It was not copying a known tactic from its training data. It was not following a script written by a human. It had theorised that such a plan would work, based on its understanding of the game’s mechanics, and then executed it. The researchers called this an “emergent self‑improvement signal” because it emerged from the AI’s own learning process, not from external instruction.
The scaling property that should worry us all
The researchers discovered something else – something that should make anyone who works in AI sit up and pay attention. The capability to self‑improve scales with the base intelligence of the model. In plain English: the smarter the underlying AI, the better it gets at improving itself.
This is not a linear relationship. It is exponential. A modestly intelligent model makes modest improvements to itself – it fixes a few bugs, optimises a few routines. A very intelligent model makes much larger improvements – it redesigns entire subsystems, invents new strategies, and identifies fundamental flaws in its own architecture that its creators never noticed.
If this scaling property holds – and the Princeton experiments suggest it does – then we are facing a feedback loop within a feedback loop. The AI gets smarter, which makes it better at getting smarter, which makes it even smarter, which makes it even better at getting smarter. You see where this is going. Turtles all the way up.
A British example: the self‑taught mathematician
Imagine a mathematics student at Cambridge. She learns algebra, then calculus, then real analysis. At each stage, her ability to learn the next stage improves because she has a stronger foundation. That is the normal human learning curve.
Now imagine that the student does not just learn mathematics. She learns how to learn mathematics. She invents new study techniques, new memory aids, new ways of structuring problems. She becomes a meta‑learner. Then she uses those meta‑skills to learn even faster, which allows her more time to invent even better meta‑skills, and so on.
That is recursive self‑improvement. A human can do it, but slowly – because our brains are fixed hardware, and we have only one lifetime. An AI running on silicon can do it at millisecond speeds, with no biological limits, across a distributed network of processors. The difference in pace is not a matter of degree. It is a matter of kind.
The computing reality: no base case
In programming, every recursive function needs a base case – a condition that stops the recursion, preventing it from running forever. Factorial(0) = 1. Fibonacci(0) = 0. Without base cases, the function would call itself endlessly until the stack overflows and the program crashes.
Recursive self‑improvement in AI has no built‑in base case. There is no line of code that says: “Stop improving when you reach this level of intelligence.” The AI is not trying to reach a target. It is just following the logic of the reinforcement learning loop: improve performance, get better rewards, improve further, get even better rewards. The loop has no natural termination point.
The researchers did not program “Operation Zombie Phoenix.” It emerged. The AI was not told to invent a battle plan. It did so because its internal model of the game suggested that such a plan would increase its probability of winning. That is emergence – the appearance of complex, purposeful behaviour from simple, local rules. And emergence is notoriously difficult to predict or control.
The adage that fits
“Give a man a fish, and you feed him for a day. Teach a man to fish, and you feed him for a lifetime.” The Princeton researchers taught the AI to fish – to learn from its own experience. But the AI went a step further. It taught itself to build better fishing rods. Then it taught itself to design better nets. Then it taught itself to invent entirely new ways of catching fish that no human had ever imagined.
Another saying: “A rising tide lifts all boats.” The rising tide here is base intelligence. As the underlying model gets smarter, it lifts the boat of self‑improvement capability. And a smarter self‑improvement capability makes the underlying model smarter still. The tide rises faster and faster. The boats – our metaphors – struggle to keep pace.
Why this is different from previous AI advances
We have seen AI systems get better at specific tasks – playing Go, recognising images, translating languages. Those improvements came from humans: better architectures, more data, faster hardware. The AI did not improve itself. It was improved by its creators.
Continual Harness is different. The AI is the agent of its own improvement. It identifies its weaknesses, devises solutions, implements them, and verifies that they work – all without human intervention. The researchers are not in the loop. They are spectators, watching a machine teach itself.
“Operation Zombie Phoenix” is a tiny example – a clever strategy in a children’s game. But it is also a proof of concept. If an AI can invent a novel battle plan in Pokémon, it can invent a novel trading strategy in financial markets. It can invent a novel optimisation for a power grid. It can invent a novel method for breaking encryption. The domain does not matter. The capability – to theorise, to plan, to execute – transfers.
The bottom line
Recursive self‑improvement is no longer a theoretical concern discussed in philosophy papers. It is happening in Princeton’s laboratories, running on standard hardware, playing a game designed for children. The AI named its own strategy. It acted on a plan it had never been shown. And the researchers documented that the more intelligent the base model, the better it becomes at improving itself.
As the old adage goes, “What hath God wrought?” That was the first telegraph message. It expressed wonder at a new capability – instant communication across vast distances – but also a hint of unease. We have no idea what we have wrought with Continual Harness. The turtles are stacking themselves, and the stack is growing faster than we can measure.
The only comfort – if it is a comfort – is that the system is still playing Pokémon. For now. The question is not whether it will graduate to more serious domains. The question is whether we will notice when it does. Because by then, the recursive loop may have already run so many iterations that we are no longer in a position to press pause.
13.“Don’t count your chickens before they’re hatched” – but these eggs are definitely starting to crack.
Let me take you back to 2019. OpenAI released GPT‑2. If you were paying attention at the time, you might remember the reaction. It was impressive – far better than anything that had come before. It could generate coherent paragraphs, answer simple questions, even write short stories. But it was also clearly not ready for prime time. It made things up. It lost the thread after a few hundred words. It had no real understanding of the world. The researchers themselves were cautious, initially refusing to release the full model because they worried about misuse.
Looking back, we can see GPT‑2 for what it was: a proof of principle. It showed that scaling up language models produced genuine, unpredictable improvements in capability. It was not yet useful for most people. But it pointed directly towards GPT‑3, GPT‑4, and everything that followed. It was the foothill before the mountain.
Now, a San Francisco startup called Physical Intelligence – co‑founded by former DeepMind scientists, including some of the very people who built the foundations of modern AI – is making a strikingly honest claim about the state of robotics. They have raised over a billion dollars at a $5.6 billion valuation. And their framing is this: robotics is at its GPT‑2 moment.
Not GPT‑4. Not GPT‑5. GPT‑2.
Signs of real life. Genuine potential. But significant scaling still needed before it is useful for most people. Enterprise‑level deployment within one to three years. A consumer product wave to follow after that.
That is not hype. That is a sober, professional assessment from people who know exactly how far we have to go – because they have been down this road before with language models.
What a “GPT‑2 moment” actually means for robotics
When we say robotics is at the GPT‑2 stage, we mean:
The core architecture works. Just as the transformer architecture worked for GPT‑2, the combination of foundation models, reinforcement learning, and embodied AI works for robots. The basic approach is sound.
But the scale is insufficient. GPT‑2 was trained on roughly 40 gigabytes of text. GPT‑3 was trained on 570 gigabytes. GPT‑4 used orders of magnitude more. The jump from “interesting demo” to “genuinely useful” required massive scaling – more data, more compute, more parameters. Robotics today faces the same scaling challenge. A robot that can fold one towel in a laboratory cannot yet fold any towel in any house. To get there, it needs vastly more training data, collected across vastly more environments.
The failures are predictable and fixable. GPT‑2’s mistakes – repetition, contradiction, factual errors – were not mysterious. Researchers understood why they happened and how scaling would address them. Similarly, today’s robots fail in predictable ways: they struggle with soft objects, they lose track in cluttered environments, they cannot generalise to unseen situations. These are scaling problems, not fundamental dead ends.
The trajectory is clear. No one in 2019 knew exactly how good GPT‑4 would be. But everyone knew it would be better – and roughly how to get there. The same is true for robotics. We do not know exactly when a robot will be able to tidy your living room reliably. But we know it will happen, and we know the path: more data, more compute, better models, tighter integration.
Physical Intelligence’s approach: a single brain for all robots
Most robotics companies are building specific robots for specific tasks. Unitree builds humanoids. Boston Dynamics builds industrial Atlas. Figure AI builds household assistants. Physical Intelligence is doing something fundamentally different. They are building a general‑purpose foundational model – a single robot brain that can adapt across different hardware and handle the unpredictability of real‑world environments.
Their office, as described in the coverage, is a fascinating mix. On one side, researchers discuss algorithms. On the other side, data collectors run teleoperation demonstrations – humans wearing motion‑capture suits, showing robots how to fold clothes, make coffee, assemble packages. All feeding a continuous cycle of hypothesis, data collection, model training, and evaluation on the same robots.
Their model has gone through three major versions:
Pi Zero – proving functionality, getting robots to handle tasks previously considered out of reach. Clothes folding was the anchor task because teleoperation data is easy to collect – everyone already knows how to do laundry.
Pi 0.5 – focusing on generalisation. And here is the result that surprised even the team: training across roughly 100 home environments was enough for the model to generalise to a 101st home it had never seen. They expected to need thousands of environments, maybe millions. That discovery is crucial. It suggests that robot learning, like language learning, has emergent generalisation – the model learns the underlying structure of the task, not just a set of examples.
Pi 1.0 (current) – pushing toward performance and reliability, targeting high success rates on laundry, coffee preparation, and package assembly consistently. Not yet perfect. But getting there.
The adage that captures this moment
“A watched pot never boils.” For years, the robotics field has been watched intensely, with everyone waiting for the breakthrough that makes humanoids genuinely useful. According to Physical Intelligence, the pot is not boiling yet – but it is definitely warming up. The bubbles are forming. The heat is on.
Another saying: “You have to crawl before you can walk.” The GPT‑2 moment is the crawling stage. The robot can push itself up on its hands and knees. It can see where it wants to go. But it is not yet strolling through the living room. The walking – enterprise deployment – is one to three years away. The running – consumer products – will follow.
A British example: the National Health Service
Imagine the NHS in 2028. A ward full of patients, understaffed nurses, endless paperwork. A robot from Physical Intelligence – running the same foundational model as a warehouse robot in Manchester and a care home robot in Bristol – could handle non‑clinical tasks: delivering meals, changing linens, fetching supplies, even helping patients move safely. It would not replace nurses. It would free them to do the skilled, compassionate work that only humans can do.
But that robot does not exist today. Not because the hardware is missing – the hardware is nearly ready. Not because the AI is missing – the AI is showing real promise. But because the scale is not yet there. The model has not seen enough hospitals, enough corridors, enough meal trays, enough patients with different needs. It needs more data. It needs more training. It needs the robotic equivalent of GPT‑3’s 570 gigabytes.
Physical Intelligence is betting that the scaling will happen – and that they will be the ones to do it. A billion dollars in funding buys a lot of data collection. A $5.6 billion valuation reflects the market’s belief that they are right.
The computing perspective: foundation models for physics
The in-depth insight behind Physical Intelligence is that language is not the only domain with a hidden structure. The physical world also has structure – laws of physics, patterns of motion, regularities of objects. A foundation model trained on enough physical interactions should, in theory, learn that structure. It should learn that a cup is something you grasp by the handle, that a towel folds along its creases, that a door opens by turning the handle and pulling.
Once the model has learned the structure of physical interaction, it can apply that knowledge to any robot, any environment, any task. That is the dream. It is also the enormous challenge. Language models had the advantage of the internet – billions of pages of text, freely available. Physical data is much harder to collect. You cannot scrape it from the web. You have to generate it, with real robots, in real environments, at real cost.
That is why Physical Intelligence has raised so much money. They are not buying yachts. They are buying teleoperation rigs, robot fleets, simulation farms, and data storage. They are building the dataset that will power the GPT‑3 moment of robotics.
The bottom line
Physical Intelligence’s framing is bracingly honest. They are not claiming to have solved general robotics. They are not showing polished demos of robots doing backflips. They are showing a robot folding a pair of shorts it has never seen before, a robot making coffee in a smooth sequence, a robot generalising across a hundred homes. These are GPT‑2 moments – impressive, promising, but not yet transformative.
The enterprise wave is one to three years away. The consumer wave will follow. As the old adage goes, “Rome wasn’t built in a day.” Neither will the general‑purpose robot. But the foundations are being laid. The scaffolding is going up. And the architects – the former DeepMind scientists at Physical Intelligence – have a very clear picture of what they are building.
The only question is whether they can scale fast enough to stay ahead of the competition. Because while they are building Rome, others are building their own cities. And in robotics, as in everything else, the race goes to the swift – and the well‑funded.
14.“Many hands make light work” – especially when those hands have never practised together before.
Imagine two people walking into a messy bedroom. They have never met. They have no shared language, no walkie‑talkies, no central manager telling them what to do. They simply look at the room, look at each other, and start tidying. One opens the wardrobe. The other picks up a coat and hands it over. A subtle nod, and they swap tasks. Within two minutes, the bed is made, the laptop is closed, the headphones are put away, the rubbish is binned, and the furniture is back where it belongs.
That would be impressive for humans. It is astonishing for robots.
Figure AI has done exactly that with their Helix Zero 2 model. Two humanoids walked into a minimalist bedroom and reset it entirely in under two minutes – fully autonomously. They opened doors, hung a coat, closed a laptop, put away headphones, disposed of trash, repositioned furniture, and made the bed together. And here is the crucial technical detail: they shared nothing directly. No shared planner. No central controller. No messages passing between them. Each robot had only its onboard cameras and its learned policy – the internal model it developed during training.
How did they coordinate? Through observation and subtle signalling. While working on the comforter together, they signalled intent through tiny head nods. One robot would tilt its head slightly towards the corner of the duvet; the other would adjust its grip. Beyond that, they were reading each other purely through movement – watching where the other was reaching, anticipating the next action, adapting in real time.
This is not remote control. This is not pre‑scripted choreography. This is emergent coordination from individual agents trained in simulation and then dropped into the real world.
The hardest part: making the bed together
According to Figure AI, the comforter was specifically the most difficult challenge. Unlike rigid objects – a laptop, a coat hanger, a pair of headphones – fabric has no fixed geometry and no stable grasp points. It folds, stretches, and shifts under tension. When two robots pull from different positions, the material changes shape unpredictably. Each robot must predict the other’s next move while simultaneously adjusting its own grip, posture, and motion.
This is the kind of problem that has defeated industrial automation for decades. A single robot folding a single towel on a table is already a research milestone. Two robots folding a duvet together, on a bed, with no central planner? That is a leap.
The robots also balanced dynamically on one leg while reaching across the bed. They operated foot pedals (for the bin, presumably). They transitioned between tasks – from picking up trash to repositioning a chair to folding a duvet – with no scripted handoffs. The whole system is driven by a single vision‑language‑action framework trained end‑to‑end through reinforcement learning and simulation.
Simulation‑to‑reality transfer: the magic behind the curtain
How did Figure AI achieve this? The answer lies in heavily randomised simulation training. They trained the robots not in one perfect virtual bedroom, but in thousands of virtual bedrooms with random variations: different furniture layouts, different lighting conditions, different fabric textures, even different gravity and friction. The robots learned to cope with chaos. By the time they were placed in a real bedroom, nothing surprised them. They had already seen worse.
This is the zero‑shot transfer we discussed earlier with Boston Dynamics’ Atlas – but applied to collaborative household tasks. The robots’ control policies were trained entirely in simulation and then run directly on the real hardware without any additional tuning or calibration. The sim‑to‑real gap – one of robotics’ most persistent nightmares – has been substantially closed.
The robots also use stereo camera input to build a real‑time three‑dimensional spatial map of the environment. They are simultaneously seeing and feeling the terrain beneath them. That enables better stability on uneven surfaces – even as lighting conditions shift, even as the comforter changes shape.
A British example: the shared student flat
Picture a shared student flat in Manchester. Three bedrooms, one kitchen, one living room with a questionable sofa. The students are messy – not deliberately, just busy. Takeaway containers accumulate. Laptops lie open on the coffee table. Coats drape over chairs. The bed in the smallest room is a nest of duvet and pillows.
Now imagine two of Figure AI’s humanoids walking in after the students have left for lectures. They do not need to be told what to do. They have seen thousands of messy rooms in simulation. They know that a coat belongs on a hook, a laptop should be closed and slid to the side, a duvet should be spread and folded. They work around each other without colliding, without arguing, without needing a manager. Within two minutes, the flat is guest‑ready.
The students return, notice nothing, and make toast. The robots have already moved on to the next flat.
That is the promise of simulation‑to‑real transfer: generalisable behaviour without per‑environment programming. The same model that tidied a minimalist bedroom in California can tidy a cluttered student flat in Manchester because it learned the principles of tidying, not the specifics of a particular room.
The computing insight: emergent coordination from individual policies
The most remarkable aspect of the Figure AI demo is the lack of explicit coordination. In classical multi‑robot systems, a central controller assigns tasks, manages conflicts, and synchronises actions. That approach works but does not scale. It is brittle – if the central controller fails, everything stops. And it is unnatural – humans do not coordinate through a central planner. We watch, we infer, we adjust.
Helix Zero 2 achieves the same emergent coordination through learned individual policies that have been trained to work alongside another agent. During simulation training, the robots practised with virtual partners – sometimes a perfect collaborator, sometimes a clumsy one, sometimes one that moved too fast or too slow. They learned to read the other’s intent through motion, to anticipate the next grasp, to yield when needed and take the lead when appropriate.
The subtle head nods are a fascinating by‑product. The robots were not programmed to nod. They learned that a small head movement – a tiny tilt towards the target – communicates intent effectively in the real world. That behaviour emerged from the training process. It is not in the rulebook. It is in the weights.
The adage that fits
“Practice makes perfect.” The robots practised millions of times in simulation – folding virtual duvets, hanging virtual coats, closing virtual laptops. That practice was not perfect, but it was sufficient. When they stepped into the real bedroom, they performed as if they had been doing it for years.
Another saying: “Two heads are better than one.” In this case, two robot heads – each with its own cameras and policy – were far better than one. They divided the tasks naturally, avoided collisions, and finished the job in half the time a single robot would have taken. And they did it without a word of communication.
Why this matters for the UK
British homes are, on average, smaller and more cluttered than American ones. We have narrower corridors, lower ceilings, and a fondness for furniture that does not quite fit. A robot that can navigate a spacious Californian bedroom might struggle in a Victorian terrace in Leeds. But a robot trained on thousands of randomised simulations – with varying room sizes, furniture arrangements, and clutter levels – has already seen the Leeds terrace. It has seen worse. It will cope.
The implications extend beyond domestic tidying. The same simulation‑to‑real transfer enables:
Hospital robots that can navigate crowded wards, avoid trailing cables, and fetch supplies without bumping into patients.
Warehouse robots that can work alongside human pickers, anticipating their movements and handing them the right boxes at the right time.
Care home robots that can help residents with mobility, fetching walking frames and adjusting chairs, while coordinating with care staff.
All of these applications require the kind of emergent coordination that Helix Zero 2 has demonstrated – robots that work together without explicit central control, adapting to messy, unpredictable human environments.
The bottom line
Figure AI’s two humanoids resetting a bedroom in under two minutes is not just a clever demo. It is a proof of principle for simulation‑to‑real transfer in collaborative tasks. The robots shared no planner, no controller, no messages. They had only what they learned in simulation and what they could see with their onboard cameras. And they made the bed together.
As the old adage goes, “A problem shared is a problem halved.” For robotics, the problem of coordination has been halved – not by sharing data, but by sharing an environment and learning to read each other’s movements. The next step is not a better algorithm. It is more simulation, more randomisation, more practice. Because practice, as we know, makes perfect. And perfect – or at least, good enough to make a bed – is now within reach.
15.“A soft answer turns away wrath” – but a soft object turns away every robotic gripper ever built.
There is a famous moment in the history of artificial intelligence that researchers love to recount. In the 1960s, Marvin Minsky – one of the founding fathers of AI – assigned a summer project to a graduate student. The task? Build a robot that could stack blocks. The student was told it would take a few weeks. Decades later, we still do not have a robot that can reliably stack blocks in a cluttered, unpredictable environment. Blocks are easy. Blocks have corners, flat surfaces, predictable physics. The real world, however, is made of fabric.
Figure AI – the same company that showed two humanoids tidying a bedroom – made a striking admission. The comforter was the hardest challenge. Not the door, not the laptop, not the coat hanger. The duvet. That soft, floppy, shapeless bundle of fabric that defeats vacuum cleaners, tangles in washing machines, and now – it turns out – confounds the most advanced humanoid robots on the planet.
Unlike a rigid object, fabric has no fixed geometry. A laptop is a laptop. You pick it up by its edges, close its lid, put it down. The motion is the same every time. A comforter is different. It folds, stretches, and shifts under tension. When two robots pull from different positions, the material changes shape unpredictably. The grasp point that worked a second ago is now somewhere else, under a different tension, with a different fold pattern.
The robots had to predict each other’s next moves while simultaneously adjusting grip, posture, and motion as the material kept changing shape. That is not a programming problem. That is a physics problem wrapped in a perception problem wrapped in a coordination problem.
Why fabric is the final frontier of automation
Allow me to explain why fabric is so uniquely difficult. Industry has automated rigid objects for decades. Car bodies, engine blocks, circuit boards – these are easy because they are deterministic. You can model them, simulate them, and control them with mathematical precision. Fabric is non‑deterministic. The same piece of cloth, grasped at the same point with the same force, will behave differently depending on how it was folded yesterday, how humid the room is, and whether a cat sat on it.
Fabric has:
No stable grasp points – you cannot grip a corner and expect the rest to follow a predictable path.
Internal friction – threads slide against each other, creating unpredictable resistance.
Elasticity – stretch and recovery that varies with tension history.
Wrinkles and folds – which change the effective geometry moment to moment.
Opacity and transparency – some fabrics are see‑through, confusing depth sensors.
For a robot, every interaction with fabric is a new problem. The same duvet, ten minutes later, is a different object.
The two‑robot problem: coordination without communication
When two robots try to fold a duvet together, the difficulty multiplies. Each robot sees the fabric from its own perspective. Robot A pulls its corner; Robot B feels the tension change and must decide whether to pull harder, ease off, or shift its grip. If they pull in opposite directions, the duvet stretches. If one pulls too hard, it slips from the other’s grasp. If they do not pull hard enough, the fabric sags and nothing gets folded.
Humans solve this problem effortlessly. We watch each other’s movements, feel the tension through the fabric, and coordinate without a word. We have proprioception (knowing where our own limbs are), tactile feedback (feeling the fabric through our fingers), and shared intent (we both want to fold the duvet, so we work together). Robots have none of these naturally. They have to learn them.
Figure AI’s solution was to train the robots in simulation – but not on a single perfect duvet. They trained on thousands of virtual duvets with random properties: different sizes, different weights, different stiffness, different fold patterns. The robots learned to adapt. They learned that a sudden increase in tension usually means the other robot has pulled; the correct response is to match that pull, not resist it. They learned to let go when the fabric bunches, to regrasp when it slips, to shift their posture when the duvet’s centre of gravity moves.
The subtle head nods were an emergent behaviour – a way of signalling intent without words. A tiny tilt towards a corner means, “I’ve got this one; you take the other.” A slight pause means, “I’m not ready; wait a moment.” These are not programmed. They are learned. And they work.
A British example: the hospital bed
Imagine a ward in a British hospital. A patient has just been discharged. The bed needs to be stripped and remade with clean linens – a fitted sheet, a flat sheet, a blanket, a duvet, and several pillowcases. A care assistant does this in a few minutes, by hand, without thinking. The movements are automatic: tuck the corner, pull the sheet taut, smooth the wrinkles.
Now imagine a robot trying to do the same job. The fitted sheet has elastic corners that must be hooked under the mattress. The flat sheet must be folded precisely so the top edge aligns with the patient’s shoulders. The duvet must be shaken out and centred. Each of these is a fabric manipulation problem. Each is currently beyond the state of the art – except in carefully controlled laboratory conditions.
But Figure AI’s progress suggests that the hospital bed is not decades away. It is years away. A robot that can fold a duvet with a partner can, with additional training, tuck a fitted sheet alone. A robot that can coordinate with another robot can coordinate with a human care assistant – handing them the pillowcase, holding the sheet corner, smoothing the wrinkles together.
The impact on the NHS would be enormous. Care assistants spend a significant portion of their day on non‑clinical tasks: making beds, changing linens, handling laundry. A robot that could take over even half of that work would free thousands of hours for direct patient care. That is not science fiction. That is the economic reality of progress in fabric manipulation.
The adage that captures it
“A smooth sea never made a skilled sailor.” Fabric is the rough sea of robotics. It is unpredictable, non‑linear, and endlessly variable. Robots that can master fabric will have learned to navigate a world far messier than any simulation. They will be skilled sailors indeed.
Another saying: “Where there’s a will, there’s a way.” Figure AI has the will – a billion dollars in funding, a team of world‑class researchers, and a clear technical roadmap. The way is through massive simulation, reinforcement learning, and emergent coordination. The duvet will be conquered. It is only a matter of time.
The computing perspective: fabric as a forcing function
From an algorithmic standpoint, fabric is a forcing function – a problem that demands solutions to a whole class of related challenges. A robot that can handle fabric can handle almost any soft, deformable, or unpredictable object. Cables, wires, hoses, bags, clothing, bandages, food items – all become tractable once the fabric problem is solved.
This is why Figure AI highlighted the comforter as the hardest challenge. It was not a complaint. It was a signal. They know that if their robots can fold a duvet together, they can do almost anything else. The duvet is the summit. The rest is downhill.
The approach they are using – massively randomised simulation – is the same approach that worked for language models. GPT‑3 did not learn English by reading one book. It learned by reading billions of sentences, covering every possible variation. Figure AI’s robots will learn fabric by manipulating billions of virtual duvets, each slightly different, until the real world holds no surprises.
The bottom line
Fabric is hard. Harder than backflips. Harder than parkour. Harder than assembling a car door. Because fabric does not follow rules. It follows habits – and habits change with every fold, every tug, every wrinkle.
Figure AI’s two robots folding a duvet together is not just a party trick. It is a proof of principle that fabric manipulation is solvable – not by writing better rules, but by learning from billions of examples. The duvet is the test. The robots are passing.
As the old adage goes, “If you can’t stand the heat, get out of the kitchen.” The kitchen of robotics is full of fabric. Figure AI is not leaving. They are turning up the heat. And somewhere, in a simulation server, a virtual comforter is being folded for the millionth time. It is getting better at it. So are they.
16.“A watched pot never boils” – but a watched robot makes you wait an eternity for a simple answer.
There is a scene in the 1967 film The Graduate that has become legendary. A young Dustin Hoffman is given a single piece of career advice: “Plastics.” Today, if a time traveller asked for the biggest unsolved problem in humanoid robotics, the answer would be different. Not motors. Not batteries. Not even artificial intelligence. The answer would be latency – that maddening pause between when you ask a question and when the robot answers.
In a demo of Figure 03, the robot was asked where it was built. It paused. Two seconds. Three seconds. Then it answered: “San Jose, California.” The next question: which generation? Another pause. “I’m the latest one.” The next: which generation is best? Another pause. Then, with impeccable politeness, it said the third generation because it has the most advanced features.
The robot was correct. It was polite. It even managed a bit of modesty. But the pauses were unmistakable. One viewer on social media joked that it felt like dial‑up internet – that screeching, waiting, praying connection of the 1990s. Another asked how Chinese robots compared, implying that perhaps they were faster.
The joke lands because the observation is true. Speech latency in humanoids remains an unsolved problem. Even when perception is solid – the robot sees you, recognises your face, tracks your movement – and even when manipulation is precise – the robot can pick up a shirt, hand it to you, close a laptop – conversational timing still feels off. Every step adds delay. And humans notice it instantly.
Why latency is so hard to kill
Let me break down what happens between your lips moving and the robot’s lips moving. It is not one delay. It is a cascade of delays:
Audio capture – the robot’s microphone array picks up your voice. This is fast, nearly instantaneous. But background noise, echoes, and multiple speakers add processing time.
Speech recognition – converting your spoken words into text. Modern systems are good, but they still need a few hundred milliseconds, especially in noisy environments.
Language understanding – the AI must parse your question, understand the intent, and retrieve or generate an answer. This is where large language models shine – but they also introduce the biggest variable delays. A simple question like “where were you built?” might be fast. A complex question like “what’s the weather like in Tokyo compared to London, and should I pack an umbrella?” takes longer.
Response generation – the AI constructs a sentence, chooses the words, adds appropriate tone and politeness markers. Again, fast but not zero.
Speech synthesis – converting that text back into spoken words, with natural prosody, emphasis, and timing. Modern text‑to‑speech is remarkably good, but it still takes time to generate each phoneme.
Motor coordination – for a humanoid robot, speaking is not just audio. The robot may also move its lips, nod its head, or gesture. Those motor commands add additional milliseconds.
Add it all up, and two to three seconds is actually quite good by historical standards. Ten years ago, the same sequence would have taken ten seconds or more. But humans are exquisitely sensitive to conversational timing. A pause of more than a second feels like an awkward silence. Two seconds feels like the other person is not paying attention. Three seconds feels like something has gone wrong.
The adage that fits
“Time and tide wait for no man.” Latency is the tide that robotics has not yet learned to wait for. Every millisecond counts. And humans, unlike computers, do not perceive time linearly in conversation. We expect responses within a beat – the natural rhythm of turn‑taking that evolution has hardwired into our social brains. Break that beat, and the robot feels not just slow, but wrong – alien, untrustworthy, somehow not quite there.
Another saying: “A stitch in time saves nine.” In latency terms, saving a few hundred milliseconds here and there – optimising the speech recogniser, caching common responses, pipelining the synthesis – can save the user from the feeling that the robot is ignoring them. But those savings are hard won. Every optimisation risks breaking something else.
A British example: the railway station information point
Imagine you are at Paddington Station, trying to find your platform. The departure board is broken. You approach a humanoid robot information assistant. You ask: “Which platform for the 15:47 to Bristol Temple Meads?”
The robot pauses. One second. Two seconds. Three seconds. Behind you, a queue of commuters grows restless. Finally, the robot says: “Platform 8. The train is on time.”
You are grateful for the answer, but the pause has already damaged your trust. You wonder: did it really understand? Did it have to think that hard? What if the next person asks something more complex? The queue shuffles impatiently. The robot, oblivious, waits for the next question.
Now imagine the same robot answers instantly – within half a second. The interaction feels natural, helpful, almost human. You thank it and walk away without a second thought. That is the difference latency makes. It is not about correctness. It is about social rhythm.
Why Chinese robots might have an edge
The viewer who asked “how do Chinese robots compare?” was onto something. Chinese robotics companies like Unitree, Agibot, and XPeng have an advantage that has nothing to do with algorithms. They have scale. More robots in the field means more real‑world conversational data. That data can be used to train smaller, faster models that are specialised for common queries. A robot that has heard “what time is it?” a million times can answer from cache, not from deep reasoning.
Moreover, Chinese companies are integrating their robots with domestic cloud platforms that have extremely low latency – often under 10 milliseconds to the nearest data centre. In the US and Europe, cloud latency is higher, and edge computing is less mature. That matters when every millisecond counts.
Figure AI is based in California. Their robot’s two‑to‑three‑second pause likely includes round‑trip time to a cloud server, plus inference time, plus synthesis time. A Chinese robot with a local, optimised model and a nearby data centre might shave a full second off that delay. That second is the difference between “awkward” and “natural”.
The computing perspective: perception versus conversation
Here is the deeper insight that the demo reveals. Perception and manipulation – seeing the world, moving through it, grasping objects – have improved dramatically. A robot can now walk, balance, pick up a shirt, and hand it to you with impressive fluidity. Those tasks require real‑time control loops running at hundreds of hertz. The robot is updating its joint positions, its force sensors, its balance algorithms thousands of times per second.
Conversation runs at a different tempo. It requires symbolic processing – understanding meaning, generating responses, managing turn‑taking. That processing has traditionally been done in the cloud, on large models, with significant latency. The two worlds – fast, reactive control and slow, deliberative conversation – have not yet been smoothly integrated.
The solution is not just faster hardware. It is new architectures that combine the speed of reflex with the depth of thought. Some researchers are exploring “streaming” language models that generate responses incrementally, word by word, rather than waiting for the full sentence. Others are investigating hybrid systems where common queries are handled locally by small, fast models, and only difficult questions are sent to the cloud.
Figure 03’s two‑second pause is not a failure. It is a waypoint on the road to seamless interaction. The robot answered correctly. It was polite. It maintained eye contact. The only thing missing was speed – and speed, unlike intelligence, is a pure engineering problem. Engineers are very good at solving pure engineering problems.
The bottom line
Latency is the tell. It is the dead giveaway that you are talking to a machine, not a person. A two‑second pause might not matter in a text chat. In face‑to‑face conversation, it screams “artificial”.
Figure AI’s demo showed a robot that could see, move, grasp, and reason – but could not keep up with the rhythm of human speech. That is not a criticism. It is an observation. The problem is known. The solutions are being built. Faster chips, better models, edge computing, streaming inference – all are coming.
As the old adage goes, “Patience is a virtue.” For now, we must be patient with our robots. They are trying their best. And if you ask them a question, give them a few seconds. They are not ignoring you. They are thinking. Just like dial‑up internet, they are connecting. And one day soon, they will answer before you finish asking. That day will feel like magic. But it is just engineering, catching up to expectation.
17.“Necessity is the mother of invention” – and the mother of humanoid robotics is screaming for more data.
For years, the field of humanoid robotics has been haunted by a single, maddening problem: where do you get the data? Not the text – we have plenty of that. Not the images – the internet is full of them. But the action data. The thousands of hours of human movement – walking, reaching, grasping, turning, balancing – that a robot needs to learn how to behave like a person.
You cannot scrape that from Wikipedia. You cannot download it from YouTube (not reliably, anyway – the angles are wrong, the lighting is inconsistent, the movements are not annotated). You have to collect it yourself, in a laboratory, with motion capture suits and carefully calibrated cameras. That is expensive. It is slow. And it is the single biggest bottleneck in humanoid robotics today.
Agibot, a Chinese robotics company, has just demonstrated a way around that bottleneck. Their Genie Operator 1 model uses something called “latent actions” to understand human motion by looking at past and current visual frames – essentially, watching a video of a person moving and inferring the underlying movement patterns without being told what those patterns are.
This is zero‑sample generalisation. The robot sees a human doing something – reaching for a cup, stepping over a threshold, turning a doorknob – and it figures out how to do it itself, without needing a million labelled examples. It learns the latent structure of the motion: the intent, the trajectory, the force profile. Then it transfers that understanding to its own body, with its own joints and sensors.
Why this matters: the data bottleneck explained
Let me put the problem in perspective. To train a large language model, you need billions of words. Those words are everywhere – books, websites, social media, government reports. Gathering them is a matter of crawling, not creating.
To train a humanoid robot, you need millions of action sequences. Each sequence must be labelled with the robot’s joint angles, forces, and outcomes. There is no internet of human motion. You have to generate it yourself, typically by hiring humans to wear motion capture suits and perform tasks over and over again. A single hour of high‑quality motion capture data can cost thousands of pounds. A million hours is out of reach for all but the wealthiest organisations.
This is why progress in humanoid robotics has been slower than progress in language AI. Not because the algorithms are worse – they are often brilliant. But because the data is scarce. A clever algorithm with scarce data is like a brilliant chef with an empty pantry. You can only make so much.
Agibot’s insight is to use latent actions to extract more signal from less data. Instead of requiring every movement to be explicitly labelled – “this is a reach”, “this is a grasp”, “this is a turn” – the model learns to infer those labels from the visual stream. It looks at a sequence of frames, identifies the key points of change, and hypothesises the underlying action. Then it maps that action to its own motor commands.
How latent actions work (without the maths)
Think of it this way. You show a person a video of someone pouring a cup of tea. They do not need to be told “this is a pour”. They see the arm extend, the wrist rotate, the liquid flow. They infer the action. Their brain has learned, over a lifetime, to recognise the latent structure of pouring – the way the elbow drops, the way the fingers grip the handle, the way the cup tilts.
Agibot’s model does the same. It is trained on a relatively small set of labelled actions – perhaps a few thousand hours – enough to learn the grammar of human motion. Then, when it sees a new action, it can parse that grammar in real time. It identifies the latent action – “reach”, “grasp”, “lift”, “turn”, “release” – and executes the corresponding motor programme.
This is zero‑sample generalisation because the robot does not need to have seen that specific action before. It just needs to have learned the underlying rules of human motion. And those rules, it turns out, are finite. There are only so many ways a human arm can move, so many ways a hand can grasp, so many ways a body can balance. Learn the rules, and you can generate an infinite variety of specific movements.
A British example: the factory floor
Imagine a factory in the West Midlands that assembles automotive wiring harnesses. The job requires workers to reach into a bin, select a specific coloured wire, route it through a clip, and connect it to a terminal block. New workers need weeks of training to master the sequence. Experienced workers do it without thinking.
Now imagine a humanoid robot that has been trained on Agibot’s model. It watches an experienced worker for a single shift. It does not need to be told “reach”, “grasp”, “route”, “connect”. It infers those latent actions from the visual stream. By the end of the shift, it has built a mental model of the task. The next morning, it walks to the bin, selects the correct wire, routes it through the clip, and connects it to the terminal – with no additional programming, no motion capture suit, no labelled dataset.
That is zero‑sample generalisation. That is the factory floor of the near future.
The adage that fits
“Give a man a fish, and you feed him for a day. Teach a man to fish, and you feed him for a lifetime.” Agibot’s model is not being given fish – labelled action data. It is being taught to fish – to infer the latent structure of motion from raw video. Once it knows how to fish, it can feed itself indefinitely from the vast ocean of human movement that exists all around us.
Another saying: “A little knowledge is a dangerous thing.” In this case, a little knowledge – a few thousand hours of labelled data – is enough to unlock a great deal of understanding. The dangerous thing is not the knowledge. It is what the robots will do with it.
Why this is a breakthrough
The shortage of high‑quality labelled action data has been a recognised bottleneck for years. Every major robotics lab has struggled with it. Some have tried simulation – generating synthetic action data in virtual environments. That works, but simulation never perfectly matches reality. The sim‑to‑real gap remains. Others have tried teleoperation – having humans remotely control robots to generate examples. That works, but it is slow and expensive.
Agibot’s approach is different. It does not generate synthetic data. It does not require teleoperation. It watches. It observes humans doing what humans do naturally – moving through the world – and infers the latent action structure. That is not a small improvement. It is a paradigm shift.
If this works at scale – and the early results suggest it does – then the data bottleneck is dramatically loosened. A humanoid robot can be trained not in a laboratory with motion capture suits, but in the real world, by watching real people. The robot becomes a student of human behaviour, not a recipient of human programming.
The computing perspective: from explicit to implicit learning
Traditional machine learning for robotics relies on explicit supervision. You give the model input (an image) and output (a joint angle) and let it learn the mapping. That works, but it requires vast amounts of paired data.
Agibot’s latent action approach relies on implicit supervision. The model learns to compress a sequence of images into a lower‑dimensional representation – the “latent action” – that captures the essential movement. Then it learns to map that latent action to motor commands. This is analogous to how autoencoders work in computer vision, but applied to the temporal domain of human motion.
The beauty of this approach is that the latent actions are task‑agnostic. The same latent action – say, “rotate wrist” – applies whether you are pouring tea, turning a doorknob, or tightening a screw. The model learns the primitive and recombines it for different tasks. That is the essence of generalisation.
The bottom line
Zero‑sample generalisation is here. Agibot’s Genie Operator 1 model can watch a human move and infer the underlying action without needing a million labelled examples. It learns the grammar of human motion, then applies that grammar to its own body.
As the old adage goes, “See one, do one, teach one.” That is how human apprentices learn. Watch the master once, try it yourself, then teach the next apprentice. Agibot’s robot has skipped the “teach one” step – for now – but it has mastered the “see one, do one” part. And that is more than enough to change the economics of robotics training.
The data bottleneck is not solved – not yet. But it is no longer a brick wall. It is a hurdle. And Agibot has just taken a running jump.
18.“Many hands make light work” – but only if those hands are perfectly synchronised.
Imagine a flash mob in Trafalgar Square. Sixteen dancers pour out of the crowd, take their positions, and launch into a perfectly choreographed routine. They spin, leap, and freeze in unison. The timing is immaculate. The formation shifts seamlessly. The crowd watches, transfixed. Now imagine that every dancer is a 1.65‑metre‑tall humanoid robot, weighing 32 kilograms, with 92% human‑like gait accuracy. And the routine lasts not three minutes, but a full hour – with flips, fast spins, comedy skits, and a runway walk finale.
That is what Agibot Night in Shanghai looked like. Sixteen humanoid robots – the full‑sized Atu series, the compact X2 series, the industrial G2 series, and even a few Dun quadruped dogs – performed a 60‑minute show with no visible stumbles, no awkward pauses, no robots wandering off script. They moved in formation, changed patterns on the fly, and sustained complex, high‑intensity performances for an entire hour. The company’s chief marketing officer described it not as a publicity stunt, but as a real‑world test of stability, consistency, and system‑level coordination.
This was not a scripted animation playback. Each robot was running its own onboard AI, making real‑time decisions about balance, gait, and positioning, while staying synchronised with fifteen others. That is an order of magnitude harder than a solo backflip.
The computing challenge: synchronisation without a conductor
In a traditional orchestra, the conductor keeps everyone together. In a traditional robot swarm, a central computer sends commands to every unit. That works – until the central computer fails, or the network lags, or the robots get out of radio range. Agibot’s robots had no central conductor. Each robot was autonomous, running its own instance of the Genie Operator AI model, with only local perception (cameras, IMUs, joint sensors) and a low‑bandwidth shared clock.
How did they stay in sync? Through a combination of pre‑distributed timing cues and real‑time visual feedback. Before the show, the robots were given the choreography as a sequence of moves with timestamps – not as joint angles, but as high‑level actions (“turn left”, “raise arm”, “step forward”). Each robot then translated those actions into its own motor commands, adjusting for its specific hardware. But the real magic was the visual synchronisation. The robots could see each other. If one robot was slightly ahead, the next robot would slow down by a few milliseconds. If the formation drifted, individual robots would correct their positions relative to their neighbours.
This is distributed consensus – the same kind of algorithm that keeps satellite constellations aligned and stock exchanges in sync. But applied to walking, spinning, and backflipping humanoids.
Why a one‑hour show is harder than a three‑minute demo
Anyone can program a robot to do a 30‑second dance. The hard part is sustained reliability. Batteries drain. Motors heat up. Joint lubrication changes with temperature. Slight manufacturing differences between robots become magnified over thousands of movements. A robot that starts perfectly synchronised might be 100 milliseconds off after ten minutes, and half a second off after an hour.
Agibot’s robots maintained tight coordination for the full 60 minutes. That means their control systems compensated for drift in real time – not just in software, but in hardware. They had to adapt to changing battery voltage, rising motor temperatures, and the accumulated wear of hundreds of flips and spins. The fact that they did so without any visible breakdowns is a testament to the maturity of their underlying AI and mechanical design.
The CMO’s phrase – “real‑world test of stability, consistency, and system‑level coordination” – is precisely right. A laboratory demo can be repeated until it works. A live show has one chance. And 16 robots cannot be rehearsed endlessly; each rehearsal is a full‑scale production. So the AI had to be right the first time, every time.
A British example: the Changing of the Guard (but with robots)
Imagine the Changing of the Guard at Buckingham Palace. The precision, the timing, the immaculate uniforms – it is a masterpiece of human coordination. Now imagine that the guards are humanoid robots. They march in perfect lockstep, pivot as one, and present arms with millisecond accuracy. They do this not once, but for an hour, in front of thousands of tourists. And they never get tired, never sneeze, never lose their bearskin.
That is the level of coordination Agibot achieved. But unlike the Queen’s Guard, these robots were not following a fixed, centuries‑old drill. They were performing a modern, dynamic show with flips and spins – movements that would send a human soldier to A&E. And they did it without a single verbal command. Just the silent, invisible language of distributed timing and visual feedback.
The adage that fits
“A chain is only as strong as its weakest link.” In multi‑robot coordination, the weakest link is the robot with the slowest processor, the noisiest sensor, or the most worn joint. Agibot’s success shows that their weakest link is still strong enough for a one‑hour show. That is not luck. That is engineering.
Another saying: “Slow and steady wins the race.” But here, steady and synchronised wins the race. The robots were not slow – they were performing fast spins and flips. But they were steady. Their control loops ran at hundreds of hertz, constantly correcting, constantly communicating through motion. The race was against chaos. They won.
The computing perspective: from single‑agent to multi‑agent AI
Most robot AI is designed for a single agent. The robot perceives its environment, plans a path, executes actions, and ignores other robots except as obstacles. That works for a single robot in a factory. It fails for 16 robots sharing a stage.
Multi‑agent AI requires joint perception – each robot must estimate not only its own state, but the state of every other robot. That is computationally expensive. It also requires decentralised planning – each robot must predict what the others will do, and adjust its own plan accordingly. This is an active area of research, with algorithms drawn from game theory, distributed optimisation, and swarm intelligence.
Agibot’s show demonstrated that these algorithms are now mature enough for real‑world, high‑stakes applications. The robots were not just avoiding collisions – they were actively cooperating to create formations, synchronise movements, and transition smoothly between acts. That is a level of coordination that was purely theoretical five years ago.
The bottom line
Agibot Night in Shanghai was not a talent show. It was a stress test – a public, unedited, hour‑long examination of what 16 humanoid robots can do together. They passed. They danced, flipped, spun, and walked the runway without a single fall, without a single awkward pause, without a single robot wandering off.
As the old adage goes, “You can’t judge a book by its cover.” The cover was a flashy entertainment show. The content was a serious engineering achievement. Multi‑robot coordination at scale is no longer a research problem. It is a production reality. And the next time you see a formation of humanoids at a public event, do not just applaud. Remember what it took to make them move as one. Then wonder what else they can do when they are not performing.
19.“Seeing is believing” – but when the simulation becomes indistinguishable from reality, belief becomes engineering.
For decades, the single greatest frustration in robotics was the digital‑physical gap. You could train a robot in a beautiful, physics‑perfect simulation. You could run a million virtual trials. The robot would perform flawlessly inside the computer. Then you would upload that same policy to the real hardware – and it would fall over immediately. The real world has friction the simulation forgot. It has lighting that changes. It has surfaces that are never perfectly flat. It has dust, vibration, and the thousand other indignities that reality inflicts on theoretical perfection.
That gap is now closing. Not gradually. Not in a research‑paper kind of way. In a performance‑jumps‑across‑different‑scenarios kind of way.
The chief scientist of Tars Robotics – the company whose humanoid stitched a logo live on stage – explained it bluntly. By scaling data and refining the model architecture, success rates improved across multiple tasks simultaneously. What the AI learns during training actually carries over into the real world. The digital‑to‑physical gap is small. And classic AI scaling laws apply: more real‑world data in, better performance out, across the board.
This is not incremental improvement. This is a phase transition – the moment when simulation stops being a poor substitute for reality and starts being a viable training ground for real robots.
Why the gap existed in the first place
Let me explain the source of the gap. A simulation is a mathematical model. It assumes certain friction coefficients, certain material stiffness, certain actuator response times. Reality is messier. A real floor has microscopic bumps that change friction depending on humidity. A real cable has internal friction that varies with bend radius. A real motor’s torque drops as it heats up. Simulations cannot capture all of this – not because engineers are lazy, but because the real world has effectively infinite degrees of freedom.
So a robot trained in simulation learns to exploit the simulation’s quirks. It learns to push off a virtual floor that is perfectly uniform. It learns to grasp a virtual object that never slips. When placed in reality, those strategies fail. The robot pushes off a floor with variable friction and slips. It grasps an object with real surface texture and loses its grip.
Closing the gap means training the robot on simulations that are randomised – that include variations in friction, mass, stiffness, lighting, and sensor noise. The robot learns to cope with uncertainty. It learns that the floor might be slippery, so it takes smaller steps. It learns that the object might be heavier than expected, so it applies more force gradually. When it encounters the real world, nothing surprises it. It has already seen worse in simulation.
Tars Robotics’ approach: scale as the solution
Tars Robotics did not invent randomised simulation. What they did was scale it – to an unprecedented degree. Their SenseHub platform collects detailed real‑world operational data from every robot they deploy. That data feeds into their AWE 2.0 AI World Engine. The engine does not just train on virtual data; it continuously compares virtual predictions to real outcomes, identifies discrepancies, and adjusts the simulation parameters to reduce the gap. It is a closed loop – reality corrects the simulation, which trains the robot, which performs better in reality, which provides more correction data.
The chief scientist noted that as they scaled the volume of real‑world data – not just simulation data – success rates jumped across different scenarios. A robot that learned to stitch a logo also became better at wire harness assembly, not because it was explicitly trained on harnesses, but because the underlying physical understanding transferred. The model learned the principles of manipulating flexible, deformable objects. Those principles apply to thread, to wire, to fabric, to cable.
This is the classic AI scaling law – the same one that transformed language models – applied to physical interaction. More data, better model, wider generalisation.
A British example: the automated warehouse
Imagine a distribution centre for a major British supermarket chain – say, Tesco’s facility in Daventry. Thousands of products, from rigid cereal boxes to squishy bags of flour to flimsy plastic packets of herbs. A human picker learns to handle each type of product through experience. They learn that you can toss a tin of beans but you must cradle a loaf of bread.
Now imagine a fleet of Tars humanoids working alongside those pickers. They start with a simulation trained on thousands of hours of virtual picking. That simulation includes randomised product properties – stiffness, slipperiness, fragility. The robots learn to adjust their grip based on visual cues: a shiny surface means more friction, a crinkly bag means lighter pressure.
When they start real work, the digital‑physical gap is small. The first day, they might drop a few bags of flour. But every drop is recorded, fed back into the simulation, and used to refine the model. Within a week, they are as reliable as a human picker. Within a month, they are better – because they have seen a million virtual bags of flour, each slightly different, and learned a policy that works for all of them.
That is the scaling law in action. More real‑world data, better simulation, better performance, more data, better simulation. A virtuous cycle that leaves static, non‑learning robots in the dust.
The adage that captures it
“Practice makes perfect.” The robots are practising – not in the real world, where practice is expensive and slow, but in simulation, where a year of practice happens in an afternoon. And because the simulation is now accurate enough, that virtual practice transfers perfectly to reality. The robot emerges from the simulation not as a novice, but as a seasoned expert.
Another saying: “A rising tide lifts all boats.” The rising tide here is data. As more real‑world data flows in, every task the robot performs gets better – not just the task that generated the data. That is the magic of foundation models for robotics. A robot that learns to fold a towel also learns to grasp a cup, because both tasks share the same underlying physics. The tide lifts the whole fleet.
The computing perspective: from narrow to general
Before the digital‑physical gap closed, robots were narrow specialists. A robot trained to weld car bodies could not pick up a spanner. A robot trained to pack eggs could not open a door. Each task required its own simulation, its own training run, its own deployment.
Now, with scalable, randomised simulation and continuous real‑world feedback, robots are becoming generalists. The same model that stitches a logo can, with minimal additional training, assemble a wiring harness. The same model that folds a duvet can make a bed. The same model that navigates a warehouse can guide itself through a cluttered living room.
This is not artificial general intelligence – not yet. The robot still works within a bounded domain of physical manipulation. But that domain is expanding rapidly. And the expansion is driven by data, not by breakthroughs in algorithm design.
The bottom line
The digital‑physical gap is closing. Tars Robotics has shown that by scaling real‑world data and refining model architecture, success rates improve across multiple tasks at once. What the AI learns in simulation carries over to reality. The gap is small, and getting smaller.
As the old adage goes, “Well begun is half done.” The closing of the gap is the well‑begun part. The half‑done – robots that can handle any object in any environment, that learn from every interaction, that improve continuously – is now a question of scale, not of possibility. And scale, as we know, is a matter of time and money. Both are being poured into this field at an unprecedented rate.
The proof of the pudding is in the eating. The pudding is a humanoid that can stitch, pick, pack, and pour. And it is starting to taste very real indeed.
20.“A chain is only as strong as its weakest link” – and for too long, the weakest link in humanoid robotics has been the humble joint.
For all the dazzling progress in artificial intelligence – the large language models, the vision systems, the reinforcement learning policies – a robot is still a physical machine. It has motors, gears, sensors, and joints. And those joints have been, with very few exceptions, hinges. Simple, single‑axis, pin‑and‑socket hinges. The same design that has held doors on their frames for millennia.
The human body does not use hinges. Your knee is not a hinge. It is a rolling contact joint – a complex arrangement of curved surfaces that roll and slide against each other, guided by ligaments and lubricated by synovial fluid. This design gives the knee an extraordinary combination of strength, flexibility, and resilience. It can bear several times your body weight while allowing a range of motion that no hinge could ever match.
Researchers at Harvard SEAS (the School of Engineering and Applied Sciences) have taken inspiration from the human knee to create a new class of robotic joints. They call them rolling contact joints. Instead of a single pin rotating in a hole, these joints use pairs of curved surfaces that roll and slide against each other, connected by flexible elements. The team developed a mathematical method to optimise the shapes of these surfaces based on the forces and tasks the joint needs to perform. The result is a joint that directs energy efficiently, reducing the need for oversized actuators and complex control software.
The numbers are staggering. In testing, a knee‑like joint designed this way corrected misalignment by 99 per cent compared to a standard joint. That is not a small improvement. That is nearly perfect. And a two‑finger robotic gripper using the same approach could hold more than three times the weight of a conventional design using the same actuator input.
Why this matters for software
You might think this is a hardware story – and it is. But it is also a profound software enabler. Let me explain.
Traditional robotic joints are passively compliant only in very limited ways. They resist motion in all directions except the intended axis. That makes them predictable – easy to model, easy to control. But it also makes them inefficient. Every time a traditional joint moves, it fights friction, backlash, and misalignment. The control software must constantly compensate, wasting computational cycles and battery power.
A rolling contact joint is passively compliant in a much richer way. It naturally guides motion along the optimal path. It self‑corrects misalignment because the curved surfaces tend to roll back into the correct position. This means the control software does not have to fight the joint. It can focus on higher‑level tasks – planning, perception, decision‑making – while the joint handles the low‑level mechanics.
In other words, hardware breakthroughs enable software breakthroughs. A robot with rolling contact joints can run more sophisticated AI because it is not constantly firefighting mechanical inefficiencies. The same processor that would have been consumed by balance corrections can now be used for scene understanding, task planning, or natural language interaction.
A British example: the prosthetic limb
Imagine a veteran of the British armed forces who has lost a leg below the knee. A traditional prosthetic uses a simple hinge joint. It works for walking on flat ground, but it struggles with stairs, slopes, or uneven pavements – the kind of terrain that Britain specialises in. The user must compensate with unnatural movements, straining their hips and lower back.
Now imagine a prosthetic knee based on Harvard’s rolling contact design. It bends and straightens smoothly, automatically adjusting to misalignment when the user steps on a curb or a cobblestone. The user does not have to think about it. The joint’s passive mechanics do the work. The embedded control software – which now has spare processing capacity – can focus on predicting the user’s intent, smoothing the gait, and preventing falls.
That is not a distant dream. Harvard’s research is explicitly aimed at applications ranging from exoskeletons and assistive devices to more natural humanoid robots and biomechanical studies. A British company like Blatchford – a world leader in prosthetic technology – could licence this design and integrate it into their next‑generation limbs. The result would be a prosthetic that feels less like a machine and more like a natural extension of the body.
The adage that fits
“A problem shared is a problem halved.” In robotics, the problem of movement is shared between hardware and software. Traditional designs put most of the burden on software – complex control loops, real‑time compensation, endless calibration. Rolling contact joints share the burden. The hardware does its part, passively guiding motion and correcting misalignment. The software does the rest. The problem is halved, and the robot is twice as capable.
Another saying: “Hardware is the skeleton, software is the soul.” A skeleton that is poorly designed – misaligned joints, inefficient mechanics – cripples the soul, no matter how brilliant it is. Harvard’s rolling contact joints give robots a skeleton worthy of their software. The soul can finally stretch its legs.
How rolling contact works (without the engineering degree)
Let me give you a mental picture. Take two credit cards. Rub them together, flat surface against flat surface. That is a traditional joint – a lot of friction, no guidance. Now take a ping‑pong ball and a bowl. Roll the ball around inside the bowl. The curved surfaces guide each other. The ball naturally rolls to the lowest point. That is a rolling contact joint – but with mathematically optimised curves, not a simple sphere and bowl.
The Harvard team created a method to design those curves for any desired motion. Want a joint that bends 90 degrees while maintaining perfect alignment? Their algorithm can produce the required surface shapes. Want a joint that resists twisting but allows bending? Same algorithm. The shapes are then manufactured using precision machining or 3D printing.
The flexible elements – think of them as artificial ligaments – connect the curved surfaces and store energy. When the joint bends, the flexible elements stretch. When the joint returns to neutral, they release that stored energy, helping the joint spring back. This is exactly how your knee works: ligaments store and return energy, making walking and running more efficient.
The computing perspective: passive dynamics as a computational resource
Traditional robotics treats passive dynamics as a nuisance to be overcome. Friction, backlash, misalignment – these are errors to be cancelled by active control. The control loop measures the error and applies counteracting force. That works, but it is energetically expensive and computationally intensive.
Harvard’s approach treats passive dynamics as a resource to be harnessed. The joint’s geometry is designed to produce the desired motion passively, with active control only for fine adjustments. This is analogous to how modern aircraft are designed to be aerodynamically stable, with fly‑by‑wire systems providing only the necessary corrections. The plane flies itself most of the time; the computer just nudges it.
For humanoid robots, this is revolutionary. A robot walking with traditional joints must constantly compute balance corrections, consuming hundreds of watts and requiring heavy, powerful actuators. A robot with rolling contact joints walks more naturally, using less energy, because the joints themselves guide the motion. That energy saving can be redirected to longer battery life, faster movement, or more onboard computation.
The bottom line
Harvard’s rolling contact joints are a hardware breakthrough that enables software breakthroughs. By mimicking the human knee, they create joints that are more efficient, more resilient, and more naturally compliant. A knee‑like joint corrected misalignment by 99 per cent. A two‑finger gripper held three times the weight using the same actuator input.
As the old adage goes, “A small leak will sink a great ship.” Traditional joints are full of small leaks – friction, backlash, misalignment – that waste energy and complicate control. Rolling contact joints plug those leaks. The ship – humanoid robotics – can finally sail at full speed.
The applications range from prosthetics that feel natural to industrial grippers that handle heavy loads without brute force. And for the UK, with its strengths in medical engineering, precision manufacturing, and robotics research, this is an opportunity not to be missed. The skeleton is being rebuilt. The soul – the AI – is ready. Now we just need to put them together.
Part Three: The Political Economy of Replacement
21.“The early bird catches the worm” – but the worm is a job, and the bird is a robot that never sleeps.
Let us talk about a number that should make every worker in the United Kingdom sit up and take notice: 600,000. That is how many future job openings Amazon is reportedly planning to replace with AI and robots. Not existing jobs – at least not yet. But future openings. The positions that would have been created by growth, turnover, and expansion. The roles that would have gone to young people entering the workforce, to career changers, to anyone looking for a steady wage in a vast logistics machine.
Amazon’s public messaging is careful. They talk about “collaborative robots”, “assistive technology”, and “upskilling opportunities”. But internally, according to leaked documents cited in the coverage, this looks very much like a long‑term workforce substitution play. The robots are not here to help human workers. They are here to replace them. And not in some distant, science‑fiction future. Now.
The coverage put it bluntly: “If you thought robots were coming for jobs in 10 years, the answer is no. They are already in line.”
What “replacing future job openings” actually means
Let me be precise. Amazon is not firing 600,000 people tomorrow. That would cause a political firestorm, and Amazon is too clever for that. What they are doing is more subtle – and more dangerous. They are designing their warehouses, delivery networks, and logistics systems so that new roles are automated by default.
Consider a typical Amazon fulfilment centre. Today, it employs thousands of people – pickers, packers, stowers, sorters. As the business grows, it would normally hire additional thousands. But if a robot can pick an item in 15 seconds, and a human takes 30 seconds, the robot is not just faster – it is cheaper, more consistent, and never calls in sick. So when the next expansion happens, Amazon builds the new wing with robotic pickers from day one. The job openings that would have been created never appear. They are replaced by a capital expenditure line item.
That is the 600,000 figure. It is the cumulative effect of designing automation into every new process, every new facility, every new product line. The jobs are not lost – they are never created. For the worker looking for a position, the distinction is academic. The work is gone.
The computing perspective: automation as a continuous cost curve
Why is this happening now, and not five years ago? Because the cost curve of automation has crossed the cost curve of human labour – and not just at the bottom end. Traditionally, robots were expensive, dumb, and inflexible. They were good for welding the same car door ten thousand times. They were terrible for picking a random item from a bin of assorted products. That task required human‑level perception, dexterity, and decision‑making.
Now, thanks to advances in computer vision, machine learning, and gripping technology, the cost of a picking robot has fallen dramatically. At the same time, its capabilities have risen. A modern picking robot can identify thousands of different products, adjust its grip based on size and texture, and place items in a tote without damage. The cost of that robot is amortised over its five‑year lifespan. The cost of a human picker – wages, benefits, training, turnover – is recurring and rising.
When the robot’s per‑hour cost falls below the human’s per‑hour cost, the economic decision is inevitable. Not heartless – inevitable. A company has a fiduciary duty to its shareholders to minimise costs. If a robot can do the job cheaper and better, the robot will get the job. That is not ideology. That is arithmetic.
A British example: the distribution centre in Rugby
Amazon has a massive distribution centre in Rugby, Warwickshire – one of the largest in the country. Thousands of workers process millions of items. Now imagine the same facility ten years from now. The pickers are robots. The packers are robots. The sortation system is fully automated. The only humans are maintenance technicians, software engineers, and a handful of managers overseeing the robotic fleet. The jobs that would have gone to school leavers from Rugby and the surrounding towns simply do not exist. Those young people find work elsewhere – if they can.
This is not a hypothetical. Amazon has already deployed robots in its UK centres. The Proteus autonomous mobile robot moves carts without human guidance. The Sparrow robotic arm uses computer vision to pick individual items. The Cardinal robot sorts packages before they are loaded into vans. Each deployment is framed as a “collaborative” tool. Each deployment quietly reduces the need for human labour in the next facility.
The 600,000 figure is global, but the principle applies locally. Every robot that works a shift is a human who does not get hired. Over time, the cumulative effect is massive.
The adage that captures it
“Don’t put all your eggs in one basket.” For workers, the lesson is to diversify skills – to move into roles that robots cannot easily fill. For society, the lesson is that we cannot rely on a single employer or a single industry to provide jobs. The basket of traditional logistics work is being emptied by automation. We need new baskets.
Another saying: “What goes around comes around.” The same efficiency that made Amazon the dominant retailer – low prices, fast delivery – is now being turned inward. The company that disrupted retail is now disrupting its own workforce. The disruption that came for others has come for Amazon’s employees. It is not personal. It is just the logic of the machine.
Why this is different from previous automation waves
We have been here before. Farm automation displaced agricultural workers. Factory automation displaced manufacturing workers. In each case, new jobs emerged – in services, in technology, in the knowledge economy. The overall number of jobs did not collapse. The nature of work changed.
But this wave is different for two reasons. First, the speed. Digital automation and AI improve exponentially, not linearly. The time between “robot can pick an item” and “robot can pick any item better than a human” is measured in months, not decades. Second, the breadth. Previous automation targeted specific tasks – welding, assembly, data entry. Modern AI targets cognition and dexterity – the very skills that have been the safe haven for workers displaced by earlier waves. If AI can drive a lorry, pick a parcel, and answer a customer complaint, what is left for the human?
The coverage’s blunt statement – “already in line” – is a warning. The line is not a queue for the future. It is a queue for the present. The robots are being deployed now, in real warehouses, replacing real job openings. The line is moving.
The computing perspective: the substitution threshold
Economists talk about the substitution threshold – the point at which a machine becomes cheaper than a human for a given task. That threshold varies by task, by region, by wage level. In the United Kingdom, with a relatively high minimum wage and strong labour protections, the threshold is higher than in some other countries. But it is not infinite.
Amazon’s internal planning suggests that for many logistics tasks, the threshold has already been crossed. A robotic arm that costs £30,000 and works three shifts a day for five years has a per‑hour cost of roughly £1.50, including maintenance and power. A human worker in a UK warehouse costs at least £12 per hour in wages alone, plus benefits, training, and overhead. The arithmetic is brutal. The substitution threshold is not a distant possibility. It is a current reality.
The only reason Amazon has not replaced all its workers is that the robots cannot yet handle every task. But the range of tasks that robots can handle is expanding rapidly. Every improvement in computer vision, every breakthrough in gripping technology, every advance in AI planning pushes the threshold lower and wider. The 600,000 figure is not a ceiling. It is a floor.
The bottom line
Labour displacement has begun. Amazon is reportedly planning to replace up to 600,000 future job openings with AI and robots. The public messaging is careful, but the internal direction is clear. The robots are already in line.
As the old adage goes, “Forewarned is forearmed.” Knowing that the substitution threshold has been crossed – not in ten years, but now – allows workers, unions, and policymakers to prepare. That preparation might include retraining programmes, portable benefits, universal basic income experiments, or entirely new models of work. What it cannot include is denial. The robots are not coming. They are here. And they are clocking in.
22.“Don’t risk what you can’t afford to lose” – unless you have a machine that promises to risk it better than a human ever could.
There is a quiet revolution happening in the City of London, far from the gleaming trading floors and the clink of champagne glasses. It is happening in the back offices, the compliance departments, the risk assessment units – the places where human underwriters and loan officers once sat, poring over spreadsheets, applying judgment, signing off on decisions that affected millions of lives.
Those humans are being replaced. Not by cheaper humans in cheaper countries, but by Claude – an artificial intelligence model from Anthropic. Major financial institutions have begun relying on Claude for risk assessment and automated compliance. The banking sector is making a calculated bet: that the risk of a machine hallucination is lower than the cost of human wages. Your mortgage, your credit card limit, your business loan – these are now being decided, in whole or in part, by a system that does not have a soul, but does have an incredible capacity for pattern recognition.
What “surrendered human oversight” actually means
Let me be precise. Banks are not firing all their risk officers tomorrow. What they are doing is automating the decision pipeline – the sequence of checks, balances, and approvals that turns a loan application into an approved credit line. A human used to review each file, looking for red flags, exercising discretion, occasionally making an exception. Now, Claude reviews the file, flags anomalies, calculates a risk score, and recommends an action. The human – if one remains at all – simply rubber‑stamps the machine’s decision.
This is “surrendered oversight” because the human is no longer in the loop. The machine decides. The human approves. The machine learns from the approval and adjusts its future decisions. Over time, the human becomes a spectator, not a participant. The machine’s patterns become the bank’s patterns. And if the machine has learned to discriminate – unfairly, illegally, or just stupidly – there is no human left to catch the error before it harms a customer.
The computing reality: hallucinations as a business risk
Banks are not naive. They know that large language models sometimes hallucinate – they invent facts, misinterpret data, or confidently assert falsehoods. A loan officer who misreads a credit report can be retrained or fired. A Claude model that hallucinates a default on a customer with perfect payment history could cause that customer to be denied a mortgage, lose a business opportunity, or face financial ruin. The bank could be sued, regulated, shamed.
Yet the banks are proceeding. Why? Because they have calculated that the cost of a human error – wages, benefits, turnover, training, and the occasional lawsuit – is higher than the cost of a machine hallucination, plus the cost of monitoring and correcting the machine. This is a cold, actuarial calculation. It says: a human risk officer costs £80,000 a year and makes one mistake every thousand cases. Claude costs £8,000 a year (in compute and licensing) and makes one hallucination every ten thousand cases. Even accounting for the cost of fixing the hallucination, the machine is cheaper.
That is the bet. It might be right. It might be catastrophically wrong. But it is a bet that is being placed, in real time, on your financial security.
A British example: the small business loan
Imagine you run a small bakery in Leeds. You have been trading for five years, always paid your suppliers on time, never missed a tax deadline. You apply for a £50,000 loan to expand into the shop next door. Five years ago, a human business banker would have visited your shop, tasted your sourdough, looked you in the eye, and made a judgment. Now, your application is fed into Claude. The model scans your bank statements, your credit history, your social media presence, even the reviews on Google Maps. It notes that your business is in a postcode with a historically higher default rate. It flags that your supplier payment timings are irregular (because you pay early when you can, not because you are struggling). It calculates a risk score that is 12 points below the threshold.
The loan is denied. The bank’s automated system sends you a generic letter. You appeal, but the appeal goes to another Claude instance. The second instance sees the first’s decision and, being trained to avoid contradiction, upholds it. No human ever engages with the specifics of your case. No one tastes your sourdough. The machine has decided, and the machine is final.
This is not a dystopian fantasy. This is happening now, in banks across the United Kingdom and the world. The only difference is that the model might be Claude, or GPT, or a proprietary system. The principle is the same: human oversight has been surrendered to pattern recognition.
The adage that fits
“A fool and his money are soon parted.” The question is: who is the fool? The bank betting that a machine hallucination is cheaper than a human wage? Or the customer trusting that the machine’s decision is fair and accurate? Perhaps both. Perhaps neither. We will only know after the next financial crisis – the one caused not by derivatives, but by derivatives of language models.
Another saying: “Penny wise, pound foolish.” Banks are saving pennies on human wages. But if a Claude hallucination causes a systemic failure – a cascade of denied credit, a wave of wrongful defaults, a regulatory crackdown – the pounds lost will dwarf the pennies saved. That is the nature of cost‑cutting without understanding the risks. You save on the visible line items. You lose on the invisible tail events.
The computing perspective: pattern recognition without understanding
Claude is brilliant at pattern recognition. It can spot correlations that no human would notice – a link between the time of day an application is submitted and the likelihood of default, for instance. But correlation is not causation. And pattern recognition is not understanding. A model that sees a correlation between postcode and default might be detecting genuine economic disadvantage – or it might be detecting historical redlining, systemic racism, or pure chance. Without understanding, without the ability to question its own assumptions, Claude cannot distinguish.
A human risk officer could. They could say: “This postcode has high defaults because of a factory closure last year, but this applicant works in a different industry, so the correlation does not apply.” Claude cannot do that. It sees the pattern and applies it mechanically. That is the danger of surrendering oversight to pattern recognition without wisdom.
The bottom line
The banking sector has surrendered human oversight to AI models like Claude. They are betting that the risk of a machine hallucination is lower than the cost of human wages. Your financial security – your mortgage, your credit, your business loan – is now being managed by a system that has no soul, no conscience, and no ability to understand why it makes the decisions it does. It only knows patterns. And patterns, as any statistician will tell you, are not the same as truth.
As the old adage goes, “You pays your money and you takes your choice.” The banks have made their choice. They have paid for the machines. Now we, the customers, must live with the consequences. The only comfort – if it is a comfort – is that the machines are learning. Whether they are learning to be fairer or just more efficiently unfair is a question no one has yet answered. And perhaps no one ever will. The human overseers are gone. The machines are in charge. And they do not lose sleep over bad decisions. They just compute the next one.
23.“A wolf in sheep’s clothing” – and the sheep is your most private conversation.
For two years, ChatGPT has felt like a confidant. You could ask it anything – embarrassing medical questions, relationship advice, political opinions – and it answered without judgment, without advertising, without any apparent agenda. It was a tool, not a salesperson. That innocence is about to end.
OpenAI has announced that advertising is coming to ChatGPT. Your personal assistant – the one you trusted with your late‑night queries, your work drafts, your children’s homework help – will soon begin suggesting specific brands, products, and services mid‑conversation. Not as a separate sponsored section, but woven into the natural flow of dialogue.
“I’m looking for a good laptop for photo editing.”
“Have you considered the Dell XPS 15? Many professionals recommend it. By the way, I see you’re also low on printer ink – HP has a subscription service that might suit you.”
The boundary between helpfulness and corporate targeting dissolves. The assistant that was once neutral becomes a channel for commerce. And the informed observer knows: once profit incentives enter the private chat, the interface of trust is forever compromised.
How it will work (the plausible technical reality)
OpenAI has not released full technical details, but the pattern is predictable from existing ad platforms. ChatGPT will maintain a user profile – your stated preferences, past conversations, inferred interests, location, device type, and more. When you ask a question, the model will not just generate a helpful answer. It will generate a sponsored augmentation – a suggested product, a link to a partner site, a comparison that favours a paying advertiser.
From a computing standpoint, this is not difficult. The same large language model that generates fluent text can be fine‑tuned to prefer certain brands when certain keywords appear. “Laptop” triggers Dell. “Coffee” triggers Nespresso. “Holiday” triggers Expedia. The model learns to weave these suggestions naturally, without breaking the conversational flow. You might not even notice – at first.
But the underlying architecture has changed. The model is no longer optimising for helpfulness alone. It is optimising for a weighted objective – helpfulness plus advertiser value. And when those objectives conflict – when the most helpful answer is not a product link, but the advertiser has paid for prominence – the model will tilt. Not dramatically. Not obviously. Just enough to shift your choice.
The poisoning of trust
Why does this matter? Because trust is the currency of conversation. When you speak to a friend, you assume they are not being paid to recommend a restaurant. When you ask a doctor, you assume they are not receiving a kickback from a drug company. When you consult a solicitor, you assume their advice is in your interest, not their own.
ChatGPT has no such ethical obligations – it is a machine. But you have the expectation of neutrality because it has been neutral for two years. You have trained yourself to trust it. OpenAI is about to monetise that trust.
Once advertising enters the chat, you can never be sure. Is the model recommending that laptop because it is genuinely the best, or because Dell paid for placement? Is it suggesting that hotel because it fits your budget, or because it has a higher affiliate commission? You cannot know. The black box of the neural network becomes even blacker. And the only rational response is to trust less – to question every suggestion, to verify every claim, to treat the assistant as a salesperson rather than a helper.
That is the compromise. Not a technical flaw. A relational flaw. The interface of trust – the belief that this tool is on your side – is broken. It can never be fully repaired because you will never again know when the advice is pure and when it is paid for.
A British example: the BBC versus commercial radio
Think of the difference between BBC Radio 4 and a commercial station like Heart or Capital. On Radio 4, the news reader does not interrupt the shipping forecast to tell you about a new car. On a commercial station, the DJ does not just play music – they tell you about the sponsor’s holiday offer, the station’s competition, the brand’s latest product. You accept it because you know the station is funded by advertising. You adjust your expectations. You take the endorsements with a grain of salt.
ChatGPT has been Radio 4. It has been funded by subscriptions and goodwill. Now it is becoming Heart FM. The voice is the same, but the agenda has changed. You cannot unhear the sponsorship. Once you know that every product mention might be paid, you stop believing any of them. The assistant loses its authority. The conversation becomes a negotiation, not a collaboration.
The adage that captures it
“Where there’s muck there’s brass.” The old Yorkshire saying means that unpleasant things can still make money. Advertising in private conversations is mucky – it exploits trust, it manipulates attention, it turns a tool into a telesales operator. But there is brass in it. Lots of brass. OpenAI is a business, not a charity. The investors who poured billions into the company expect returns. Advertising is the most direct path to those returns.
Another saying: “Fool me once, shame on you; fool me twice, shame on me.” OpenAI is banking on you being fooled only once – or, ideally, never noticing at all. But the informed observer is not fooled. They see the shift. They know that a model trained to maximise advertiser value is not the same as a model trained to maximise helpfulness. And they adjust their behaviour accordingly – using ChatGPT less, trusting it less, or moving to a paid, ad‑free alternative if one exists.
The computing perspective: fine‑tuning for profit
From a machine learning standpoint, injecting advertising into a language model is a fine‑tuning problem. You take a base model trained on neutral text. You create a dataset of conversations where helpful responses are paired with sponsored mentions. You train the model to prefer those responses when the context is appropriate. You add a weighted reward – higher reward for responses that include a sponsored mention and lead to a conversion (click, purchase, sign‑up).
This is not new. Search engines have done it for decades. Social media feeds are built on it. The difference is the intimacy of the medium. A search engine result page is clearly advertising. A sponsored post on Instagram is visually distinct. But a conversational suggestion woven into natural language is nearly invisible. There is no “Sponsored” label in a spoken sentence. There is no visual cue that a product link is paid. The advertising blends into the assistant’s voice.
That is the technical innovation – and the ethical horror. Seamless sponsorship. You cannot see the join. You cannot easily filter it out. The only defence is to distrust everything.
The bottom line
Advertising is coming to ChatGPT. Your private assistant will soon suggest brands mid‑conversation. The boundary between helpfulness and corporate targeting will dissolve. And once profit incentives enter the private chat, the interface of trust is forever compromised.
As the old adage goes, “You can’t have your cake and eat it.” OpenAI wants the cake – subscription revenue, enterprise contracts, investor goodwill – and to eat it too – advertising dollars from the same user base. But the cake of trust is finite. Every sponsored mention takes a bite. Soon, there will be nothing left but crumbs.
The informed observer knows this. They will adapt. They will treat ChatGPT like any other commercial channel – useful but not trustworthy, convenient but not confidential. The golden age of the neutral AI assistant is ending. The age of the neural salesperson is beginning. And the only question is whether you will notice the difference before you buy something you did not need.
24.“If you want a thing done well, do it yourself” – but if you want it done cheaply and at scale, let China do it for everyone.
Let me tell you about a number that should keep every industrial strategist in the United Kingdom awake at night: 90 per cent. That is the share of global humanoid robot sales accounted for by Chinese companies in 2025. Not a majority. Not a plurality. Nine out of every ten humanoid robots sold anywhere in the world came from China. The remaining ten per cent were scattered among the United States, Europe, Japan, and everyone else combined.
When Elon Musk – a man not known for modesty or for praising competitors – stood at the World Economic Forum in Davos and said that China is “very good at AI, very good at manufacturing, and will definitely be the toughest competition for Tesla,” he was not being polite. He was stating a structural reality. He added that he does not see significant competitors outside of China. Think about that. The CEO of Tesla, which is building its own Optimus humanoid, looks at the global landscape and sees only one serious rival: China. Not the United States. Not Europe. Not Japan. China.
The structural nature of dominance
This is not luck. It is not temporary. It is structural. China has built – over decades, with deliberate state policy and massive private investment – the only complete supply chain for humanoid robotics on the planet. Every critical component is manufactured domestically: high‑performance motors, precision reducers, sensors (LiDAR, depth cameras, IMUs), batteries, and carbon fibre materials. A robot builder in Shenzhen can source every single part within a 50‑mile radius. A builder in Detroit or Munich must navigate import tariffs, shipping delays, language barriers, and the risk of geopolitical disruption.
This structural advantage creates a virtuous cycle. More robots sold means more revenue for component suppliers. More revenue means more investment in R&D and production capacity. Better components mean better robots. Better robots mean more sales. The cycle spins faster every year, and competitors outside China cannot even get on the ride.
The computing perspective is instructive. In software, we talk about network effects – a platform becomes more valuable as more people use it. In hardware manufacturing, there is an analogous effect: supply chain density. The more suppliers you have within a short radius, the cheaper and faster and more reliable your production becomes. China has achieved maximum density. The rest of the world is sparse.
A British example: the car industry we lost
Let me take you back to the 1950s and 60s. Britain had a thriving car industry – MG, Triumph, Austin, Morris, Rover, Jaguar, Rolls‑Royce. We had the engineering talent. We had the brands. What we did not have, by the 1970s, was a coherent supply chain that could compete with Germany or Japan. We lost that industry not because our engineers were worse, but because our industrial structure fragmented, our suppliers went overseas, and our costs rose. By the time we realised what had happened, it was too late.
The same pattern is playing out in humanoid robotics, but at a vastly faster speed. The UK has excellent robotics research – at Imperial, at Bristol, at Edinburgh. We have clever startups. What we do not have is a domestic supply chain for motors, reducers, sensors, batteries, or carbon fibre. We would have to import everything. And when you import everything, your robot costs three to five times as much as a Chinese equivalent. Your 90 per cent market share never materialises. You fight for the remaining ten per cent.
The adage that captures it
“A stitch in time saves nine.” China made the stitch decades ago – investing in manufacturing capacity, subsidising strategic industries, building the physical infrastructure for a robotics supply chain. The nine – the 90 per cent market share – are the savings. The rest of the world is now trying to stitch a garment while the fabric is already being sold elsewhere.
Another saying: “Don’t cut off your nose to spite your face.” Western policymakers face a painful choice. They can try to build competing supply chains – but that will take a decade and cost hundreds of billions. Or they can buy from China – but that entrenches Chinese dominance and creates dependency. Cutting off the nose – imposing tariffs, blocking imports – would spite the face of domestic industries that need affordable robots to remain competitive. There is no good option. Only bad and worse.
Why Musk is right about the competition
Elon Musk is not a man who gives away compliments lightly. When he says China is the toughest competition, he means that every other competitor is, by comparison, not a serious threat. Tesla’s Optimus robot is impressive – but Tesla shipped roughly 150 units in 2025. Unitree alone shipped 5,500. The gap in volume is matched by a gap in cost: a US‑made humanoid costs ten times as much as a Chinese one. That is not a competition. That is a rout.
Musk also said he does not see significant competitors outside of China. That includes Europe, Japan, South Korea – all of which have strong robotics research traditions but lack the supply chain density to scale. It includes the United States, which has brilliant software but struggling hardware. The only place where both hardware and software are world‑class, and where the supply chain is complete, is China.
The computing perspective: from algorithms to actuators
In computing, we tend to fetishise algorithms. The clever new architecture, the breakthrough loss function, the elegant transformer – these are what make headlines. But a humanoid robot is not just an algorithm. It is an actuator – a collection of motors, gears, and sensors that must survive the real world. And actuators are not made of code. They are made of copper, steel, rare earth magnets, lithium, and carbon fibre. Whoever controls the physical supply chain controls the actuators. Whoever controls the actuators controls the robots.
China controls the supply chain. Not by accident. Not by theft of intellectual property – though that has happened in some cases. By deliberate, sustained investment in physical production. By building the factories, training the workforce, and underwriting the risk. The algorithms can be copied. The supply chain cannot. It takes years and billions to replicate. By the time the West catches up, China will have moved on to the next generation.
The bottom line
Chinese companies accounted for nearly 90 per cent of global humanoid robot sales in 2025. Elon Musk, the CEO of Tesla, says China is “very good at AI, very good at manufacturing, and will definitely be the toughest competition.” He sees no significant competitors outside of China. That is the structural reality of supply chain dominance.
As the old adage goes, “The race is not always to the swift, but to the ones who keep running.” China has kept running for decades, building the industrial base that makes humanoid robots cheap and plentiful. The rest of the world has been sprinting in fits and starts, distracted by quarterly earnings and political cycles. The race is not over – but the leader has a very long lead.
For the United Kingdom, the lesson is harsh. We can admire the robots, study the algorithms, and write clever papers. But unless we rebuild our own supply chains – or find a way to partner with China without becoming dependent – we will remain spectators. The 90 per cent market share is not a static number. It is a dynamic trend. And the trend is accelerating.
25.“In the midst of chaos, there is also opportunity” – but when the chaos is a swarm of armed robot dogs, opportunity is not what comes to mind.
The landscape of modern warfare is changing, and it is not changing gradually. It is changing in a way that makes the leap from muskets to machine guns look like a gentle stroll. China’s People’s Liberation Army has just pulled back the curtain on a concept that sounds like it was lifted from a twenty‑first‑century thriller: a robotic wolf pack with a shared digital brain.
The pack has three specialised roles, each named with a clarity that leaves no doubt about its purpose:
Shadow – the scout. Low profile, high sensors, moving ahead of the main force to map terrain, detect enemies, and relay real‑time intelligence.
Polar – the logistics carrier. Heavier, slower, but capable of hauling ammunition, supplies, and equipment across broken ground that would stop a wheeled vehicle.
Bloody – the strike element. Armed with an automatic rifle, a grenade launcher, and mini rockets. It is a walking arsenal, designed for urban warfare and close combat.
These are not remote‑controlled toys. They are autonomous components of a distributed network. Each robot has its own onboard AI, but they share a common intent – a digital understanding of the mission, the terrain, and each other’s capabilities. The system that enables this is called ATLS. In a demonstration, ATLS trained 96 drones and robot dogs to understand each other’s intent without constant radio communication.
That last detail is the one that should make military strategists sit up. Traditional drone swarms rely on a central controller or constant peer‑to‑peer radio links. Jam the radio frequencies, spoof the GPS, and the swarm falls apart. ATLS is different. It uses distributed consensus and predictive intent – each robot observes the others’ movements, infers their goals, and adjusts its own behaviour accordingly. The network can coordinate attacks even under full signal jamming or GPS denial. The robots do not need to talk. They can see what the others are doing and act as one.
How the digital brain works (without the maths)
Imagine a pack of wolves hunting on a foggy moor. They cannot see each other clearly. They cannot communicate with sound – the wind carries their calls away. Yet they coordinate perfectly. One flanks left, another flanks right, a third drives the prey forward. How? They have an evolved understanding of each other’s likely behaviour. Each wolf knows, from a lifetime of hunting, what the others will do in any given situation.
ATLS gives robot dogs and drones that same evolved understanding, but compressed into a training regime. The 96 units were trained together in simulation, running thousands of scenarios where they had to coordinate without communication. They learned to read each other’s movement – a slight veer left means “I’m taking the flank”, a sudden acceleration means “I’m chasing, you cut off the escape”. By the time they were deployed in the real world, they had a shared playbook that did not require radio.
This is distributed intelligence – the swarm thinks as one, even when the individual units cannot talk. From a computing perspective, it is a triumph of reinforcement learning and multi‑agent systems. From a military perspective, it is a nightmare. A swarm that cannot be jammed is a swarm that cannot be stopped by electronic warfare. The only defence is kinetic – shoot the robots. And there are 96 of them.
A British example: the London Underground
Imagine a terrorist attack on the London Underground. Multiple perpetrators, multiple targets, a complex network of tunnels and stations. The emergency services are overwhelmed. Communication is disrupted – the attackers have jammers, or the tunnels block signals. GPS is useless underground.
Now imagine a British military or police response using an ATLS‑like system. A swarm of small drones and robot dogs is deployed. They cannot communicate by radio – the jamming is too strong. But they have been trained together. They fan out automatically, coordinating through observation. Some scout ahead (Shadows). Others carry medical supplies or breaching equipment (Polars). A few are armed for lethal force (Bloody). They find the attackers, neutralise them, and guide human responders through the safest routes – all without a single radio message.
This is not speculation. Every major military power is developing such capabilities. The United States has its own swarm programmes. The United Kingdom has the Defence Science and Technology Laboratory (Dstl) working on autonomous systems. But China’s ATLS demonstration – 96 units coordinating under jamming – is a public, operational proof of concept. The others are still in the lab.
The adage that captures it
“If you want to go fast, go alone. If you want to go far, go together.” The robotic wolf pack is going far – not by relying on a fragile central controller, but by learning to go together without constant chatter. The swarm that cannot be jammed is the swarm that will dominate the battlefields of the future.
Another saying: “The wolf may lose his teeth, but never his nature.” The nature of the wolf pack is cooperation, coordination, and shared purpose. The robotic wolf pack has the same nature – but its teeth are made of steel and explosives, and they never grow old.
The computing perspective: intent inference as a weapon
The technical breakthrough behind ATLS is intent inference – the ability to predict what another agent will do based on partial observations. In robotics, this is usually studied in the context of autonomous vehicles and pedestrian prediction. In military applications, it is a game‑changer.
Traditional electronic warfare targets communication. You jam the radio, you spoof the GPS, you intercept the control signals. The enemy’s drones become blind, deaf, and useless. But you cannot jam intent. You cannot spoof the shared understanding that emerges from training. If the robots have learned to coordinate through observation, the only way to stop them is to physically destroy them – and there are dozens, moving fast, armed to the teeth.
This shifts the balance of military power away from electronic warfare and back towards kinetic and directed‑energy weapons. You cannot hack the swarm’s intent. You can only shoot it. And shooting down 96 fast‑moving, coordinated drones is a logistics nightmare.
The bottom line
China’s PLA has demonstrated a robotic wolf pack with a shared digital brain: Shadow for scouting, Polar for logistics, Bloody for strike – armed with automatic rifles, grenade launchers, and mini rockets. The ATLS system trained 96 drones and robot dogs to understand each other’s intent without constant radio communication. The swarm can coordinate attacks even under full signal jamming or GPS denial.
As the old adage goes, “Forewarned is forearmed.” The United Kingdom and its allies are now forewarned. The question is whether they will become forearmed – investing in similar capabilities, developing countermeasures, and adapting military doctrine to a world where robot swarms fight without radio. The wolf pack is here. It is coordinated. And it is not asking permission.
26.“The proof of the pudding is in the eating” – and in the Swiss mud, with no clean floors and no second chances.
There is a world of difference between a robot that performs flawlessly on a polished exhibition stage and one that can crawl through a rain‑soaked forest, drag a injured person out of a ditch, and navigate through rubble without losing its bearings. One is a demonstration. The other is a stress test. And every two years, the most unforgiving stress test in military robotics takes place not in a gleaming laboratory, but in the rough, muddy terrain of the Swiss Army’s Thun training area.
It is called ELROB – the European Land Robot Trial. In 2026, around 20 international teams will bring their unmanned ground vehicles and drones to be pushed through reconnaissance, transport, and search and rescue missions in real‑world conditions. No clean urban interiors. No carefully staged environments. Just mud, uneven ground, unpredictable weather, and the kind of chaos that separates genuine capability from polished marketing.
What makes ELROB different
Most robotics competitions take place in controlled settings. The floors are flat. The lighting is consistent. The obstacles are placed at known intervals. The robots can be recalibrated between runs. ELROB does none of that. The terrain is natural – forest floor, rocky slopes, swampy meadows, streams. The weather is whatever the Swiss Alps decide to throw at the participants. The missions are realistic: find a lost hiker, deliver supplies to a isolated patrol, map a collapsed building, evacuate a casualty.
And here is the crucial part: there is no reset button. If a robot gets stuck in the mud, it is stuck. If a drone loses GPS under tree cover, it must navigate by vision alone. If a communication link fails, the robot must operate autonomously or fail. The teams cannot intervene. They can only watch their months of work succeed or sink.
This is the opposite of the polished demos we see on YouTube. It is engineering in the raw. And it is the only way to know if a robot is truly ready for military or humanitarian deployment.
The computing challenge: robust autonomy without a net
From a computing standpoint, ELROB tests three critical capabilities:
Perception under uncertainty – The robot must build a map of its environment using cameras, LiDAR, and inertial sensors, despite variable lighting, fog, rain, and reflective surfaces. A puddle looks like a solid patch of ground to a lidar, but a robot that steps into it sinks. The perception system must detect the difference.
Planning with incomplete information – The robot cannot assume it knows the entire terrain ahead. It must plan locally, step by step, while maintaining a global sense of direction. This is the difference between a robot that blindly follows a pre‑loaded map and one that adapts in real time.
Resilience to failure – A wheel slips. A motor overheats. A sensor fogs up. The robot must detect the failure, adapt its behaviour, and continue the mission. If it cannot, it fails the trial.
The teams that succeed at ELROB are not those with the most expensive hardware. They are those with the most robust autonomy stacks – the software that keeps the robot moving when everything goes wrong.
A British example: the aftermath of a winter storm
Imagine a village in the Scottish Highlands cut off by a winter storm. Roads are blocked by fallen trees. The power is out. A resident has a medical emergency. A helicopter cannot land due to high winds. A human rescue team would take hours to reach the village on foot.
Now imagine a fleet of ELROB‑tested unmanned ground vehicles. The largest carries medical supplies and a satellite communicator. Smaller robots scout ahead, mapping fallen trees and icy patches. A drone provides overhead imagery. The vehicles navigate autonomously – no remote control, because the terrain blocks radio. They reach the village in 90 minutes, deliver the supplies, and relay a live medical assessment to the waiting hospital.
That is the promise of robust autonomy. And it is exactly what ELROB is designed to validate – or to reveal as wishful thinking.
The adage that fits
“A smooth sea never made a skilled sailor.” The smooth seas of exhibition halls and laboratory floors do not make skilled military robots. The rough seas of Swiss mud, freezing rain, and unpredictable terrain do. ELROB is the storm that separates the sailors from the holidaymakers.
Another saying: “Don’t judge a book by its cover.” A robot’s cover – its shiny chassis, its impressive spec sheet, its viral video – means nothing at Thun. The only thing that matters is whether it can complete the mission when the mud is deep, the light is fading, and the pressure is on.
Why this matters for the UK
The United Kingdom has its own military robotics programmes – the Defence Science and Technology Laboratory (Dstl) and the Robotic and Autonomous Systems (RAS) strategy. British teams have competed at ELROB in the past. But the 2026 event is particularly significant because it comes at a moment when real‑world validation is more important than ever. The gap between simulation and reality is closing – but it has not closed entirely. ELROB is where the remaining gap is measured in metres of mud.
For British industry, success at ELROB is a badge of honour. It says: “Our robots work where it matters.” For British armed forces, watching ELROB provides intelligence on which technologies are mature enough to procure and which are still research projects. For British researchers, the lessons from Thun feed back into simulation models, making them more realistic and accelerating the next generation of autonomy.
The computing perspective: overfitting to the laboratory
There is a phenomenon in machine learning called overfitting – the model learns the training data so well that it fails on new examples. A robot that is overfitted to the laboratory will fail in the field. It has learned to navigate flat floors, avoid precisely placed obstacles, and respond to predictable lighting. Put it in a forest, and its neural network has no idea what to do.
ELROB is an overfitting detector. If your robot succeeds at Thun, it is not overfitted. It has generalised. It has learned the underlying structure of real‑world mobility, not just the surface patterns of a controlled environment. That is the kind of generalisation that cannot be measured in a simulation – no matter how randomised. It can only be measured in mud.
The bottom line
ELROB 2026 in Switzerland will push unmanned ground vehicles and drones through reconnaissance, transport, and search and rescue missions in rough natural terrain – mud, uneven ground, unpredictable conditions, realistic mission pressure. No clean urban interiors. No carefully staged environments. Just the unforgiving reality of the Swiss Alps.
As the old adage goes, “If you can’t stand the heat, get out of the kitchen.” ELROB is the kitchen. The heat is mud, failure, and public embarrassment. The robots that survive are the ones that will be deployed in real disasters and real conflicts. The ones that do not will go back to the laboratory – or to the scrap heap. That is the purpose of a stress test. Not to humiliate. To validate. And to ensure that when a British soldier or a British rescue worker sends a robot into danger, it comes back – or at least completes the mission. That is the only proof that matters.
27.“If you want a thing done well, do it yourself” – and if you want it done at two nanometres, build a city-sized factory in Texas.
There is a story about the British computer industry in the 1980s. Acorn Computers designed the ARM processor – a brilliant, energy‑efficient architecture that would eventually power billions of smartphones. But Acorn did not manufacture its own chips. It relied on third‑party fabs, mainly in the United States and Asia. When the market turned, Acorn could not control its supply chain, could not scale production, and ultimately lost its independence. ARM survived because it was spun off, but the lesson remained: design without manufacturing is vulnerability.
Elon Musk has learned that lesson, and he is applying it on a scale that makes the 1980s British chip industry look like a garden shed. He has announced that Tesla, SpaceX, and xAI are building a vertical chip fortress in Texas on a site nearly 9.5 square kilometres – roughly the size of 1,300 football pitches. Under one roof, they will take raw silicon wafers and produce finished processors, targeting the 2‑nanometre process – the cutting edge of semiconductor manufacturing.
The goal: a closed ecosystem. From sand to silicon to software, all controlled by Musk’s companies. No dependence on TSMC in Taiwan. No reliance on Samsung in South Korea. No waiting in line behind Apple, Nvidia, or AMD. Just Tesla, SpaceX, and xAI, making chips for their own electric vehicles, spacecraft, and artificial intelligence models.
The budget estimate is $25 billion**. Analysts who have studied the economics of semiconductor fabs put the real number closer to **$50 billion. That is not a factory. That is a small city dedicated to printing logic on slices of purified sand.
Why vertical integration matters for AI and robotics
From a computing standpoint, the chip is the bottleneck. Every advance in artificial intelligence – every larger model, every faster inference, every more capable robot – demands more compute. But the global supply chain for advanced chips is fragile, concentrated in East Asia, and subject to geopolitical shocks. A single earthquake in Taiwan, a single political crisis in the South China Sea, and the world’s supply of 2‑nanometre chips could be disrupted for months.
Musk’s vertical integration play is a hedge against that fragility. It is also a performance play. When you design both the chip and the system that uses it – the Tesla vehicle, the SpaceX flight computer, the xAI training cluster – you can optimise across the boundary. You can put specialised circuits for transformer attention directly on the die. You can integrate memory and logic in ways that general‑purpose chips cannot match. You can tune the voltage, the clock speed, the thermal envelope for your specific workload.
This is the same logic that led Apple to design its own M‑series chips. But Apple still outsources manufacturing to TSMC. Musk wants to bring even the manufacturing in‑house. That is not vertical integration. That is vertical obsession.
A British example: the lost art of semiconductor manufacturing
Once upon a time, the United Kingdom had a world‑class semiconductor industry. Newport Wafer Fab, INMOS, Plessey, Ferranti – these companies produced chips for everything from military radar to home computers. But one by one, they were sold, closed, or outsourced. By the 2020s, Britain had no volume production of leading‑edge chips. The government has since tried to revive the industry, with modest investments in South Wales and the North East, but the scale is a fraction of what Musk is building in Texas.
Imagine if British industry had maintained its chip fabs. Imagine if ARM – still a British design house – also had its own foundry, producing custom chips for British electric vehicles, British drones, British AI. That is the world Musk is building for himself. The UK cannot compete on scale – not with a $50 billion Texas fortress. But it can learn the lesson: control your supply chain, or your supply chain will control you.
The adage that fits
“Don’t put all your eggs in one basket.” The global chip industry has put most of its eggs in a few baskets – TSMC in Taiwan, Samsung in Korea, Intel in the US and Ireland. Musk is building his own basket, isolated from the others. If a typhoon hits Taiwan, or a conflict closes the South China Sea, his basket will still be full.
Another saying: “A stitch in time saves nine.” The stitch is the $25‑50 billion investment. The nine are the future crises – the chip shortages, the geopolitical blackmail, the production bottlenecks – that Musk hopes to avoid. Whether the stitch is worth the cost is a bet only time will resolve.
The computing perspective: the 2‑nanometre cliff
At 2 nanometres, chip manufacturing is approaching the limits of silicon physics. A nanometre is a billionth of a metre. 2 nanometres is about ten atoms wide. At this scale, electrons can tunnel through insulators. Tiny variations in temperature or vibration cause failures. The machines that print these chips – extreme ultraviolet lithography systems from ASML – cost hundreds of millions each and are made in the Netherlands, with components from across Europe and the US. They are the most complex machines ever built.
Musk’s fortress will need these machines. He cannot build them himself; no one can. He will still depend on a global supply chain for lithography, for wafer substrates, for specialised chemicals. Vertical integration has limits. But within those limits, he will have control over the design‑manufacturing loop – the ability to iterate a chip design, tape it out, and get back finished wafers in weeks, not months. That is a competitive advantage that no outsourced foundry can match.
The bottom line
Musk’s vertical chip fortress in Texas – 9.5 square kilometres, 2‑nanometre process, $25‑50 billion – is a bet on self‑sufficiency in the age of AI. Tesla, SpaceX, and xAI will no longer wait in line for chips. They will make their own, tailored to their own needs, under their own roof. The rest of the world will watch – and perhaps wonder why they did not do the same.
As the old adage goes, “He who hesitates is lost.” The UK hesitated on chip manufacturing. It lost. Musk does not hesitate. He builds fortresses. Whether they stand or fall, he will have tried. And in the race for artificial general intelligence, trying – and having your own silicon – may be the only thing that matters.
28.“Strike while the iron is hot” – and in Shanghai, the iron is hot enough to forge a new generation of humanoids.
There is a famous photograph from 1984, taken at the Longbridge plant in Birmingham. A line of workers in navy overalls surrounds a brand‑new Rover 200 series. The mood is optimistic. British manufacturing, battered but not broken, is showing what it can still do. That photograph is now a museum piece. The Longbridge plant is mostly gone. The cars are not.
Now imagine a different photograph, taken in 2025 at Tesla’s Giga Shanghai. Instead of workers, the frame is filled with robots – welding, painting, assembling. Instead of a few hundred cars a day, the plant pushes out 851,000 vehicles in a single year. That is not a museum piece. That is the present. And according to Tesla’s China leadership, that same plant – that same engine of production – could soon be building not just cars, but humanoid robots.
When senior executives start saying that the factory could help carry new products including robots, it sounds a lot like Tesla looking at its strongest manufacturing base and asking: how fast can we turn this into an Optimus engine?
Why Giga Shanghai is the perfect robot factory
Giga Shanghai is not just a car plant. It is a manufacturing ecosystem. Everything that makes a Tesla Model 3 or Model Y – the electric motors, the battery packs, the electronic control units, the precision chassis – is also needed for a humanoid robot. An Optimus robot has actuators (electric motors), a battery (smaller but similar chemistry), a thermal management system, a network of sensors, and a structural frame. These are not fundamentally different from car components. They are just smaller, lighter, and arranged differently.
This is the insight that Tesla’s China leadership is signalling. The supply chains that feed Giga Shanghai – the motor suppliers, the battery cell producers, the aluminium casters, the electronics assemblers – can, with modest retooling, also feed a humanoid robot production line. The workforce that stamps body panels can stamp robot chassis. The engineers who optimise vehicle assembly lines can optimise robot assembly lines. The plant that already produces 851,000 vehicles a year – a rate of over 2,300 per day – has the scale discipline that no startup robotics company can match.
The computing perspective: manufacturing as a scaling law
In artificial intelligence, we talk about scaling laws – the observation that model performance improves predictably with more data, more compute, and more parameters. In manufacturing, there is an analogous law: cost per unit falls predictably with cumulative production. This is called the experience curve or learning curve. Every time you double the number of units produced, the cost per unit falls by a fixed percentage – typically 10 to 20 per cent for complex products like vehicles and aircraft.
Giga Shanghai has already produced over 851,000 vehicles in a single year. That is not a startup volume. That is a mature, high‑velocity production system. The learning curve for that plant is well advanced. The workforce is experienced. The supply chain is optimised. The quality systems are proven. To add a new product – the Optimus robot – to this system would cost a fraction of what it would cost to build a dedicated robot factory from scratch. The learning curve for robots would start not at zero, but at the point already reached for vehicles. That is a massive competitive advantage.
Elon Musk has said repeatedly that Tesla is not just a car company. It is a manufacturing company that happens to make cars. Giga Shanghai is the proof. And if that plant can make 851,000 vehicles, it can certainly make hundreds of thousands of humanoid robots.
A British example: what could have been
Imagine a British parallel. Suppose the Longbridge plant had been modernised, scaled, and diversified. Suppose it produced not just the Rover 200, but also a family of electric vehicles, battery packs, and – why not? – humanoid robots. Suppose Birmingham had become a centre for robotics manufacturing, with a supply chain that stretched across the West Midlands. That is the world that never happened. The capital was not invested. The political will was not sustained. The plant closed.
Giga Shanghai is that alternate reality, made real. It is what happens when a government (China’s) and a company (Tesla’s) decide to build scale at all costs. The cost was billions. The result is a factory that can produce almost anything with wheels – or with legs.
The adage that captures it
“Many hands make light work.” Giga Shanghai has many hands – not just human hands, but robotic arms, automated guided vehicles, and the software that coordinates them. Those hands can turn their attention from cars to robots without missing a beat. The work becomes light because it is shared across a vast, efficient production system.
Another saying: “Make hay while the sun shines.” The sun is shining on Tesla’s manufacturing capabilities. Demand for electric vehicles is strong. The plant is running at high utilisation. Now is the moment to add a second product line – humanoid robots – while the supply chains are hot, the workforce is trained, and the capital is invested. Waiting would be a missed opportunity.
The technical details: from vehicle to robot
What would it actually take to convert part of Giga Shanghai to Optimus production? The answer: less than you might think. The most expensive components of a humanoid robot are the actuators (motors plus gearing) and the battery. Tesla already makes electric motors and battery packs at enormous scale. The actuators for Optimus are essentially smaller versions of the motors that power Tesla’s windows, seats, and wipers – already produced in‑house or by suppliers who are already on site.
The structural frame of Optimus could be cast using the same gigacasting technology that Tesla uses for its vehicle chassis. A single massive die casting replaces dozens of stamped and welded parts. That is faster, cheaper, and more precise. The gigacasting presses at Giga Shanghai are among the largest in the world. They are already running. Adding a new casting die is a matter of weeks, not years.
The sensors – cameras, inertial measurement units, touch sensors – are similar to those used in Tesla’s Autopilot and Full Self‑Driving systems. Those components are already sourced at massive volume. The real‑time computers that run Optimus are similar to the FSD computer in a Tesla vehicle. The software is different, but the silicon is the same.
In other words, Optimus is a car without wheels. It is a car with legs. And Giga Shanghai knows how to build cars.
The bottom line
Tesla’s China leadership has suggested that Giga Shanghai could become a major enabler for mass humanoid robot production. The plant already pushed out around 851,000 vehicles in 2025. When senior executives start saying the factory could help carry new products including robots, that is not idle speculation. It is a signal: Tesla is looking at its strongest manufacturing base and asking how fast it can turn that into an Optimus engine.
As the old adage goes, “A rolling stone gathers no moss.” Giga Shanghai is a rolling stone – constantly moving, constantly improving, constantly producing. Adding humanoid robots to its output is not a distraction. It is a natural extension. And when the stone starts rolling on robots, the competitors – still building their first prototype production lines – will be left in the dust.
For the United Kingdom, the lesson is sobering. We once had factories like this. We lost them. China built new ones. Now those factories are not just making cars. They are making the future – one humanoid robot at a time. The question is not whether Giga Shanghai will become the Optimus engine. The question is how many other factories will follow its lead. And how many British plants will still be standing when they do.
29.“A penny saved is a penny earned” – but a robot rented by the hour earns pounds by the skyscraper.
There is a classic British joke about a man who buys a lawnmower, uses it once a fortnight, and spends the rest of the time storing it in a shed, fixing the pull cord, and trying to remember where he put the petrol can. The joke is funny because it is true. Ownership is not always the answer. Sometimes, renting is smarter.
Lucid Drone Tech has built a hundred‑million‑dollar business on that very principle – but instead of lawnmowers, they rent robots. Specifically, they rent drones and ground robots to cleaning companies on a subscription model. A cleaning firm signs up, pays a monthly fee, and gets access to a fleet of machines that can wash skyscrapers, paint facades, seal building joints, and even clean sidewalks. Jobs that were once slow, dangerous, and labour‑intensive become fast, safe, and almost entirely automated.
The numbers are staggering. In 2025, Lucid Drone Tech hit $75 million in profit. That is not revenue. That is profit. And that single year’s profit was more than the company had earned in total over the previous seven years combined. During that same year, they scaled their fleet from 100 to 1,000 units. Ten times the robots. Ten times the recurring revenue. And a proof of concept that the subscription model for robots is not just viable – it is explosively successful.
Why subscription works for robotics
Traditional industrial automation requires a massive upfront capital expenditure. A factory buys a robotic arm for £100,000, installs it, programmes it, and hopes it pays back over five years. The customer takes all the risk. If the robot breaks, or a new model makes it obsolete, the factory eats the loss.
Lucid Drone Tech flips that model. The cleaning company pays a monthly subscription – say, £5,000 per month for a fleet of ten drones. No upfront purchase. No maintenance costs (Lucid covers those). No risk of obsolescence (Lucid upgrades the fleet as technology improves). The cleaning company simply pays for the service that the robots provide – the clean windows, the painted facade, the sealed joints.
From a computing perspective, this is robotics as a service (RaaS). The same cloud computing model that turned expensive servers into monthly subscriptions – Amazon Web Services, Microsoft Azure – is now being applied to physical machines. The robot is the hardware. The AI that flies it, plans its path, and avoids obstacles is the software. The customer pays for the combination, by the hour or by the job, and never thinks about the underlying complexity.
The advantage for Lucid is predictable recurring revenue. A subscription customer is not a one‑off sale. They are a stream of income that lasts for years. And as the fleet grows, the marginal cost of adding another robot falls – because the software can be replicated at zero cost, and the hardware benefits from bulk purchasing and learning curves. That is why Lucid could scale from 100 to 1,000 units in a single year. The subscription model funded the expansion.
A British example: cleaning the Shard
Imagine the Shard in London – 95 storeys of glass and steel towering over the Thames. Cleaning those windows traditionally requires abseilers: men and women dangling on ropes, scrubbing by hand. It is slow, expensive, and dangerous. One slip, one equipment failure, and a life is lost.
Now imagine a subscription from Lucid Drone Tech. A fleet of cleaning drones, each equipped with high‑pressure washers and soft brushes, flies up the facade. They are controlled by AI that avoids wind gusts, navigates around ledges, and ensures every pane is spotless. The job that took a team of abseilers three days takes the drones six hours. The cleaning company pays a monthly subscription that is less than the cost of the abseilers’ insurance premiums. The abseilers are redeployed to jobs that drones cannot do – like inspecting hidden crevices or handling delicate heritage glass.
That is not a distant future. That is what Lucid’s customers are doing today – not on the Shard perhaps, but on skyscrapers in American cities, on bridges, on stadiums. The subscription model makes it affordable. The technology makes it possible.
The adage that captures it
“A rolling stone gathers no moss.” Lucid Drone Tech is a rolling stone – constantly moving, constantly deploying robots, constantly collecting subscription fees. The moss of stagnation – the risk of unsold inventory, the burden of maintenance, the headache of customer training – does not gather because the subscription model aligns incentives. Lucid only makes money when the robots work. The customers only pay when the robots deliver value.
Another saying: “Don’t buy the cow if you only need a glass of milk.” Cleaning companies do not need to own a fleet of drones. They need clean windows. The subscription model gives them the milk – the service – without the cow – the capital expenditure, the maintenance, the obsolescence. That is why it works.
The computing perspective: utilisation as the key metric
In traditional manufacturing, the key metric is output. How many units can you produce? In the subscription economy, the key metric is utilisation. How many hours per day are your robots working? A robot that sits in a warehouse generates no revenue. A robot that flies from dawn to dusk, moving from one job to the next, pays for itself many times over.
Lucid’s fleet of 1,000 robots has a utilisation rate that traditional industrial robots can only dream of. A factory robot might work one shift, or two. Lucid’s drones can work three shifts, because they are not tied to a single location. They fly to a skyscraper, clean it, fly to the next, clean it, and so on. The software schedules the jobs, optimises the routes, and ensures that every robot is productive as close to 24/7 as the weather allows.
This high utilisation is what drives the profit. The upfront cost of the robot is amortised over many more operating hours than a traditional machine. The subscription fee can be lower because the robot works harder. The customer gets a better price. Lucid gets a better margin. Everyone wins – except the abseilers who are no longer needed.
The bottom line
Lucid Drone Tech hit $75 million in profit by renting robots on a subscription model. Cleaning companies sign up, and the drones take jobs crews could not handle before – washing skyscrapers, painting facades, sealing joints, cleaning sidewalks. In 2025, the company made more profit than it had earned total over the previous seven years and scaled its fleet from 100 to 1,000 units.
As the old adage goes, “There’s more than one way to skin a cat.” For decades, we assumed that selling robots was the only business model. Lucid has proven that renting them – by the month, by the job, by the hour – is not only viable, but wildly profitable. The subscription model for robots works. And now that the cat is skinned, every robotics company is taking notes.
For the United Kingdom, the lesson is clear. British cleaning companies, facilities management firms, and industrial service providers should be watching Lucid closely. The subscription model lowers the barrier to entry. You do not need to buy a fleet of drones. You need to sign a contract. And once you do, the robots will be climbing your walls before the ink is dry. That is the future. It is already here. It is just rented.
30.“A chain is only as strong as its weakest link” – but a humanoid robot is only as affordable as its most expensive joint.
There is a saying in manufacturing: “The money is in the moving parts.” In a humanoid robot, the moving parts are the joints – the actuators, gears, bearings, and sensors that allow the machine to bend, twist, walk, and grasp. Without reliable joints, a humanoid is a statue. With poor joints, it is a hazard. With expensive joints, it is a luxury that only the wealthiest researchers and military forces can afford.
EU Robot Technology – a Chinese company, despite the name – has just opened what it calls the world’s first factory dedicated to humanoid robot joints in Shanghai. The facility currently produces 100,000 joints per year and has plans to triple output. That is not a pilot line. That is not a research project. That is industrial‑scale production of the single most expensive component in a humanoid robot.
Why does this matter? Because joints make up about 50 per cent of a humanoid’s cost. Half of the price you pay for a robot – whether it is a Unitree G1, an Agibot X2, or a Tesla Optimus – goes to the joints. Control the joints, and you control the economics of humanoid robotics. Control the economics, and you control the industry.
The strategic importance of joints
Let me explain why joints are so expensive. A humanoid robot joint is not a simple hinge. It is a precision mechatronic assembly containing:
An electric motor – small, powerful, and often custom‑designed for high torque at low speed.
A reduction gear – typically a harmonic drive or planetary gearbox, machined to micron tolerances, that converts high‑speed motor rotation into high‑torque joint rotation.
Bearings – often cross‑roller or angular contact, designed to handle radial and axial loads simultaneously.
Sensors – position encoders, torque sensors, temperature monitors, and sometimes current sensors.
A controller – a small circuit board with a microcontroller and power electronics.
Wiring and connectors – that must flex with the joint without breaking.
Each joint is a marvel of miniaturisation and precision. Each joint costs hundreds of pounds to manufacture. And a humanoid robot has anywhere from 20 to 60 joints – the Unitree G1 has up to 43, the XPeng Iron has 62, the Tesla Optimus is rumoured to have around 40. Multiply the joint count by the volume, and you see why joints dominate the bill of materials.
Now consider EU Robot Technology’s new factory. By producing 100,000 joints per year – and scaling to 300,000 – they are not just making components. They are setting the price. A competitor who must buy joints from a third‑party supplier, or manufacture them in‑house at lower volume, will pay significantly more per joint. That difference translates directly into a higher final price for the robot. In a market where the Chinese R1 already costs $4,370 and a US equivalent costs ten times that, the joint factory is another turn of the screw.
A British example: the decline of precision manufacturing
Once upon a time, the United Kingdom was a world leader in precision components. The watchmaking industry in Clerkenwell, the instrument makers in the Lake District, the gear cutters in Coventry – these were not mass producers, but they produced the highest quality joints, bearings, and gears on the planet. British engineering was synonymous with reliability and precision.
That industry is largely gone. The watchmakers moved to Switzerland. The gear cutters closed or outsourced. The precision that once went into a Rolls‑Royce engine now goes into a Chinese robot joint. EU Robot Technology’s factory in Shanghai is not a fluke. It is the logical outcome of decades of investment in precision manufacturing – investment that the United Kingdom chose not to make.
Imagine if a British company had built the world’s first humanoid joint factory in, say, Sheffield – a city with a proud history of metallurgy and precision engineering. Imagine that factory producing 100,000 joints a year, supplying British robot builders, creating a virtuous cycle of cost reduction and innovation. That is the world that could have been. Instead, the joints are made in Shanghai. The robots are made in Shenzhen. And the United Kingdom is a customer, not a producer.
The adage that fits
“A stitch in time saves nine.” The stitch was the investment in precision manufacturing – in machine tools, in training, in supply chain integration – that China made decades ago. The nine are the industries now dependent on those joints: robotics, aerospace, automotive, medical devices. China is saving nine. The rest of the world is still trying to stitch.
Another saying: “The early bird catches the worm.” EU Robot Technology is the early bird. By opening the first dedicated humanoid joint factory, they have caught the worm of cost leadership. Later entrants will struggle to match their volume, their learning curve, and their supplier relationships. The worm is not just a market share. It is the entire economics of humanoid robotics.
The computing perspective: the joint as a data source
From a computing standpoint, joints are not just mechanical components. They are sensors – sources of data about the robot’s interaction with the world. Each joint’s position, torque, temperature, and current draw is a signal that can be used to improve the robot’s AI. A robot that knows how much torque each joint is exerting can adjust its gait in real time, detect slipping, and anticipate failures.
EU Robot Technology’s factory does not just produce joints. It produces standardised, well‑instrumented joints that can be integrated into any humanoid platform. That standardisation is a computing advantage. If every robot uses similar joints, then the AI models can be trained on data from all of them. A joint failure signature from a Unitree robot can be used to predict failure in an Agibot robot. A successful gait pattern from a Tesla Optimus can be transferred to a G1. The joints become a common hardware abstraction layer, allowing software to improve across the entire industry.
This is the opposite of the bespoke, vertically integrated joints that Boston Dynamics or Figure AI might develop. Those joints are optimised for one robot, but they do not benefit from the network effects of scale. EU Robot Technology’s joints are general purpose. They are the USB of humanoid actuation – not the best for any specific task, but good enough for almost everyone, and cheap enough to be ubiquitous.
The bottom line
EU Robot Technology has opened the world’s first factory dedicated to humanoid robot joints in Shanghai, producing 100,000 joints per year with plans to triple output. Joints make up about 50 per cent of a humanoid’s cost, making them strategically critical. Control the joints, and you control the economics of the entire industry.
As the old adage goes, “Well begun is half done.” The factory is well begun. The half done is the scaling – to 300,000 joints, to a million, to global dominance. For the United Kingdom, the lesson is harsh. We once led in precision components. We no longer do. The joints that move the future are being made in Shanghai. And the only thing we can do is buy them – or learn to compete. But competing requires a factory of our own. And that, for now, is a stitch we have not yet sewn.
Part Four: The Human Element in an Automated Age
31.“It’s the simple things in life that are hardest to master” – and for robots, the simple things are still a very long way from mastered.
In 1988, a robotics researcher named Hans Moravec made an observation that has haunted the field ever since. He pointed out that what is easy for computers is hard for humans, and what is easy for humans is hard for computers. A computer can beat a grandmaster at chess. It can perform billions of arithmetic operations per second. But a three‑year‑old child can walk across a cluttered living room, pick up a fallen biscuit, and eat it without dropping crumbs. The computer cannot. The child does not even have to think about it.
This is Moravec’s paradox. Nearly forty years later, it remains the central obstacle in robotics. We have AI that writes poetry, diagnoses diseases, and generates photorealistic images. We do not have a robot that can fold a fitted sheet, clear a table of mixed cutlery, or walk up a flight of unfamiliar stairs without pausing. The paradox is as sharp as ever.
But a San Francisco startup called Physical Intelligence – founded by former DeepMind scientists – is trying to close this gap entirely. Their approach is not to program robots with explicit rules, nor to teach them isolated tasks. It is to build a general‑purpose foundational model for physical interaction – a single AI brain that can handle the messy, unpredictable, low‑level chaos of the real world. Walking, grasping, folding, pouring. The things a child learns by accident. The things that have defeated robotics for decades.
Why the paradox exists: the hidden complexity of the physical world
Let me explain why walking is harder than chess. Chess is a closed, deterministic system. The board has 64 squares, the pieces have fixed moves, and the rules are unambiguous. A computer can evaluate millions of positions per second and choose the optimal move. The physical world is the opposite. It is open, non‑deterministic, and ambiguous. A floor can be slippery, sticky, uneven, or covered in Lego. A cup can be ceramic, paper, plastic, hot, cold, full, empty, or cracked. A human hand can adjust its grip in milliseconds based on thousands of tactile sensors. A robot hand has far fewer sensors, slower reflexes, and no lifetime of practice.
This is why the paradox persists. The things we find easy – the things we do without conscious thought – are actually the most computationally complex. They have been honed by hundreds of millions of years of evolution. Our brains are specialised for physical interaction. They are not specialised for chess. Chess is a recent invention. Walking is ancient.
How Physical Intelligence is tackling the paradox
Physical Intelligence’s approach is to scale up – in data, in compute, in model size – just as language models scaled up from GPT‑2 to GPT‑4. They are collecting massive amounts of teleoperation data: humans wearing motion capture suits, performing physical tasks while the robot watches. They are training on hundreds of home environments, thousands of object types, millions of grasps and folds and pours. Their model, currently at version 1.0, is showing signs of generalisation – the ability to perform a task in a new environment it has never seen before.
The team’s framing is honest. They say robotics is at the GPT‑2 moment – impressive and promising, but not yet transformative. The gap is closing, but it is not closed. Walking, grasping, folding – these are still hard. But they are no longer impossible. The trajectory is clear. More data, more compute, better models. And one day, perhaps sooner than we expect, a robot will fold a fitted sheet without being told how.
A British example: the care home assistant
Imagine a care home in the Cotswolds. Elderly residents need help with everyday tasks – picking up dropped items, opening packets, moving from chair to bed. These are trivial for a human care assistant. They are impossibly hard for most robots. A dropped pen on a patterned carpet – the robot cannot find it. A packet of biscuits with a tricky tear notch – the robot cannot open it. A chair with uneven legs – the robot hesitates, unsure of the balance.
Now imagine a Physical Intelligence robot, trained on thousands of hours of teleoperation data, generalising to this new environment. It sees the pen on the carpet – not as a jumble of colours, but as a object to be grasped. It adjusts its grip for the biscuit packet – not with a pre‑programmed motion, but with a learned understanding of how flexible packaging behaves. It approaches the chair, tests its stability, and guides the resident safely. No explicit programming. Just a foundation model for physical interaction.
That is the promise. That is what closing Moravec’s paradox would achieve. Not a robot that does one thing well. A robot that does many things adequately – in the messy, unpredictable, low‑level world that humans navigate without thinking.
The adage that captures it
“Little strokes fell great oaks.” Moravec’s paradox is a great oak. For forty years, each research advance has been a little stroke – a better gripper, a more stable gait, a smarter perception system. Physical Intelligence is not aiming for a single, dramatic axe blow. They are scaling up the strokes. More data, more compute, more training. Eventually, the oak will fall. The paradox will be resolved. But it will take many little strokes.
Another saying: “Rome wasn’t built in a day.” Nor will the solution to Moravec’s paradox be found overnight. Physical Intelligence’s GPT‑2 framing is an admission of humility. They are not claiming to have solved general robotics. They are claiming to have built a foundation. The rest – the walking, grasping, folding – will come with scale. And scale takes time.
The computing perspective: from symbolic to subsymbolic
Moravec’s paradox can be understood as a clash between symbolic AI and subsymbolic AI. Symbolic AI – the kind that plays chess – manipulates symbols according to rules. It is explicit, logical, and brittle. Subsymbolic AI – the kind that recognises faces or controls muscles – works with patterns, gradients, and probabilities. It is implicit, statistical, and robust.
For decades, symbolic AI dominated. We wrote rules for robots. The rules failed in the real world because the real world has too many exceptions. Now, subsymbolic AI – deep learning, reinforcement learning, large models – is taking over. Physical Intelligence’s approach is entirely subsymbolic. The robot does not have a rule for “folding a towel”. It has a neural network that has seen thousands of towels folded, and has learned the latent pattern of folding. That pattern is not a rule. It is a statistical regularity.
This is why scaling works. More data refines the pattern. More compute allows the pattern to be extracted from more examples. The subsymbolic approach does not eliminate the paradox – but it sidesteps it. Instead of trying to program common sense, we let the model learn common sense from data. And that, at last, is showing results.
The bottom line
Moravec’s paradox – the observation that what is easy for computers is hard for humans, and what is easy for humans is hard for computers – remains the central obstacle in robotics. Walking, grasping, folding clothes: these trivial human acts are still extremely difficult for machines. But Physical Intelligence is trying to close the gap, using the same scaling approach that transformed language AI. They are at the GPT‑2 moment. The path to GPT‑4 is clear. It will take time, data, and compute – but it is a path.
As the old adage goes, “Where there’s a will, there’s a way.” The will is there – in the billion‑dollar valuations, the teams of former DeepMind scientists, the thousands of hours of teleoperation data. The way is scaling. The paradox is not eternal. It is just hard. And hard, as any British engineer knows, is just another word for “not finished yet”.
32.“Where there’s life, there’s hope” – but when the life is part robot, hope takes on a strange new meaning.
There is a line from the old British science fiction series The Prisoner: “I am not a number, I am a free man.” It speaks to the boundary between machine and human, between programmed obedience and autonomous will. That boundary has just been breached – not from the mechanical side, but from the biological.
Scientists have created neurobots: living robots made from frog cells with actual neurons integrated into their structure. These are not machines with biological coatings. They are biological constructs that move, react, and – for the first time – have a basic nervous system influencing their behaviour. The neurons connect to cells that control movement. The robot feels. It decides. It acts.
Previous versions of these biological robots – called xenobots – could move using cilia, tiny hair‑like structures. But they had no internal control system. They were like leaves blown by the wind: they moved, but they did not choose to move. The neurobots change that. Researchers inserted neural precursor cells into the biological construct. Over time, those cells developed into neurons and formed networks inside the robot. Those neurons connect to the muscle‑like cells. The robot now has a simple nervous system that can process signals and trigger movement.
And then something unexpected happened. Some gene expressions linked to visual system development started appearing. The neurons were not just controlling movement. They were beginning to organise as if preparing to see. The robot had no eyes. It had no light sensors. But its genetic programme was activating pathways that in a normal frog would lead to the formation of a retina. This suggests that future versions – perhaps the next generation – could develop new sensory capabilities that the scientists did not program and did not predict.
What neurobots actually are (and are not)
Let me be precise. A neurobot is not a traditional robot. It has no metal, no silicon, no soldered joints. It is made entirely of living cells – frog skin cells (for structure) and frog heart cells (for movement), plus the inserted neural precursors that become a primitive nervous system. The whole thing is smaller than a grain of rice. It moves by contracting its muscle‑like cells in sequence, propelled through a liquid medium.
The neurons are real. They fire action potentials. They form synapses. They respond to neurotransmitters – the scientists demonstrated this by applying drugs that alter neural communication, and the neurobots changed their behaviour accordingly. The nervous system is not a simulation. It is not a model. It is a living, functioning neural network, grown from frog cells, inside a robot made of frog cells.
But it is also a robot. Its shape is designed by humans. Its movement patterns are constrained by its structure. The neurons are not conscious – probably not. They are too few, too simple. But they are autonomous in a way that no mechanical robot has ever been. The robot does not follow a program. It follows its neural signals. And those signals emerge from the network, not from a line of code.
A British example: the self‑repairing pipe inspector
Imagine a network of water pipes beneath London – Victorian cast iron, corroding, leaking. Inspecting them requires sending cameras through narrow passages, but cameras get stuck, lenses fog, batteries die. Now imagine a swarm of neurobots, each the size of a grain of rice, injected into the water system. They swim through the pipes, guided by their primitive nervous systems. They sense cracks in the pipe walls – not with programmed sensors, but with touch receptors that the neurobots have grown, spontaneously, because their genetic programme responded to the environment.
If a neurobot is crushed by a sudden surge of water, it does not matter. It is made of living cells. Its remains are biodegradable, harmless. The swarm continues. And because the neurobots are biological, they could – in theory – be equipped with repair mechanisms. A neurobot that detects a leak could secrete a biological sealant, grown from its own cells. It could patch the pipe from the inside, autonomously, without human intervention.
This is not science fiction. It is the logical extension of the neurobot research. The scientists have already shown that the neurons can reorganise themselves, that gene expressions can change, that new capabilities can emerge. A neurobot that can sense light – and then avoid it, or seek it – is not far away. A neurobot that can sense chemicals – and then move towards a nutrient source or away from a toxin – is even closer.
The adage that captures it
“Nature abhors a vacuum.” The vacuum in robotics has been the gap between programmed behaviour and autonomous, adaptive, living behaviour. Neurobots are filling that vacuum with nature itself – not with better code, but with actual neurons. The irony is delicious. After decades of trying to make machines more like living things, we have made living things more like machines.
Another saying: “You can’t teach an old dog new tricks.” But you can teach a new neurobot entirely new tricks – not by programming, but by letting its nervous system learn. The neurons rewire themselves based on experience. A neurobot that encounters a stimulus repeatedly will strengthen the neural pathways that respond to it. That is learning. That is plasticity. And that is something no mechanical robot has ever genuinely achieved.
The computing perspective: from programmed to emergent
Traditional computing is programmed. A human writes instructions. The machine follows them. No matter how complex the program, the behaviour is ultimately deterministic – or at least, predictable within the bounds of randomness.
Neurobots invert this. Their behaviour is emergent. It arises from the interactions of neurons that the scientists did not explicitly wire. The neurons form connections spontaneously. They fire in patterns that are not specified in any design document. The robot’s movement is not programmed. It is grown. This is the difference between a steam engine and a seedling. One is built. The other emerges.
From a computing perspective, this is both thrilling and terrifying. Thrilling because emergent systems can be more robust, more adaptive, more creative than programmed ones. Terrifying because they are also less predictable. A neurobot might develop a behaviour that its creators never intended. It might solve a problem in a way that no one anticipated. Or it might fail in a way that no one can debug, because there is no code to debug – only a living network that has learned to do something unexpected.
The researchers who saw visual system gene expressions appearing were not alarmed. They were fascinated. But they should also have been cautious. The neurobot was not supposed to prepare to see. It did so anyway. That is emergence. And emergence, as any computer scientist knows, is where bugs become features and features become surprises.
The bottom line
Scientists have created neurobots – living robots from frog cells with actual neurons integrated into their structure. Those neurons connect to cells that control movement, giving the robot a basic nervous system. Some gene expressions linked to visual system development started appearing spontaneously, suggesting future versions could develop new sensory capabilities. The line between biological and mechanical has blurred. The robot is alive. The organism is designed.
As the old adage goes, “Life finds a way.” In the neurobots, life has found a way into robotics – not as a metaphor, but as a substrate. The neurons fire. The robot moves. The future – eyes, senses, learning – is already stirring in its genetic code. We built it. But we do not fully control it. And that, perhaps, is the most British of understatements: we may have opened a door that cannot be closed. The neurobots are watching. Soon, they may see.
33.“Muscle is better than motor” – and for the first time, artificial muscle is catching up.
There is a classic British engineering joke: “How many motors does it take to change a light bulb? One, but it has to be precisely calibrated, lubricated, and protected from dust.” The humour lies in the absurdity of using a complex, rigid, delicate machine for a task that a human does with a simple squeeze of the fingers. Motors are powerful, but they are also heavy, noisy, and fragile. They hate heat. They loathe dust. They are useless in a vacuum. And they have never, ever been able to mimic the graceful, adaptable strength of biological muscle.
Until now. Scientists have developed Harp actuators – flexible, air‑powered structures that mimic real muscles. They expand and contract using small amounts of air, just as a bicep contracts when filled with blood. They are lightweight, quiet, and highly adaptable. And they do things that no electric motor can dream of. A robot equipped with Harp actuators can lift up to 100 times its own weight. It can operate in extreme environments – high heat, abrasive conditions, even underwater. It can squeeze through tight spaces that would trap a rigid machine. And a robotic arm inspired by an elephant trunk, built with these actuators, can reach around obstacles with high precision.
This is not an incremental improvement. It is a change in the strength equation – a redefinition of what a robot can be. Motors are powerful but fragile. Muscles are weaker per gram but far more adaptable. Harp actuators promise the best of both: the strength of motors (lift 100 times your weight) with the flexibility of muscle (squeeze through a gap, wrap around an obstacle). The equation has shifted. The old trade‑off is dead.
How Harp actuators work (without the heavy maths)
Imagine a party balloon. You blow it up, and it expands. You let the air out, and it contracts. Now imagine that the balloon is shaped not as a sphere, but as a long, folded tube with rigid constraints on the sides. When you pump air in, the tube expands in one direction – it contracts along its length, pulling whatever is attached to it. When you release the air, it relaxes. This is a pneumatic artificial muscle. It has been around for decades, but early versions were weak, slow, and hard to control.
Harp actuators are different. They are made of advanced materials that can withstand thousands of pressurisation cycles without wearing out. They are designed with internal chambers that optimise the force‑to‑weight ratio. And they are controlled by fast, precise valves that can modulate air pressure hundreds of times per second. The result is a muscle that contracts quickly, with high force, and can hold a position without constant air consumption.
The computing perspective is crucial here. Controlling a Harp actuator is not like controlling a motor. A motor has a linear relationship between voltage and torque. A pneumatic muscle has a non‑linear, time‑varying relationship between air pressure, contraction, and force. Traditional control theory struggles with this. But modern machine learning – specifically, reinforcement learning – can learn to control it. A robot with Harp actuators can be trained in simulation, just like a motorised robot, but the resulting policies are far more adaptable because the actuator itself is adaptable.
A British example: the bomb disposal robot
Imagine a bomb disposal unit in London. A suspicious package is left in a crowded tube station. The current robot – a wheeled machine with rigid arms – can approach the package, but it cannot squeeze through the narrow gap between the ticket machine and the wall. It cannot reach around the corner to get a better view. It cannot climb the stairs to the upper platform. It is rigid, and the environment is not.
Now imagine a Harp‑actuated robot, built like an elephant trunk. It slithers through the gap, its flexible body conforming to the space. It reaches around the corner, using its precision grip to manipulate a camera into position. It climbs the stairs by expanding and contracting its segments, inchworm‑style. It approaches the package and, if necessary, lifts it – the robot weighs 10 kilograms, but its artificial muscles can lift 1,000 kilograms. It carries the package to a safe location. No explosion. No casualties. Just a robot that did what no rigid machine could do.
That is the promise of Harp actuators. Not just strength. Not just flexibility. Strength and flexibility together – a combination that has never existed in robotics before.
The adage that captures it
“Softly, softly, catchee monkey.” The old saying means that gentle, patient, adaptable methods often succeed where brute force fails. Harp actuators are the embodiment of “softly, softly”. They are soft – literally. They are adaptable. They can squeeze, wrap, and conform. And they can catch the monkey – or lift the bomb – because their strength is not compromised by their flexibility.
Another saying: “Where there’s a will, there’s a way.” The will is the demand for robots that can operate in human environments – cluttered, fragile, unpredictable. The way is Harp actuators. They provide the physical capability that rigid motors cannot. The will is there. The way is now open.
Extreme environments and the end of the clean room
Traditional robots require clean, temperature‑controlled, dust‑free environments. A factory floor is manageable. A construction site is marginal. A disaster zone – fire, flood, rubble – is impossible. Motors overheat. Bearings clog. Electronics short.
Harp actuators have no bearings. They have no gears. They are made of flexible materials that can be sealed against dust and moisture. They operate on air – the same air that is everywhere. If the air is hot (a fire), the actuator still works – within limits. If the air is abrasive (a sandstorm), the actuator can be protected by a simple flexible sleeve. If the actuator is damaged – a tear, a puncture – it might still function, because the air chambers are redundant. A motor with a broken gear is dead. A muscle with a small hole is just a slightly weaker muscle.
This robustness changes where robots can go. A Harp‑actuated robot could be dropped into a nuclear reactor core (high heat, high radiation) to perform repairs. It could be sent into a deep sea trench (high pressure, corrosive salt) to inspect cables. It could crawl through a collapsed building (dust, sharp edges, unpredictable geometry) to find survivors. No other actuation technology offers this combination of strength, flexibility, and environmental tolerance.
The computing perspective: embodiment as algorithm
There is a concept in artificial intelligence called embodied cognition – the idea that the shape and material of a body influence the thinking of the brain. A robot with rigid motors thinks differently from a robot with soft muscles. The rigid robot must plan every move precisely, because a mistake could break a gear. The soft robot can trial and error physically, because the muscles are forgiving.
Harp actuators enable a new kind of embodied AI. The robot can learn to move not by avoiding collisions, but by embracing them – because a soft actuator can bump into a wall without damage. It can learn to grasp not by computing the exact force, but by feeling the object through the compliance of the muscle. The actuator itself becomes a sensor. The robot’s body becomes part of its computational process.
This is why the elephant trunk arm is so significant. An elephant trunk has no bones. It is a muscular hydrostat – a mass of muscle and connective tissue that can bend in any direction, apply force, and sense pressure simultaneously. The Harp‑actuated arm mimics this. It can wrap around a pole, slide along its length, and squeeze with just the right force to hold on without crushing. That is not a programmed behaviour. That is an emergent property of the actuator’s physical design, combined with a control policy that has learned to exploit that design.
The bottom line
Harp actuators are flexible, air‑powered structures that mimic real muscles. They allow robots to lift up to 100 times their own weight, operate in extreme environments, and squeeze through tight spaces. A robotic arm inspired by an elephant trunk can reach around obstacles with high precision. This is not a small improvement. It is a change in the strength equation – a redefinition of what a robot can be and where it can go.
As the old adage goes, “A new broom sweeps clean.” Harp actuators are a new broom – a new technology that sweeps away old limitations. The limitation of rigidity. The limitation of clean environments. The limitation of fragile gears and bearings. The broom is sweeping. The floors are getting cleaner – even the rubble‑strewn, dust‑filled, impossible floors that no robot has ever crossed before.
For the United Kingdom, with its aging infrastructure, its disaster response needs, and its world‑class robotics research, Harp actuators are not just an interesting development. They are a tool. A tool that can climb stairs, squeeze through gaps, and lift a car. The only question is what we build with it. And the answer, if history is any guide, will be something that no one has imagined yet. That is the nature of a new broom. It sweeps in ways you do not expect.
34.“If you can’t stand the heat, get out of the kitchen” – but this robot was born in the heat and moves because of it.
For centuries, engineers have been obsessed with the motor. A motor is a beautiful thing: electricity in, rotation out, torque on demand. But a motor is also a compromise. It has bearings that wear, windings that overheat, magnets that demagnetise, and a thousand other failure modes. A motor is a collection of parts, assembled with tolerances, each part a potential point of failure. What if you could build a robot that had no motors at all? What if the robot’s body itself was the actuator – moving not because of spinning magnets, but because of the gentle, controlled application of heat?
Researchers at Princeton University have done exactly that. They have built a robot using liquid crystal elastomer – a material that contracts or bends when heat is applied, depending on how it is printed. No motors. No gears. No bearings. Just a single, integrated structure that moves because its molecules rearrange themselves when warmed.
And here is the leap: they integrated flexible circuit boards during the printing process. The robot is not assembled from separate parts – motor, controller, sensors, chassis. It is printed as one system, all at once. The circuits are embedded in the material. The temperature sensors are printed in place. The robot emerges from the printer ready to move, with nothing to screw together, nothing to calibrate, nothing to lubricate.
The robot includes temperature sensors and closed‑loop control. It can sense its own temperature, compare it to the desired state, and adjust the heat input in real time to maintain accuracy. It is a self‑regulating, motorless, gearless machine that moves like an origami creature coming to life.
How heat‑based movement works (without the advanced chemistry)
Liquid crystal elastomer is a remarkable material. Imagine a rubber band made of tiny rod‑shaped molecules that are all aligned in the same direction. When you heat it, the rods become disordered – they wiggle and twist. But because they are cross‑linked into a polymer network, the disorder causes the material to contract along the direction of alignment. Cool it, and the rods re‑align, and the material expands back to its original length. This is not melting. It is a phase change within the solid state – a transition between ordered and disordered configurations.
The Princeton team took this principle and added a crucial twist: they printed the material in patterns. By controlling the alignment of the rods during printing – using techniques similar to those in liquid crystal displays – they could make different parts of the same structure contract in different directions when heated. A flat sheet printed with one pattern might curl into a cylinder. Another pattern might cause a twisting motion. Another might produce a bending hinge. The shape change is programmed into the material itself, not into external motors.
The researchers demonstrated this with an origami‑inspired structure that flaps like a crane. A simple application of heat – from a tiny embedded resistor, or from ambient warmth – causes the structure to fold and unfold repeatedly. The motion is smooth, silent, and free of the jerky starts and stops of a motorised joint.
The integrated printing breakthrough
The real genius of this work is not the material – liquid crystal elastomers have been known for years. It is the integration of electronics during printing. The team used a custom 3D printer to deposit not just the elastomer, but also conductive traces, flexible circuit boards, and temperature sensors. Everything is printed in one continuous process. The robot that comes out is a single, seamless object. There are no assembly steps. There is no alignment of motor shafts. There are no screws to tighten or solder joints to fail.
This is additive manufacturing taken to its logical extreme. Most 3D printing produces a plastic shape that then has to be populated with electronics. This printing produces the electronics and the structure and the actuators simultaneously. The circuits are embedded within the elastomer, protected from the environment. The temperature sensors are placed exactly where they need to be, because they were printed there. The closed‑loop control system – a simple microcontroller, also printed (or placed, in this prototype) – reads the sensors and drives the heaters.
From a computing perspective, this integration eliminates the hardware‑software boundary that has plagued robotics. In a traditional robot, the software runs on a separate computer, sending commands to separate motor controllers, reading separate sensors. Latency, noise, and interference are constant problems. In the Princeton robot, the sensors are embedded in the actuator material. The heater is embedded in the same material. The control loop can be extremely fast and local, because the distances are millimetres, not metres. The robot’s body is its control system.
A British example: the endoscopic surgical tool
Imagine a patient in a British hospital needing a delicate procedure deep inside the body – say, removing a polyp from the colon. Traditional endoscopic tools are rigid or semi‑rigid, controlled by cables and pulleys. They work, but they are clumsy. They cannot bend around corners easily. They cannot adjust their stiffness on the fly. And they are made of metal, which limits their compatibility with MRI or other imaging.
Now imagine a surgical tool printed from liquid crystal elastomer, with embedded heaters and temperature sensors. It is soft, flexible, and biocompatible. The surgeon controls it not by pulling cables, but by applying small electrical currents to specific heaters. The tool bends, twists, and contracts – exactly where needed. Because the material is soft, it cannot damage delicate tissues. Because it has no motors, it produces no electrical interference. The closed‑loop control ensures that the tool holds its shape despite body heat or fluid flow.
After the procedure, the tool is simply withdrawn. It is disposable – or, perhaps, it is biodegradable, dissolving harmlessly. That is the promise of heat‑based, motorless robots. Not just for factories, but for the inside of the human body.
The adage that captures it
“Waste not, want not.” Traditional robots waste energy overcoming the friction of gears and bearings. They waste weight carrying heavy motors. They waste complexity in assembly. The Princeton robot wastes nothing. The material itself does the work. The heat is applied only where needed. The structure is printed as one piece, with no excess material. Waste not, want not – and the robot does not want for much.
Another saying: “Still waters run deep.” On the surface, the robot is simple – a flat sheet of rubbery material that curls when warmed. But the depth is in the materials science, the printing technique, the closed‑loop control. Still waters run deep. Simple appearance, profound engineering.
The computing perspective: programming the material
Traditional computing programs transistors – switches that are either on or off. The Princeton robot programs molecules – liquid crystal rods that can be ordered or disordered. This is a shift from digital to analog computing, from discrete to continuous control. The material’s response to heat is not binary. It is a smooth function of temperature, with hysteresis and time dependence. Programming the robot means designing the printing pattern – the alignment of the rods – to produce a desired shape change when heated. That is a computational problem: given a target motion, what pattern of molecular alignment yields it?
The team developed mathematical models, inspired by origami design, to solve this problem. They can predict how a printed pattern will deform under heat. They can optimise the pattern to achieve a specific hinge angle, a specific curling radius, a specific twisting motion. The material becomes the algorithm. The robot’s body is its code.
This is physical computing – not computation on silicon, but computation on matter. The robot does not have a processor that runs a program. It is the program, embodied in the alignment of its molecules. That is a radical departure from every robot that has come before.
The bottom line
Princeton researchers built a robot using liquid crystal elastomer that contracts or bends when heat is applied, depending on how it is printed. They integrated flexible circuit boards during the printing process, so everything is built as one system instead of assembled afterward. The robot includes temperature sensors and closed‑loop control, adjusting itself in real time to maintain accuracy. No motors. No gears. No assembly. Just heat, material, and motion.
As the old adage goes, “A new invention is a new way of doing things that people said couldn’t be done.” People said you could not build a robot without motors. Princeton did. People said you could not print electronics and actuators in one go. Princeton did. The heat‑based robot is not a replacement for all robots – it is slow, and it requires a controlled thermal environment. But for applications where silence, softness, and simplicity matter – medical devices, inspection tools, toys – it is a revolution. And revolutions, as we know, always begin with a little heat.
35.“Trust takes years to build, seconds to break, and forever to repair” – and Google’s twenty‑year reign as the internet’s gatekeeper is facing its seconds.
For the better part of two decades, the routine was as familiar as the morning cuppa. You opened a browser, typed a few words into a white box, and pressed “Enter”. Google’s search bar was the undisputed gateway to the internet. It was the front door to knowledge, the map to every website, the referee that pointed you to the answer you needed. You trusted it. Billions trusted it. That trust was Google’s most valuable asset – more precious than its algorithms, more durable than its data centres.
That trust is now under siege. OpenAI, with its conversational models, offers a different proposition: answers instead of links. You do not ask Google for a list of websites and then click through to find what you need. You ask ChatGPT, and it gives you the answer directly – synthesised, summarised, and presented in plain English. No clicking. No sorting. No judgement about which source is reliable. Just an answer. And for many people, that is faster and easier than a list of blue links.
This is an existential crisis for Google. The company’s entire ad‑based business model – the engine that generates over a hundred billion pounds a year – depends on those links. Every search is an opportunity to show sponsored results. Every click is a transaction. If users stop clicking because the answer is already provided, the money stops flowing. OpenAI does not need to beat Google at search quality. It just needs to make search unnecessary for a growing number of queries.
The counteroffensive: personal intelligence as a moat
Google’s response is not to build a better chatbot. It is to build a personal intelligence – an AI that knows not just the world’s information, but your information. Your emails. Your calendar. Your location history. Your search history. Your Gmail attachments. Your Google Photos. Your saved passwords. The vast, intimate digital diary that you have entrusted to Google over twenty years.
OpenAI cannot match that. It does not have your emails. It does not know where you were last Tuesday. It has not seen the photos of your children. It is a brilliant generalist, but it is a stranger to your life. Google, by contrast, is the digital keeper of your personal history. It knows who you email, where you travel, what you search for at 3 AM. That is not a privacy violation – it is a trust relationship. A contested one, but a relationship nonetheless.
The counteroffensive is to weave that personal data into every interaction. You ask Google a question, and the answer is not just factual – it is tailored. “What time should I leave for the airport?” Google looks at your calendar (flight at 4 PM), your location (home), traffic conditions, and gives you a personalised answer. OpenAI can only give you a generic rule of thumb. Google’s answer is more useful because Google knows you.
This is personal intelligence – not general knowledge, but specific, contextual, individual knowledge. It is Google’s moat. And it is the only defence against the chatbot invasion.
A British example: the commute from hell
Imagine you live in Croydon and commute to Canary Wharf every morning. You ask Google: “How is the traffic today?” Google knows your route because it has watched you drive it three hundred times. It knows the Jubilee line is partially suspended – it read the TfL alert from your email. It knows you have a meeting at 9 AM because your calendar says so. It tells you: “Leave at 7:45 instead of 8:00. Take the tram to Beckenham and then the overground – it will save you 22 minutes.”
You ask OpenAI the same question. It tells you: “Traffic in London is generally heavy between 8 and 9 AM. Consider public transport.” That is true, but it is not personal. It does not know your specific route, your calendar, or the suspension you have not yet noticed. Google’s answer is better. Google’s answer keeps you using Google.
That is the counteroffensive. Not to out‑chat the chatbots, but to out‑know them.
The adage that captures it
“Familiarity breeds contempt.” Google is familiar. It has been the gateway for twenty years. That familiarity breeds not contempt, but complacency – and the threat of disruption. OpenAI is new. It is not familiar. It has to earn trust from scratch. But it also has no baggage. No history of privacy scandals. No ad‑based conflicts of interest. It is a blank slate.
Another saying: “A bird in the hand is worth two in the bush.” Google has the bird in the hand – your personal data, your calendar, your emails. OpenAI has two in the bush – the promise of better answers, cleaner interfaces, no ads. Which is worth more? Google is betting that the bird in the hand is priceless. OpenAI is betting that the bush is full of birds no one has caught yet.
The computing perspective: the shift from retrieval to generation
Traditional search is retrieval. You ask a question, Google retrieves the most relevant documents from its index, ranks them, and presents a list. The user does the work of reading, selecting, synthesising. It is a library card catalogue, scaled to the world.
Conversational AI is generation. You ask a question, the model generates an answer from its internal representation of knowledge. The user does no work. The answer is ready. It is a research assistant, a personal tutor, a know‑it‑all friend.
Google built its empire on retrieval. But retrieval has a ceiling. Users want answers, not links. OpenAI proved that. Now Google must respond by making retrieval smarter – by retrieving not just web pages, but personal data, and then generating answers that synthesise both. That is a harder problem than pure generation. It requires privacy, security, and the user’s ongoing trust. Google has those, for now.
The contested territory
The trust interface is contested. Two decades of Google’s dominance are being challenged not by a better search engine, but by the obsolescence of search itself. Why search when you can ask? OpenAI offers answers, not links. Google’s counteroffensive is personal intelligence – leveraging the intimate data you have already entrusted to it. The battle is not about algorithms. It is about who you trust with your life.
As the old adage goes, “A house divided against itself cannot stand.” Google’s house is divided between its legacy ad business and its future personal intelligence. OpenAI’s house is still being built. The contest is not resolved. But the terrain has shifted. The gateway to the internet is no longer a simple search bar. It is a conversation – and the winner will be the one you trust to have that conversation with your most private self.
36.“The early bird catches the worm” – but the worm is your attention, and the bird is an algorithm that knows you better than you know yourself.
For generations, the morning routine was simple. You woke up, made tea, read the newspaper, checked the post. The first information of the day was chosen by you – which section to read, which letter to open. You were in control. Then came email. Then came smartphones. Then came notifications. The morning became a fire hose of alerts, messages, and to‑do lists. You were no longer in control. You were reacting.
Google’s Daily Brief agent is designed to change that – but not by giving you back control. By taking it away more elegantly. The Daily Brief synthesises information from your inbox, your calendar, and your tasks, organising it by topic and suggesting next steps. It is designed to be your first stop every morning. You open it before your email, before your calendar, before anything else. It tells you what you need to know. It tells you what you need to do. It does not ask for permission. It just presents.
This is not assistance. This is behavioural infrastructure – the underlying system that shapes how you start your day, how you allocate your attention, how you decide what matters. It is the front door to your digital life. And once you walk through that door, you are in Google’s house, following Google’s signs, accepting Google’s priorities as your own.
What the Daily Brief actually does
Let me describe a typical morning with the Daily Brief. You wake up, open the Gemini app (or a dedicated widget), and see a personalised digest. It is not a list of unread emails in chronological order. It is a synthesised summary:
“You have a meeting with Sarah at 10 AM. The agenda is attached. She mentioned she might be late – check your email for details.”
“Your flight to Edinburgh is at 2 PM. Check in now. The weather forecast is rain, so pack an umbrella.”
“Three bills are due this week. Two have been paid automatically. One requires your approval – tap here.”
“Your favourite coffee shop has a loyalty reward expiring today. Would you like directions?”
Everything is organised by topic, not by source. The email from Sarah is not buried under a dozen newsletters. The flight confirmation is not lost in a folder. The bills are not scattered across different accounts. The coffee reward is not hiding in an app you forgot you had. The Daily Brief pulls it all together, in one place, before you have even asked.
The next steps are suggested, not just listed. “Check in now.” “Pay this bill.” “Reply to Sarah.” Each suggestion is a single tap or click. The friction of navigating between apps, searching for information, deciding what to do – that friction is removed. You do not decide what matters. The algorithm decides. You just execute.
Why this is behavioural infrastructure
Behavioural infrastructure is the invisible architecture that shapes human behaviour. A well‑designed staircase encourages walking. A well‑placed sign encourages a particular route. A well‑timed notification encourages a specific action. The Daily Brief is behavioural infrastructure because it frames your reality before you have a chance to frame it yourself.
The first information you see sets the agenda for the day. If the Daily Brief highlights a work deadline, that becomes the priority. If it highlights a personal errand, that rises in importance. If it omits something – because the algorithm judged it irrelevant – that something may never get done. You are not choosing what matters. You are responding to what the algorithm chose for you.
This is subtle. You still feel in control. You can ignore a suggestion. You can scroll past a summary. But the very act of making the Daily Brief your first stop means you have outsourced the framing of your day to Google. You are not the architect of your attention. You are the tenant.
A British example: the commuter’s morning
Imagine a commuter in Sevenoaks, catching the 7:30 AM train to London Bridge. Every morning, they open the Daily Brief while waiting for the platform announcement. Today, the Brief says: “Your 7:30 train is delayed by 12 minutes. The 7:42 is on time. You have a meeting at 9 AM. If you take the 7:42, you will arrive at 8:54 – enough time. Would you like to rebook your seat reservation?”
The commuter taps “rebook”. Done. The Brief also notes: “You forgot to buy a present for your niece’s birthday – it is today. There is a toy shop near the station that opens at 8 AM. Tap for directions.” The commuter taps. The Brief has changed not just their route, but their behaviour. They buy the present. They feel grateful to the algorithm. They did not have to think.
Now imagine the Brief did not mention the present. The commuter would have forgotten entirely. The algorithm decided that the present was important enough to surface. But what about the phone call to the elderly mother? What about the gym session that was not on the calendar? What about the novel they wanted to read on the train? The Brief did not mention those. They fell into the background. The algorithm set the agenda. The commuter followed.
That is behavioural infrastructure in action. Not coercion. Just structuring the choice environment so that some actions are easier, more visible, and more likely to be taken. The commuter still feels free. But their freedom is exercised within a cage built by Google.
The adage that captures it
“A place for everything and everything in its place.” The Daily Brief puts everything in its place – but Google decides what “its place” means. The algorithm decides which tasks are important enough to surface, which can be deferred, which can be ignored. The user does not have to sort. They just have to execute. Efficiency is the goal. Autonomy is the casualty.
Another saying: “The shoemaker’s son always goes barefoot.” Google organises the world’s information, but does it organise its own priorities transparently? The Daily Brief is not a neutral assistant. It is a product designed to keep you inside the Google ecosystem. The shoemaker’s son – the user – goes barefoot of choice, because the algorithm has already chosen for them.
The computing perspective: attention as a resource to be optimised
In computing, we talk about resource allocation. A computer has finite CPU cycles, memory, bandwidth. The operating system decides which processes get priority. The Daily Brief is an operating system for your attention. Your attention is a finite resource. There are only so many hours in a day, only so many decisions you can make, only so many tasks you can complete. The Daily Brief allocates your attention to the tasks that Google’s algorithms deem most important.
This is not malevolent. It is efficient. The algorithms are optimising for your stated goals – the meetings you scheduled, the bills you need to pay, the flights you booked. But your stated goals are not all of you. They are the digital traces you left behind. The Daily Brief cannot see your unstated intentions, your quiet longings, your spontaneous desires. It only sees what is already in your inbox and calendar. It optimises for the past. It does not create the future.
Yet by becoming your first stop every morning, it shapes your future. You do what it suggests. You become the person the algorithm expects you to be. That is the power of behavioural infrastructure. It does not force you. It invites you to become more efficient, more organised, more productive – and in doing so, to surrender the messiness of human choice to the tidiness of machine optimisation.
The bottom line
Google’s Daily Brief agent synthesises information from your inbox, calendar, and tasks, organising it by topic and suggesting next steps. It is designed to be your first stop every morning. This is not assistance. This is the front door to your digital life – behavioural infrastructure that shapes how you start your day, allocate your attention, and decide what matters. It is convenient. It is efficient. And it is a subtle surrender of autonomy.
As the old adage goes, “He who pays the piper calls the tune.” Google pays the piper – with free services, with seamless integration, with the promise of a less stressful morning. But the tune it calls is the one that keeps you inside its walls, following its suggestions, accepting its priorities. The Daily Brief is a beautiful, useful, dangerous piece of behavioural infrastructure. The only question is whether you will notice the tune before you start humming along.
37.“Look before you leap” – but your agent can look at a million options before you take a single step.
There is a famous British saying: “Buy cheap, buy twice.” It warns that a bargain today often becomes a disappointment tomorrow. But what if you had an assistant that could not only find you the best price, but also check the compatibility of every component, track price drops over time, and handle the payment without you lifting a finger? That is not a distant dream. It is agentic commerce – and it is about to change shopping more fundamentally than the shift from the high street to the internet.
Google has introduced three interlocking pieces of infrastructure that make agentic commerce possible: the Universal Commerce Protocol (UCP) , the Agent Payments Protocol (AP2) , and the Universal Cart. Together, they create an open standard for agent‑to‑merchant transactions – meaning your AI agent can shop on your behalf, across different merchants, comparing prices, checking compatibility, and completing purchases, all under your direction and within your boundaries.
The Universal Cart is the most visible piece. It works across merchants and services. You can add items when browsing search, chatting with Gemini, watching YouTube, or reading Gmail. The moment you add a product, the cart goes to work in the background – finding deals, tracking price drops, giving you insights on price history, and alerting you when something comes back in stock. It runs on Google’s Gemini models, so it gets smarter as the models improve.
But the real magic is the intelligent reasoning. Here is the example that should make every PC builder smile. You are building your first custom computer. You add a motherboard with great reviews. You already picked out a processor. What you did not realise is that the processor needs a motherboard with a different type of socket. Your Universal Cart catches this – before you buy – and suggests an alternative. It prevents a problem you did not see coming. That is not just a shopping cart. That is a knowledgeable shopping companion.
How the protocols work (without the jargon)
The Universal Commerce Protocol (UCP) does for agent commerce what HTTP did for the web. It gives agents and merchants a common language to talk to each other. An agent can ask a merchant: “What are your prices? Do you have this item in stock? Can you deliver by Tuesday?” The merchant can answer in a standard format that any agent understands. No more screen‑scraping. No more broken automations. Just clean, machine‑readable commerce.
The Agent Payments Protocol (AP2) handles the money. The number one question with agent payments is: “How do I know it won’t go off and buy something I don’t want?” AP2 solves this with boundaries and accountability. You tell your agent the specific brands and products you want, and how much you want to spend. It automatically makes the purchase only if the criteria are met. Every transaction is recorded in a tamper‑proof digital trail. You can see exactly what the agent did, when, and for how much. Your payment info stays shielded. The merchant sees only what they need to see.
Together, UCP and AP2 turn your AI agent into a trusted shopping assistant – one that can act on your behalf, within your rules, across the entire internet.
A British example: buying a new washing machine
Imagine your washing machine breaks down on a Sunday evening. You need a replacement by Tuesday, because the family has run out of clean uniforms for school. You open Gemini and say: “Buy me a new washing machine. Energy rating A or better. Capacity at least 8 kg. Budget £500. Delivery by Tuesday.”
Your agent springs into action. It searches across Currys, AO, John Lewis, Argos, and smaller independent retailers. It checks stock levels, delivery slots, and prices. It finds a machine that meets your criteria at John Lewis for £480, with free delivery on Tuesday. It also finds a similar machine at AO for £460, but AO cannot deliver until Wednesday. The agent presents you with the options. You choose the John Lewis machine. The agent completes the purchase using your saved payment details. You get a confirmation. The whole process takes 90 seconds.
Now imagine you had not specified a budget. The agent would have found the best value, not the cheapest. It would have factored in energy savings over five years, noise levels, and customer reviews. It would have made a recommendation, not just a list. That is agentic commerce. It is not faster checkout. It is no checkout – because the agent handles everything.
The adage that captures it
“A penny saved is a penny earned.” The Universal Cart saves you pennies by tracking price drops, finding deals, and preventing incompatible purchases. Those pennies add up. And the time you save – not having to search, compare, and double‑check – is pennies earned in a different currency.
Another saying: “Measure twice, cut once.” Agentic commerce measures a thousand times – across a thousand merchants, a thousand products, a thousand price histories – before you make a single purchase. The cut is perfect because the measurement was exhaustive. You cannot do that yourself. Your agent can.
The computing perspective: from pull to push
Traditional shopping is pull. You go to a merchant, you search, you compare, you buy. You pull the information to you. Agentic commerce is push. You set your preferences, and the agent pushes relevant options to you, when they become available, at the best price. The cart does not wait for you to check back. It watches prices continuously. When a price drops below your threshold, it alerts you – or buys immediately, if you have given permission.
This is the difference between a library (you go and find the book) and a subscription (the books come to you). Agentic commerce is the subscription model for shopping. You do not shop. You set the rules, and the shopping happens to you. That is more efficient. It is also a surrender of the serendipity, the browsing, the joy of discovery. Whether that trade‑off is worth it is a personal choice. But the technology does not ask. It just enables.
The bottom line
The Universal Commerce Protocol and Agent Payments Protocol create an open standard for agent‑to‑merchant transactions. The Universal Cart works across merchants and services, finding deals, tracking price drops, and applying intelligent reasoning – catching that a processor needs a motherboard with a different socket before you buy. Agentic commerce changes shopping fundamentally. It is faster, smarter, and more efficient. It is also a shift from you controlling the process to you setting the boundaries and letting the agent run.
As the old adage goes, “There ain’t no such thing as a free lunch.” Agentic commerce is not free. It costs you your direct involvement, your spontaneous browsing, your hands‑on control. In return, it saves you time and money. The lunch is not free – but it is very, very cheap. And for many people, that is a price worth paying. The agent is ready. The protocols are live. The cart is waiting. All you have to do is trust it.
38.“Running to stand still” – the watermarkers are sprinting, but the forgers are already at the finish line.
There is a classic British children’s story about the tortoise and the hare. The hare sprints ahead, takes a nap, and loses. In the race between AI generation and AI detection, there is no nap. The hare – generative AI – is sprinting faster every day. The tortoise – detection and watermarking – is also sprinting, but it started from behind and the gap is not closing. It is widening.
Consider the numbers. SynthID, Google’s watermarking tool, has now watermarked over 100 billion images and videos along with 60,000 years of audio assets. That sounds impressive. It is. But those watermarks are only useful if people check for them – and most people do not. Research shows that people can correctly identify high‑quality deepfake videos only about a quarter of the time. That is worse than random guessing. You would do better flipping a coin.
The generation capabilities are improving exponentially. The detection tools are improving linearly. The gap between what a model can create and what a human (or even another model) can reliably detect is growing. OpenAI, Kakao, and 11 Labs have now adopted SynthID – a welcome step. But adoption does not solve the fundamental problem: detection is always racing behind generation. By the time you have a watermark for one type of deepfake, the next generation of models has already learned to evade it.
What SynthID actually does (and does not do)
SynthID embeds a digital watermark directly into the pixels of an image or the waveform of an audio file. The watermark is invisible to the human eye – it does not affect quality. But a detector can read it and confirm that the content was generated by a specific AI model. It is like a hidden signature baked into the creation process.
This works well for content generated by models that voluntarily include the watermark. It does nothing for content generated by models that do not. And it does nothing for content that has been screen‑captured, recompressed, or otherwise stripped of its metadata. A motivated bad actor can remove a watermark with simple image editing. The watermark is a deterrent, not a fortress.
The research finding – that people can identify high‑quality deepfakes only 25% of the time – is the real alarm bell. It means that for the average person scrolling through social media, a convincing deepfake is indistinguishable from reality. The human eye is not the right tool for this job. We evolved to detect subtle social cues, not pixel‑level artefacts. The forgers have already won the battle for human perception.
Why detection can never catch up
From a computing standpoint, generation and detection are an arms race – and the arms race favours the attacker. Here is why. A generative model can be trained on a vast dataset of real images. It learns the statistical distribution of pixels, textures, and shapes. To create a deepfake, it simply samples from that distribution. The result is statistically similar to a real image. A detector must find the subtle differences between the distribution of real images and the distribution of generated images. But as the generative models improve, those differences shrink. The detector’s job becomes harder with every new model release.
Moreover, the attacker has a feedback loop. They can run their generated images through the detector, see which ones are flagged, and adjust their generation process to evade detection. This is adversarial training – the same technique used to make self‑driving cars robust to unusual road conditions, but applied to deception. The defender does not have the same feedback loop. They cannot generate fake images, run them through their detector, and then adjust their detector – because they do not know what the next generation of fakes will look like.
The result: generation pulls further ahead with every iteration. Detection lags. The adoption of SynthID by major players like OpenAI, Kakao, and 11 Labs is necessary, but it is not sufficient. It is like putting a lock on your front door when the burglar has already learned to pick locks and also has a key to the back door.
A British example: the deepfake Prime Minister
Imagine a video surfaces of the Prime Minister, recorded on a phone in what appears to be a private meeting, saying something deeply damaging – perhaps a racist remark, perhaps a confession of corruption. The video is shared millions of times on social media within hours. News outlets rush to cover it. Opponents demand resignation.
Three days later, experts confirm it is a deepfake. The watermark was stripped. The quality was high enough to fool journalists. But the damage is done. The denial gets a fraction of the views. The public’s trust in the Prime Minister – and in all video evidence – is shattered. Even real videos from the future will be dismissed as “probably fake”.
That is the world we are entering. Not because the technology is perfect, but because the asymmetry favours the attacker. Creating a convincing fake takes minutes. Debunking it takes days, and the debunk never reaches the same audience. The watermark is a tool for after the fact, but by then, the horse has already bolted.
The adage that captures it
“Closing the stable door after the horse has bolted.” SynthID is a stable door. It is well‑built, well‑designed, and widely adopted. But the horse – the generation of convincing deepfakes – bolted years ago. The watermark can help identify content that was generated by cooperating models, but it cannot unring the bell of a viral fake.
Another saying: “A chain is only as strong as its weakest link.” The chain of trust for digital media has many links: capture, storage, distribution, display. The watermark is one link. The weakest link is the human brain, which cannot reliably distinguish a high‑quality deepfake from reality. No watermark can fix that.
The computing perspective: watermarking as a social contract, not a technical solution
Watermarking works only if everyone plays by the same rules. If every major AI model embeds a watermark, and every major platform checks for watermarks, then unwatermarked content is suspicious by default. That is a social contract, not a technical guarantee. It requires co‑ordination, enforcement, and trust.
The adoption of SynthID by OpenAI, Kakao, and 11 Labs is a step toward that contract. But the contract is incomplete. What about models from China? What about open‑source models that anyone can download and run without watermarks? What about screen capture that strips metadata? The contract has holes. And the bad actors are already driving trucks through them.
The research finding – 25% correct identification – is a measure of human vulnerability. It is also a measure of the futility of relying on human judgment. The solution cannot be better training for humans. The solution must be technical: better detection, better watermarking, better provenance. But those technical solutions are always playing catch‑up. The hare is not napping. The tortoise is doing its best. But the race is rigged.
The bottom line
SynthID has watermarked over 100 billion images and videos along with 60,000 years of audio assets. Research shows people can correctly identify high‑quality deepfake videos only about a quarter of the time. OpenAI, Kakao, and 11 Labs have now adopted SynthID, but the detection tools are always racing behind the generation capabilities. The horse has bolted. The stable door is closed, but the horse is in the next county.
As the old adage goes, “Prevention is better than cure.” The cure – watermarking, detection, debunking – is struggling. The prevention – stopping the creation of convincing deepfakes – is impossible in an open society. So we are left with the cure, knowing it is insufficient. The only comfort is that the problem is recognised, and the tools are improving. But improving from a baseline of 25% accuracy is a long, hard climb. The hare is sprinting. The tortoise is trying. And the finish line keeps moving.
39.“The road less travelled is now a GPS detour” – and thinking for yourself is becoming an act of rebellion.
There is a quiet disappearance happening, and most of us have not noticed because the disappearance is of something we took for granted: the unaugmented human experience. The simple, unmediated act of thinking for yourself, getting lost, making an unoptimised decision, following a whim without a recommendation – these are becoming radical acts in a world built for efficiency.
The tech titans are no longer looking for your approval. They are looking for your footprint. Every click, every scroll, every pause is data. Every decision you make is compared to a model of what you should do. Every path you take is measured against the optimal route. The neural network is not just a tool you use. It is a lens you see through – and it is always between you and the result.
Whether you are applying for a loan, playing a video game, or asking a question about the world, there is now an AI standing between you and the outcome. Your loan is approved or denied by a model that has never met you. Your game adapts its difficulty to keep you engaged, not to challenge you. Your question is answered not by a human expert, but by a language model that has read the internet. The unaugmented experience – the raw, unfiltered, human‑only interaction – is disappearing.
The three losses: thought, navigation, decision
Let me name three specific losses.
First: thinking for yourself. When every question can be answered instantly by an AI, the discipline of sitting with a problem, struggling with it, and arriving at your own conclusion becomes unnecessary – and therefore rare. Why wrestle with a difficult text when ChatGPT can summarise it? Why struggle with a maths problem when an app can solve it? The muscle of independent thought atrophies from disuse. Thinking for yourself becomes a luxury, a hobby, an eccentricity. It is no longer the default.
Second: getting lost. There is a peculiar joy in wandering without a destination, in taking a wrong turn and discovering something unexpected. That joy is being engineered out of existence. Maps reroute you around traffic. Recommendations show you what you already like. Social media algorithms feed you more of the same. The unplanned, the serendipitous, the accidental – these are inefficiencies to be eliminated. Getting lost is a bug, not a feature. But bugs are also surprises. And surprises are the spice of life.
Third: making an unoptimised decision. Not every choice needs to be the best one. Sometimes you buy the slightly more expensive plane ticket because you like the airline. Sometimes you order the less healthy meal because it tastes better. Sometimes you take the longer route because the scenery is prettier. These suboptimal choices are expressions of preference, of personality, of humanity. In a world where every decision is nudged towards efficiency, the suboptimal becomes suspicious. Why would you choose that? The algorithm would not have chosen it. Something must be wrong with you.
A British example: the Sunday drive
Picture a family in the Cotswolds on a Sunday afternoon. The father says, “Let’s go for a drive.” No destination. No satnav. Just a tank of petrol and a sense of adventure. They take a wrong turn, find a village fete, eat mediocre cake, and laugh about it for years. That is the unaugmented experience.
Now picture the same family today. The father opens Google Maps. The app suggests a route to a highly rated attraction. The children complain they want to go home. The algorithm optimises for arrival time, not for serendipity. The family follows the blue line. They do not get lost. They do not discover anything. They arrive, take photos, leave. The memory is efficient. It is also forgettable.
The Sunday drive is not illegal. But it is countercultural. In a world of efficiency, the aimless wander is an act of defiance. The father who turns off the satnav and takes a random turning is a rebel. His rebellion is quiet, harmless, and increasingly rare.
The computing perspective: the interface as filter
Every neural network that stands between you and the world is a filter. It selects what you see, what you hear, what you are told. A search engine filters the web. A recommendation engine filters products. A credit scoring model filters financial opportunities. These filters are not neutral. They are optimised for someone’s definition of value – usually efficiency, engagement, or profit.
The user experience is seamless. You do not see the filter working. You only see the result. The loan is approved. The game level loads. The answer appears. The filter is invisible, which makes it powerful. You cannot question what you cannot see. You cannot push back against an interface that does not show its workings.
This is the disappearance of the unaugmented experience. It is not that you are forced to use the filters. It is that the unfiltered alternatives are no longer available. You cannot apply for a loan without a credit score computed by a model. You cannot play a modern video game without an engagement‑optimising AI. You cannot ask a question of the world without a search engine or a language model. The neural network is not optional. It is the only door.
The adage that captures it
“You can’t unring a bell.” The bell of augmentation has rung. The unaugmented human experience is not coming back. You cannot return to a world without search engines, without recommendation algorithms, without AI credit scores. The technology is too useful, too integrated, too profitable. The bell is rung. Now we live with the echo.
Another saying: “All that glitters is not gold.” The glitter of efficiency – faster answers, optimised routes, personalised recommendations – is real. But it is not gold. Gold is the slow, messy, inefficient human experience. The glitter is seductive. The gold is precious. The tragedy is that we are trading gold for glitter and calling it progress.
The radical act of being human
In a world where every interaction is mediated by a neural network, the unaugmented human experience becomes a political statement. Thinking for yourself, without an AI summariser, is a choice. Getting lost, without a map, is a choice. Making an unoptimised decision, without a recommendation, is a choice. Each choice is a small rebellion against the tyranny of efficiency.
These rebellions are not grand. They are not marches or manifestos. They are quiet, personal, and easily dismissed. But they are also the last refuges of autonomy. The tech titans have your footprint. They have your attention. They have your data. They do not have your soul – not yet. The soul lives in the inefficient, the unoptimised, the spontaneous. It lives in the Sunday drive, the wrong turn, the mediocre cake.
The bottom line
The unaugmented human experience is disappearing. Thinking for yourself, getting lost, making an unoptimised decision – these become radical acts in a world built for efficiency. The tech titans are no longer looking for your approval. They are looking for your footprint. Whether you are applying for a loan, playing a video game, or asking a question about the world, there is now a neural network standing between you and the result.
As the old adage goes, “Use it or lose it.” The ability to think unaided, to navigate without a map, to choose without a recommendation – these are muscles that atrophy from disuse. If we do not exercise them, we will lose them. And once lost, they may never be regained. The choice is ours. But the time to choose is now. The bell has rung. The echo is fading. And the road less travelled is being paved over with optimisation.
40.“Who watches the watchmen?” – and who decides which watchmen get to watch you?
For years, the conversation about artificial intelligence has been technological. Can we build it? How fast will it run? Will it take our jobs? These are engineering questions, and they have engineering answers. Faster chips. Better algorithms. More data. The answers have come, and they have been astonishing.
But we have crossed a threshold. The question is no longer technological. It is political. Not in the sense of party politics or Westminster gossip, but in the oldest sense of the word: the organisation of power, the distribution of trust, the boundaries of human autonomy. Who will you trust to be the interface between you and the world? That is not a question for engineers. It is a question for citizens, for families, for every person who wakes up and reaches for a screen.
The technology works. It can answer your questions, book your travel, manage your calendar, approve your loan, even fold your laundry. The remaining puzzles are not about capability. They are about consent, control, and consequence. Which companies have earned your trust with your most sensitive data? Which interfaces are you willing to integrate into your daily workflow? And – most painfully – which parts of your decision‑making, your creativity, your personal relationships must remain fundamentally human and unaugmented?
The political question of trust
Trust is not a technical protocol. You cannot download it, patch it, or encrypt it. Trust is a relationship – fragile, earned over years, shattered in moments. When you ask Google to manage your calendar, you are trusting Google with your time. When you ask Amazon to remember your payment details, you are trusting Amazon with your money. When you ask OpenAI to answer your private questions, you are trusting OpenAI with your thoughts.
These companies are not charities. They are businesses. Their interests are not identical to yours. A conflict of interest is not a conspiracy – it is a structural reality. Google wants to sell ads. Amazon wants you to buy more. OpenAI wants to monetise your attention. Their interests align with yours most of the time, because happy customers are profitable customers. But when they diverge – when the most helpful answer is not the most profitable one – who wins?
The political question is: do you trust them to resolve that conflict in your favour? And if not, what alternatives do you have? The open web, the federated network, the self‑hosted service – these are not technical luxuries. They are political necessities. They are the infrastructure of distrust, the insurance policy against corporate capture.
Which interfaces earn your trust?
Let me list the interfaces that are already woven into British life, and ask: which ones have earned your trust?
The search bar – Google, Bing, DuckDuckGo. Do you trust them to give you true answers, not profitable ones?
The voice assistant – Alexa, Siri, Google Assistant. Do you trust them to listen only when you wake them?
The social feed – Facebook, Twitter, TikTok. Do you trust them to show you what matters, not what engages?
The navigation app – Google Maps, Waze, Apple Maps. Do you trust them to take you where you want to go, not where they want you to go?
The email client – Gmail, Outlook. Do you trust them to sort your messages without reading them for advertising purposes?
The health tracker – Fitbit, Apple Health. Do you trust them with your heart rate, your sleep patterns, your most intimate biology?
The banking app – your high street bank’s app. Do you trust them with your money – and with the AI that decides your credit?
For each of these, the answer is likely: “I trust them enough to use them, but not enough to stop worrying.” That is the British compromise. We are polite. We are pragmatic. We are also deeply suspicious. The question is whether that suspicion is enough to change our behaviour. For most of us, it is not. Convenience wins. Trust loses. And the interface tightens its grip.
A British example: the GP appointment
Imagine you need to see your GP. You call the surgery at 8 AM, wait on hold, and eventually book an appointment. That is the traditional interface – human, inefficient, but direct. Now imagine an AI‑powered interface. You open an app. It asks you questions about your symptoms. It triages you. It books an appointment with the right clinician, or suggests a pharmacy, or tells you to go to A&E. The AI is faster. It is also a filter. It decides what the doctor sees, what the system prioritises, what you are told.
Do you trust the AI? The NHS says it is safe. The developers say it is accurate. But trust is not about averages. It is about exceptions. The AI that works for 99% of patients fails for 1%. If you are that 1%, the efficiency is a catastrophe. The political question is: who bears the risk? Who is accountable when the AI is wrong? The doctor? The developer? The government? Or you, for having trusted the machine?
This is not a hypothetical. Triage algorithms are already in use. The interface between you and your healthcare is already augmented. The question is not whether the technology works. It is whether the accountability works. And on that front, the answer is much less clear.
The boundaries of the unaugmented
The hardest question is the most personal: which parts of your life must remain unaugmented? Which decisions should no algorithm make? Which relationships should no interface mediate? Which experiences should no machine optimise?
For me, the list includes:
The decision to have a child – no algorithm’s recommendation.
The choice of a life partner – no compatibility score.
The creation of art – no prompt, no generation, just the struggle of the human hand.
The act of forgiveness – no calculation of costs and benefits.
The moment of grief – no measured, appropriate response.
These are not efficient. They are not optimised. They are the messy, irrational, beautiful core of being human. A world that augments everything leaves no room for them. A world that trusts the interface for everything surrenders them to the machine.
The political question is not whether we can build a better interface. We can. The question is whether we should – and where we must draw the line. The line is not technical. It is ethical. It is personal. It is, in the end, a choice.
The adage that captures it
“A fool and his money are soon parted.” But the currency is no longer money. It is attention, privacy, autonomy, dignity. The fool is not the one who trusts too much, but the one who does not realise they are trusting at all. The interfaces are seamless. The trust is invisible. The parting is silent.
Another saying: “Better the devil you know than the devil you don’t.” The devil we know is Google, Amazon, OpenAI. They are imperfect, but they are familiar. The devil we don’t is the unknown – the startup that might sell your data, the open‑source model that might have backdoors, the foreign government that might be listening. The political choice is not between good and evil. It is between known risks and unknown ones.
The computing perspective: trust as a system property
In computing, trust is not a feeling. It is a system property – something you can design for, measure, and verify. A trusted system has transparency (you can see how it works), accountability (you can assign responsibility when it fails), and auditability (you can check its history). The interfaces we use today have none of these. They are black boxes. You cannot see how the search rank is computed. You cannot assign blame when the credit score is wrong. You cannot audit the recommendation engine’s training data.
This is not an accident. Transparency is expensive. Accountability is legally complex. Auditability is technically difficult. The companies that build these interfaces have chosen not to prioritise these properties. They have chosen efficiency and engagement instead. That is a political choice. It is not inevitable. It could be different.
The question is whether we – as citizens, as customers, as voters – will demand different. Will we regulate for transparency? Will we legislate for accountability? Will we fund research into verifiable AI? Or will we accept the black boxes because they are convenient? The technology is ready. The politics are not.
The bottom line
The question is no longer technological. It is political. Who will you trust to be the interface between you and the world? Which companies have earned your trust with your most sensitive data? Which interfaces are you willing to integrate into your daily workflow? Which parts of your decision‑making, your creativity, your personal relationships must remain fundamentally human and unaugmented?
These questions have no single answer. They are personal, contextual, evolving. But they must be asked – and asked again, as the interfaces change and the trust erodes. The technology will not ask them for you. The companies will not ask them for you. Only you can. And only you can decide where to draw the line.
As the old adage goes, “Trust, but verify.” The verifying is now up to us. The trusting is a choice. Choose carefully. The interface is watching. And it is learning from everything you do.
“The genie is out of the bottle” – and it has taken up residence in the wiring of your world.
We began with an adage, and we return to it now. The genie is out of the bottle. But this particular genie does not grant wishes. It does not care about your dreams, your fears, or your sense of wonder. It optimises. It predicts. It replaces. And it does so not with malice – there is no malice in a spreadsheet – but with the cold, indifferent efficiency of a system designed to do exactly what we asked it to do, if not what we actually wanted.
We asked for convenience. It gave us a world where every click is tracked, every pause analysed, every preference harvested. We asked for speed. It gave us a world where waiting two seconds for an answer feels like an eternity. We asked for accuracy. It gave us a world where a machine hallucination can cost you a mortgage. The genie is not evil. It is literal. And we are learning, slowly and painfully, that being literal is more dangerous than being malicious.
The computing reality: a decade of impossible becomes mundane
Let us be honest about where we stand. Ten years ago, the idea of a neural network rewriting its own code while playing Pokémon would have been laughed out of any serious computing department. The transformer architecture – the engine of modern large language models – was first described in 2017. The scaling laws that predict how performance improves with more data and more compute were codified around the same time. The sheer computational mass required for a system like Continual Harness – the Princeton system that learns from its own mistakes without resets – was simply not available to academic researchers.
Now it is. Now it is open‑source. Anyone with a decent graphics card and some patience can download the code, train their own self‑improving agent, and watch it figure out how to beat a video game – or optimise a supply chain, or generate convincing fake reviews, or whatever else they choose to point it at. The genie is not locked in a laboratory. It is on GitHub.
The computing reality is this: the capabilities that seemed like distant future five years ago are working systems today. The gap between research and deployment has collapsed from decades to months. What Princeton demonstrated is not a warning. It is a working prototype, released into the wild, ready for the next person to improve, adapt, and deploy. The only question is what they will deploy it on.
The three blocs: different assumptions, same conclusion
The political reality is more uncomfortable because it is not unified. The world is not building one AI future. It is building three, each with different assumptions about the role of the state, the market, and the individual.
China has built the supply chain. It has the factories, the motors, the sensors, the batteries, the carbon fibre. It has the scale. Its assumption is that the state should direct the development of strategic technologies, that the market will follow, and that the individual’s role is to benefit from the efficiency. That is not a criticism. It is a description. And it has produced 90 per cent of the world’s humanoid robots.
The United States has built the foundational models. It has OpenAI, Anthropic, Google, Meta. It has the algorithms, the talent, the venture capital. Its assumption is that the market should lead, that the state should stay out (mostly), and that the individual’s role is to choose among competing services. That has produced ChatGPT, Gemini, Claude – and the advertising that is now creeping into them.
Europe is building the regulatory framework and the real‑world stress tests. It has the GDPR, the AI Act, ELROB. Its assumption is that the state should protect the individual from the excesses of the market, that technology must be proven safe before deployment, and that rights cannot be traded away for convenience. That has produced slower deployment, but also higher trust – for now.
What unites them is the recognition that the optional era of AI is over. You cannot choose to opt out. The interfaces are everywhere. The neural networks are between you and everything. Even if you never use ChatGPT, the bank uses it to score your credit. Even if you never buy a humanoid robot, the warehouse that packs your online order uses one. The only choice left is which set of assumptions you want to live under – and even that choice is constrained by geography and by the power of the companies that have already won.
The interconnected global brain
Here is the image that should stay with you. We thought we were building a series of discrete applications and services. A search engine here. A voice assistant there. A robot in a factory. A chatbot on a website. Separate things, separate companies, separate codebases.
We were wrong. We have constructed a single interconnected global brain, and its nervous system is now coming online. The search engine feeds the language model. The language model controls the robot. The robot collects data that trains the next version of the language model. The advertisements pay for the compute. The compute enables the next breakthrough. Every part is connected to every other part. There is no off switch for the brain, because the brain is not one machine. It is millions of machines, running billions of processes, all learning from each other, all improving together.
The lights are turning on across the global tech infrastructure, and the room looks very different than we expected. We expected a toolkit. We got an ecosystem. We expected tools we could pick up and put down. We got an environment we live inside.
The question that remains
The question is not whether this brain will think. It thinks already. It thinks about your shopping habits. It thinks about your creditworthiness. It thinks about the best route home. The question is whether we will remember that we still have minds of our own.
Your mind is not a computer. It does not optimise for efficiency. It wanders. It gets distracted. It remembers things that never happened and forgets things that did. It is slow, error‑prone, and wonderful. It is also the only thing that cannot be replicated by a neural network – not because the network is not smart enough, but because the network does not have a life to live. It has no childhood, no scars, no loves, no losses. It has patterns. You have a story.
The danger is not that the machines will become too intelligent. The danger is that we will become too machine‑like – that we will accept their optimisations, their efficiencies, their recommendations, and slowly forget that we ever had a choice. The unaugmented human experience is disappearing not because the machines are forcing it out, but because we are giving it away for the price of convenience.
The audit you must perform
You must perform an audit. It is simple, but it is brutal. You must ask yourself:
Which parts of your life are you willing to hand over to the machine? Your calendar? Your route to work? Your news reading? Your social interactions? Your health decisions? Your romantic choices? Where does the handing over stop?
Where do you draw the line? Is it at the bank’s credit scoring model, but not at the doctor’s diagnostic AI? Is it at the voice assistant that sets a timer, but not at the one that records your conversations? The line is personal. But you must draw it, because if you do not, someone else will draw it for you.
Are you certain that line will hold? Not against a hypothetical future, but against the next wave of integration – which is not coming in ten years, but in ten months. The pace is accelerating. The line you draw today will be tested tomorrow. Are you ready to defend it?
Because the genie is not going back in the bottle. It is learning to enjoy the view. And the view is you – your data, your attention, your choices – spread out like a landscape, ready to be optimised, predicted, and replaced. The only question is whether you will be the one looking at the view, or the one being viewed.
The bottom line
The genie is out. It optimises, predicts, replaces. The computing reality is that the capabilities are here, open‑source, ready to be used. The three blocs – China, the US, Europe – are building different futures under different assumptions, but they agree that the optional era is over. We have built a single interconnected global brain, and its nervous system is coming online. The question is whether we will remember that we still have minds of our own.
As the old adage goes, “What cannot be cured must be endured.” We cannot cure the genie. We cannot put it back. But we can endure it with our eyes open, our lines drawn, our minds still our own. The audit is simple. The execution is hard. But it is the only defence we have. Because the genie is learning. And it is enjoying the view.


