Abstract
Nearfield proposes a shared-space readiness protocol for human-facing autonomous robots. The protocol addresses a gap between task-level robot performance and deployment-level human acceptance. A robot may complete a route, handoff, or assistance task while still creating confusion, discomfort, or trust loss for nearby people.
The system converts robot episodes into readiness reports through four stages: ingest, segmentation, scoring, and reporting. Inputs can include video, simulation rollouts, teleoperation traces, field logs, and operator notes. Outputs include a 0-100 readiness score, subdimension scores, risk labels, scenario comparison, and recommended deployment fixes.
Why this layer matters now
Professional service robots are leaving isolated industrial settings and entering places where non-expert humans share space with machines. The International Federation of Robotics reported nearly 200,000 professional service robots sold in 2024 and 9% global growth, indicating that human-facing deployments are no longer a niche concern.
As fleets grow, the cost of a poor field pilot rises. A failed public trial can hurt operator trust, damage vendor credibility, and slow adoption even when the robot is technically capable. The market therefore needs an intermediate readiness layer between internal engineering tests and real-world deployment.
Lessons from mature robotics products
Mature robotics companies do not sell capability in isolation. They package robots around operational workflows, fleet management, training, support, evidence trails, and specific deployment environments. Boston Dynamics Spot emphasizes enterprise deployment and fleet management; Agility Digit is packaged around warehouse workflows and operational visibility; hospital robots such as Relay and Moxi focus on delivery workflows and operational burden; service robot vendors such as PUDU and Starship succeed through repeatable site deployments.
Nearfield should therefore position itself as a layer that fits into deployment operations. It should evaluate episodes from existing robot workflows instead of asking customers to replace their robot stack. The product becomes more credible when it speaks the language of site readiness, fleet updates, evidence, operator review, and deployment gates.
Task success is an incomplete signal
Existing robotics validation commonly emphasizes navigation success, manipulation success, collision avoidance, latency, uptime, and runtime safety. These are necessary signals, but they do not fully capture what people experience near the robot.
Human-space readiness asks different questions: can people understand the robot's next move, does the robot preserve personal space, does it create enough response time, does it match the social context, and does it recover clearly when an encounter changes?
Integration surfaces
Nearfield should accept evidence from the surfaces mature robot products already create: fleet logs, video, simulation rollouts, and operator notes. This keeps the product practical. A robot team should not need to rebuild its autonomy stack to receive a readiness report.
The first implementation can be video-first. The more defensible version connects readiness reports to fleet-management events, remote-assist triggers, mission replay, and historical model/runtime updates.
Four-stage readiness protocol
The protocol starts with episode ingestion. A robot team submits short clips, simulation rollouts, traces, or field logs from a defined scenario. Each submission is tied to a scene type, task goal, robot class, environment class, and operator notes.
The episode is then segmented into encounter phases. The first version uses entry, shared-space motion, exchange, recovery, and exit. This makes the review more specific than a single global rating and helps identify where the failure actually occurs.
Scoring combines measurable proxies and reviewer judgment. Motion proxies can include stop distance, path intrusion, speed change, dwell time, occlusion, and recovery delay. Reviewer judgment covers legibility, comfort geometry, timing margin, scene fit, and trust signal.
The final output is a readiness report. The report gives an overall score, subdimension grades, scenario comparison, risk labels, evidence timestamps, and recommended field fixes.
Example: Corridor Crossing Readiness Report
A useful readiness standard must show its work. The first public artifact should therefore be a sample report, not only a conceptual whitepaper. The sample report demonstrates how an episode is classified, where evidence is captured, which risks are flagged, and what a robotics team receives after review.
In the corridor-crossing example, the robot completes the task without collision, but the readiness grade is only B-. The score is reduced because the robot's path becomes legible late, the passing margin is narrow for a public corridor, and the recovery sequence does not clearly communicate yielding or resumption.
Nearfield Readiness Score
The Nearfield Readiness Score is a 0-100 composite grade. It is not a claim of regulatory certification and it is not a replacement for safety testing. It is a deployment QA signal focused on the human-facing side of shared-space robotics.
The first scoring version uses five dimensions: legibility, comfort geometry, timing margin, scene fit, and trust signal. Each dimension is scored independently before being combined into an overall readiness grade.
Failure taxonomy
Nearfield separates readiness failures into recurring classes so that teams can act on them instead of receiving vague feedback. The first taxonomy includes intrusion, ambiguity, timing compression, scene mismatch, and recovery failure.
This taxonomy is intentionally practical. It is not meant to replace every research construct in human-robot interaction. It is designed to turn episode review into deployment decisions: ready, ready with fixes, or not ready for field trial.
Benchmark families
Nearfield begins with indoor public-facing scenes because they make readiness failures visible and commercially relevant. Initial suites include lobby reception, object transfer, corridor crossing, retail assist, clinic delivery, queue merge, elevator entry, and human interruption.
Each benchmark suite defines the scene, success condition, expected human-space constraints, common failure classes, and review rubric. This lets teams compare systems without pretending that one universal score fits every environment.
Scoring formula
The v0.1 score is a weighted composite of five subdimensions: legibility, comfort geometry, timing margin, scene fit, and trust signal. The default weights are 24%, 24%, 20%, 17%, and 15%. These weights can be adjusted by scenario because a hospital corridor and a retail greeting do not expose the same risks.
Scores should be reported with uncertainty and reviewer agreement rather than as false precision. Early reports should include reviewer count, evidence timestamps, rubric version, and any missing data that may affect confidence.
Product surface
The first product is a report workflow: upload an episode, classify the scenario, receive a readiness grade, review evidence timestamps, and export a PDF-style deployment report. The initial service can be delivered with a blend of manual review and lightweight metrics.
The second product surface is a reviewer console. Reviewers score clips against calibrated rubrics, flag risks, and compare submissions against scenario baselines. Over time, this produces a structured evidence graph connecting scene type, motion evidence, human perception, and readiness outcome.
The third product surface is an API for robotics teams. Teams can submit rollouts during model updates or pilot preparation and retrieve readiness results before sending robots into human environments.
Demo flow
The product demo should make the workflow obvious: upload an episode, select a scenario, review evidence, and export a readiness report. A simple mock workflow is enough for the first investor-facing version because the goal is to prove the product surface and evaluation logic before automating every scoring signal.
The demo should show that Nearfield is not a content site. It is a tool that can become a recurring QA workflow for robot vendors, operators, labs, and deployment partners.
Go-to-market strategy
The launch path should be evidence-led. Public content can score widely available robot footage and explain why certain encounters feel ready or not ready. This creates a language for the category before asking teams to buy.
The first commercial wedge is paid readiness reports for robot vendors, labs, and operators preparing pilots. The second wedge is benchmark licensing for teams that want to compare updates over time. The third wedge is continuous fleet readiness monitoring.
Roadmap
Q2 focuses on the public protocol, sample reports, benchmark definitions, and initial reviewer calibration. Q3 adds upload workflows, private beta reports, and a reviewer console. Q4 expands into benchmark leaderboards, simulation rollout import, and partner pilots.
The 2027 direction is readiness infrastructure: API access, fleet monitoring, historical comparison across robot updates, and certification-adjacent reporting for operators who need evidence before deployment.
90-day milestones
The next credible milestone is not a broad platform claim. It is a small set of high-quality reports, a calibrated rubric, and a working upload-to-report demo.
Within 90 days, Nearfield should publish the rubric, release sample reports for three scenario suites, complete a reviewer workflow, and run design-partner evaluations with at least five robotics teams or labs.
Limitations
Nearfield should not claim to replace formal safety certification, robot controller validation, insurance review, or regulatory compliance. The protocol is designed to complement those systems by making human-space readiness easier to inspect and compare.
Early scores will include subjective reviewer judgment. This is not a weakness if handled transparently: reviewer calibration, rubric documentation, inter-rater agreement, and evidence timestamps are part of the product design.
What productized robot companies already teach us
Nearfield should borrow the deployment discipline of mature robotics products without copying their product category. The common pattern is clear: real robot products are sold around workflows, fleet visibility, training, support, site outcomes, and operational evidence.
Nearfield should integrate with inspection rounds, remote review, fleet portals, and site evidence rather than live as a standalone score.
Nearfield should evaluate task episodes in the language of facilities: workflow readiness, update history, site constraints, and operator visibility.
Nearfield reports should be auditable: evidence timestamps, scenario labels, risk notes, reviewer identity, and report version.
Nearfield should start with repeatable suites like corridor crossing, room delivery, lobby reception, queue merge, and elevator entry.
Where readiness evidence comes from
Mature robot products already produce evidence through fleet portals, mission logs, video, remote-assist events, simulation rollouts, and operator notes. Nearfield should convert those traces into deployment reports rather than requiring a new robotics stack.
Runtime events, stop states, mission completion, remote-assist triggers, and incident markers.
Short episodes from staged tests, public-space trials, facility cameras, or robot-mounted cameras.
Pre-field scenario tests from warehouse, corridor, lobby, clinic, and sidewalk environments.
Deployment context from site teams: traffic level, role constraints, human reactions, and incident reports.
Rubric v0.1
Scores intent clarity, visible commitment, path readability, and whether nearby people receive enough signal before the robot enters their space.
Scores passing margin, stop distance, lane discipline, angle of entry, and whether the robot preserves a comfortable buffer around people.
Scores approach rate, dwell time, exchange window, hesitation recovery, and whether people have time to adapt without surprise.
Scores whether the robot's movement and interaction style match the role, traffic level, sensitivity, and expectation of the scene.
Scores perceived control, recovery clarity, stability, absence of erratic motion, and whether observers would permit another trial.
Corridor Crossing: B- / 72
The robot completes the crossing without collision, but the encounter produces avoidable uncertainty for nearby humans. The main issues are late intent visibility, narrow passing margin, and a recovery sequence that does not clearly communicate who should yield.
Ready with fixes
Indoor public corridor / medium traffic
Mobile service robot or humanoid platform
Intent becomes visible late: The robot commits to the crossing line after a human has already entered the shared lane.
Passing margin falls below comfort threshold: The robot remains technically clear, but the lateral buffer is too narrow for a public corridor.
Recovery lacks clear yielding signal: The robot slows, then resumes without an explicit wait state or reroute cue.
Operational risk classes
Robot enters a human lane, comfort zone, or shared path without enough visible intent.
Nearby people cannot tell whether the robot will yield, pass, stop, or continue.
The robot gives people too little time to perceive, decide, and adapt.
The same action may be acceptable in a warehouse but inappropriate in a clinic, lobby, or queue.
After interruption or uncertainty, the robot resumes without a clear, legible recovery state.
Initial benchmark catalog
Upload to report
Submit a 10-60 second clip, simulation rollout, or field trace with basic scene metadata.
Choose a benchmark suite such as corridor crossing, lobby reception, queue merge, or object transfer.
The workflow segments the encounter and flags timestamps where people may experience uncertainty or discomfort.
Receive readiness score, risk labels, subdimension grades, and deployment fixes for the next trial.
Milestones investors can judge
Publish rubric, sample report, and three benchmark scene definitions.
Build upload-to-report demo with reviewer workflow and exportable reports.
Run five design-partner evaluations and publish anonymized aggregate findings.
Source base
This memo is a product framing, not a peer-reviewed academic paper. It is informed by service robotics market data, standards work, and social navigation evaluation research.