How fast is robot setup with Aurevix?

Typically hours, not weeks — for a first task from initial recording to robot running. Traditional setup with a specialist engineer takes 3–5 weeks.

Do I need a robotics engineer to use Aurevix?

No. Any factory worker can set up a robot task. If they can demonstrate a task and describe it out loud, they can use Aurevix. No robotics knowledge, no coding, no special training required.

Which robots does Aurevix support today?

Universal Robots (UR3, UR5, UR10, UR16, UR20) and ABB (GoFa, SWIFTI) are available today. FANUC, KUKA, Techman, and Yaskawa are on the near-term roadmap.

What does Aurevix cost?

Flexible subscription pricing with no per-task fees and no integrator invoices. We offer Starter, Professional, and Integrator tiers — talk to us for specifics.

Can Aurevix handle multi-step tasks?

Yes. You can chain pick, orient, place, and machine-tend into a single program with conditional branches and signal-waits between steps. Multi-step sequencing is a core capability.

What gripper types does Aurevix support?

Aurevix supports pneumatic, electric, and vacuum grippers. Pneumatic grippers make up 55–60% of installed industrial grippers, so we build for the hardware most factories already have.

How does Aurevix understand what I am demonstrating?

Aurevix uses vision-language-action (VLA) models — the same technology behind Google RT-2 and OpenVLA — to interpret your phone video and voice narration, translating them into precise robot motion sequences.

Yes. All data is processed in isolated containers with zero persistence. Enterprise customers can deploy on-premise. Aurevix is GDPR compliant and SOC 2 ready.

Can I program a robot without a teach pendant?

Yes. Aurevix replaces the teach pendant with a phone camera and voice. Workers demonstrate the task naturally and Aurevix converts it into robot motion automatically — no specialist training.

Does Aurevix support FANUC robots?

FANUC robot integration is on our near-term roadmap. Currently Aurevix supports Universal Robots and ABB cobots. Join the waitlist at agenticconvergent.com to be notified when FANUC support launches.

Physics-Aware Data Annotation: Why Pixels Alone Aren't Enough for Robot Learning

Imagine two grasp attempts that look identical in images. Same hand pose, same object, same lighting, same background.

But in one case, the robot successfully grasps and holds the object. In the other, it slips and drops it.

A vision-only annotation system would label both identically. A physics-aware system would catch the difference: the failed grasp lasted 0.3 seconds at 15N grip force; the successful one held 50N steady for 2 seconds.

This is the gap between vision-only annotation and physics-aware annotation. And it's the gap between models that work in controlled labs and models that work in the real world.

The Hidden Half of Robotics Data

Most annotation pipelines stop at vision.

Robotics doesn't.

A complete representation of what a robot did includes not just what it saw, but what it felt:

Force and torque feedback: Grip force, insertion force, compliance measurements
Joint positions and velocities: The exact trajectory the robot followed
Joint accelerations and currents: Energy expenditure and dynamic constraints
Contact dynamics: When and where the robot touched objects, surfaces, or obstacles
Motion trajectories: The path through space and time that the robot took to accomplish the task
Proprioceptive signals: The robot's sense of its own body configuration and motion

These signals capture the physics of what happened—the interaction between the robot and its environment.

Without them, models learn incomplete, brittle representations. They can recognize situations but can't learn how to safely or effectively interact with the world.

Why Vision-Only Annotation Fails Robots

Let's make this concrete with examples from common robotics tasks.

Example 1: Grasping and Manipulation

Vision tells you where the hand is relative to an object. Physics tells you how strongly it's grasping, whether it's stable, and whether it's about to slip.

Two grasps might have identical hand poses in the camera view, but one applies 10N of grip force (too weak, object will slip) and the other applies 50N (stable, but high energy cost). Models trained only on vision can't distinguish these outcomes until they fail in the real world.

Example 2: Insertion and Assembly

Inserting a peg into a hole is deceptively simple. Vision shows the hand approaching the hole and entering it. Physics tells you whether the task succeeded.

A failed insertion attempt might visually look similar to a successful one: the peg is pressed against the hole. But force feedback reveals the difference:

Failed insertion: High insertion force (>20N), pushing against the hole wall
Successful insertion: Lower force, peg slides smoothly into the hole as friction guides it

Without force-torque annotation, models can't learn the fine manipulation skills that distinguish successful assembly from jammed, broken, or damaged interactions.

Example 3: Compliant Tasks and Contact

Some tasks require the robot to actively use contact feedback, not avoid it. Wiping a surface, pushing an object across a table, or manipulating deformable objects all require understanding contact dynamics.

Vision sees motion. Physics captures the forces behind that motion and the compliance required to succeed. A model that learns only visual patterns for wiping will fail when surface properties change (wet vs. dry, rough vs. smooth). A model that learns the force profiles and adjustment strategies required for different surfaces will generalize.

What Is Physics-Aware Data Annotation?

Physics-aware annotation incorporates signals beyond vision:

Force-Torque (6D Wrench) Data:

Fx, Fy, Fz (forces along X, Y, Z axes)
Mx, My, Mz (torques around each axis)
Captured at 100–1,000 Hz from sensors in the robot's wrist or gripper

Joint Telemetry:

Position, velocity, acceleration for each joint
Joint current/torque (reflecting effort and resistance)
Captured at 100–1,000 Hz from joint encoders or actuators

Trajectory Annotations:

Segmentation of continuous demonstrations into sub-tasks or primitives
Labeling of key waypoints or state transitions
Encoding of constraint information (e.g., "maintain contact with surface" or "avoid this region")

Contact and Event Labels:

When contact initiates and terminates
Type of contact (grasp, push, slide, impact)
Stability or instability of contact

Proprioceptive Data:

Full-body configuration sequences
Constraint compliance (robot following a desired trajectory despite disturbances)

Together, these signals encode the grammar of physical interaction. They answer questions that vision alone cannot:

Is this manipulation stable or precarious?
Did this task succeed due to skill or luck?
How much force is necessary, and what happens if we use more or less?
Is this motion energy-efficient or wasteful?

Why Incumbent Annotation Tools Don't Support Physics Layers

Most annotation platforms evolved from computer vision workflows. They have deep expertise in:

Image and video annotation (bounding boxes, segmentation masks)
2D and 3D bounding boxes on static scenes
Scene understanding and temporal tracking

But physics signals—force, torque, joint data—are outside that scope. As a result:

Physics signals are treated as metadata: If they're captured at all, they're separated from visual annotation, disconnected and hard to align
Annotation is manual and inconsistent: A human replays a video and manually applies labels, introducing subjectivity about what "force threshold" or "contact state" means
There's no unified pipeline: Vision annotation happens in one tool, force analysis in another spreadsheet or custom script; temporal alignment is manual and error-prone
Scaling fails: Annotating 10 hours of data with force-torque labels might require custom engineering per dataset

Scale AI, Labelbox, and Encord focus on vision. They use human annotators or AI models to label visual content. But none offer fully automated, native support for force-torque annotation or proprioceptive data labeling.

This creates a huge gap: teams with rich sensor data are forced to either:

Ignore the physics signals and train vision-only models (leaving generalization and robustness on the table)
Manually engineer custom annotation pipelines (expensive, brittle, slow)
Outsource to specialized robotics data companies (costly, slow turnaround)

The Data-Efficiency Multiplier of Physics-Aware Annotation

Here's why this matters for model performance:

Less Data Required: A model trained on vision + physics signals can learn robust behaviors with fewer demonstrations. Why? Because physics signals eliminate ambiguity. The model doesn't have to guess what happened; the data tells it.

Studies in robotic learning show that adding force-torque feedback can reduce data requirements by 30–50% for manipulation tasks, because the model learns causal relationships (action → force outcome) instead of correlations (visual pattern).

Better Generalization: Physics signals capture task semantics that are invariant to appearance. A grasping policy trained on force feedback generalizes across different lighting, camera angles, and object textures—because the task is ultimately about grip force and stability, not visual features.

Faster Learning: Models can learn force control policies directly from labeled demonstrations, rather than learning inverse models that predict force from vision (error-prone and indirect).

Debugging and Safety: In simulation-to-real transfer and live deployment, physics-aware labels help catch problems early:

"This simulated force profile doesn't match real data → simulation is miscalibrated"
"The model is predicting forces that exceed joint limits → safety issue"

Aurevix: Annotation Built for Physical Intelligence

Aurevix was designed specifically to handle the full robotics data stack—vision and physics.

Instead of bolting force annotation onto a vision platform, Aurevix:

Ingests and aligns physics signals natively: Force-torque data, joint telemetry, proprioception, and video synchronized with sub-millisecond precision
Automates physics-aware labeling: Rather than manual frame-by-frame annotation, automated detection of grasp events, contact transitions, force anomalies, and trajectory segmentation
Provides unified labeling interfaces: View force-torque signals alongside video, with synchronized playback and joint inspection
Scales without human bottlenecks: Process millions of demonstrations, automatically labeling force profiles, contact dynamics, and compliance violations
Outputs physics-ready datasets: Datasets where every training example includes aligned vision, language (task intent), action (joint trajectory), and physics signals (forces, torques, compliance)

The result: robotics teams can train models that understand not just what to do, but how to do it—with the physical intelligence to handle variability, disturbance, and real-world complexity.

Physics-Aware Annotation in Practice

Let's walk through a concrete example: training a robot to perform a grasp-and-lift task.

Without physics-aware annotation (vision-only):

Annotators label video frames: "grasp initiated," "grasp complete," "grasp stable"
Labels are subjective; different annotators disagree on timing and stability
Model learns to predict grasp success from hand pose and object appearance
Model fails in deployment because it hasn't learned force control; it can recognize good hand poses but can't adjust grip force in response to slip

With physics-aware annotation (Aurevix):

System automatically detects force rise and plateau as gripper closes
Identifies successful grasps: force stabilizes for >0.5 seconds without slip events
Identifies failed grasps: force spikes then drops (slip event)
Labels contact events (when gripper first touches object)
Outputs training data: [RGB video + object position + force profile + success/failure]
Model learns: "approach this way → apply this force profile → monitor slip → adjust if needed → succeed"
Model generalizes: Can handle softer objects (lower grip force needed), harder objects (higher force), different materials (different friction feedback)

The physics layer doesn't just improve accuracy; it teaches the robot why something worked or failed.

The Competitive Advantage of Physics-Aware Training Data

In the race to build embodied AI—robots that generalize across tasks, environments, and challenges—physics-aware training data is becoming a key differentiator.

Teams with physics-grounded datasets can:

Iterate faster: Shorter training times due to less data needed
Deploy safer: Models that understand force constraints can avoid breaking things or hurting humans
Transfer better: Policies learned on physics signals transfer to new robots or tasks with less fine-tuning
Achieve human-level dexterity: Fine manipulation (assembly, surgery, intricate tasks) requires force feedback; vision alone can't get you there

Conversely, teams relying on vision-only annotation are building models with a fundamental limitation: they lack the sensorimotor grounding that makes physical interaction reliable and adaptive.

The Frontier: Beyond Pixels and Into Interaction

As robotics models become more ambitious—from single-task policies to general-purpose embodied agents—the frontier of innovation has shifted.

It's no longer about perception (we have excellent computer vision). It's about interaction: understanding how to affect the physical world reliably.

That requires data that captures the physics of interaction.

Physics-aware annotation isn't a nice-to-have feature for robotics ML. It's becoming essential infrastructure for building robotics systems that truly learn from experience.

Ready to Build Physics-Aware Models?

If you're training robots to manipulate, assemble, or interact with the physical world, your training data's richness determines your model's capability ceiling.

Vision-only datasets are hitting that ceiling. The teams pushing beyond it are the ones building datasets that include force, torque, trajectory, and contact information.

Aurevix makes physics-aware annotation practical and scalable. You get datasets where every training example includes the full story of what happened—what the robot saw, what it did, and how the physics responded.

[Discover how Aurevix enables physics-aware robot learning →]