Part IV: Outlook

Chapter 10: Experimental Design and Validation Plan

Written: 2026-04-07 Last updated: 2026-04-09

Summary

This chapter presents experimental designs for validating TacGlove [#26]/TacTeleOp and TacPlay [#27] hypotheses: 3 pilot processes (capping, labeling, packaging), required hardware (gloves, robots, smart glasses), quantitative evaluation metrics (success rate, convergence time, generalization, cost), and key ablation experiments (3-axis vs binary, co-training conditions, tactile residual decomposition).

10.1 Three Pilot Processes

Process 1: Container Capping (Difficulty: High)

  • Action: Hold container, rotate cap clockwise
  • Tactile requirement: Normal force (grip) + shear force (rotation torque) precision control
  • 3-axis vs binary: Binary detects contact only, cannot distinguish insufficient/excessive torque
  • Success criteria: Cap fully closed within specified torque range

Process 2: Label Application (Difficulty: Medium)

  • Action: Apply label to container surface, press uniformly to remove bubbles
  • Tactile requirement: Uniform pressure distribution maintenance
  • 3-axis vs binary: 3-axis advantageous for detecting force imbalance
  • Success criteria: Label position accuracy ±2mm, no bubbles

Process 3: Assembly Packaging (Difficulty: High)

  • Action: Place and secure multiple parts in box in specified order
  • Tactile requirement: Multi-directional force + precision positioning (snap-fit)
  • 3-axis vs binary: Force feedback essential for precision positioning during snap-fit
  • Success criteria: All parts correctly positioned and secured

10.2 Required Hardware

Equipment Quantity Purpose Notes
TacGlove 3+ pairs Shift changes + spares Fabricated by Prof. Park Y.-L.'s group
3-axis magnetic tactile sensors 8/glove × 3 = 24+ Tactile data collection BMM350 + magnetic elastomer
Allegro or LEAP Hand 1–2 units Robot experiments 16-DoF dexterous
Smart glasses (Aria or equiv.) 2 units Egocentric RGB + head pose Time synchronization required
Cosmetics process setup 1 set Lab reproduction Actual containers, labels, packaging
GPU (A100 × 4+) 1 set Co-training + RL Isaac Sim compatible
MuJoCo + TACTO License Sim environment TacPlay sim-first

10.3 Evaluation Metrics

Primary Metrics

Metric Definition Target
Success rate Success ratio over 50 trials per process Capping >85%, Label >90%, Packaging >80%
Convergence time Learning time to reach target success rate TacPlay: <4 hr/object
Novel object generalization Success on unseen containers/labels >70% (within -15%p of seen)
Cost efficiency Human labor hours per unit success rate 3×+ over teleop

Ablation Metrics

Ablation Conditions Purpose
3-axis vs binary 8 sensors 3-axis vs 8 sensors binary H3 tactile resolution marginal gain
Co-training conditions A only, B only, A+B (vision), A+B (vision+tactile) H2 co-training + tactile value
Scaling curve Data B at 10, 50, 200, 800 hr Tactile scaling law
Tactile residual transfer \Delta_{capping} → label vs \Delta_{label} trained fresh H3 cross-task generalization
Residual decomposition Kinematic vs task vs object components Understanding residual structure

10.4 Experimental Protocol

Phase A: TacGlove/TacTeleOp Validation (Month 1–6)

M1–M2: 3-finger glove + tactile sensor prototype fabrication and smart glasses synchronization.

M3: Lab pilot collection of 50 hours. Initial co-training experiments on capping.

  • Go/No-Go check: co-training (A+B) > A only? Tactile addition effect > +5%p?

M4: 5-finger extension (if feasible). Execute 3-axis vs binary ablation.

M5: 200+ hour collection. Scaling curve analysis (10, 50, 200 hr). Complete all co-training ablations.

M6: TacGlove/TacTeleOp paper writing and submission.

Phase B: TacPlay Validation (Month 3–8)

M3–M4: Build tactile-target RL environment in MuJoCo + TACTO. Reward function design and convergence verification.

  • Go/No-Go check: Sim convergence on 2+ of 3 tasks?

M5–M6: Real-world glove-mounted play experiments. Capping task priority.

M7: Tactile residual cross-task transfer experiment (capping → labeling).

M8: TacPlay paper writing or workshop paper preparation.

Go/No-Go Decision Framework

Condition Go Pivot Stop
Co-training (A+B > A) Statistically significant Trend only Adverse
Tactile addition +10%p or more +5–10%p <+5%p
Sim RL convergence 2+ of 3 tasks 1 task only 0 tasks
Glove stability 8+ hr continuous 4–8 hr <4 hr

10.5 Expected Results and Interpretation

Best Case

  • TacGlove/TacTeleOp: 800 hr tactile co-training → capping 90%+. Tactile scaling law confirmed log-linear. 3-axis > binary by +10%p.
  • TacPlay: Tactile-target RL converges in sim and real. Cross-task residual transfer succeeds. 85%+ at 0 hr teleop.

Realistic Case

  • TacGlove/TacTeleOp: Co-training effect confirmed at 200 hr, 3-axis vs binary significant only for capping. Scaling law partially verified.
  • TacPlay: Sim convergence confirmed, partial real success. Cross-task transfer limited. Workshop paper level.

Worst Case

  • TacGlove/TacTeleOp: Tactile co-training negligible vs vision-only. No 3-axis vs binary difference. → Pivot to hardware contribution (stretchable) + dataset contribution.
  • TacPlay: RL convergence failure. → "Challenges and Limits of Tactile-Target RL" negative result paper.

10.6 Connection to Our Direction

This experimental design systematically validates all hypotheses from Chapter 7 (TacGlove), Chapter 8 (TacTeleOp), and Chapter 9 (TacPlay). The 3-axis vs binary ablation directly responds to Ye et al.'s [2026] "binary 85%" result, and the tactile scaling curve extends EgoScale's [2] vision scaling law to the tactile domain. The next chapter discusses long-term outlook beyond these experiments (Chapter 11).

References

  1. Ye, Q., et al. (2026). Visual-Tactile Learning for Dexterous Manipulation. Science Robotics. scholar
  2. Zheng, R., et al. (2026). EgoScale: Egocentric Video Pretraining. arXiv. scholar
  3. Yin, J., et al. (2025). OSMO: A Large-Scale Tactile Glove. arXiv. #18 scholar
  4. DexH2R (2024). Task-Oriented Residual RL. arXiv. scholar
  5. Si, Z., et al. (2025). ExoStart. arXiv. #9 scholar
  6. Kareer, S., et al. (2024). EgoMimic. arXiv. scholar