Part IV: Outlook

Chapter 10: Experimental Design and Validation Plan

Written: 2026-04-07 Last updated: 2026-04-09

Summary

This chapter presents experimental designs for validating TacGlove [#26]/TacTeleOp and TacPlay [#27] hypotheses: 3 pilot processes (capping, labeling, packaging), required hardware (gloves, robots, smart glasses), quantitative evaluation metrics (success rate, convergence time, generalization, cost), and key ablation experiments (3-axis vs binary, co-training conditions, tactile residual decomposition).

10.1 Three Pilot Processes

Process 1: Container Capping (Difficulty: High)

Action: Hold container, rotate cap clockwise
Tactile requirement: Normal force (grip) + shear force (rotation torque) precision control
3-axis vs binary: Binary detects contact only, cannot distinguish insufficient/excessive torque
Success criteria: Cap fully closed within specified torque range

Process 2: Label Application (Difficulty: Medium)

Action: Apply label to container surface, press uniformly to remove bubbles
Tactile requirement: Uniform pressure distribution maintenance
3-axis vs binary: 3-axis advantageous for detecting force imbalance
Success criteria: Label position accuracy ±2mm, no bubbles

Process 3: Assembly Packaging (Difficulty: High)

Action: Place and secure multiple parts in box in specified order
Tactile requirement: Multi-directional force + precision positioning (snap-fit)
3-axis vs binary: Force feedback essential for precision positioning during snap-fit
Success criteria: All parts correctly positioned and secured

10.2 Required Hardware

Equipment	Quantity	Purpose	Notes
TacGlove	3+ pairs	Shift changes + spares	Fabricated by Prof. Park Y.-L.'s group
3-axis magnetic tactile sensors	8/glove × 3 = 24+	Tactile data collection	BMM350 + magnetic elastomer
Allegro or LEAP Hand	1–2 units	Robot experiments	16-DoF dexterous
Smart glasses (Aria or equiv.)	2 units	Egocentric RGB + head pose	Time synchronization required
Cosmetics process setup	1 set	Lab reproduction	Actual containers, labels, packaging
GPU (A100 × 4+)	1 set	Co-training + RL	Isaac Sim compatible
MuJoCo + TACTO	License	Sim environment	TacPlay sim-first

10.3 Evaluation Metrics

Primary Metrics

Metric	Definition	Target
Success rate	Success ratio over 50 trials per process	Capping >85%, Label >90%, Packaging >80%
Convergence time	Learning time to reach target success rate	TacPlay: <4 hr/object
Novel object generalization	Success on unseen containers/labels	>70% (within -15%p of seen)
Cost efficiency	Human labor hours per unit success rate	3×+ over teleop

Ablation Metrics

Ablation	Conditions	Purpose
3-axis vs binary	8 sensors 3-axis vs 8 sensors binary	H3 tactile resolution marginal gain
Co-training conditions	A only, B only, A+B (vision), A+B (vision+tactile)	H2 co-training + tactile value
Scaling curve	Data B at 10, 50, 200, 800 hr	Tactile scaling law
Tactile residual transfer	$\Delta_{capping}$ → label vs $\Delta_{label}$ trained fresh	H3 cross-task generalization
Residual decomposition	Kinematic vs task vs object components	Understanding residual structure

10.4 Experimental Protocol

Phase A: TacGlove/TacTeleOp Validation (Month 1–6)

M1–M2: 3-finger glove + tactile sensor prototype fabrication and smart glasses synchronization.

M3: Lab pilot collection of 50 hours. Initial co-training experiments on capping.

Go/No-Go check: co-training (A+B) > A only? Tactile addition effect > +5%p?

M4: 5-finger extension (if feasible). Execute 3-axis vs binary ablation.

M5: 200+ hour collection. Scaling curve analysis (10, 50, 200 hr). Complete all co-training ablations.

M6: TacGlove/TacTeleOp paper writing and submission.

Phase B: TacPlay Validation (Month 3–8)

M3–M4: Build tactile-target RL environment in MuJoCo + TACTO. Reward function design and convergence verification.

Go/No-Go check: Sim convergence on 2+ of 3 tasks?

M5–M6: Real-world glove-mounted play experiments. Capping task priority.

M7: Tactile residual cross-task transfer experiment (capping → labeling).

M8: TacPlay paper writing or workshop paper preparation.

Go/No-Go Decision Framework

Condition	Go	Pivot	Stop
Co-training (A+B > A)	Statistically significant	Trend only	Adverse
Tactile addition	+10%p or more	+5–10%p	<+5%p
Sim RL convergence	2+ of 3 tasks	1 task only	0 tasks
Glove stability	8+ hr continuous	4–8 hr	<4 hr

10.5 Expected Results and Interpretation

Best Case

TacGlove/TacTeleOp: 800 hr tactile co-training → capping 90%+. Tactile scaling law confirmed log-linear. 3-axis > binary by +10%p.
TacPlay: Tactile-target RL converges in sim and real. Cross-task residual transfer succeeds. 85%+ at 0 hr teleop.

Realistic Case

TacGlove/TacTeleOp: Co-training effect confirmed at 200 hr, 3-axis vs binary significant only for capping. Scaling law partially verified.
TacPlay: Sim convergence confirmed, partial real success. Cross-task transfer limited. Workshop paper level.

Worst Case

TacGlove/TacTeleOp: Tactile co-training negligible vs vision-only. No 3-axis vs binary difference. → Pivot to hardware contribution (stretchable) + dataset contribution.
TacPlay: RL convergence failure. → "Challenges and Limits of Tactile-Target RL" negative result paper.

10.6 Connection to Our Direction

This experimental design systematically validates all hypotheses from Chapter 7 (TacGlove), Chapter 8 (TacTeleOp), and Chapter 9 (TacPlay). The 3-axis vs binary ablation directly responds to Ye et al.'s [2026] "binary 85%" result, and the tactile scaling curve extends EgoScale's ^[2] vision scaling law to the tactile domain. The next chapter discusses long-term outlook beyond these experiments (Chapter 11).

References

Ye, Q., et al. (2026). Visual-Tactile Learning for Dexterous Manipulation. Science Robotics. scholar
Zheng, R., et al. (2026). EgoScale: Egocentric Video Pretraining. arXiv. scholar
Yin, J., et al. (2025). OSMO: A Large-Scale Tactile Glove. arXiv. #18 scholar
DexH2R (2024). Task-Oriented Residual RL. arXiv. scholar
Si, Z., et al. (2025). ExoStart. arXiv. #9 scholar
Kareer, S., et al. (2024). EgoMimic. arXiv. scholar