What it was
A fork of raulmur/ORB_SLAM2, the reference implementation from the famous Mur-Artal et al. SLAM paper. The project was less “build a SLAM system” and more “stand the canonical one up, run it on your own video, learn what each component does by watching it work and fail.”
What ORB-SLAM2 does
- Tracking thread. ORB keypoint detection + descriptor matching against the local map; PnP for the current frame’s pose.
- Local mapping thread. Triangulate new map points; local bundle adjustment over a window of recent keyframes.
- Loop closing thread. DBoW2 visual-word bag matches the current frame against historical keyframes; if a loop is detected, run a pose-graph optimization to correct accumulated drift.
The whole thing runs at 25-30 Hz on a laptop CPU. No GPU. No deep learning.
What was actually tricky
- Building the dependency stack — OpenCV 3.x with the right
xfeatures2dbuild, Eigen pinned to a specific version, Pangolin built from source, DBoW2 + g2o vendored as submodules. The CMake config was where most of the time went. - Scale ambiguity. Monocular SLAM is inherently up-to-scale — the reconstruction is correct but the units are arbitrary. RGB-D or stereo gives metric scale; mono does not.
- Initialization is brittle. ORB-SLAM2 needs enough parallax in the first few frames to triangulate. Move the camera too slowly at startup and it sits in “INITIALIZING” forever.
What I’d do differently with hindsight
- Use ORB-SLAM3 (released 2020). Adds visual-inertial mode, multi-map support, and far better robustness — all backwards- compatible with ORB-SLAM2 datasets.
- Pair with IMU for monocular cases. VI-SLAM is night-and-day more robust than mono SLAM.
- For learning, also build a toy SLAM from scratch. ORB-SLAM2 is too big to internalize — but coding up “PnP + g2o pose-graph” on a half-page makes the math click.
What it taught me
SLAM is the single hardest problem in mobile robotics, and it has a generation of mature open-source implementations. Most application code should consume those, not re-invent. The valuable thing to know is “what each thread does and why” — which lets you debug when it fails — not “how to write a bundle adjustment from scratch.”