What it did
Project 1 of the Udacity SDC nanodegree. Detect highway lane markings from dashcam video — six lines of OpenCV per frame, then average across time to draw two clean lane edges on top of the source footage.
The pipeline
- RGB → grayscale.
- Gaussian blur (kernel size 5).
- Canny edge detection (low/high thresholds tuned by eye).
- Region-of-interest mask — trapezoid centered on the ego lane.
- Probabilistic Hough transform → line segments.
- Linear-fit each side (left + right) → two extrapolated lane lines.
- Overlay on the source frame.
What was actually tricky
The Hough threshold + ROI mask are a pair you tune together. Loose Hough + tight ROI works on the easy clip; tight Hough + loose ROI works on the curvy clip; neither setting works on both. The lesson: hand-tuned pipelines win the easy demo and lose the second-easiest one.
What I’d do differently with hindsight
- Treat it as semantic segmentation, not edge detection. A small U-Net trained on a few thousand frames will outperform any Hough configuration. Modern SDC stacks start there.
- Track lanes over time. Smoothing the line params with a Kalman filter (or just an exponential moving average) erases the jitter that single-frame inference produces.
- Use bird’s-eye warp first. Project the road into a top-down view before line fitting; curves become much easier to parameterize as polynomials.
What it taught me
Computer vision pipelines compound: each step has a small parameter budget, and they trade off against each other. There’s no global optimum across diverse scenes — only good enough for this footage. The discipline of accepting “good enough on the test set” before shipping is the first taste of an ML mindset, even in pre-ML code.