Context
SmartNews renders millions of articles through its SmartView system. The existing extraction pipeline had accumulated technical debt and inconsistencies across publishers.
Replacing it outright was too risky—any regression would directly impact content readability and user trust.
The problem
We needed to:
- introduce a new extraction pipeline
- improve HTML parsing and formatting reliability
- integrate with backend-driven rules (CSS, transformations)
- avoid breaking the existing rendering experience
All without disrupting a core product surface.
Approach
I built SmartHTMLExtractorV2 as a parallel implementation:
- Created a new Swift module (SmartHTML) wrapping SwiftSoup with a stable internal API
- Implemented a new extractor pipeline with:
- HTML parsing and cleanup
- CSS rule integration from backend services
- structured formatting via internal models
- Introduced client-condition gating so v1 and v2 could coexist safely
To reduce operational risk:
- Added fallback logic (stage/prod URLs if client conditions failed)
- Ensured edition-aware proxying (US vs JP behavior)
- Handled iOS-version-specific behavior (e.g. image format selection like JPEG vs WebP)
Outcome
- Successfully shipped a parallel rendering system without regressions to the core experience
- Enabled incremental rollout and validation instead of a risky full switch
- Established a clean Swift-based foundation for future rendering improvements
What this demonstrates
- Designing safe rollouts for high-risk, high-impact systems
- Wrapping third-party libraries into stable internal abstractions
- Building modular architecture inside a legacy mixed Obj-C/Swift codebase
- Thinking beyond implementation—operational safety and fallback behavior