A bad OTA push can crash every device in your fleet. This guide covers crash-aware auto-rollback, manual CLI commands, rollback storm mitigation, and the telemetry you need to make the call.
Not every bug warrants a rollback. These four failure modes should trigger an immediate rollback; waiting costs you users.
Follow this decision tree when a release looks bad. Screenshot it, pin it in Slack, tape it to your monitor.
The best rollback is one that happens before your users file a ticket. Crash-aware rollback detects fatal failures and reverts the bundle on the next app launch. No human intervention required.
When a new OTA bundle is downloaded and ready to apply, the SDK marks it as PENDING in local storage. The previous known-good bundle hash is preserved as the fallback target.
On the next app launch, the SDK starts a 10-second crash detection window. If the app survives this window without a fatal exception, the bundle status is promoted to CONFIRMED. The pending flag is then cleared.
If the app crashes during the detection window, the SDK sees the PENDING flag on next launch and loads the previous known-good bundle instead. The bad bundle is quarantined and a rollback event is reported to the server.
When auto-rollback doesn't trigger (silent regressions, performance degradation, user complaints), you need manual control. These CLI commands give you full rollback power.
Built-in crash detection. Automatic revert. Zero manual intervention for critical failures.
A rollback storm happens when a bad update reaches your entire user base and every device tries to download the previous bundle simultaneously. Your CDN gets hit by a thundering herd. Here's how to survive it.
Add a random delay (0–60s) before each device downloads the rollback bundle. This spreads the thundering herd across a minute instead of creating a single traffic spike.
Your bundle hosting must handle burst traffic. Static file hosts with fixed bandwidth caps will buckle under load. Use a CDN with edge caching and auto-scaling, or let AppsOnAir's global edge network handle it.
Keep the previous bundle cached on the device. During a rollback, the SDK loads the local cache instead of re-downloading the bundle: zero network traffic, instant revert. AppsOnAir CodePush retains the last two bundles by default.
The best rollback is the one you never need. Stage your releases and catch problems when they affect 5,000 users, not 500,000.
Push to 5% of your user base, ideally internal team members and opted-in beta users. Monitor crash rates, JS errors, and cold-start times for 30 minutes.
If the canary looks clean, promote to 25%. This catches device-specific issues: older phones, different OS versions, and lower-bandwidth networks. Hold for 2 hours.
Metrics green at 25%? Ship to everyone. If anything goes wrong at this stage, auto-rollback handles it. Staged rollouts mean you almost never get here with a bad build.
Some updates can't be rolled back. Payment flow changes, auth migrations, and database schema updates; pin these versions so no OTA update overwrites them until you're ready.
You can't make a rollback decision without data. These six signals are the minimum telemetry you need to run OTA updates safely.
Crash rate segmented by bundle version. Compare the new release's crash rate against the previous version's baseline within the first 30 minutes.
Unhandled JS exception count per minute, with full stack traces. A 3× spike over baseline within 15 minutes warrants investigation.
How quickly devices are picking up the new bundle. A stalled adoption curve means devices are failing to download or apply the update.
Compare cold-start time before and after the update. A 500ms+ regression signals a bundle-size or initialization problem worth rolling back.
How many devices have auto-rolled back. Even a small percentage of auto-rollbacks means some users are experiencing crashes that aggregate metrics might hide.
Automated signals miss UX regressions. Correlate support ticket volume with OTA deployment timestamps to catch silent failures that humans notice but machines don't.
Here's how AppsOnAir CodePush handles each rollback scenario: no custom code, no infrastructure setup, no manual configuration.
Automatic rollback is triggered by crash-on-launch detection (the app crashes within the detection window after applying a new bundle), a JS error rate exceeding a configurable threshold, or native exceptions linked to the new bundle. AppsOnAir CodePush monitors all three signals and reverts automatically; no custom crash-handler code is needed.
Use the AppsOnAir CLI: appsonair codepush rollback <appName>. This reverts all devices to the previous known-good release. You can also target a specific version with the --targetBinaryVersion flag or use the dashboard's one-click rollback button.
A rollback storm occurs when thousands of devices detect a bad update simultaneously and all request the previous bundle at once, overwhelming your CDN. Prevent it with jittered rollback delays (a random 0–60s delay per device), percentage-based rollouts (so fewer devices receive the bad update), local bundle caching, and a CDN with auto-scaling. AppsOnAir CodePush handles all of these automatically.
Yes, always. Releasing to 5-10% of users first lets you catch crash spikes, JS errors, and user complaints before the update reaches your full user base. If metrics look bad, you roll back only the small affected group, not your entire fleet. This is the single most effective rollback-prevention strategy.
At a minimum: crash rate per bundle version, JS error rate with stack traces, update adoption curve, cold-start time delta, rollback event count, and user-reported issues. AppsOnAir CodePush provides all of these in a real-time dashboard with configurable alert thresholds for Slack, PagerDuty, and email.
Yes. Version pinning locks a device or device group to a specific bundle version, preventing any newer OTA updates from being applied until the pin is removed. This is critical for enterprise apps during audit periods, compliance freezes, or when native and JS versions must stay in sync. Set pin-expiry dates to avoid stale locks.
The rollback logic itself is the same, but the restart mechanism differs. In Bridgeless mode (RN 0.76+), the old bridge-based reload path is gone, so rollback must use a native restart module. AppsOnAir CodePush handles this automatically with its Bridgeless-aware restart module; no custom native code is needed.