State persistence

Learn how to maintain an Actor's state to prevent data loss during unexpected restarts. Includes code examples for handling server migrations.

Long-running Actor jobs may need to migrate between servers. Without state persistence, your job's progress is lost during migration, causing it to restart from the beginning on the new server. This can be costly and time-consuming.

To prevent data loss, long-running Actors should persist their state so they can resume from where they left off after a migration.

For short-running Actors, the risk of restarts and the cost of repeated runs are low, so you can typically ignore state persistence.

Understand migrations

A migration occurs when a process running on one server must stop and move to another. During this process:

All in-progress processes on the current server are stopped
Unless you've saved your state, the Actor run will restart on the new server with an empty internal state
You only have a few seconds to save your work when a migration event occurs

Causes of migration

Migrations can happen for several reasons:

Server workload optimization
Server crashes (rare)
New feature releases and bug fixes

Frequency of migrations

Migrations don't follow a specific schedule. They can occur at any time due to the events mentioned above.

Why state is lost during migration

By default, an Actor keeps its state in the server's memory. During a server switch, the run loses access to the previous server's memory. Even if data were saved on the server's disk, access to that would also be lost. Note that the Actor run's default dataset, key-value store and request queue are preserved across migrations, by state we mean the contents of runtime variables in the Actor's code.

Implement state persistence

Use the JS SDK's Actor.useState() or Python SDK's Actor.use_state() methods to persist state across migrations. This method automatically saves your state to the key-value store and restores it when the Actor restarts.

JavaScript
Python

import { Actor } from 'apify';

await Actor.init();

const state = await Actor.useState({ itemCount: 0, lastOffset: 0 });

// The state object is automatically persisted during migrations.
// Update it as your Actor processes data.
state.itemCount += 1;
state.lastOffset = 100;

await Actor.exit();

from apify import Actor

async def main():
    async with Actor:
        state = await Actor.use_state({'item_count': 0, 'last_offset': 0})

        # The state object is automatically persisted during migrations.
        # Update it as your Actor processes data.
        state['item_count'] += 1
        state['last_offset'] = 100

For improved Actor performance, consider caching repeated page data.

Speed up migrations and ensure consistency

Once your Actor receives the migrating event, the Apify platform will shut it down and restart it on a new server within one minute. To speed this process up and ensure state consistency, you can manually reboot the Actor in the migrating event handler using the Actor.reboot() method available in the Apify SDK for JavaScript or Apify SDK for Python.

JavaScript
Python

import { Actor } from 'apify';

await Actor.init();
// ...
Actor.on('migrating', async () => {
    // ...
    // save state
    // ...
    await Actor.reboot();
});
// ...
await Actor.exit();

from apify import Actor, Event

async def actor_migrate(_event_data):
    # ...
    # save state
    # ...
    await Actor.reboot()

async def main():
    async with Actor:
        # ...
        Actor.on(Event.MIGRATING, actor_migrate)
        # ...