---
title: Data persistence with R2
description: Mount R2 buckets as local filesystem paths to persist data across sandbox lifecycles.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/sandbox/llms.txt  
> Use this file to discover all available pages before exploring further.

[Skip to content](#%5Ftop) 

# Data persistence with R2

**Last reviewed:**  6 months ago 

Mount object storage buckets as local filesystem paths to persist data across sandbox lifecycles. This tutorial uses Cloudflare R2, but the same approach works with any S3-compatible provider.

**Time to complete:** 20 minutes

## What you'll build

A Worker that processes data, stores results in an R2 bucket mounted as a local directory, and demonstrates that data persists even after the sandbox is destroyed and recreated.

**Key concepts you'll learn**:

* Mounting R2 buckets as filesystem paths
* Automatic data persistence across sandbox lifecycles
* Working with mounted storage using standard file operations

## Prerequisites

1. Sign up for a [Cloudflare account ↗](https://dash.cloudflare.com/sign-up/workers-and-pages).
2. Install [Node.js ↗](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm).

Node.js version manager

Use a Node version manager like [Volta ↗](https://volta.sh/) or [nvm ↗](https://github.com/nvm-sh/nvm) to avoid permission issues and change Node.js versions. [Wrangler](https://developers.cloudflare.com/workers/wrangler/install-and-update/), discussed later in this guide, requires a Node version of `16.17.0` or later.

You'll also need:

* [Docker ↗](https://www.docker.com/) running locally
* An R2 bucket (create one in the [Cloudflare dashboard ↗](https://dash.cloudflare.com/?to=/:account/r2))

## 1\. Create your project

 npm  yarn  pnpm 

```
npm create cloudflare@latest -- data-pipeline --template=cloudflare/sandbox-sdk/examples/minimal
```

```
yarn create cloudflare data-pipeline --template=cloudflare/sandbox-sdk/examples/minimal
```

```
pnpm create cloudflare@latest data-pipeline --template=cloudflare/sandbox-sdk/examples/minimal
```

Terminal window

```

cd data-pipeline


```

## 2\. Configure R2 binding

Add an R2 bucket binding to your `wrangler.json`:

wrangler.json

```

{

  "name": "data-pipeline",

  "compatibility_date": "2025-11-09",

  "durable_objects": {

    "bindings": [

      { "name": "Sandbox", "class_name": "Sandbox" }

    ]

  },

  "r2_buckets": [

    {

      "binding": "DATA_BUCKET",

      "bucket_name": "my-data-bucket"

    }

  ]

}


```

Replace `my-data-bucket` with your R2 bucket name. Create the bucket first in the [Cloudflare dashboard ↗](https://dash.cloudflare.com/?to=/:account/r2).

## 3\. Build the data processor

Replace `src/index.ts` with code that mounts R2 and processes data:

* [  JavaScript ](#tab-panel-8173)
* [  TypeScript ](#tab-panel-8174)

JavaScript

```

import { getSandbox } from "@cloudflare/sandbox";


export { Sandbox } from "@cloudflare/sandbox";


export default {

  async fetch(request, env) {

    const url = new URL(request.url);

    const sandbox = getSandbox(env.Sandbox, "data-processor");


    // Mount R2 bucket to /data directory

    await sandbox.mountBucket("my-data-bucket", "/data", {

      endpoint: "https://YOUR_ACCOUNT_ID.r2.cloudflarestorage.com",

    });


    if (url.pathname === "/process") {

      // Process data and save to mounted R2

      const result = await sandbox.exec("python", {

        args: [

          "-c",

          `

import json

import os

from datetime import datetime


# Read input (or create sample data)

data = [

    {'id': 1, 'value': 42},

    {'id': 2, 'value': 87},

    {'id': 3, 'value': 15}

]


# Process: calculate sum and average

total = sum(item['value'] for item in data)

avg = total / len(data)


# Save results to mounted R2 (/data is the mounted bucket)

result = {

    'timestamp': datetime.now().isoformat(),

    'total': total,

    'average': avg,

    'processed_count': len(data)

}


os.makedirs('/data/results', exist_ok=True)

with open('/data/results/latest.json', 'w') as f:

    json.dump(result, f, indent=2)


print(json.dumps(result))

        `,

        ],

      });


      return Response.json({

        message: "Data processed and saved to R2",

        result: JSON.parse(result.stdout),

      });

    }


    if (url.pathname === "/results") {

      // Read results from mounted R2

      const result = await sandbox.exec("cat", {

        args: ["/data/results/latest.json"],

      });


      if (!result.success) {

        return Response.json(

          { error: "No results found yet" },

          { status: 404 },

        );

      }


      return Response.json({

        message: "Results retrieved from R2",

        data: JSON.parse(result.stdout),

      });

    }


    if (url.pathname === "/destroy") {

      // Destroy sandbox to demonstrate persistence

      await sandbox.destroy();

      return Response.json({

        message: "Sandbox destroyed. Data persists in R2!",

      });

    }


    return new Response(

      `

Data Pipeline with Persistent Storage


Endpoints:

- POST /process  - Process data and save to R2

- GET /results   - Retrieve results from R2

- POST /destroy  - Destroy sandbox (data survives!)


Try this flow:

1. POST /process  (processes and saves to R2)

2. POST /destroy  (destroys sandbox)

3. GET /results   (data still accessible from R2)

    `,

      { headers: { "Content-Type": "text/plain" } },

    );

  },

};


```

TypeScript

```

import { getSandbox, type Sandbox } from '@cloudflare/sandbox';


export { Sandbox } from '@cloudflare/sandbox';


interface Env {

  Sandbox: DurableObjectNamespace<Sandbox>;

  DATA_BUCKET: R2Bucket;

}


export default {

  async fetch(request: Request, env: Env): Promise<Response> {

    const url = new URL(request.url);

    const sandbox = getSandbox(env.Sandbox, 'data-processor');


    // Mount R2 bucket to /data directory

    await sandbox.mountBucket('my-data-bucket', '/data', {

      endpoint: 'https://YOUR_ACCOUNT_ID.r2.cloudflarestorage.com'

    });


    if (url.pathname === '/process') {

      // Process data and save to mounted R2

      const result = await sandbox.exec('python', {

        args: ['-c', `

import json

import os

from datetime import datetime


# Read input (or create sample data)

data = [

    {'id': 1, 'value': 42},

    {'id': 2, 'value': 87},

    {'id': 3, 'value': 15}

]


# Process: calculate sum and average

total = sum(item['value'] for item in data)

avg = total / len(data)


# Save results to mounted R2 (/data is the mounted bucket)

result = {

    'timestamp': datetime.now().isoformat(),

    'total': total,

    'average': avg,

    'processed_count': len(data)

}


os.makedirs('/data/results', exist_ok=True)

with open('/data/results/latest.json', 'w') as f:

    json.dump(result, f, indent=2)


print(json.dumps(result))

        `]

      });


      return Response.json({

        message: 'Data processed and saved to R2',

        result: JSON.parse(result.stdout)

      });

    }


    if (url.pathname === '/results') {

      // Read results from mounted R2

      const result = await sandbox.exec('cat', {

        args: ['/data/results/latest.json']

      });


      if (!result.success) {

        return Response.json({ error: 'No results found yet' }, { status: 404 });

      }


      return Response.json({

        message: 'Results retrieved from R2',

        data: JSON.parse(result.stdout)

      });

    }


    if (url.pathname === '/destroy') {

      // Destroy sandbox to demonstrate persistence

      await sandbox.destroy();

      return Response.json({ message: 'Sandbox destroyed. Data persists in R2!' });

    }


    return new Response(`

Data Pipeline with Persistent Storage


Endpoints:

- POST /process  - Process data and save to R2

- GET /results   - Retrieve results from R2

- POST /destroy  - Destroy sandbox (data survives!)


Try this flow:

1. POST /process  (processes and saves to R2)

2. POST /destroy  (destroys sandbox)

3. GET /results   (data still accessible from R2)

    `, { headers: { 'Content-Type': 'text/plain' } });

  }

};


```

Replace YOUR\_ACCOUNT\_ID

Replace `YOUR_ACCOUNT_ID` in the endpoint URL with your Cloudflare account ID. Find it in the [dashboard ↗](https://dash.cloudflare.com/) under **R2** \> **Overview**.

## 4\. Deploy to production

**Generate R2 API tokens:**

1. Go to **R2** \> **Overview** in the [Cloudflare dashboard ↗](https://dash.cloudflare.com/)
2. Select **Manage R2 API Tokens**
3. Create a token with **Object Read & Write** permissions
4. Copy the **Access Key ID** and **Secret Access Key**

**Set up credentials as Worker secrets:**

Terminal window

```

npx wrangler secret put AWS_ACCESS_KEY_ID

# Paste your R2 Access Key ID


npx wrangler secret put AWS_SECRET_ACCESS_KEY

# Paste your R2 Secret Access Key


```

Worker secrets are encrypted and only accessible to your deployed Worker. The SDK automatically detects these credentials when `mountBucket()` is called.

**Deploy your Worker:**

Terminal window

```

npx wrangler deploy


```

After deployment, wrangler outputs your Worker URL (e.g., `https://data-pipeline.yourname.workers.dev`).

## 5\. Test the persistence flow

Now test against your deployed Worker. Replace `YOUR_WORKER_URL` with your actual Worker URL:

Terminal window

```

# 1. Process data (saves to R2)

curl -X POST https://YOUR_WORKER_URL/process

# Returns: { "message": "Data processed...", "result": { "total": 144, "average": 48, ... } }


# 2. Verify data is accessible

curl https://YOUR_WORKER_URL/results

# Returns the same results from R2


# 3. Destroy the sandbox

curl -X POST https://YOUR_WORKER_URL/destroy

# Returns: { "message": "Sandbox destroyed. Data persists in R2!" }


# 4. Access results again (from new sandbox)

curl https://YOUR_WORKER_URL/results

# Still works! Data persisted across sandbox lifecycle


```

The key insight: After destroying the sandbox, the next request creates a new sandbox instance, mounts the same R2 bucket, and finds the data still there.

## What you learned

In this tutorial, you built a data pipeline that demonstrates filesystem persistence through R2 bucket mounting:

* **Mounting buckets**: Use `mountBucket()` to make R2 accessible as a local directory
* **Standard file operations**: Access mounted buckets using familiar filesystem commands (`cat`, Python `open()`, etc.)
* **Automatic persistence**: Data written to mounted directories survives sandbox destruction
* **Credential management**: Configure R2 access using environment variables or explicit credentials

## Next steps

* [Mount buckets guide](https://developers.cloudflare.com/sandbox/guides/mount-buckets/) \- Comprehensive mounting reference
* [Storage API](https://developers.cloudflare.com/sandbox/api/storage/) \- Complete API documentation
* [Environment variables](https://developers.cloudflare.com/sandbox/configuration/environment-variables/) \- Credential configuration options

## Related resources

* [R2 documentation](https://developers.cloudflare.com/r2/) \- Learn about Cloudflare R2
* [Background processes guide](https://developers.cloudflare.com/sandbox/guides/background-processes/) \- Long-running data processing
* [Sandboxes concept](https://developers.cloudflare.com/sandbox/concepts/sandboxes/) \- Understanding sandbox lifecycle

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/sandbox/","name":"Sandbox SDK"}},{"@type":"ListItem","position":3,"item":{"@id":"/sandbox/tutorials/","name":"Tutorials"}},{"@type":"ListItem","position":4,"item":{"@id":"/sandbox/tutorials/persistent-storage/","name":"Data persistence with R2"}}]}
```
