Skip to content

Tag: aws

AWS S3: How to recover deleted files?

Few days back while fixing some production issue my team deleted a big database. it had more than 100 tables and around 100 GB of data. And so re-processing and loading tables before business people query it was next to impossible task. Good thing, we had versioning enabled on bucket.

If you have a versioning enable on your S3 buckets, Every time you make changes to your file it creates a new version. similar to git, and if you delete a file, rather than deleting a file physically it just marks file as deleted. And so, if you want to recover a file, all you need to do is delete the delete marker from file.

Now let’s see how to do it step-by-step.

let’s take the following bucket, prefix for example

[gshah@aws-dev restore]$ aws s3 ls s3://aws-dev01-sample-bucket/SIT/USER/gshah/                                                                              
2020-07-20 13:50:12          0 1.txt                                                                                                                                       
2020-07-20 13:50:18          0 2.txt                                                                                                                                       
2020-07-20 13:50:25          0 3.txt                                                                                                                                       
2020-07-20 13:42:28          0 abc.txt                                                                                                                                     
2020-07-20 13:43:01          0 xyz.txt

Now let’s delete delete file files.

[gshah@aws-dev restore]$ aws s3 rm s3://aws-dev01-sample-bucket/SIT/USER/gshah/abc.txt                                                                       
delete: s3://aws-dev01-sample-bucket/SIT/USER/gshah/abc.txt                                                                                                           
[gshah@aws-dev restore]$ aws s3 rm s3://aws-dev01-sample-bucket/SIT/USER/gshah/xyz.txt                                                                       
delete: s3://aws-dev01-sample-bucket/SIT/USER/gshah/xyz.txt                                                                                                           
[gshah@aws-dev restore]$ aws s3 ls s3://aws-dev01-sample-bucket/SIT/USER/gshah/                                                                              
2020-07-20 13:50:12          0 1.txt                                                                                                                                       
2020-07-20 13:50:18          0 2.txt                                                                                                                                       
2020-07-20 13:50:25          0 3.txt

Now let’s see those deleted files

[gshah@aws-dev restore]$ aws s3api list-object-versions \                                                                                                         
> --bucket aws-dev01-sample-bucket \                                                                                                                                    
> --prefix SIT/USER/gshah/  \                                                                                                                                            
> --output json \                                                                                                                                                          
> --query 'DeleteMarkers[?IsLatest==true]'                                                                                                                               
[                                                                                                                                                                          
    {                                                                                                                                                                      
        "Owner": {                                                                                                                                                         
            "ID": "514384a9e158b47a163ad2a0b3e7d767dfbc46b167f6899de54b0eb7d4413cd9"                                                                                       
        },                                                                                                                                                                 
        "IsLatest": true,                                                                                                                                                  
        "VersionId": "oGhCi9bGRS_xYNstvGEgjGv24Dv94VzW",                                                                                                                   
        "Key": "SIT/USER/gshah/4.txt",                                                                                                                                   
        "LastModified": "2020-07-20T13:51:33.000Z"                                                                                                                         
    },                                                                                                                                                                     
    {                                                                                                                                                                      
        "Owner": {                                                                                                                                                         
            "ID": "514384a9e158b47a163ad2a0b3e7d767dfbc46b167f6899de54b0eb7d4413cd9"                                                                                       
        },                                                                                                                                                                 
        "IsLatest": true,                                                                                                                                                  
        "VersionId": "_Air1UwjRGOqln65oSp5xiCO06ZPwocP",                                                                                                                   
        "Key": "SIT/USER/gshah/abc.txt",                                                                                                                                 
        "LastModified": "2020-07-20T13:59:52.000Z"                                                                                                                         
    },                                                                                                                                                                     
    {                                                                                                                                                                      
        "Owner": {                                                                                                                                                         
            "ID": "514384a9e158b47a163ad2a0b3e7d767dfbc46b167f6899de54b0eb7d4413cd9"                                                                                       
        },                                                                                                                                                                 
        "IsLatest": true,                                                                                                                                                  
        "VersionId": "e1LYLX92jBtLgbXp92.nZdf0yFKZ8m3I",                                                                                                                   
        "Key": "SIT/USER/gshah/xyz.txt",                                                                                                                                 
        "LastModified": "2020-07-20T13:59:58.000Z"                                                                                                                         
    }                                                                                                                                                                      
]

This will show all the deleted files, not just recently deleted files. For recently deleted files put time inside query argument. i.e. ‘DeleteMarkers[?IsLatest==true && LastModified >= 2020-07-20T13:59:52.000Z]’. You can see that it shows 4.txt as well, which I deleted sometime in past.

Now let’s recover those files.
for this I need to delete the delete market for this files.

[gshah@aws-dev restore]$ aws s3api delete-object --bucket aws-dev01-sample-bucket --key SIT/USER/gshah/xyz.txt                                                                                                                                                                                                                                                                                                                                                                                    
{                                                                                                                                                                          
    "VersionId": "xpLtEqtX7PBYy3NmstxbvwlVqV6thjwt",                                                                                                                       
    "DeleteMarker": true                                                                                                                                                   
}                                                                                                                                                                          
[gshah@aws-dev restore]$ aws s3api delete-object --bucket aws-dev01-sample-bucket --key SIT/USER/gshah/abc.txt                                                                                                                                                                                                                                                         
{                                                                                                                                                                          
    "VersionId": "gKv_9EaQF2UPtZFQDrQkKFobd.OhORXV",                                                                                                                       
    "DeleteMarker": true                                                                                                                                                   
}

Let’s see if we got those files back

[gshah@aws-dev restore]$ aws s3 ls s3://aws-dev01-sample-bucket/SIT/USER/gshah/                                                                              
2020-07-20 13:50:12          0 1.txt
2020-07-20 13:50:18          0 2.txt
2020-07-20 13:50:25          0 3.txt
2020-07-20 13:42:28          0 abc.txt
2020-07-20 13:43:01          0 xyz.txt

Leave a Comment