Amazon Cloud Drive: Photos Migration

Linux
Document the migration of my photos and videos hosted from Amazon Cloud Drive to a self-hosted solution.
Author

Vinh Nguyen

Published

January 26, 2023

Background

I was lured towards Amazon Cloud Drive in 2015 due to their unlimited storage offer. When that went away, I continued to pay for the service for two reasons:

  1. my family’s collection of photos and videos, over a terabyte worth (mainly synced from our phones), was already with Amazon (status quo); and
  2. I really like the Amazon Photos app for the following reasons: automatic syncing in original quality, files stored and viewable in a folder structure via the Cloud Drive front end, creation of albums independent of file location / folder structure, facial detection, past memory reminders, and robust sharing capabilities).

With Amazon’s announcement in July 2022 that Amazon Cloud Drive is shutting down in December 2023, I’ve decided to cancel my remaining Amazon Photos subscription. Although I have the option of continuing to use Amazon Photos for my media storage and viewing needs, I dislike that I no longer have a folder structure view of my files, and who knows, maybe something else will change or get discontinued down the line. This is just like Google Reader all over again.

I explored Google Photos and NextCloud as alternative replacement services, and opted for the following self-hosted solution/setup due to my family’s growing collection of photos + videos, and to retain control of the setup:

  • Back end: store all of my photos and videos locally on my NAS via a directory structure to minimize cloud storage cost.
  • Front end: self-host PhotoPrism via Docker and use it to view and share my photos and videos via a web interface or the PWA app on Android and iOS devices. I don’t use the AI features much (facial detection), but Photoprism allows one to create albums and share files easily.
  • Sync: use the PhotoSync app to automate the backup of photos and videos on mobile devices. This happens several times a day. On Android, I could also use Syncthing (a free option).
  • Update front end: use a cron job to re-index my files in PhotoPrism every hour (to account for new files added from syncing files).
  • Backup back end: use a cron job to backup my photos and videos to Google Drive via rclone (to have an off-site backup, and to preserve my directory structure).
  • Albums: when creating photo albums in PhotoPrism, try my best to first move the files into an album folder first, just in case I decide to use something other than PhotoPrism as a front-end interface in the future.

Migration of Photos and Videos from Amazon Cloud Drive to NAS

Amazon only allows the user to bulk download their files via the Amazon Photos Desktop app. I really dislike the limited export / take-out options, and is one reason I opted for a self-hosted setup.

The app allows one to download files from Cloud Drive’s folder structure, or download selected albums from Amazon Photos. One struggle that I faced is I have 1.5TB worth of data, and I don’t have that much much storage on my Windows laptop (my NAS with a lot of storage runs Linux, but there is no Desktop app for Linux).

I originally tried to download the files (stored in Cloud Drive’s Pictures folder) onto a 500GB external hard drive, then copy the files over to the NAS, in batches. However, this was very time-consuming, and to ensure all files are copied, I would have to determine the batches of files ahead of time on Amazon Cloud Drive.

What I ended up doing was mount my NAS as a Samba drive on my laptop (say, Z:). I then chose this drive as the destination for exporting all of my files (Pictures/) from the Amazon Photos Desktop app. It took about a week to get all of my files onto my NAS via one run.

The next issue I had to resolve is recover my albums data. In the previous backup, the files transferred successfully, but they aren’t organized as albums, but rather, the source of where the files were uploaded from. For example, my photos and videos are stored in Pictures/Pixel 7/Camera and Pictures/iPhone for two mobile devices. To recover my albums, I used the Amazon Photos Desktop app to download all of my albums onto the Samba drive (so all files in an album will be downloaded into it’s own album folder). These files are essentially duplicates of the files previously downloaded.

On the NAS (Linux), I relied on the fdupes program to export a list of duplicate files (installable via apt for Debian-based systems). Essentially, I want to keep the copy of the photos/videos from the albums download and delete duplicates that are in Pictures/Pixel 7/Camera and Pictures/iPhone. I wrote the output of fdupes to a file (fdupes.txt) and reviewed the file rather than let fdupes do the deleting. The output looks something like:

Amazon Photos Downloads/Minh Hieu preschool 2019-2020/2020-06-13_09-05-35_348.jpeg
Amazon Photos Downloads/Pictures/Minh Chau's iPhone/2020-06-13_09-05-35_348.jpeg

Amazon Photos Downloads/Minh Hieu preschool 2019-2020/2020-06-13_08-24-03_438.jpeg
Amazon Photos Downloads/Pictures/Minh Chau's iPhone/2020-06-13_08-24-03_438.jpeg

...

As seen, duplicate files are separated by a blank line. In this example, I want to keep the first version of the file (in an album folder) and delete the subsequent files. To be safe, I wrote a script to process this output, grab the subsequent files, and generate bash commands that move the files to an amazon_delete folder (fearing I might delete the wrong files). The new output looks something like:

mkdir -p "`dirname \"amazon_delete/Amazon Photos Downloads/Pictures/Minh Chau's iPhone/2020-06-13_08-24-03_438.jpeg\"`"; mv "Amazon Photos Downloads/Pictures/Minh Chau's iPhone/2020-06-13_08-24-03_438.jpeg" "amazon_delete/Amazon Photos Downloads/Pictures/Minh Chau's iPhone/2020-06-13_08-24-03_438.jpeg"
...

After reviewing the generated commands, I then executed them in bash. All duplicate files are now in amazon_delete. This folder can be deleted afterwards.