Data Loss
The theme of this post is data loss. The post is in fact a celebration of being able to post at all, as for 2-3 weeks the raspberry pi I host this site on was down! I shall tell the long, winding, and amateurish tale of my attempts to revive it, involving side quests to the lands of “google drive”, “ubuntu” and, improbably, “baidu wangpan”.
SD Card disaster
It all kicks off when I found my raspberry pi acting up - somehow it was being very slow on ssh, and even unable to open files. This headscratcher wasn’t fixed by rebooting, in fact the pi became unresponsive, and not seemingly for the usual internet issues. Attaching it to a screen, it became apparent that something was majorly off with its SD card (OS and file storage for the Pi).
I immediately became paranoid about my files on the Pi - though nothing very important, I had been patchy in my usage of git, and particularly had invested sweat and tears in config and set-up to make it usable as a website hoster. Viewing the SD card with my macbook (via a dongle’s SD card reader slot), no files seemed apparent (I later learnt that actually probably my mac just couldn’t read the SD card’s format), so I hastened to grab my not-much-used external SSD (500GB), and found a program (balena etcher) to clone/flash the SD card onto it. At least now if the SD card became totally unreadable, perhaps the files could still be recovered from the SSD.
Everything is breaking; more supplies needed
Said dongle’s HDMI port happened to break the next day, with no replugging seeming to be effective. A (rare?) case where buying the cheapest available on amazon doesn’t turn out so well. Thus with a new SD card and dongle on my to-buy list, I trooped to Zurich HB. Running back and forth between Interdiscount, Fust, and Microspot, I found a shinier looking dongle (70 CHF and sadly fewer ports than before), and a 32 GB SD card (9 CHF).
A storm brews in the Baidu cloud
Some context is necessary: I have had a long running endeavour since 2020 trying to find a way to watch the anime hunter x hunter with mandarin dubs and subs. There exists a Taiwanese dub, but due to my incompetence in surfing the chinese internet, I had been unable to find a good place to watch it, especially one without traditional chinese subs - I wanted simplified subs. Now a couple of months ago, I’d found someone posting on bilibili (like a chinese youtube) a short clip of high quality but trad subbed hunter x hunter, and promising to have the whole series - just a DM away (my plan was to either find trad subtitle .srt files to translate to simplified, or use OCR to extract the subtitles myself). I’d made a bilibili account, DM’d them a request, and forgotten about it. Checking back recently, I saw they’d asked to connect on WeChat (like chinese Whatsapp/Messenger). They actually wanted 15 yuan for the promise of sharing, which luckily a kind chinese friend was willing to send on my behalf (chinese paypal is beyond me). They sent a link, and Shazam! I now had access to a Baidu wangpan drive (like chinese google drive) of 150 high quality episodes.
But what has this to do with data loss and the raspberry pi? The answer is twofold:
-
These two events happened synchronously, and both were time pressured. I felt pressure on fixing the Pi as I was using it to host my website, my poor attempt at showing prospective employers some kind of frontend competence on my part; I didn’t want them to click the link on my CV and get a non-functioning personal website. For my anime stash, I had three days to get the files off Baidu before the link expired. Likely my link provider could just give me another, but I didn’t want to tempt fate, or a larger ransom.
-
150 episodes of anime is actually 40 GB, and I was worried about running out of space on my laptop. And now my handy 500 GB drive was temporarily hosting the dead brains of my Pi.
Two chinese sorcerers come to the rescue
Under this pressure, I couldn’t even figure out how to use my precious drive link - to access or download the files I needed a Baidu account. Chinese internet being what it is, to make an account you need a chinese phone number to receive 2fa texts (some alternative sign up for international users seemed to no longer work). I rang up my Mum, who has a chinese phone number, but was unable to make signing up work. I was considering trying again over a VPN to appear to be in Hong Kong / China (an absurd reversal), but luckily another chinese friend came through, and was happy to lend me their own unused account.
I was now able to transfer the anime files from one cloud drive to my friend’s, where I hoped they would be safe and non-expiring. Although Baidu ridiculously said I’d occupied 40 GB / 5 GB available storate space, the files still seemed to be downloadable, albeit at very slow speeds (about the same time to download an episode as its actual runtime. Baidu seems to make money from charging users for higher download rates).
Enter Ubuntu and GParted; A wise sage speaks
The anime situation sorted out for now, I was able to turn my attention back to saving the Pi. Prodding at the SD card with my Mac proved ineffective, and ChatGPT advised trying with a linux computer (it takes one to know one). A failed venture to use Ubuntu on uni desktops later (no sudo access on the built in OS; Impervious to booting from an external thumbdrive), I remembered my old ex-laptop at home, which was happy to be exeternally booted. It turns out that GParted which comes as default with Ubuntu was just what I needed! My flatmate Steven, an experienced sysadmin in a past life, gave helpful pointers on the mysteries of partitions. Et voilà I could now see a whole file system on my old SD card (and also copied into a separate partition on my SSD). Almost all my files were happily still residing under /home/pi/
, however the /dev/
folder was oddly completely missing (my nginx configs ).
Pi-oenix rises from the ashes
With the recovered files transferred onto the new raspbian SD card, the Pi could finally be restored to almost its former glory! I rewrite my nginx configs and set-up port forwarding with a fraction of the time taken previously. I however dare not relaunch the temperature sensing program I’d had merrily running previously on my Pi - perhaps the constant writing to memory might have contributed to the SD card failure? A problem to be solved another day, and my room shall remain unknowingly chilly in the meantime. Wandb annoyingly has no “online only logging” setting , so I think every log command writes locally to disk as well.
Data is backed up, but at what cost? (google drive)
With meditations on the importance of having data backed up and in order, and also realising that my macbook’s Google Drive for Desktop is constantly consuming 7GB of RAM for unknown reasons, I decide to do a thorough cleaning of all the old junk files that are constantly synced on my laptop and the cloud. I flirt with Rclone but it seems far too slow (cloud provider throttling?), and settle on keeping a much smaller subset of my cloud files to be synced rather than literally all of them. Many useless files are also tossed!
A final twist of the blade
The weary adventurer is on the final leg of the journey. Almost everything is as it should be. He starts to write this very blog post on the silently purring Pi, using the latest in adventuring software - VScode remote editor. He’s only a couple of paragraphs in, and suddenly connection to the remote host breaks, and cannot be re-established. No amount of playing around with the VScode settings.json
has effect, even though SSHing normally through the terminal is still fine. The words written so far have vanished into the ether. The adventurer sighs, and admits that, like the nginx config, some things are better rewritten from scratch. And one more better data practice is added to the list - remote code transfer should be done offline through git.
Somethings were lost. Somethings were recovered. Somethings were learnt and changed for the better. The kingdom is at peace again, for now…