After almost an entire week of not being able to backup to Crashplan because of “archive maintenance” on their part, I was informed today that instead of merely maintaining my backup archive, Crashplan LOST THE ENTIRE THING. That’s right, Crashplan lost all of my data, unrecoverably. It’s gone – my entire online backup archive of all of my data – my life’s work – 2.4 tb of everything – completely gone. And Crashplan doesn’t seem to have any remorse.
I’ve been using online backup for a number of years. Before I switched to Crashplan, I was using Backblaze for online backup. Backblaze was great – a lightweight, easy to use front end, fast backup speed, low system resources, good support response times, and most importantly, a reliable service. However, the one thing Backblaze didn’t do at the time was offer unlimited, or very long retention. Backblaze would delete data from my archive if it didn’t see that data connected to my computer for a month – so if I went on vacation for 2 months and stored away my external hard drives, my data would be automatically expunged at the end of the first month. Because of this little quirk of their service, I decided to switch backup providers to Crashplan. Note that nowadays, I believe that Backblaze retains data for 3-6 months, which is much much better.
Anyway, I decided that having longer than 1 month retention times for my data was important, and switched to Crashplan. My initial impression of Crashplan was good – a flexible service with lots of configuration options. However, over the last few months of using it, and leading up to this giant Crashplan data loss disaster, have seen a darker side. Backup upload speed was nothing like I saw using Backblaze. I would usually only see 100kbps or so upload speed, even when I knew I had much more bandwidth. The backup service would load slowly, and would frequently re-scan my computer. The UI was finicky – if the wrong settings were changed, my entire backup archive would be expunged by the system – with no time to go back and undo anything. Sure they give big warnings of this, but a small grace period would be nice. And throughout all of these issues, Crashplan support sucked. I’d wait days to get a response, had multiple support agents dealing with my case on multiple threads – the whole thing was a mess.
So when I received the below email this afternoon, I was extremely disappointed and immediately concerned for the safety of my now un-backed-up data, but unfortunately I wasn’t very surprised – this is just the kind of disaster I’ve come to expect from Crashplan.
Below you’ll find a transcript of my current support history, starting from the time when I reported my “unable to backup” issue, and ending with my response to their meager attempt to remedy the situation.
I had a brief call with tech manager Brad W. this evening about the issue – he was a nice, knowledgable guy, and offered me a 2tb seed drive to re-seed my data. He also confirmed that I’m the only account that this data loss situation happened with, and that they’re interested in finding a suitable solution. I haven’t accepted a solution yet.
For now, the situation is unresolved, and my data is still gone. And since I’m still traveling at the moment, I don’t even know when I’ll be able to get home to re-seed all of my data to a drive, even if they do overnight one to me.
What should I do? Should I accept their 2tb seed drive resolution and move on? Should I jump ship and backup with somebody else?
Crashplan Support Thread
“What happened to my data?! “Unable to restore due to a backup archive I/O error”
Dec-09 01:58 pm
I’m very concerned.. I was looking into my crashplan settings to see
why my backup was taking so long, and noticed that although it says
I’m connected, it’s saying the backup destination is unavailable. Also
online I get a notice when trying to view my flies “Unable to restore
due to a backup archive I/O error”. What’s going on here? Is all my
data gone? Why am I not backing up? Do you need any diagnostic info
from my computer or anything? Help help – there have been way way too
many hiccups with my backup so far since I switched from backblaze to
crashplan. Once again thinking of switching back. I just don’t want to
be always worrying about my backup crashing all the time – that is not
screenshots attached. thanks!
Screen Shot 2011-12-09 at 2.54.43 PM.png (quick view)
Screen Shot 2011-12-09 at 2.53.16 PM.png (quick view)
Dec-10 2011 02:05 pm
First, your data is still on our servers and safe. Your archive is in queue for maintenance. Unfortunately, there are a few large archives ahead of yours in the queue. Archive maintenance is necessary to ensure reliable data for restoration. This is also what caused the error that you saw when you went to the Web Restore portal.
As soon as archive maintenance has run on your archive, your backups will resume normally. Unfortunately, you’ll have to wait for the other archives in front of yours in the queue first.
Please let us know if you have further questions.
Dec-10 2011 03:16 pm
Hi David, thanks for your response!
It seems like my archive has been in the queue for quite some time, and as a result I’ve been unable to back up for a few days – how can this happen? One of the reasons I chose Crashplan was the promise of “Real-time continuous backup” – which I took to mean that my files would always be backed up, as my upload bandwidth permitted – so that when I added or changed files, they would be backed up ASAP. I’m a little disappointed that maintenance on your side would result in me being unable to backup for so long. Can anything be done restore my backup functionality I paid for? Also, I understand that planned system maintenance is probably expected – and bet that I actually agreed to it in the license agreement I accepted – although I can’t find any type of availability reference in this doc – http://support.crashplan.com/doku.php/eula – but I’m not a lawyer and am almost definitely not reading/understanding it correctly.
Anyway, let me know if anything can be done to get my backup restarted, or if it would be at all possible for me to get a drive to backup to while I wait for my online backup to be restored.. Could I do another seed drive and have it added to my existing backup archive (not replace it)?
Dec-10 2011 03:48 pm
Unfortunately, another Seed Drive would replace your archive rather than be added to the current archive. Also, the only thing we can do while your computer is in the maintenance queue is to wait for it to complete. This is part of the reason that we highly recommend that you use CrashPlan to backup to more than one destination (i.e. another computer or an external harddrive).
Dec-10 2011 04:38 pm
This is rediculous. It’s been 4 days since I was able to back up, and now I’m receiving warning emails from your system – for a problem that I have no control over! I actually DO backup to other locations with Time Machine – however that has nothing to do with the fact that Crashplan+, which I pay for, has not allowed me to backup for 4 days! What is going on here? Why in the world would I pay for such shitty service? Since I switched from backblaze to crashplan, I’ve had nothing but problems.
Please fix this, immediately.
Dec-12 2011 02:35 pm
Thanks for calling–give me a call back when that scan finishes!
Dec-12 2011 02:46 pm
Could you please send me your CrashPlan log files so I can take a closer look at the behavior you’re seeing? Please do the following:
- Open CrashPlan
- Double click the CrashPlan logo in the upper right
- Type: getlogs [REDACTED]
- Press enter
CrashPlan will automatically zip up your log files and attach them to this ticket.
Dec-12 2011 02:49 pm
Okay–got a response back *right* away! Apparently after the archive move we ran a maintenance job on it (which is normal) and yours says it’s still pending because there were a few folks ahead of you in line. I’ve ‘bumped’ you to the top of the line so your archive should run maintenance and finish tonight, hopefully. It should be ready to access later on tonight/tomorrow. I hope this helps.
Dec-12 2011 02:58 pm
Ok, thanks very much for bumping me up to the front of the line – hopefully I’ll be able to backup soon.
Is suspending a paying customers backup for 4 days normal or acceptable? This seems like a bit long to go without backing up.. what do you think?
Dec-12 2011 03:18 pm
Thank you for your patience as we looked into this issue. Unfortunately, I have some bad news regarding your maintenance job. Your backup archive had corrupted data, and the archive could not be properly repaired by archive maintenance in this situation. This is a very rare situation. We are still looking into the root cause, but unfortunately much of the archive was affected and very little was able to be rebuilt. I sincerely apologize for this situation and the inconvenience of this event.
We definitely want to get you backed up and secured again, and I can expedite shipment of a seed drive to you so that we can upload your data again. We will cover the cost of the seed drive and all shipment, and we can get this out to you today if that would work for you. Could you contact us back with the shipping address, or simply confirm that the shipping address we previous used is the one you wish to use again?
From our records, I have the shipping address as:
Please let us know if we should use this one, or if you’d prefer a different address. Again, please accept our apology for this issue. Please contact me if you have any questions.
Dec-13 2011 03:17 pm
Thanks for the heads up on the issue. After not being able to backup for almost a week, and now being informed that my ENTIRE BACKUP IS LOST, I’m extremely unhappy. How in the world can this situation possibly occur? Crashplan advertises itself as a reliable, real time backup solution. I was sold on the promise that my data with Crashplan would always be protected, maintained, and available to me at all times through the app, website, and mobile. app. To tell me that my data not only became corrupted, but was also unrepairable and then completely lost is absolutely unacceptable, for any level of service. Crashplan was my primary backup, and I’m extremely scared now that my data is unprotected and GONE.
I understand how competitive the online backup industry has become recently, and that one of the main hallmarks of marketing and competition is the reliability and ease of use of your service. You advertise both “Files secured in data centers worldwide” and “Real-time continuous backup”, yet I have experience none of this. My backup has been stalled for almost a week, and now your data center has completely lost my data.
Up until I started having issues with Crashplan last week, I had actually been recommending you to various friends and family. In fact, I set up both my Mother and Sister with Crashplan. How do I explain to them that the service I had once recommended has completely lost my data?
Regarding your “solution” – it’s weak, and highlights the fact that you really don’t care about me or my data. You’re offering to send me ONE measly 1tb seed drive? Did you not look at my account and see that I have 2.5 tb of data selected? At very very very least, you could offer to seed all of my current data to your online data center. A 1TB seed drive service costs $124.99 – so are you saying that all of my data – my life’s work – is only worth $124.99 to you? That’s insane.
I understand that periodically data issues do occur, and I appreciate you owning up to the issue. I’d like to have my confidence in your service restored and continue out the rest of my 4+ year prepaid account. I’d like a swift and appropriate resolution to this issue. Please let me know what you can do for me. Also, look forward to my blog post about this, and associated information sent over to Consumerist, ETC.
Your un-backed-up almost former customer,
Dec-13 2011 05:00 pm
Here’s a recap of our phone call.
Here is what we can do to get you backed up again as quickly as possible.
We can send you a 2TB hard drive so you can seed your data to that. The remainder of your backup archive will need to be uploaded via the internet. We can comp you for the time it takes for the last part of your backup archive to upload to our servers.
We have your email address so we can send out a drive overnight once you make your final decision as to what you want to do.
Dec-13 2011 05:26 pm
**UPDATE 2011-12-14 13:03 EST**
Looks like this is actually not an isolated issue, despite what Crashplan told me point blank on the phone last night. Facebook user Diane Dusek just posted on the Crashplan Facebook wall that she is having a very very similar issue. Screenshot of the facebook converstaion:
** UPDATE 2011-12-14 17:30 EST**
It’s 5:30 – the end of the day. I’ve called Crashplan twice today to talk to somebody about resolving my issue and moving on. On the first call I was put on hold for a brief minute, and when the guy came back on the line I was told that somebody could call me back shortly. 30 minutes later and no call, I called back and was once again put on hold, then told that somebody would definitely call me back within 30 minutes. That was 45 minutes ago. I’m simply being brushed off by Crashplan. Do they thin that this issue will simply go away, like all of my data did? I couldn’t have imagined how they could make the situation of losing a customer’s data worse, but somehow they managed to.
** UPDATE 2011-12-14 19:09 EST **
I just received a personal call from Matthew Dornquast of Code 42 software, makers of Crashplan. He was extremely apologetic, and did a great job of being extremely nice and understanding, and explaining the entire situation to me in as much detail as he could. I took a few notes from the call, and with Matthew’s permission, am publishing them. In the end, Crashplan is sending me out a 3tb seed drive so I can get my entire archive back online, and they’re also issuing me an account credit for the previous seed drive service I had purchased. Also of note, Matthew mentioned that he had not in fact read my blog post on the subject, but did read through the support thread, and heard from the Crashplan social media person that there was an issue that needed his attention.
Notes from call with Code 42′s Matthew Dornquast
- The problem with data being lost was due initially to a hardware failure in one of the data centers Crashplan uses
- Crashplan manage 62 petabytes of data across its data centers, which is a huge amount.
- In this incident, a single server had a hardware issue, which normally wouldn’t take down the whole thing and cause any data loss.
- However, there was also an element of human error in this case, which ultimately caused a loss of data for about 20 customers.
- All 20 customers did not lose all of their data – only some did.
- Only 20 users were affected, out of the millions of Crashplan customers
- Human error was a blind spot for Crashplan’s system, and there was no process in place to prevent it.
- As of now, there’s a new process to prevent this type of human error from happening again.
- This year Crashplan is expecting 380% growth. As a growing company, they are obviously still having growing pains.
- For the 20 people affected by this snafu, Crashplan is offering free seeding to get data back into cloud.
- Crashplan is a backup company, not an archive company. Don’t put all eggs in one basket – multiple backups is the way to go so that if one backup fails, there is a second backup. [ed. Which I do - I use Apple's Time Machine as my local backup]
- In this issue, software in the Crashplan system software detected data issue, and started healing immediately.
- System was able to repair a lot of the data, and of the 20 people affected, very few actually lost all of their data.
- Crashplan normally only offers 1tb seeding drives. But in this special case I can be sent a 3tb drive for seeding my 2.4 tb of data. [Although Brad told me last night that it was technically impossible for them to import more than 2tb of data from a seed drive into their system, Matthew assured me that his team would figure out a way to make it happen on his request]
- Looking forward, Crashplan is coming out with new mobile stuff – iPad, android, windows phone 7 apps
- New features for travelers – don’t use mobile access points, don’t run on battery power.
- Summary – screwup by software people, 20 people affected, not the end of the world, working hard to get affected customers backed up again, putting systems in place to prevent this from happening again.
- Responsible message to remind people of: Crashplan isn’t an archive service, it’s a backup service. Important to remember that multiple backups are the way to go for better security.
**UPDATE 2011-12-15 13:20 EST**
Crashplan released an official statement on the issue that caused the loss of my data:
On Dec 13th a storage node in one of our Minneapolis data centers experienced a hardware failure. While typically a non-event, additional human error escalated this failure into backup archive corruption for 20 of our customers, affecting their backup in varying degrees.
Since CrashPlan automatically detects and heals around archive corruption, the affected customers don’t need to do anything – their backup archives will automatically heal over the internet. (However, to accelerate this process, we have offered overnight free seeding services to those customers.)
While CrashPlan manages over 60 petabytes of consumer data globally and this event was limited to 20 people, we nevertheless take this failure very seriously and sincerely apologize for this lapse.
Further, our operations team has modified our processes to avoid this human error in the future.
**UPDATE 2012-01-07 22:43 MST**
I’ve redacted my actual ticket number from the command line noted above. It seems that a user was able to successfully add their own crashplan’s log files to my support thread by using this line.. not good at all.
**UPDATE 2012-01-07 10:33 EDT**
Reader and beauty photographer Ashley Karyl just commented below that she’s been having big issues with crashplan too. She’s also noticed some new behavior. According to network monitoring tool Little Snitch running on Ashley’s computer, Crashplan made a connection to Amazon Web Services S3 cloud storage service this morning. The behavior deviates from the standard Crashplan behavior of connecting directly to Crashplan’s servers for backup. Crashplan has obviously had its share of issues lately. There’s even a static message posted on its support site indicating that they’re getting more support issues than normal”
“We’re currently seeing unprecedented demand for technical support, due to our recent rapid growth. As a result, our support response times are significantly longer than we want them to be.
Realtime live phone support and chat is available for customers experiencing urgent issues (help restoring files, lost or stolen hardware).
We are working diligently to resolve the delayed response times and apologize for the inconvenience.”
So could this Amazon AWS connection attempt indicate that Crashplan is making an emergency switch over to S3 to fix their own server farm’s I/O issues? Maybe, maybe not. Also to be considered is the fact that Crashplan just updated its software, which existing customers computers will automatically download and update themselves with. Very frequently developers use S3 to distribute software downloads – so it’s very likely that Ashley’s S3 connection was simply to download the latest version of the Crashplan client. In fact, I think this is very likely the case.
However, it’s still worth noting that things have been a bit bumpy for Crashplan lately, as evidenced by the static support message, and increased number of comments posted here.
Below is Ashley’s email explanation, and screenshot of her Little Snitch connection log.
I didn’t grab a screenshot of the message as it popped up in Little Snitch, however I did just find it in the rules and I’ve attached a screenshot of that instead. Immediately afterwards the upload started to work normally and has continued for the last few hours. It could be that Crashplan are simply using S3 for the download of updated software, which many developers do, however it would seem strange in a case like this given the network they have.
At the moment it is uploading nicely but this is after a week of problems and I’m still looking at a month to upload the rest assuming there are no further delays. I’m still on the fence about signing up but some options are simply too expensive and it’s a pain for me to start again with this slow connection while I’m waiting for fibre to be enabled in this area.
PS I’m a photographer as well and want my archives backed up offsite to ensure I am covered if there was a fire or theft at the house.
**UPDATE 2012-04-05 17:13 EDT**
In light of the torrent of recent comments, and in the interest of getting some sort of official response from Crashplan on this ongoing issue, I’ve just tweeted directly to Code42 Software’s CEO Matthew Dornquast. During my initial issues with Crashplan, which prompted me to write this blog post, I had a phone call with Matthew – he was generally very nice, knowledgeable, and pleasant to talk to, and helped speed me through the process of getting my backup re-seeded. Hopefully he’ll take heed of this post and let everybody know what’s going on here, and why users are continuing to experience issues.
— Jeffrey Donenfeld (@Jeffzilla) April 5, 2012
After having crashplan crash multiple times on my Macbook Air running 10.8.2, I wrote in to support to ask for advice. Their response? “you’re running out of memory and need to manually increase memory”. Really? I have to manually fix your app that keeps crashing? Terrible.
Here’s my initial problem:
Whats going on here? My computer has been on and connected to the internet just fine lately… Error message I received: “Computer “My Computer Name” has been unable to reach any backup destinations for 3 DAYS. Back up to multiple destinations to reduce your risk of losing data.”
I forwarded Kelly my Crashplan log files. Then her response after reviewing my log files:
Good day Jeffrey,
I believe that you may not be backing up due to out of memory errors. Looking through your logs it appears that the CrashPlan backup engine is running out of memory. Please follow the instructions below to allocate more memory to the CrashPlan backup engine.
Edit the CrashPlan engine’s com.crashplan.engine.plist (“the plist file”) file to allow it to use more java memory. You will need to use Terminal for this, and edit the file using the ‘sudo’ command. When prompted, type in your computer’s Admin password. Note that you will not see the text-cursor move when you do – this is normal.
1. Stop the backup engine by typing this into the Terminal application:
sudo launchctl unload /Library/LaunchDaemons/com.crashplan.engine.plist
2. Run this command to edit the backup engine:
sudo nano /Library/LaunchDaemons/com.crashplan.engine.plist
3. In /Library/LaunchDaemons/com.crashplan.engine.plist, find this line: -Xmx512m
4. Edit that line to something larger such as 640, 768, 896, or 1024. E.g.: -Xmx1024m
This sets the maximum amount of memory that CrashPlan can use. CrashPlan will not use that much until it needs it. I would recommend starting out setting it to 768, and go higher only if you continue experiencing problems. You can increase it above 1024 if you have a really large file-selection.
5. Hold the Control key and tap the x key to exit. Choose “y” to confirm it.
6. You’ll see the prompt “File Name to Write.” Hit enter to save to the existing location.
7. Start the backup engine by typing:
sudo launchctl load /Library/LaunchDaemons/com.crashplan.engine.plist
If you have any problems with this process, you can call us directly at 1-855-411-4242. We are available during the week 7-7 Central Time. Please let me know if you have any questions.
So, for an out of memory problem, they’re trying to instruct me on how to go in and manually fix it. For a consumer facing product, this seems much much much too complex to expect me to do on my own. Crashplan, fix your app.
And it seems like they are working on fixing it with a Native Mac Crashplan App!
I got a nice tidbit of information from support rep Kelly – Crashplan is working on a Mac Native Crashplan App. That’s right, native code, no java wrapper crappyness. Here’s Kelly’s words:
The reason that it is not native is because CrashPlan is a cross platform program. Currently we are working on a native Mac client, but this does not have a time frame on when it will be released.
– Kelly S., Crashplan Support
Hoping for that new native crashplan client to be released soon…