A Bloggers Worst Nightmare ...

Four tables in the database that drives my blog was corrupted sometime on Wednesday of last week.  And I did not have a backup.  This is my THS ...

 

Some Background

I use Yahoo! Small Business to host my Moveable Type (MT) blog.  For the past year and a half I have not had any problems at all.  In fact, it worked so well that I did not even know that Moveable Type was storing my entries in a MySQL database.  Looking back, clearly I should have taken the time to understand how MT works, but I didn't really think it was all that important.  I am paying Yahoo! to keep an eye on that stuff - or that's what I thought I was paying for.  I mean check out the second bullet point under Reliable and secure hosting ... 

image

 

Wednesday, July 23rd

I get an email from a nice guy named John who lets me know that he received some errors when trying to post a comment to my blog.  Here are the error messages:

  • Load of blog '1' failed: Loading data failed with SQL error Can't open file: 'mt_blog.MYI' (errno: 145)
  • Can't call method "allow_unreg_comments" on an undefined value at /blog-mt/lib/MT/App/Comments.pm line 263.

So I went Yahoo!'s online help desk and opened a help desk ticket explaining to them what the issue was.  12 hours after I opened the ticket, I received the following response (see below).  Ummm ... are they really asking for the url, user name and passwords?  I sent Jamie a response asking him to please look up this information himself since they are the ones that have it on file.  

image

 

Thursday, July 24th

Well, by 5:00 PM the following day Jamie still hasn't got back to me.  Over the course of Thursday I sent 3 more follow-up emails to Jamie (it says in his email 'Please do not hesitate to reply ...') but I don't get a reply.  So when I get home from work I make a call to Yahoo! tech support.  The automated operator tells me that ...

The current wait time is over 30 minutes - Yahoo! Help Support

'Over 30 minutes' - could you be a little more exact?  Turns out that I ended up waiting on hold for over 1 and 1/2 hours.  Finally I got someone on the line to help.  And here is how that conversation went ...

Me:       My blog isn't letting people post comments.  I opened a help desk ticket, but I am having trouble getting a response

Yahoo!:  Sorry, we don't handle MT support.

Me:       It looks like a MySQL problem.  Can you please log into the MySQL database and see if you see anything wrong?

Yahoo!:  You have four tables that are corrupted.  You will need to drop and re-create these tables.  If you had a backup I could restore these for you.

Me:        I didn't even know until yesterday that MT used a database

Yahoo!:  Here are some urls for the MySQL and MT online forums

Me:       That's it?  What am I paying you for?

Yahoo!:  Sorry, we don't handle MT support.

 

Needless to say that made me really mad.   

 

Friday, July 25th

Jamie finally gets back to.  Here is what he has to say ...  It turns out they don't actually support any of the products they are selling.

image

 

Saturday, July 26th

So I put my frustrations aside for a few minutes and planned my attack for getting my blog up and running again.  The only table that was corrupted that was of any real significance was the one that stored all of the comments people have left.  And I have all of those comments already published and out on mattberseth.com, so I wrote a series of real small console applications to accomplish the following tasks ...

  • Go out to mattberseth.com and download all of the individual entry pages.  Save a local copy of these html files.
   1: static void Main(string[] args)
   2: {
   3:     //  I exported the entry table from the MySQL admin page
   4:     var entries =
   5:         from e in XDocument.Load(@"..\..\blog.xml").Descendants("mt_entry")
   6:         select new
   7:         {
   8:             id = e.Descendants("entry_id").First().Value,
   9:             name = e.Descendants("entry_basename").First().Value,
  10:             postDate = Convert.ToDateTime(e.Descendants("entry_created_on").First().Value)
  11:         };
  12:  
  13:     if(!Directory.Exists("entries"))
  14:     {
  15:         Directory.CreateDirectory("entries");
  16:     }
  17:  
  18:     WebClient wc = new WebClient();
  19:     foreach (var entry in entries)
  20:     {
  21:         string url = string.Format("http://mattberseth.com/blog/{0}/{1}/{2}.html", entry.postDate.Year, string.Format("{0:00}", entry.postDate.Month), entry.name);
  22:  
  23:         //  download the url and save it.
  24:         wc.DownloadFile(url, Path.Combine("entries", string.Format("{0}.html", entry.name)));
  25:     }
  26:  
  27:     //  move each file into its own folder
  28:     foreach (string file in Directory.GetFiles("entries"))
  29:     {
  30:         //  create the directory
  31:         Directory.CreateDirectory(Path.Combine("entries", Path.GetFileNameWithoutExtension(file)));
  32:  
  33:         //  move the file into the dir
  34:         File.Move(file, Path.Combine(Path.Combine("entries", Path.GetFileNameWithoutExtension(file)), Path.GetFileName(file)));
  35:     }
  36: }

 

  • Rip through these files and extract the comments section
   1: static void Main(string[] args)
   2: {
   3:     //  extract the comments xml and save it into the folder
   4:     foreach (string dir in Directory.GetDirectories("entries"))
   5:     {
   6:         string entryContents = File.ReadAllText(Directory.GetFiles(dir, "*.html")[0]);
   7:  
   8:         int commentsStartIndex = entryContents.IndexOf("<div id=\"comments\" class=\"comments\">");
   9:         int commentsEndIndex = entryContents.IndexOf("<form method=\"post\" action=\"http://mattberseth.com/blog-mt/mt-comments.fcgi\"");
  10:  
  11:         string commentsXml = entryContents.Substring(commentsStartIndex, (commentsEndIndex - commentsStartIndex)) + "</div>";
  12:  
  13:         try
  14:         {
  15:             XDocument.Parse(commentsXml).Save(Path.Combine(dir, "comments.xml"), SaveOptions.None);
  16:         }
  17:         catch (Exception ex)
  18:         {
  19:             //  error
  20:             Console.WriteLine(dir);
  21:  
  22:             //  just save the file as text
  23:             File.WriteAllText(Path.Combine(dir, "comments.xml"), commentsXml);
  24:         }
  25:     }
  26: }

 

  • Rip through each of the comments.xml files and rebuild the information stored in MT comments table: author, date, content, url, etc ...  Use this information to construct the MySQL insert script that will insert these values back into the MT comment table
   1: static void Main(string[] args)
   2: {
   3:     int commentID = 1;
   4:     //  extract the comments xml and save it into the folder
   5:     foreach (string dir in Directory.GetDirectories("entries"))
   6:     {
   7:         string commentsXml = File.ReadAllText(Directory.GetFiles(dir, "*.xml")[0]);
   8:  
   9:         if(File.Exists(Path.Combine(dir, "insert.sql")))
  10:         {
  11:             File.Delete(Path.Combine(dir, "insert.sql"));
  12:         }
  13:  
  14:         //  find all of the comments in the document
  15:         foreach(XElement e in XDocument.Parse(commentsXml).Descendants().Where(x => x.HasAttributes && x.Attribute("class") != null && x.Attribute("class").Value == "comment"))
  16:         {
  17:             //  the name of the person leaving the comment
  18:             string comment_author;
  19:             //  the date the comment was entered
  20:             DateTime comment_date;
  21:             //  the payload
  22:             string comment_content = null;
  23:             //  the authors url
  24:             string comment_url = null;
  25:             //  the author emails
  26:             string comment_email = null;
  27:  
  28:             //  the comment header contains both the comment_author and the comment_date.
  29:             XElement commentHeader = e.Elements().First();
  30:  
  31:             //  the last anchor is always the comment date
  32:             comment_date = DateTime.Parse(commentHeader.Elements().Last().Value.Trim());
  33:  
  34:             //  if there are 2 anchors, the first one contains the comment_url as the href
  35:             //  and the payload is the name, otherwise the comment_author is the text between
  36:             //  the Posted by: and on tokens.
  37:             if (commentHeader.Elements().Count(x => x.Name == "a") == 2)
  38:             {
  39:                 XElement element = commentHeader.Elements().First(x => x.Name == "a");
  40:                 comment_author = element.Value.Trim();
  41:                 comment_url = element.Attribute("href").Value.Trim();
  42:  
  43:                 if (comment_url == "http://mattberseth.com")
  44:                 {
  45:                     //  its me!
  46:                     comment_email = "matt@mattberseth.com";
  47:                 }
  48:             }
  49:             else
  50:             {
  51:                 const string beginToken = "Posted by:\n";
  52:                 const string endToken = "on\n";
  53:  
  54:                 int beginIndex = commentHeader.Value.IndexOf(beginToken) + beginToken.Length;
  55:                 int length = commentHeader.Value.LastIndexOf(endToken) - beginIndex;
  56:  
  57:                 comment_author = commentHeader.Value.Substring(beginIndex, length).Trim();
  58:             }
  59:  
  60:             //  the content in the comment-content portion is the payload.  not sure what to
  61:             //  do about the HTML that is embedded ...
  62:             foreach (XElement paragraph in e.Elements().Last().Elements().Where(x => x.Name == "p"))
  63:             {
  64:                 comment_content += string.Format("{0}\n\n", paragraph.Value).Replace("\n", "\\n");
  65:             }
  66:  
  67:             Console.WriteLine("comment_author: {0}", comment_author);
  68:             Console.WriteLine("comment_date: {0}", comment_date);
  69:             Console.WriteLine("comment_url: {0}", comment_url);
  70:             Console.WriteLine("comment_content: {0}", comment_content);
  71:  
  72:             //  get the mapping from entries to id's
  73:             Dictionary<string, int> map = GetEntryToIDMap();
  74:             
  75:             //  Format:
  76:             //  {0} = comment_id
  77:             //  {1} = comment_author
  78:             //  {2} = comment_email
  79:             //  {3} = comment_entry_id
  80:             //  {4} = comment_text
  81:             //  {5} = comment_url
  82:             //  {6} = comment_created_on
  83:             //  INSERT INTO `mt_comment` VALUES ({0}, '{1}', 1, NULL, {2}, 1, NULL, NULL, NULL, 1, '2000-01-01 00:00:00', '{3}', '{4}', 1, '{5} 00:00:00', NULL, '{5} 00:00:00', NULL);
  84:             string insert = string.Format(
  85:                 "INSERT INTO `mt_comment` VALUES ({0}, '{1}', 1, NULL, {2}, {3}, NULL, NULL, NULL, 1, '2000-01-01 00:00:00', '{4}', {5}, 1, '{6} 00:00:00', NULL, '{6} 00:00:00', NULL);",
  86:                 commentID++,
  87:                 comment_author.Replace("'", string.Empty),
  88:                 comment_email == null ? "NULL" : string.Format("'{0}'", comment_email),
  89:                 map[Path.GetFileName(dir)],
  90:                 comment_content.Replace("'", string.Empty),
  91:                 comment_url == null ? "NULL" : string.Format("'{0}'", comment_url.Replace("'", string.Empty)),
  92:                 comment_date.ToString("yyyy-MM-dd"));
  93:  
  94:             File.AppendAllText(Path.Combine(dir, "insert.sql"), insert);
  95:             File.AppendAllText(Path.Combine(dir, "insert.sql"), Environment.NewLine);
  96:  
  97:             //  todo: escape single quotes in the comment text
  98:             //  todo: escape single quotes in the comment author
  99:             //  todo: escape single quotes in the url
 100:  
 101:             Console.WriteLine();
 102:             Console.WriteLine();
 103:         }
 104:     }
 105: }

 

  • Combine these .sql files into the maximum file size the MySQL admin tool accepts
   1: static void Main(string[] args)
   2: {
   3:     int offset = 0;
   4:     foreach (string insert in Directory.GetFiles("entries", "*.sql", SearchOption.AllDirectories))
   5:     {
   6:         long fileSize = new FileInfo(insert).Length;
   7:         long currentCombinedFileSize = File.Exists(string.Format("combined-{0}.sql", offset)) ? new FileInfo(string.Format("combined-{0}.sql", offset)).Length : 0;
   8:  
   9:         if (currentCombinedFileSize + fileSize > 40000)
  10:         {
  11:             offset++;
  12:         }
  13:  
  14:         File.AppendAllText(string.Format("combined-{0}.sql", offset), File.ReadAllText(insert));
  15:         File.AppendAllText(string.Format("combined-{0}.sql", offset), string.Empty);
  16:     }
  17: }

 

Conclusion

Is it Yahoo's problem for not backing up my database for me?  Or was I the moron for not looking into the MT internals and ensuring the database is backup on a scheduled interval?  I don't know, I can see an argument both ways.  Either way their support was terrible and the way they promote their services is misleading.

If you run a blog, do yourself a favor and make sure you have a backup plan in place.  Do it now!

 

That's it.  Enjoy!


TrackBack

TrackBack URL for this entry:
http://mattberseth.com/blog-mt/mt-tb.fcgi/141

Comments


That's why I wrote my own blog platform, on my own server, and do my own backups... Cause using hosted solutions sucks ass.

Looks like maybe one more method call might help, Service.Terminate( Yahoo );

Sorry to hear about your bad luck.

While I self-host my blog (subtext), I never think about backing up the database. After reading this post, I decided to do it and just finished FTP'ing my 200+mb SQL backup to my home system. :-)

Support is over rated... A lot of decisions are made in IT based on support. "We decided to spend an extra million dollars on this because we have support just in case anything happens." Then you run into issues like yours. Here's a problem I had with GoDaddy: http://www.jstawski.com/archive/2007/01/29/virtual-dedicated-hosting.aspx

Ouch! I don't like hosted solutions, or pretty much any solution where I can blame someone other than myself for screwing things up. Note to self: don't use Yahoo!

I'm speechless! But, you did a great job getting your comments back. But, you said the table is corrupted. They were cached on the weblog posts?! I mean you must at least lost few comments.
Anyway Matt, I guess you might need to change your host and your blog provider to something more controllable and easy to manage.
Ok I'm going to get my Database backup
Cheers

Posted by: Jonx on July 28, 2008 06:07 AM

Wow. You know how to have a nice weekend ;) Sorry to hear that you had so much troubles.

I wrote something very similar to import posts/comments from a wordpress blog last year. Now, wordpress offers you the possibility to export everything to XML in standard.

How comes that despite the table being broken your blog was able to display the comments?

Please do not hesitate to reply in case you need further assistance ;)

@Rush -
Leave it to me to learn this the hardway.

@Eric -
Trust me - its in the works.

@Michael -
Good Idea!

@Jonas -
Thats funny - the godaddy Technical Rep sounds just like the guy at Yahoo that I spoke to. Everytime I would ask him a question he would put me on hold while he went to go try and figure out the answer.

@Kevin -
I don't mind buying the web space, but what really gets me is Yahoo!'s Small Business branding. They position it like they sell more than a virtual directory on some server. But from what I can tell they really don't.

I was lucky because I had a way to rebuild the table. But imagine your Mom/Dad/Sister/Wife who is really using Yahoo! for hosting their small business site. They would be completly screwed. The support guy was actually in the processes of googling for links that would help me troubleshoot MySQL and MT. What a joke!

@Muhammad -
Hey Muhammad. The only comments that I lost were the ones that anyone tried to leave between Wednesday and Saturday. I did have to manually modify a handful of the 2K or so comments that I extracted, but nothing too crazy.

In this post I also wanted to point that I was surprised how easy of a task this was to implement using LINQ to XML. When I was talking to my wife on Friday I told her that I would probably need ten hours or so this weekend to fix this problem. It turned out that it took less than 3. The code isn't real pretty, but it saved my blog.

@Jonx -
MT builds the static HTML files from the contents in the MT database. And I built the blog on Tuesday night so all comments I had recieved by Tuesday had been published.

Posted by: Nisar on July 28, 2008 09:41 AM

do you know that yahoo increased the price for domain parking from $2 to $45?
for first time they charge $2 and then once you hooked then they will suck your blood!

i have cancel the plan with yahoo... yahoo service is degrading as i can see....

Man, That sucks. I would be out of there. Just remember going forward with Yahoo! "Fool me once, shame on you; fool me twice, shame on me." I would be gone. Back that db up and roll.

I know your frustrations. I had similar issues with goddady. The y siad they can restore my database for $150 for my $4 hosting. All shared hosting sucks. I got a dedicated server and automated my daily backup system.

It is expensive but man it is awesome to have your own server.

Ouch. Glad you were able to get them back. Lucky that MT at least generates those static HTML files.

I don't know if there's a similar plugin for MT, but I use a plugin for WordPress that automatically handles database backups. Each night it backs up, compresses the backup, and emails it to a GMail account that I use exclusively for storing backups.

Peace of mind: Priceless.

Posted by: some guy on July 28, 2008 09:42 PM

FD: I used to work for a hosting company, but not Y!.

You and Yahoo are equally at fault.

Yahoo should be backing up your database regardless of what you put in it. However, asking them to do anything beyond a restore from a backup should be met with the 'that is unsupported' response*.

You should be backing up your database regardless of whether or not your web host does as they will likely screw it up at some point. Same goes for your emails and anything else you desire to keep.

*at least for software that runs 'on' the website (blog, gallery etc). Management tools (such as a hosting control panel) are the host's responsibility.

Posted by: Ryan on July 29, 2008 10:30 AM

Eek. If it was just your MYI files that were corrupted (your table indexes), then you could have simply repaired/recreated them with a few MySQL commands. I wonder if your Yahoo support tech even bothered to try. It's only when your MYD files get corrupted that you may have recovery issues that necessitate a backup.

Wow, I am in utter disbelief. I would have busted that phone in rage, guaranteed. Aside from all the bullshit, it had to be fun to write that code to resurrect all that you could. I had to do something similiar at work recently, it was a bad situation but I actually had fun fixing it!

All in all, this boils down to poor hosting, a mediocre blogging engine, and extremely bad customer service. Now you know why I took the time to write my own engine, and why I host the blog myself. It's brave and kinda risky, but it puts you in total control. And best of all, and you'll no longer have to fork over cash to fools that don't deserve it.

Good idea to post this, by the way.

Matt, I have to agree with others that hosted solutions are evil. Although it might be a hassle, why don't try blogengine.net?

Rush & Josh, I have to disagree with you. You have several excellent open-source engines, like blogengine.net or wordpress. So you still have a high quality code and full control over it.

Thanks for the heads-up on Yahoo.

I pay $6 month for shared hosting (ICDsoft, FYI) & get daily backups (or 2 clicks to DIY & download the tar.gz of the site &/or dB) and a 5 minute response on support tickets 100% of the time. I'm sure my host is not the only one.

Shop around.

BTW - MT, wordpress et al are all decent engines, just read the install instructions. . Both Blogger & Wordpress.com can host a blog and have a custom URL point to it for $10/yr.

Posted by: Rachel on August 22, 2008 10:32 PM

Yahoo's customer service/support is ABSOLUTELY TERRIBLE (got the same types of responses as Matt)! And, if you Google the error message, you'll notice a lot of the errors posted to forums occuring in the last two months. Now, maybe this kind of thing "happens with databases", but just "coincidentally" all around the same time and all blogs that were hosted with YaPOO? I doubt it...and I DO think Yahoo may be the issue. They CERTAINLY have a problem with customer service. Unlike Matt, I have no expertise in SQL, MySQL, or databases in general...that's why I picked a host whose package handles that. My blog has the same errors & I have NO idea where to begin to fix it. I want to fire Yahoo and move my blog elsewhere...but I'd like to get my 2-years worth of blogging back first. Aaarrrggh!

It would seem to me that if you are paying to be hosted somewhere, the host should be responsible enough to make backups, even if they only keep them for a few days. What happens in the event they crash and everything is lost? You're telling me Yahoo can't restore it and everyone they host would simply be okay with that?

Post a comment