How to migrate from CS2007 to WordPress, Movable Type (or any other blog engine, supports XML-RPC) with C#

Today we’ll speak about migration from community server 2007 to another blog engine, when you have no access to CS and/or other database.

Let’s set targets first:

  • You want to migrate all posts
  • You want to migrate all comments
  • You want to transfer all hosted images and media files
  • You should update all internal links

Looks complicated? not really. First of all, grab any XML-RPC framework (for example xml-rcp.net). Then create a proxy to CS2007 – it uses Metablog API. You can see all defined methods by accessing /blogs/metablog.ashx

[XmlRpcUrl("http://blogs.microsoft.co.il/blogs/tamir/rsscomments.aspx?PostID=", posts[i].postid);
_rssReader = new XmlTextReader(commentsRSSURL);

while (_rssReader.Read()) {
                  _rssReader.MoveToContent();
                  if (_rssReader.NodeType == XmlNodeType.Element) {
                     if (_rssReader.Name == "pubDate") { date = DateTime.Parse(_rssReader.ReadElementContentAsString()); }
                     if (_rssReader.Name == "dc:creator") { author = _rssReader.ReadElementContentAsString(); }
                     if (_rssReader.Name == "description") {
                        if (!shouldSkip) {
                           content = _rssReader.ReadElementContentAsString();
                           comments.Add(new Comment {
                              author = author,
                              date_created_gmt = date,
                              status = true

As you can see, now you have all comments. Next step is to detect and reupload all images to the new host.

private const string imgRX = "<img[^>]*src=\"?([^\"]*)\"?([^>]*alt=\"?([^\"]*)\"?)?[^>]*>";
var matches = Regex.Matches(posts[i].description, imgRX);
               Console.WriteLine("Fixing {0} images…", matches.Count);
               for (int j = 0; j < matches.Count; j++) {
                  Console.WriteLine("Retriving image #{0}", j);
                  var url = matches[j].Groups[1].Value;
                  if (url.Contains(baseURL)) {
                     try {
                        var data = wc.DownloadData(url);
                        Console.WriteLine("Uploading image #{0}", j);
                        var uf = newblog.uploadFile(newblogid, newUsername, newPassword, new MediaObject {
                           bits = data,
                           name = matches[j].Groups[1].Value.Substring(matches[j].Groups[1].Value.LastIndexOf(‘/’) + 1)
                        });
                        posts[i].description = posts[i].description.Replace(url, uf.url);
                     } catch { }
                  }
               }

Now all images are stored in the new location and all image links are updated within stored posts. Next step is to upload all posts to the new location. CS stores tags as categories, which is wrong. Why? Because categories can be hierarchical, while tags cannot. So we have to convert all categories within retrieved posts into real tags. After it we can post everything

for (int i = posts.Length – 1; i >= 0; i–) {
           posts[i].mt_keywords = string.Join(",", posts[i].categories);
           var pid = newblog.newPost(newblogid, newUsername, newPassword, posts[i], true);
           foreach (var comment in posts[i].comments) {
              try {
                 var cid = newblog.newComment(newblogid, newUsername, newPassword, pid, comment);
              } catch { }
           }

Now we have to update all internal links within new locations. For this we should grab all posts back to learn new URLs.

var newPosts = newblog.getRecentPosts(newblogid, newUsername, newPassword, toFetch);
         for (int i = 0; i < newPosts.Length; i++) {
            foreach (var pi in _postsIndex) {
               if (newPosts[i].description.Contains(pi.Key)) newPosts[i].description = newPosts[i].description.Replace(string.Concat(baseURL,pi.Key), pi.Value);
            }
             wpblog.editPost((string)newPosts[i].postid, newUsername, newPassword, newPosts[i], true);
            if (!refereces.ContainsKey(newPosts[i].link)) refereces.Add(newPosts[i].link, posts[i].link);

         }

We done. Last, but not the least, is to update old posts with new URL to make visitors able to forward into new location.

csposts = csblog.getRecentPosts(csBlogid, csUsername, csPassword, toFetch);
            for(int i=0;i< csposts.Length;i++) {
               if (_postsIndex.ContainsKey(csposts[i].link)) {
                  string write = string.Format("<h3>[This blog was migrated. You will not be able to comment here.<br/>The new URL of this post is <a href=\"{0}\">{0}</a>]</h3><hr/>", _postsIndex[csposts[i].link]);
                  csposts[i].description = string.Concat(write, csposts[i].description);
                  csblog.editPost((string)csposts[i].postid, csUsername, csPassword, csposts[i], true);
                  Console.WriteLine("Post {0} was updated",i);
               }
            }

Have a nice day and be good people!

  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • DotNetKicks
  • DZone
  • Live
  • Reddit
  • TwitThis
  • email
  • Slashdot
  • StumbleUpon

You may also be interested with:

  1. INotifyPropertyChanged auto wiring or how to get rid of redundant code
  2. How to calculate CRC in C#?

7 Responses to “How to migrate from CS2007 to WordPress, Movable Type (or any other blog engine, supports XML-RPC) with C#”

  1. dotmad Says:

    Great post, welcome to the new location.

  2. Eran Kampf Says:

    Congrats on the new blog :)
    Theme needs a lot of work though…

  3. Tamir Says:

    Thank you. It still a lot of work to fix all glitches here :)

  4. Kristine Says:

    Hi Tamir:)

    I am just a little non-geeky (though I am trying..) to install the gas price widget.

    It’s lovely to look at; shows the latest prices of grass in my area, just as I had hoped it would. When you click onit, it gives you a complete list of all the area stations.

    Herein lies the problem. When one clicks icon for any given sation, although the staion and addy are displaed, the icons which tell you the actual price for that station is little red lined boxes.(“show boxes doesn’t help.

    Help? This could be a Godsend widget. Inthought you would he best guy to ask and answer this question :) Thank for any assistance for I’m sure you are busy.

    Thanks in advance. I do appreciate ~

    Kristine

  5. ekspekt Says:

    Excellent article, i’ve bookmarked your blog for future referrence

  6. fajerwerki Says:

    Very interesting blog, what template do you use ?

  7. Darrell Says:

    Sounds great. Can you provide a link to the code?

Leave a Reply

Recommended

 


Sponsor


Partners

WPF Disciples
Dreamhost
Code Project
Switched to Better Place

Together