Migrate (or how to escape) from soup.io to WordPress

 

I’ve been using soup.io for over 6 months to save my twitter stream, collect nice music video and host some personal articles. Due to the “not that professional” maintenance of soup.io (several long downtime) and the lake of feature (can’t create and manage tag cloud or category other than manually, search feature is not working), I decided to move to WP and do my own stuff.

The migration of 6 months of data collection from soup.io to WP has been a long and painful journey. Soup.io rss export did not contains all my post (over 500) and was not recognized by WP properly (using the native WP import rss feature).

 

So I’ve look up for a way to move my data using a set of different tools. Below is a quick tutorials describing my main step and the tool I used. At the end of this article an other solution that might work as I did not try it.

Please note that during this migration your tag, guid and publish date will be lost. If you have post from google reader, guid (ie link to the orginal article) will be lost too – I guess my cleaning in google refine lost them.

  • Extract all your posts using the soup downloader (thanks monschein). Soup downloader will do a post is a xml document. So unless you want to import all your articles one by one, we need to have them into a single file.
  • Put them back into a single file using Replace Pioneer (access tutorial – download) to create a single xml document
  • Rename this document as .txt
  • Now we have a nice xml document, with all your post, we want to make it ready to be imported in the wp_post table. To do that we will use google refine  a great tool to clean messy data.
    • create a project by importing your txt document (do not split into columns)
    • Apply the following code using the option apply. Copy every thing and here we go !

 

 

    • Extract data in Excel format.
  • Create a wordpress blog. I won’t describe this step, there is already tons of excellent tutorials all over the web.
  • Import your Excel file in the wp_post table using phpMyadmin interface. Again, there is tons of good tutorials over the web.
    • It has been the longest and most painful part. The import did not happen properly and I had to run a couple of sql queries to rearrange the data properly (in the right field / format …)
  • Connect to WordPress, all your post must be displayed. Select them all and publish them.
  • Install feedwordpress plugin and configure the import you had previously with soup.io.
    • feedwordpress is a rss importer that turn every element in the rss will be turn into a blog. If the element contains tag or category
  • Configure and have fun with WordPress

 

An other option I did not explore, but might work is to:

  1. install a WordPress instance with feedwordpress plugin
  2. configure your soup.io rss.
  3. run the feedwordpress plugin to import the x first elements from soup.io
  4. delete imported element in soup.io
  5. run the feedwordpress to import the next elements from soup.io
  6. keep repeating 4 and 5 until your soup.io is empty.
Advantages of this methods (if it works):
  • Do not need to use 4 differents interfaces
  • Import post tag and create them into WordPress.
Hope this help. If you have questions, and idea to improve this methods, I’m interested to hear them !