Migrating from Drupal to Ikiwiki
TLPL; j’ai changé de logiciel pour la gestion de mon blog.
TLDR; I have changed my blog from Drupal to Ikiwiki.
Note: since this post uses ikiwiki syntax (i just copied it over here), you may want to read theoriginal versioninstead of this one.
will continue operating for a while to
give a chance to feed aggregators to catch that article. It will also
give time to the Internet archive to catchup with the static
stylesheets (it turns out it doesn’t like Drupal’s CSS compression at
all!) Anarchive will therefore continue being available on the
internet archive for people that miss the old stylesheet.
Eventually, I will simply redirect theanarcat.koumbit.orgURL to
the new blog location, . This will likely be my
last blog post written on Drupal, and all new content will be
available on the new URL. RSS feed URLs shouldnotchange.
I am migrating away from Drupal because it is basically impossible to
upgrade my blog from Drupal 6 to Drupal 7. Or if it is, I’ll have to
redo the whole freaking thing again when Drupal 8 comes along.
And frankly, I don’t really need Drupal to run a blog. A blog was
originally a really simple thing: a web blog. A set of articles
written on the corner of a table. Now with Drupal, I can add
ecommerce, a photo gallery and whatnot to my blog, but why would I do
that? and why does it need to be a dynamic CMS at all, if I get so
So I’m switching to ikiwiki, for the following reason:
- no upgrades necessary: well, not exactly true, i still need to
upgrade ikiwiki, but that’s covered by the Debian package
maintenance and I only have one patch to it, and there’sno data migration!(the last such migration in ikiwiki wasin 2009and was fully supported)
- offline editing: this is a a big thing for me: i can just note
things down and push them when I get back online
- one place for everything: this blog is where I keep my notes, it’s
getting annoying to have to keep track of two places for that stuff
- future-proof: extracting content from ikiwiki is amazingly
simple. every page is a single markdown-formatted file. that’s it.
Migrating will mean abandoning the
barlowtheme, which was
seeing a declining usage anyways.
So what should be exported exactly. There’s a bunch of crap in the old
blog that i don’t want: users, caches, logs, "modules", and the list
goes on. Maybe it’s better to create a list of what I need to extract:
- title ([[ikiwiki/directive/meta]]titleandguidtags,guidtoavoid flooding aggregators)
- body (need to check for "break comments")
- nid (for future reference?)
- tags (should be added as/[[!tag foo bar baz]]at the bottom)
- URL (to keep old addresses)
- published date ([[ikiwiki/directive/meta]]datedirective)
- modification date ([[ikiwiki/directive/meta]]updateddirective)
- attached files
- RSS feed
- author name
- attached files
- each tag should have its own RSS feed and latest posts displayed
Some time before summer 2015.
Well me, who else. You probably really don’t care about that, so let’Sget to the meat of it.
How to perform this migration… There are multiple paths:
- MySQL commandline: extracting data using the commandline mysql tool (drush sqlq …)
- Views export: extracting "standard format" dumps from Drupal and
parse it (JSON, XML, CSV?)
Both approaches had issues, and I found a third way: talk directly to
mysql and generate the files directly, in a Python script. But first,
here are the two previous approaches I know of.
LeLutin switched using MySQL requests,
although he doesn’t specify how content itself was migrated. Comments
importing is done with that script:
echo "select n.title, concat(‘| [[!commentformat=mdwn|| username=/"’, c.name, ‘/"|| ip=/"’, c.hostname, ‘/"|| subject=/"’, c.subject, ‘/"|| date=/"’, FROM_UNIXTIME(c.created), ‘/"|| content=/"/"/"||’, b.comment_body_value, ‘||/"/"/"]]’) from node n, comment c, field_data_comment_body b where n.nid=c.nid and c.cid=b.entity_id;" | drush sqlc | tail -n +2 | while read line; do if [ -z "$i" ]; then i=0; fi; title=$(echo "$line" | sed -e ‘s//+|.*//’ -e ‘s/ /_/g’ -e ‘s/[:(),?/+]//g’); body=$(echo "$line" | sed ‘s/[^|]*| //’); mkdir -p ~/comments/$title; echo -e "$body" > ~/comments/$title/comment_$i._comment; i=$((i+1)); done
Kind of ugly, but beats what i had before (which was "nothing").
I do think it is the good direction to take, to simply talk to the
MySQL database, maybe with a native Python script. I know the Drupal
database schema pretty well (still! this is D6 after all) and it’s
simple enough that this should just work.
[[!img 2015-02-03-233846_1440x900_scrot.png class="align-right" size="300x" align="center" alt="screenshot of views 2.x"]]
mvc recommendedviews data exporton Lelutin’s
blog. Unfortunately, my experience with the views export interface has
been somewhat mediocre so far. Yet another reason why I don’t like
using Drupal anymore is this kind of obtuse dialogs:
I clicked through those for about an hour to get JSON output that
turned out to be provided byviews bonusinstead of
views_data_export. And confusingly enough, thepathand
format_namefields are null in the JSON output
(whyyy!?).views_data_exportunfortunately only supports XML,
which seems hardly better than SQL for structured data, especially
considering I am going to write a script for the conversion anyways.
Basically, it doesn’t seem like any amount of views mangling willprovide me with what i need.
Nevertheless, here’s the [[failed-export-view.txt]] that I was able tocome up with, may it be useful for future freedom fighters.
I ended up making a fairly simple Python script to talk directly tothe MySQL database.
The script exports only nodes and comments, and nothing else. It makes
a bunch of assumptions about the structure of the site, and is
probably only going to work if your site is a simple blog like mine,
but could probably be improved significantly to encompass larger and
more complex datasets. History is not preserved so no interaction is
performed with git.
First, I imported the MySQL dump file on my local mysql server for easierdevelopment. It is 13.9MiO!!
mysql -e ‘CREATE DATABASE anarcatblogbak;’ssh aegir.koumbit.net "cd anarcat.koumbit.org ; drush sql-dump" | pv | mysql anarcatblogbak
I decided to not import revisions. The majority (70%) of the content has
1 or 2 revisions, and those with two revisions are likely just when
the node was actually published, with minor changes. ~80% have 3
revisions or less, 90% have 5 or less, 95% 8 or less, and 98% 10 or
less. Only 5 articles have more than 10 revisions, with two having the
maximum of 15 revisions.
Those stats were generated with:
SELECT title,count(vid) FROM anarcatblogbak.node_revisions groupby nid;
Then throwing the output in a CSV spreadsheet (thanks to
mysql-workbenchfor the easy export), adding a column numbering the
rows (B1=1,B2=B1+1), another for generating percentages
(C1=B1/count(B$2:B$218)) and generating a simple graph with
that. There were probably ways of doing that more cleanly withR,
and I broke my promise to never use a spreadsheet again, but then
again it was Gnumeric and it’s just to get a rough idea.
There are 196 articles to import, with 251 comments, which means an
average of 1.15 comment per article (not much!). Unpublished articles
(5!) are completely ignored.
Summaries are also not imported as such (breakcomments are
ignored) because ikiwiki doesn’t support post summaries.
Calling the conversion script
The script is in [[drupal2ikiwiki.py]]. It is called with:
./drupal2ikiwiki.py -u anarcatblogbak -d anarcatblogbak blog -vv
The-nand-l1have been used for first tests as well. Use this
command to generate HTML from the result without having to commit and
ikiwiki –plugin meta –plugin tag –plugin comments –plugin inline. ../anarc.at.html
More plugins are of course enabled in the blog, see the setup file for
more information, or just enable plugin as you want to unbreak
things. Use the–rebuildflag on subsequent runs. The actual
invocation I use is more something like:
ikiwiki –rebuild –no-usedirs –plugin inline –plugin calendar –plugin postsparkline –plugin meta –plugin tag –plugin comments –plugin sidebar. ../anarc.at.html
I hadproblems with dates, but it turns out that I wasn’t setting
dates in redirects… Instead of doing that, I started adding a
"redirection" tag that gets ignored by the main page.
Files and old URLs
The script should keep the same URLs, as long as pathauto is enabled
on the site. Otherwise, some logic should be easy to add to point to
To redirect to the new blog, rewrite rules, on original blog, shouldbe as simple as:
Redirect / http://anarc.at/blog/
When we’re sure:
Redirect permanent / http://anarc.at/blog/
Now, on the new blog, some magic needs to happen for files. Both
/filesand/sites/anarcat.koumbit.org/filesneed to resolve
properly. We can’t use symlinks because
ikiwiki drops symlinks on generation.
So I’ll just drop the files in/blog/filesdirectly, the actual
cp $DRUPAL/sites/anarcat.koumbit.org/files $IKIWIKI/blog/filesrm -r .htaccess css/ js/ tmp/ languages/rm foo/bar # wtf was that.rmdir *sed -i ‘s#/sites/anarcat.koumbit.org/files/#/blog/files/#g’ blog/*.mdwnsed -i ‘s#http://anarcat.koumbit.org/blog/files/#/blog/files/#g’ blog/*.mdwnchmod -R -x blog/filessudo chmod -R +X blog/files
A few pages to test images:
There are some pretty big files in there, 10-30MB MP3s – but those arealready in this wiki! so do not import them!
Runningfdupeson the result helps find oddities.
Themeta guiddirective is used to keep the aggregators from finding
duplicate feed entries. I tested it with Liferea, but it may freak out
some other sites.
- postsparkline and calendar archive disrespectmeta(date)
- merge the files in/communicationwith the ones in/blog/files
- import non-published nodes
- check nodes with a format different than markdown (only a few3=Full
HTMLfound so far)
- replace links to this wiki in blog posts with internal links
More progress information in [[the script|drupal2ikiwiki.py]] itself.
Axure汉化版已经发布，版本号Axure 22.214.171.1249，下面是截图效果 Axure汉化版文件下载地址：Axure汉化补丁 Axure RP pro 126.96.36.1999 下载地址注册用户名：Axure 序列...
BI中文站 6月7日报道 艾默生·斯帕茨(Emerson Spartz)今年28岁，已婚，是Spartz Inc公司的首席执行官。Spartz Inc是一个网站媒体帝国，旗下的30多家网站专门发布有趣、励志和让人感到不可思议的文章和帖子，其分享量非常...
- mysql 将字段time按天/月/年分组
- 新安装mysql 第三方工具连接不上问题
- CentOS 安装MySQL 5.1.69
- mysql出现“Incorrect key file for table”解决办法
- mysql无法启动——cannot allocate the memory for the buffer pool
- Mysql 日志删除
- mysql 修改字段类型 删除字段类型
- 修改mysql字符编码出现Job failed to start解决办法
- Why TokuDB hates Transparent HugePages