how-to #1: http://www.analyticsforfun.com/2014/04/how-to-move-your-blog-from-tumblr-to.htmlThe procedures in those links leave the images hosted on Tumblr, and also strip the 'source' URLs to content from each post. The bash snippets included here successfully fix these issues, with only a few potential flaws that can be easily cleaned up manually. It would have been more proper to create the XML files from scratch, or at least manipulate the resultant XML objects directly. This was something contemplated during but was abandoned for simple Bash (sed et al.). Some alternate XML manipulation instructions are listed below
how-to #2: https://yourbusiness.azcentral.com/import-tumblr-blogger-10881.html
# N.B. Presented without modification - hard coding (i.e. 'oioiiooixiii' user ID) present
###### Step 1 Download tumblr account via https://tumblr2wordpress.benapps.net/
###### Step 2 - Get all image urls tr \" \\n < blogger-export.xml \ | grep '.jpg\|.gif\|.png' \ | dedupe > images.list # And if needed, do further URL cleaning such as... grep '.jpg\|.gif\|.png' < images.list \ | grep -v \{ \ | grep -v '\\' \ | grep -v \; > revised-images.list
###### Step 3 - Get all tumblr post urls tr \> \\n < tumblr_oioiiooixiii.xml \ | grep oioiiooixiii.tumblr.com/post \ | cut -d\< -f 1 \ | awk '!x[$0]++' > tumblr-post.urls
###### Step 4 - Get source html from tumblr posts html="$(wget -qO- http://oioiiooixiii.tumblr.com/post/149805370851)" #loop # OR wget -nc -w 1.5 --random-wait -i ../tumblr-post.urls
###### Step 6 - Get source URL from each tumblr post urlDecode() # https://unix.stackexchange.com/a/187256 { # urldecodelocal firstPass=$(perl -MURI::Escape -e 'print uri_unescape($ARGV[0])' "$1") local urlEncoded="${firstPass//+/ }" printf '%b' "${urlEncoded//%/\\x}" } getOriginalUrl() # html parser, prints URL string ('|||' is data separator) { local userID="oioiiooixiii" for i in * do raw="$(sed -e 's/<div class="cont content_source\">/\\n/g' \ -e 's/<\/a>//g' <<<"$(cat $i)" \ | grep '<a href="http' \ | grep -v '<a href="http://oioiiooixiii' \ | cut -d\" -f 2)" # If URL is encoded, decode it twice (for double-encoded URLs) [[ $raw == *"%"* ]] \ && printf "$i|||$(urlDecode "$(cut -d= -f2 <<<"$raw" \ | cut -d\& -f1\ | head -1)")\n"\ || printf "$i|||$(head -1 <<<$raw)\n" #sleep .1 done } # Clean up missing source links, by using Tumblr post URLs while read line do postID=${line%|||*} url=${line##*|||} printf "${postID}|||" # If existing url doesnt conform to http(s) (e.g. blank) [[ "$url" =~ ^h ]] \ && printf "$url\n" \ || printf "http://oioiiooixiii.tumblr.com/post/${postID}\n" done < source-COPY.urls > source-FIXED.urls
###### Step 6 - Rehost images #Upload images via Blogger post(s), or upload via Google Photos
###### Step 7 - Get URLs of uploaded images # Add all images to one, or multiple, blog posts (this can be tedious and laborious). Get the source html of these posts and execute the following tr \" \\n < rehosted-images.urls \ | grep tumblr \ | grep 1600 \ | awk '!x[$0]++' > rehosted-images.urls
###### Step 8 - Replacing old image URLs with new image URLs # tumblr archive XML: image urls replaced with corresponding Google URLs via sed while read line do original="$line" filename="${original##*/}" new="$(grep "$filename" <rehosted-images.urls)" echo "SWAPPING: $original FOR: $new" | tee -a image-swapping.log sed -i "s|$original|$new|g" tumblr_oioiiooixiii-NEW-IMG-URLS.xml done < original-images.urls
###### Step 9 - Remove fluff and ready XML file for source link html # Adding XML content would be best done using a XML editor like 'xmlstarlet' # but because this strips '<![CDATA[' fear of issues arising during conversion # kept editing at a low level with basic bash, sed, etc. # Examples of adding data via 'xmlstarlet' # Find specific item based on 'wp:post_name' return post data formatted via Perl xmlstarlet sel -t -m "/rss/channel/item[wp:post_name='65528357819']" \ -v 'content:encoded' test.xml | perl -MHTML::Entities -pe 'decode_entities($_);' # Add data to object xmlstarlet ed -u '/rss/channel/item/content:encoded' -v "newHTMLtext" test.xml # Instead: # Use sed to remove 'figure' tags, and ']]></content:encoded>' # Idea being, source URL html will be added along with ']]></content:encoded>' # above postID in each 'item' object. Thus, re-encapsulating the html data. sed -e 's/<div class="figure"><figure>//g' \ -e 's/<\/figure><\/div>//g' \ -e 's/]]><\/content:encoded>//g' \ -e '/^\s*$/d' < tumblr_oioiiooixiii.xml
###### Step 10 Add source URLs to each post # Read each line ('postID|||source_url') source url file and add to correct 'item' while read line do postID=${line%|||*} url=${line##*|||} searchTerm='<wp:post_name>'"$postID"'</wp:post_name>' sourceHTML='</br><div class="sourcelink">source: <a href="'\ "$url"'">'"$url"'</a></div>' prefix="$sourceHTML"']]></content:encoded>'"$searchTerm" #echo "$prefix"; sleep 1 sed -i "s|$searchTerm|$prefix|g" \ tumblr_oioiiooixiii-NEW-IMAGES-REMOVED-LINES-ITEM-PREFIXED.xml done < source-FIXED.urls # Do a visiual test for any items that were missed grep '<wp:post_name>' \ < tumblr_oioiiooixiii-NEW-IMAGES-REMOVED-LINES-ITEM-PREFIXED.xml \ > test-for-item-errors.text
###### Step 11 - Convert modified tumblr export XML file to Blogger version # Upload file to: http://www.wordpress-to-blogger-converter.appspot.com/
###### Step 12 - Import posts to Blogger blog Go to 'Settings' tab in Blogger Dashboard, select 'Other', then 'Import Content'. Blogger will attempt to import all posts but may terminate after certain number. In testing, 900 posts imported upon first attempt, then only 1 or 2 in repeated attempts. Some posts may contain errors that cause the process to hang and so, will need to be fixed, but usually it is just a mandatory import limit enforced by Blogger, and will resolve itself after 24 hours.
Re-hosted tumblr account: https://↯.blogspot.com
/* Blogger Notes - 'Dynamic Views' theme with the following custom CSS (mainly remove annoying animations) */ *, *:before, *:after { transition-property: none !important; transform: none !important; animation: none !important; } #main { margin: 0px 0px !important; background-color:#3D3D3D !important; } .item { border: solid 0px #e3e3e3; } .share-controls{ display: none !important; }