Google XML Sitemaps

I started to learn a bit about Google Webmaster Tools and how to increase the findability of one’s website. First of all, I was looking for a flexible sitemap generator for WordPress where I ended up with Google XML Sitemaps.

Google XML SItemap Plugin

This plugin generates a sitemap file which can be consumed by search engines like Google or Bing. To do so, you have to verify your website with the various providers. Usually, this is done by adding some meta tags to your HTML pages to prove you have full control over the server. Currently, this can be achieved by WordPress’ Jetpack. That way you don’t have to fiddle with the header.php file of your WordPress installation.

WordPress Jetpack Site verification

You don’t have to, but you can sign into the Google Webmaster Tools to check the verification status of your site.

Bing Webmaster Center

Also Bing Webmaster Center will provide you a Meta tag you can provide Jetpack to verify your site and to improve the discoverability of your site.

Monitoring your Site with Apex Ping

While I run meanwhile quite a number of websites, blogs and other services, I was looking for monitoring possibilities – not running on the server I am just monitoring.

I was pointed to Apex Ping, which is a simple and beautiful monitoring of various features of your website.

Apex Ping Homepage

While it is a bit pricey for my use case at the moment, it is a really nice service one can consider for a non invasive monitoring of your sites.

Link:
https://apex.sh/

There it Goes – Google Reader Gone for Good

Icon by http://icontexto.blogspot.de/  via Creative Commons Attribution Non-commercial Share Alike (by-nc-sa)First headline in this morning’s news: Goggle stopps Google Reader (please bear in mind, this link won’t work anymore in the future). Google wants to power down the Google Reader among other APIs and services since the 2011 spring clean. Personally, I am affected the third time by Google’s cut downs after the Feedburner burnout last year.

While I was annoyed in the very first moment, I had to think through various perspectives, not just coming up with yet another rant post about Google’s attitudes.

The Business Point of View
Google is not doing anything wrong (I guess) from a business point of view. They simply cut down projects, teams or cost centers with no or little revenue. I have seen this several times during my time at Microsoft where teams or studios where shut down due to a revenue not meeting the expectations. Larry Page wanted to focus on core products and  less speculative projects which does make sense considering the shareholders beyond Google. Consequently, cutting down free services not being paid for, requiring manpower for development an maintenance and (not to underestimate) bare metal down in Google’s data centers is a plan to increase revenues, cut down losses and save not to spend money.

The User ‘s Point of View
As a user, you might rely on these services. Maybe you build up your website based on various Google APIs (as they have been free), you maintained you RSS feed in Google Reader and so on. Even with several weeks of notice, you need to change technologies, maybe rebuild or recode you page, and even worse to change habits. At some point in time, after this happened one, two or three times (depending on your very personal potential to suffer).

The Developer’s Point of View
There are quite many apps, tools and pages out there heavily depending or based on Google’s API including Google Reader. Not only their apps and tools stop working, also users who bought these products will be forced to stop using these tools. With feedly, there is timely an alternative Reader and with Normandy developers get an API they might use for their products. However, Nick Bradbury already announced to stop working on the Windows client FeedDemon which heavily depends on the Google API for synchronization.More will definitely follow…

The Consequences
As developer, I was affected once before, as user I am affected the second time by now. By cutting down both services I am left with Google Calendar. While Google might or might not continue this service in the future, one might rethink if using it is a good choice. Keep in mind, we do not pay for it as users and the Google App Sync meanwhile is only available for business users (probably paying for it). Google Calendar Sync was a great tool to sync between Outlook and Google Calendar. I fought my way through the setup using Windows 7 three years ago right after they stopped development for it.

There is already an petition for keeping Google Reader alive, supported by more than 35,000 users (nothing compared to he 10 Mio user susing G+ stated by Larry Page). Still chances that Google will continue the service are less than probably.

The Business Point of View Revisited
I wonder if Google thought of charging for these services. I wonder if one (e.g. I) would pay for such a service. It definitely would depend on the amount they would charge. A few bucks a year won’t hurt and with a few ten thousands of users they might pay the bills for this service one might think. On the other hand, a company like Google might not be interested in any service with less than ten million $$$ of revenue (please put in whatever amount you think is suitable) or a million of users…

IEEE Data Breach

A few days ago, Radu Drăgușin discovered a data leak at the IEEE servers, enabling him to download about 100.000 plain text keywords (probably mine as well).

On the one hand it shows how critical it is to consider the security off your system, nevertheless if you are a small company or a worldwide organization such as the IEEE. On the other hand it showed that even large organizations you never thought of this might face such fatal security leaks.

However, Radu went ahead and (a) decided not to share the information he gained through this security leak with public (big kudos for this decision), (b) to prepare various statistics on ieeelog.com based on the information (which are indeed interesting without revealing traceable information about individuals) and (c) to inform IEEE about the leak (also kudos for this). As a result you can say, he was quite responsible with the data he received and at least e followed some of the principles, provided by the IEEE Computer Society Code of Ethics.

One result of his analysis is the fact, that about almost 300 users are using the password 123456, reminding me Mel Brooks epic Star Wars parody Spaceballs, Dark Helmet saying

“So the combination is… one, two, three, four, five? That’s the stupidest combination I’ve ever heard in my life! That’s the kind of thing an idiot would have on his luggage!”

As a result, I went straight to my IEEE account and changed the password. Luckily, it was a password not used for any other site beside the IEEE. Said that, if you have an IEEE account, it probably is a good thing to go there directly changing yours as well if not already done.

Most used IEEE passwords

And Radu, whenever you ever read this post, if have the chance please have a look into the log files and let me know if the user aheil is listed there as well.

Cross-domain Mash-up using Google Feed API

If you want to retrieve cross-domain content via AJAX/JavaScript to build a mash-up client, browsers might restrict these calls upon security reasons.

Digging through the resources on the Web, you might figure out that there are various approaches. I decided against any server-side processing of the request as I did not want to make an extra call to the my server. Also any jQuery plugin related approach would not work at the moment due to recent unavailability of jQuery plugins.

Looking for an alternative approach I came along the Google Feed API. Basically, it allows you to download any public Atom or RSS feed and consuming it in your JavaScript.

Once you got your API key which is based on the domain you want to call the API from, you can immediately  start using it.  The key is valid for all pages within this domain. Usage of the API includes adding the script your head of the HTML, loading the API using Google Loader’s load()call and finally hooking up your code as call-back in the setOnLoadCallback function. The feed is then provided either as JSON or as XML by the Google Feed API and can be easily used within you code without any cross-domain restrictions.

Google Plus Operator

Google has replaced the + (plus) operator for their search. While looking for a certain expression (using the plus operator) Google tells that from now an double quotation marks are necessary to find an exact expression.

SNAGHTML19b9b0e

Not sure if I like this, however, it looks like there are not many options to ignore this change. This probably has to do with all the G+ notation. It feels to me as bad as product and event names like .net or build which in combination with the new double quotation mark operator find some 2,490,000,000 results not relevant at all.

SNAGHTML19f637b

Migrating dasBlog to WordPress

Over the last couple of years, I run my blog using the dasBlog engine. As I started hosting the blog in 2004 on my own server, I choose dasBlog as it did not need any database on the backend, saved everything in XML and did a great job on the full text search over the XML content. Beside that, a blog engine running on ASP.NET seemed the right choice being familiar with the technology. Eventually, I did several fixes and hacks on my installation over the last few years. Unfortunately, there was no new release since March 2009. As I like playing with alternative technologies from time to time and WordPress comes with a rich set of features I miss at dasBlog, I decided to migrate to WordPress. In this article I will describe the steps moving forward to WordPress hosted on a Windows Server 2008.

Overview

Moving forward to the new platform includes several steps. First of all the server has to be prepared to host the new platform. After the new blog engine is set up, the content needs to be migrated. Finally, the old engine needs to be shut down and the server needs to be set up to forward requests to the old engine to the new one.

Installing WordPress

Installing WordPress should be relatively easy as it is available through the Microsoft Web Platform Installer 2.0. However, you might encounter issues during the process on machines running IIS 7 as the required Windows Update KB980363 causes the installation process to hang. The update process only hangs when started from within the Web Platform Installer, so pick it from the Microsoft Download Page and install the hotfix beforehand. Before installing WordPress you need to install PHP on the server. In addition to the instructions how to configure PHP on IIS 7, Ruslan Yakushev provides a very good tutorial how to set up FastCGI on Windows Server 2008.

Migrating from dasBlog to WordPress

Originally, I planned to use BlogML to migrate the content from dasBlog to Worpress. Instead I found dasBLogML which is a simple GUI wrapper around the original BLogML. First you download the content of the old blog to your local machine.

dasBlogML

To import the BlogML data, you might want to follow Edgardo Vega’s article. In order to avoid potential problems during the import, also have a look at Daniel Kirstenpfad’s tip about replacing all   occurrences in the XML file. Using the BLogML Importer plug-in you can finally import the previously exported XML file.

Import BlogML

Redirecting dasBlog

In the final step I had to redirect the requests from the old blog to the new one. There are several issues to think about: First of all, all binaries are still referred from the old blog. Consequently it is not possible to just shut it down. Furthermore, there are many entries that are linked from several places all over the web.

My solution is to create a IIS module using managed code, and the ASP.NET server extensibility APIs. First of all I had a look at the schemes of the permalinks or URIs I have chosen for the old blog

http://www.blog.old/yyyy/mm/dd/articletitle.aspx

and the new one

http://www.blog.new/yyyy/mm/dd/article-title/

Consequently the HTTP module has to perform several steps: Replace the domains, remove the technology specific information in form of the .aspx file extension (technology specific information isn’t good practice anyway based on Tim Berner-Lee’s article about cool URIs) and finally add some hyphens. While the later is an somehow impossible task, there is an easy workaround. The scheme for permalinks I have chosen in WordPress will list all articles on a given day if you omit the article title in the URI. Consequently, the requested URI will be rewritten by the module to

http://www.blog.new/yyyy/mm/dd/

and sent back in the response with HTTP status code 301 (moved permanently) base on RFC 2616:

“The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible. This response is cacheable unless indicated otherwise.

The new permanent URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).“

Additional URIs that need to be processed are in the form of

http://www.blog.old/CategoryView,category,categoryname.aspx

Also this one is relatively easy as WordPress expects the category in form of

http://www.blog.new/category/categoryname/

Finally, the selection from the calendar in dasBlog looks like

http://www.blog.old/default,date,yyyy-mm-dd.aspx

and needs to be transformed into

HttpApplication application = (HttpApplication)sender;
HttpContext context = application.Context;

context.Response.StatusCode = 301;
context.Response.RedirectLocation = GetRedirectLocation(context.Request);
context.Response.Cache.SetCacheability(HttpCacheability.Public);

if (!context.Request.Equals("HEAD"))
{
    ...
}

To create the redirect locations I use a set of Regex objects that cover the most important URI types.

Regex singleUriPattern
  = new Regex("http://" + OLD_DOMAIN
  + "/[0-9]{4}/[0-9]{2}/[0-9]{2}/([\w-_\%+]+)*.aspx");
Regex categoryUriPattern
  = new Regex("http://" + OLD_DOMAIN
  + "/CategoryView,category,([\w-_\%+]+)*.aspx");
Regex dateUriPattern
  = new Regex("http://" + OLD_DOMAIN
  + "/default,date,[0-9]{4}-[0-9]{2}-[0-9]{2}.aspx");

Now everything beside the content can be deleted from the old dasBlog installation. In order to avoid any requests not covered by the previously deployed module, the custom error page for status code 404 is set to the corresponding URI on the news blog.

After deploying the module (into the bin folder of the dasBlog installation) it needs to be added to the web.config. Therefore you just have to add it to the httpModules section.

<httpModules>
  <add name="UriRedirector" type="RedirectModule" />
</httpModules>

Edit Custom Error Page Dialog

If the application pool is running in Classic mode, the custom error pages do not cover any ASP.NET content. Therefore add the customError section into to web.config file. Now all requests that do not request any content from the old blog or which a are not redirected by your module are covered by the new WordPress blog.

<customErrors mode="On">
     <error statusCode="404" redirect="http://www.blog.new/404/" />
</customErrors>

Conclusion

Now the content from the old dasBlog instance are displayed on the new WordPress blog, the most important links to your old dasBlog pages are covered by the URI redirection to the new blog and all the rest is caught by the WordPress blog as well. You might want to extend the redirect module with further regular expressions (e.g. to cover CommentView.aspx or other dasBlog pages).

Web Activities in Windows Live

 

Windows Live becomes more open to other Web-based platforms. Maybe this was there already before, however, I haven’t seen it, yet. Windows Live is able to consume further events from platforms. Among the supported ones you will find TripIt, Flickr, Twitter and others.

Windows Live Web activities

Adding the applications is quite easy. Sometimes (e.g. for TripIt) you have to sign in and to confirm.

Share your activity on Windows Live

Looking forward to find even more supported activities in the future. It looks definitely like a step towards the right direction.

Photosynth goes Live

Live Maps came up with a new feature where you can find your Synths on the map. However, not easy to find whilst looking the first time for it.

Live Maps

To get the Photosynth collections, zoom into any area of interest, and select Collections and Explore Collections from the upper right menus.

Collections

On the left side you can choose from which collections the results should be displayed. The fourth icon provides a list of Synths from the region in the viewed area.

Synths

Synthing Karlsruhe

I did a third Synth, this time I got it 100% synthy. In this try, I used about 30 images of Karlsruhe, I did during a night session. I also used a panorama, I created out of a few of this images. This seems to be a quite good approach to help Photosynth to create the good Synth.

 I’ve also used the first time the map feature you can find at your Synth’s page.

Map

You simply select the place, where your Synth belongs to and save it along with it.

Location of Synth