Microsoft, Information Technologies...



  • From Taiwan, living and working at Tokyo, Japan.

Recent Posts


Microsoft Sites

Other Sites

Blog pools



Site Info

Locations of visitors to this page

Logos & Chicklets


Rex's Certifications
Rex's Certifications

Creative Commons授權條款
本 著作 係採用
Creative Commons 授權條款

November 2005 - Posts

URL Rewriting in ASP.NET


Just a follow-up reading about URL Rewriting in ASP.NET.

I've did the Url Rewriting of this blog site to point my old blog posts to the new site urls. while at the same time pointed all my RSS links to FeedBurner. This was all done based on CommunityServer's Url Rewriting subsystem.

Just ran in to James Avery's another blog post talking about built-in Url-Rewriting in ASP.NET 2.0, which seems not as that useful as it meant to be, since it didn't support Regular Expressions on Url-Rewriting in ASP.NET 2.0.

There was a nice concept article link found in James's post talking about how to do Url-Rewriting in ASP.NET. This should help understanding how IIS and ASP.NET handle url requests and do the rewriting.

As for the lack of Reqular Expression support in ASP.NET 2.0 Url-Rewriting, seen via James's post comments here and here , Christopher and Will had approaches of adding Regex support for Url-Rewriting. I seen Christopher's approach also using Http Module to handle the rewriting, which is the same way how CommunityServer did.

I do think that if ASP.NET 2.0 didn't provide useful rewriting built-in functionality, maybe I should also consider building my own one based on those concepts built from .NET 1.1.

Technorati Tags: communityserver , , programming


Visual Studio Add-Ins Every Developer Should Download Now


[via Michiko Osada] [via MSDN magazine, December 2005]

Tools matter!!

Check out James Avery's article in the latest issue of MSDN magazine about Ten Essential Tools of Visual Studio Add-Ins , I am sure most developers in the .Net world will need them just as I did!

Technorati Tags: addin , visualstudio , programming


FireFox 1.5 is out...


[via ijliao][via Slashdot]

FireFox 1.5 is out! I've installed this new version on my notebook. the first impression of this version is the "speed" improvement while browsing websites. also the Ctrl+Tab hotkey to toggle between tabs, it's convenient!

It's not my main browser, I am still using IE. but I do need to install this to test my websites for cross-browser compatibilities. FireFox with no doubt had become one of the mainstream browsers.

Technorati Tags: browser , firefox


The fun of URL Rewriting...


This blog site had finally changed to CommunityServer 1.1.

As on doing the site transfer, I've moved from the sub-domain to the root domain, at the same time, also moved those archived pic files as well as other public referenced files out to the other IIS virtual root with different domain name. It's a big move of URLs in the website, but actually there are seldom moves in the actual files. As in the past time I usually mixed using all the domain names I registered to point to the files (most of the domain names I registered are all pointed to the same virtual root, thus using any one for the path was ok), unifiying the domain name urls costs me some time to organize all the blog content.

When using the .Text converting tools provided by Kevin Harder, it provided a string replacement section in the middle of convertion while moving old .Text posts to new CommunityServer database. this saved me a lot of time of re-editing posts. noticed that it will be not possible to just using SQL statements utilizing the Replace function to change BLOB fields (the post body), if one wants to do the string replacement in those posts, one must writing a small program to get each post out of database, modified in the memory, and then update it back to db, which is Kevin's convertion tool did.

After did the first convertion to move all the posts as well as links, referrals, etc to new CommunityServer blog site, string replacement of the URLs of files moved to other virtual root were already changed with correct URL links. I'll still need to do some database queries to find out those posts with links that using as domain name to point files, and mannualy changed to the new domain name. fortunately it's only about 30 posts or so.

After confirmed the content that all had correct URLs , it's time to consider all the outside links around the world that point to my site. All the links outside will all using old domain name to get the file. I'll need to have a way to redirect those old links to new links in my new blog site.

I found that CommunityServer now had very good URL rewriting subsystem to do all the URL rewriting using Http Module, and store all the rewriting configuration in a SiteUrl.config file. each rewriting item all had 2 parts of settings for 2 functions. one is for the system to get the real path of a aspx file really located, by using formatting string to let program pass parameters to form the real URL. after this, if this URL had a really processing Http Handler with the other aspx file, then same in this rewriting item it contains the other 2 xml attributes, the pattern and the vanity attributes. by using Regular Expression pattern to match the original path, it then transforms to the real URL that the vanity attribute specified, with parameters from the pattern transform, to call the real aspx file. pretty cool!

So, I would like to use this rewriting system to point my old posts to new posts. First just point the old domain name to the new virtual root, so that old are now also points to new website doing this making the old blog traffic going to the new site too. Second is to mapping old .Text URLs to the new blog site. noticed that in old .Text site's web.config file there were bunch of http handler settings that also using a pattern matching approach to map the urls to the actual processing aspx programs, that will be a good reference to start the mapping to the new locations.

The formal .Text url rewriting settings were as follows (inside the web.config file):

<HandlerConfiguration  defaultPageLocation = "DTP.aspx" type = "Dottext.Common.UrlManager.HandlerConfiguration, Dottext.Common">
<HttpHandler pattern = "^(?:/rss20\.aspx)$" type = "Dottext.Common.Syndication.RssHandler, Dottext.Common" handlerType = "Direct" />
<HttpHandler pattern = "^(?:/atom20\.aspx)$" type = "Dottext.Common.Syndication.AtomHandler, Dottext.Common" handlerType = "Direct" />
<HttpHandler pattern = "^(?:/comments/commentRss/\d+\.aspx)$" type = "Dottext.Common.Syndication.RssCommentHandler, Dottext.Common" handlerType = "Direct"/>
<HttpHandler pattern = "^(?:/aggbug/\d+\.aspx)$" type = "Dottext.Framework.Tracking.AggBugHandler, Dottext.Framework" handlerType = "Direct"/>
<HttpHandler pattern = "^(?:/customcss\.aspx)$" type = "Dottext.Web.UI.Handlers.BlogSecondaryCssHandler, Dottext.Web" handlerType = "Direct" />
<HttpHandler pattern = "^(?:/category\/(\d|\w|\s)+\.aspx/rss)$" type = "Dottext.Common.Syndication.RssCategoryHandler, Dottext.Common" handlerType = "Direct" />
<HttpHandler pattern = "^(?:((\/default\.aspx)?|(\/?))?)$"  controls = "homepage.ascx"/>
<HttpHandler pattern = "^(?:/articles/\d+\.aspx)$" controls = "viewpost.ascx,Comments.ascx,PostComment.ascx" />
<HttpHandler pattern = "^(?:/articles/\w+\.aspx)$" controls = "viewpost.ascx,Comments.ascx,PostComment.ascx" />               
<HttpHandler pattern = "^(?:/archive/\d{4}/\d{2}/\d{2}/\d+\.aspx)$" controls = "viewpost.ascx,Comments.ascx,PostComment.ascx" />
<HttpHandler pattern = "^(?:/archive/\d{4}/\d{2}/\d{2}/\w+\.aspx)$" controls = "viewpost.ascx,Comments.ascx,PostComment.ascx" />
<HttpHandler pattern = "^(?:/archive/\d{4}/\d{1,2}/\d{1,2}\.aspx)$" controls = "ArchiveDay.ascx" />
<HttpHandler pattern = "^(?:/archive/\d{4}/\d{1,2}\.aspx)$" controls = "ArchiveMonth.ascx" />
<HttpHandler pattern = "^(?:/contact\.aspx)$" controls="Contact.ascx" />
<HttpHandler pattern = "/posts/|/story/|/archive/" type="Dottext.Web.UI.Handlers.RedirectHandler,Dottext.Web"  handlerType = "Direct"/>
<HttpHandler pattern = "^(?:/gallery\/\d+\.aspx)$" controls="GalleryThumbNailViewer.ascx" />
<HttpHandler pattern = "^(?:/gallery\/image\/\d+\.aspx)$" controls="ViewPicture.ascx" />
<HttpHandler pattern = "^(?:/(?:category|stories)/(\w|\s)+\.aspx)$" controls="CategoryEntryList.ascx" />
<HttpHandler pattern = "^(?:/comments\/\d+\.aspx)$" type = "Dottext.Common.Syndication.CommentHandler, Dottext.Common" handlerType = "Direct" />
<HttpHandler pattern = "^(?:/services\/trackbacks/\d+\.aspx)$" type = "Dottext.Framework.Tracking.TrackBackHandler, Dottext.Framework" handlerType = "Direct" />
<HttpHandler pattern = "^(?:/services\/pingback\.aspx)$" type = "Dottext.Framework.Tracking.PingBackService, Dottext.Framework" handlerType = "Direct" />
<HttpHandler pattern = "^(?:/services\/metablogapi\.aspx)$" type = "Dottext.Framework.XmlRpc.MetaWeblog, Dottext.Framework" handlerType = "Direct" />

Just do a one-to-one mapping of those urls to the new urls by editing SiteUrls.config file. first one must added a new location with empty start path (this is in the "location" section):

<location name="empty" path="" />

then just start the mapping in the "url" section:

<!-- below is the general mapping from .text to cs files -->
<url name = "oldcat03" location="empty" path="/rss\.aspx" pattern="\.aspx" vanity="" />
<url name = "oldcat04" location="empty" path="/atom\.aspx" pattern="\.aspx" vanity="" />
<url name = "oldcat05" location="empty" path="/comments/commentRss/\d+\.aspx" pattern="\d+)\.aspx" vanity="$1" />
<url name = "oldcat06" location="empty" path="/aggbug/\d+\.aspx" pattern="\d+)\.aspx" vanity="$1" />
<url name = "oldcat07" location="empty" path="/customcss\.aspx" pattern="" vanity="" />
<url name = "oldcat08" location="empty" path="/category\/(\d|\w|\s)+\.aspx/rss" pattern="\d+)\.aspx/rss" vanity="$1" />
<url name = "oldcat09" location="empty" path="((\/default\.aspx)?|(\/?))?" pattern="" vanity="" />
<url name = "oldcat10" location="empty" path="/articles/\d+\.aspx" pattern="\d+)\.aspx" vanity="$1.aspx" />
<url name = "oldcat11" location="empty" path="/articles/\w+\.aspx" pattern="\w+)\.aspx" vanity="$1.aspx" />
<url name = "oldcat12" location="empty" path="/archive/\d{4}/\d{2}/\d{2}/\d+\.aspx" pattern="\d{4})/(\d{1,2})/(\d{1,2})/(\d+)\.aspx" vanity=";y=$1&amp;m=$2&amp;d=$3&amp;PostID=$4" />
<url name = "oldcat13" location="empty" path="/archive/\d{4}/\d{2}/\d{2}/\w+\.aspx" pattern="\d{4})/(\d{1,2})/(\d{1,2})/(\w+)\.aspx" vanity=";y=$1&amp;m=$2&amp;d=$3&amp;PostName=$4" />
<url name = "oldcat14" location="empty" path="/archive/\d{4}/\d{1,2}/\d{1,2}\.aspx" pattern="\d{4})/(\d{1,2})/(\d{1,2})\.aspx" vanity=";y=$1&amp;m=$2&amp;d=$3" />
<url name = "oldcat15" location="empty" path="/archive/\d{4}/\d{1,2}\.aspx" pattern="\d{4})/(\d{1,2})\.aspx" vanity=";y=$1&amp;m=$2&amp;d=1" />
<url name = "oldcat16" location="empty" path="/contact\.aspx" pattern="\.aspx" vanity="" />
<url name = "oldcat17" location="empty" path="/gallery\/\d+\.aspx" pattern="\d+)\.aspx" vanity="$1.aspx" />
<url name = "oldcat18" location="empty" path="/gallery\/image\/\d+\.aspx" pattern="\d+)\.aspx" vanity="$1.aspx" />
<url name = "oldcat19" location="empty" path="/(?:category|stories)/(\w|\s)+\.aspx" pattern="|stories)/(\d+)\.aspx" vanity="$1.aspx" />
<url name = "oldcat20" location="empty" path="/comments\/\d+\.aspx" pattern="\d+)\.aspx" vanity="$1.aspx" />
<url name = "oldcat21" location="empty" path="/services\/trackbacks/\d+\.aspx" pattern="\d+)\.aspx" vanity="$1" />

Since the path attribute is for programs to passing parameters, it's not used here and just act as comment string place. The pattern specifies the original .Text site path pattern, and vanity point to the actual CommunityServer program path. noticed that the path mapped might not be the end path of a program and may just been url-rewritten some times to reach it's real http handler, which is defined in other place inside SiteUrl.config file.

The CommunityServer source code must be modified to support above re-writing with full URL path including the start http:// , otherwise the output URL will be wrong to redirect to the right place. since the original design of this URL rewriting subsystem was just for internal site rewriting, it will need some care for that http module code to make above mapping happen.

So, after all the modification, I've made my old links correctly linked to the new places without broken. all the outside post links will link to the new place with a http 301 status code indicating the old url were permenently changed to this new place (added code to do this!). further more, using above approach can let me redirect all my RSS links as well as category links out to other RSS service provider like FeedBurner, as my site is currently doing, without modify skin-pages or modify the code to point the category RSS as well as site RSS to outside. Just write one more url rewrite and it's all out!

All until now is for those http requests that's able to make itself to the CommunityServer's http modules. for those like static html files, image files, etc, it's taken and served by defult IIS server behavior which is far before it can be captured and re-written by CommunityServer's http module. how do we re-write those things?

It's not able to do it inside the CommunityServer site by just using the re-write subsystem. some re-writter that's more under IIS must be utilized. I've used ISAPI-Rewrite for a long time for defending the spammers, it will be a good place to do those html files and image files re-writing there.

for example I got a big folder with pictures in a directory that some of the outside links used the new domain name to reference them, which now this domain name had become the blog site and I moved files to the other domain name. to do the rewrite in ISAPI Rewrite, simplely using the example like this:

RewriteCond Host:
RewriteRule /misc/(.*) http\://archive\.tang\.tw/misc/$1 [I,R]
RewriteCond Host:
RewriteRule /images/(.*) http\://rex\.la/blogs/past/images/$1 [I,R]
RewriteCond Host:
RewriteRule /rex-resume.htm http\://archive\.tang\.tw/rex-resume\.htm [I,R,L]

noticed that using Host header as rewrite condition to prevent rewriting infinite loops since both virtual roots got the same sub-directory pattern while rewriting.

And that should cover all my links outside without broken.

URL rewriting is fun and convenient for site transfer!

Technorati Tags: communityserver , , programming


Let CommunityServer search fulltext instead of keyword search...


For English content CommunityServer provided good keyword search that will index keywords while blog posts or forum posts were added into the database. CommunityServer used a seperated keyword table (cs_SearchBarrel) to store those keywords and their rankings. By using the search functions, either to search all aggregated posts or to search individual blog posts, it used the keyword table joined to post tables to search context by keywords order by rankings and post dates (refer to store procedure cs_weblog_Search). The keywords to be searched were further hashed to integers to speed up the search, pretty cool!

However, this is not going to work for blogs or forums primarily using languages like Chinese, Japanese, or Korean, as those languages seldom using space to seperate words like English or other likely, the indexing pattern used by CommunityServer is not able to do the index thus the search of keywords of those languages against the whole database will return either no match or all posts. as showing of the problem on the forum posted here.

A search to Google didn't find useful information about how to solve this. also had a visit to China's CommunityServer website, found it's search function for Chinese also broken to function. Need to find a way to solve this.

While using .Text as my blog system, there was not built-in search functions for blog posts. I used mainstream searching engines to search my own blog site. although this function still work for using in CommunityServer, it seems  little bit waste to just drop the built-in functions provided by the system. The main problem here is that languages like Chinese were not fit into "space separated keyword" pattern. for languages like this, a keyword dictionary is needed to find the keywords for index, which is too large a scale of system modification for me. I need a simple and fast way to solve this problem. The next idea came out of my mind is to change the underlying excellent keyword seaching mechaism back to old fashion full-text search. and that's what I had done couple hours ago.

If you did take a look at the store procedure for weblog search (cs_weblog_Search), you'll find the join operation from keyword table to post table, plus some other joined table related to only select individual forums or blogs. There were one create-temp-table operation taking the sql string generated by code to determine the keyword match against the keyword table, then use this temp table to join post table for real content to send back to system. the temp table also acts as paging operation table since there was an auto-identify field that will act as post orders for paging to determine exactly which posts should go back. thus the original thought of just removing the temp table in the store procedure will not be a good way to do, since it will affect too many upper functions and the souce code will be modified too much.

The easy way is to just modify the search criteria generated from the code to do the search match, not against to keyword table, but directly to the post table. CommunityServer source code got a nice SqlGenerator helper class to generate dynamic queries to database, which can also be utilized for the new search criteria. actually after digging into the source code, I did find that the seach function provided by CommunityServer can do simple AND and OR search , also it will auto-strip symbols to prevent malicious SQL query code injections that might harm the system (nice regular expression stripping!) by search like "new york and travel" or " or C#" or any combination of AND or OR, the search can be performed. nice!

The work that minimize the impact of the whole source code is to just modify search related functions that's near the DataProvider level but not down to the database level (no modification to the store procedure), so that the returning records for data reader object will have the same schema for the later controls to DataBind. although the better way would be using a real "Search Provider" for CommunityServer, which hopefully Telligent Systems will do that later, I'll just do the modification for my blog site first.

The main search box in UI will all finally conduct to a WeblogDataProvider and invoke the GetSearchResult function. the modification to full-text search function goes here. the same pattern should also fit to forum search or even gallery search (I was not tested against those).

This is technical enough, and necessary to modify the source code, so I'll just provide the main part of the code here , I think my comments should be enough to indicate most of the things.

Partial source code for CommunityServer full-text modification here.

My site here is already adopted the code above and now it's in full-text search mode (keyword search was turned off and keyword generation also turned off by modifying the code), Chinese search is working, English of course working. as for Japanese, there was still tiny problems related to the encoding of the page, the database content, and my OS version (a Traditional Chinese OS version), this is not related to the full-text search code, thus should not affected the results the code can do.



Finally, got this new blog system working...


So, finally got this new blog system working...

I'll post various things about how to achive the working of this new blog system later. just too tired for now. couple days without good sleeps.

the old URL are now automatically redirect to the home page here, all the links of the old blog will correctly redirected to here inside , thus my links outside the world will still got linked without broken. Thanks to good URL rewriting architecture of CommunityServer (with some modification of the source code to let it be able to rewrite to outside locations. will talk about this later.)

the Past blog's RSS link will map to the whole site RSS link hosted by FeedBurner, Work / Life / Japanese blogs got their own RSS links seperately for you to choose. Each blog's category RSS will not aggregate its own category and will map to the blog RSS link (that means, I put the category rss to the blog rss, category rss not functions as it means to be.), now all the RSS agreegation links are all hosted by FeedBurner.

Old RSS link for , that is , were already deleted. FeedBurner provided 30 days redirect service that will let your RSS reader know the new RSS feed location. it's an automatical change negotiated by your RSS reader to FeedBurner and finally to my blog system, therefore no modification of your RSS reader is needed. just in case if you want to change the new feed by hand, change the old feed to for the whole site RSS, or click above tags for each individual blog with different topics for its own RSS.

Sleepy, but finally finished. (although the search box can only search English content for now, Chinese content is still not searchable, need to fix this later.)


Microsoft ActiveSync 4.1 released...


activesync4.jpg[Via MSDN blogs]

ActiveSync 4.1 released!

if you are using mobile device running Windows Mobile 5, ActiveSync 4.0 or higher version will be necessary to let you synchronize your device with your Outlook in notebook or PC. As previously seen that there will be some problems regarding to connecting mobile device via wireless and bluetooth network at version 4.0, this upgraded version should excepted to solve those problems.

other new features includes:

  • New partnership wizard makes syncing easier
  • Faster transfer of data files including media
  • Ability to sync photos assigned to contacts from Outlook on the desktop

Technorati Tags: Microsoft , Windows Mobile , Activesync