I came across a really odd corner case in a customer ticket today, I was unable to find anything related to this problem that involved rewrites, so here it is! My first real kinda non-infosec post. Shoutout to all sysadmins and ops in the world, the struggle is real! <3
In Microsoft Word 2010, URLs that have been pasted into a document will hyperlink, however due to a reason I cannot find any reasonable explanation this is what happens:
What Word passes to the default browser:
# gets turned into
A desperate search on the intertubes revealed you can actually implement a registry hack to fix this. In this case, that is simply not possible. Word documents with macros that change registry values = malware, idk what it does, it’s doing the WRONG thing.
After a good hour making some of the most monstrous regular expressions I think I’ve ever made, I finally started getting somewhere.
rewrite ^(.*)\ -\ (.*)$ $1#$2 redirect;
In the end this rewrite rule was born.
Note! – Nginx will automatically translate the %20 to a space before it hits the rewrite block.
Here it is in action!
curl -v 'http://localhost/index.html%20-%20thingy'
2015/08/26 13:26:58 [notice] 17521#0: *67 "^(.*)\ -\ (.*)$" matches "/index.html - thingy", client: 127.0.0.1, server: localhost, request: "GET /index.html%20-%20thingy HTTP/1.1", host: "localhost"
< HTTP/1.1 302 Moved Temporarily` < Location: http://localhost/index.html#thingy`