Setting apache deflate for pre-compressed file

I could not think of a better title than this. I will explain it here.

Use-case is something like this - You want to serve your file using apache (2.x) in a compressed format using the deflate module. But you don't want apache to waste time by compressing the file on the fly. You rather want it to pick a pre-compressed file from the disk and serve it straightaway.

Sounds quite simple. Just enable the apache deflate module and edit the conf file -
sudo a2enmod deflate

This will enable the straightforward compression i.e. once enabled with this, apache will send all the files out in a compressed format, unless you edit the deflate.conf file to prevent the compression of some specific files.

Here is a snapshot of my deflate.conf-
          SetOutputFilter DEFLATE
          SetInputFilter DEFLATE
          DeflateCompressionLevel 9
          # Don't compress images
          SetEnvIfNoCase Request_URI \
          \.(?:gif|jpe?g|png|gz)$ no-gzip dont-vary
          DeflateFilterNote Input input_info
          DeflateFilterNote Output output_info
          DeflateFilterNote Ratio ratio_info
          LogFormat '"%r" %{output_info}n/%{input_info}n (%{ratio_info}n%%)' deflate
          CustomLog /var/log/apache2/deflate_log deflate


Doing this alone sets up apache for compressed serving but not for picking up a pre-compressed file from the disk. To enable it to save some time by picking the compressed file from disk, few more changes are required.

Adding MultiViews option along with the encoding mechanism does the trick. These options are added in the conf file of enabled sites -  /etc/apache2/sites-enabled/sitefoo.

Snapshot - 
                ServerAlias server
                ServerAdmin admin@localhost
                DocumentRoot /home/user/apacheroot/
               
user/apacheroot/>
                Order deny,allow
                Allow from all
                options MultiViews
                AddEncoding x-gzip .gz   

          
Once these options are enabled and apache server has been restarted, requests coming from client are handled somewhat like this -
1.Client requests for file "/foo/foo.sh"
2.Apache looks for "/foo/foo.sh". If the file "foo.sh" is not found, apache looks for other file-extensions on same filename in the same directory, like "/foo/foo.*".In this case if "/foo/foo.gz" is present, apache serves this file.
3.The AddEncoding attribute notifies the client the which encoding mechanism has been used for what file extension. Like in the above snapshot "z-gzip .gz" means that for "*.gz" files. gzip has been used. So the client is able to decompress the files and use them in their original form.

This saves significant amount of time, as apache does not have to do compression anymore.




EDIT: Enabling the deflate module is not necessary. I realized it later that once multiviews is enabled, deflate is not required at all.

The Diaspora effect

"Diaspora" is a perfect example of bringing a simple word out of the dictionary into limelight, and the way it has been done is absolutely remarkable. Remarkable because no body knew about it, not even the Diaspora guys probably.

If you have already visited the link above, you would be knowing what I am talking about, else hear it from me. I am talking about the new wave in the world of social networking which is supposed to replace an existing czar, "Facebook". This wave is called Diaspora.

Now this is the most interesting bit about Diaspora. The bit that it will replace "Facebook". To me it sounds like someone is out there to replace Google. Although comparing Facebook to Google is not fair at all, its somewhere justified considering the mammoth size of FB in its own domain (and that too after FB outsmarted Google in number of hits during a week, few days before).

I am not at all skeptical about Diaspora's capacity to outsmart FB but why has it started so early in the day! Where and how did this comparison start? Its so good to hear that so many people are ready to fund a startup and Diaspora is raising a big money, but isn't it too big an expectation from a toddler! Yes, toddler. Even Google, FB and Twitter have been a toddler and everyone has had their own share of problems, be it security or be it the UI. And fortunately everyone has had their own share of time to improve and become popular with time. But when it comes to Diaspora, it looks like no one is ready to give it time. Everyone is looking at it, as if on the very first day it will come and sweep FB away. Think about it honestly, it is not possible. At least IMHO its not possible. And god forbid, if this happens and Diaspora's already famous security is breached and it fails to impress people with its abilities, no one is going to turn their face towards it again.

Every human being goes through such situation where he/she does not want people around while making mistakes, let alone the startups. Consider wannabe singers. What happens if someone suddenly organizes their live concert in London's O2 without letting them practice. The situation is same here. Diaspora is being put on a stage to perform without being given enough opportunity to practice and make its own share of  mistakes. To me, its only a bumpy ride which results in higher jumps. If the ride is too smooth from very first day, the speed might slow down.

But I still hope the best for Diaspora's success. Looking forward to it.

Capturing the stdout/err of nonterminating binaries

Its work again :)

Came across this issue at work. I did not have any control over the binary, otherwise, could have done the logging from inside the binary.

Anyways, it goes something like this. I wanted to capture std output and error of a few binaries which run forever. The first option of-course is O/P redirection -
$PATH/foobin >> $PATH/foolog 2>&1 &

But what happens to file size if binary does not terminate for months ? :S

I tried clearing the content of file once it reaches 10MB, but no luck, because while clearing the content, the file is still in use and so, as soon as the log is written again, file size is retained, even if half of the file is empty.


var=`ls -l $PATH/foolog | awk '{print $columnno}'`
if [ $var -gt 10000000 ]
then
>foolog
fi


This does not help :(

So the solution is something like this.

Do not redirect the O/P, instead pipe it to a script and let the script do the job.
$PATH/foobin 2>&1 | $PATH/foo.sh &

foo.sh -

while true
do
read line
if [ ! -z "$line" ]
then
echo $line >> $PATH/foolog
fi
var=`ls -l $PATH | grep foolog | awk '{print $columnno}'`
if [ $var -gt 10000000 ]
then
rm $PATH/foolog
fi
done


It's done.