Setting apache deflate for pre-compressed file

I could not think of a better title than this. I will explain it here.

Use-case is something like this - You want to serve your file using apache (2.x) in a compressed format using the deflate module. But you don't want apache to waste time by compressing the file on the fly. You rather want it to pick a pre-compressed file from the disk and serve it straightaway.

Sounds quite simple. Just enable the apache deflate module and edit the conf file -
sudo a2enmod deflate

This will enable the straightforward compression i.e. once enabled with this, apache will send all the files out in a compressed format, unless you edit the deflate.conf file to prevent the compression of some specific files.

Here is a snapshot of my deflate.conf-
          SetOutputFilter DEFLATE
          SetInputFilter DEFLATE
          DeflateCompressionLevel 9
          # Don't compress images
          SetEnvIfNoCase Request_URI \
          \.(?:gif|jpe?g|png|gz)$ no-gzip dont-vary
          DeflateFilterNote Input input_info
          DeflateFilterNote Output output_info
          DeflateFilterNote Ratio ratio_info
          LogFormat '"%r" %{output_info}n/%{input_info}n (%{ratio_info}n%%)' deflate
          CustomLog /var/log/apache2/deflate_log deflate


Doing this alone sets up apache for compressed serving but not for picking up a pre-compressed file from the disk. To enable it to save some time by picking the compressed file from disk, few more changes are required.

Adding MultiViews option along with the encoding mechanism does the trick. These options are added in the conf file of enabled sites -  /etc/apache2/sites-enabled/sitefoo.

Snapshot - 
                ServerAlias server
                ServerAdmin admin@localhost
                DocumentRoot /home/user/apacheroot/
               
user/apacheroot/>
                Order deny,allow
                Allow from all
                options MultiViews
                AddEncoding x-gzip .gz   

          
Once these options are enabled and apache server has been restarted, requests coming from client are handled somewhat like this -
1.Client requests for file "/foo/foo.sh"
2.Apache looks for "/foo/foo.sh". If the file "foo.sh" is not found, apache looks for other file-extensions on same filename in the same directory, like "/foo/foo.*".In this case if "/foo/foo.gz" is present, apache serves this file.
3.The AddEncoding attribute notifies the client the which encoding mechanism has been used for what file extension. Like in the above snapshot "z-gzip .gz" means that for "*.gz" files. gzip has been used. So the client is able to decompress the files and use them in their original form.

This saves significant amount of time, as apache does not have to do compression anymore.




EDIT: Enabling the deflate module is not necessary. I realized it later that once multiviews is enabled, deflate is not required at all.

The Diaspora effect

"Diaspora" is a perfect example of bringing a simple word out of the dictionary into limelight, and the way it has been done is absolutely remarkable. Remarkable because no body knew about it, not even the Diaspora guys probably.

If you have already visited the link above, you would be knowing what I am talking about, else hear it from me. I am talking about the new wave in the world of social networking which is supposed to replace an existing czar, "Facebook". This wave is called Diaspora.

Now this is the most interesting bit about Diaspora. The bit that it will replace "Facebook". To me it sounds like someone is out there to replace Google. Although comparing Facebook to Google is not fair at all, its somewhere justified considering the mammoth size of FB in its own domain (and that too after FB outsmarted Google in number of hits during a week, few days before).

I am not at all skeptical about Diaspora's capacity to outsmart FB but why has it started so early in the day! Where and how did this comparison start? Its so good to hear that so many people are ready to fund a startup and Diaspora is raising a big money, but isn't it too big an expectation from a toddler! Yes, toddler. Even Google, FB and Twitter have been a toddler and everyone has had their own share of problems, be it security or be it the UI. And fortunately everyone has had their own share of time to improve and become popular with time. But when it comes to Diaspora, it looks like no one is ready to give it time. Everyone is looking at it, as if on the very first day it will come and sweep FB away. Think about it honestly, it is not possible. At least IMHO its not possible. And god forbid, if this happens and Diaspora's already famous security is breached and it fails to impress people with its abilities, no one is going to turn their face towards it again.

Every human being goes through such situation where he/she does not want people around while making mistakes, let alone the startups. Consider wannabe singers. What happens if someone suddenly organizes their live concert in London's O2 without letting them practice. The situation is same here. Diaspora is being put on a stage to perform without being given enough opportunity to practice and make its own share of  mistakes. To me, its only a bumpy ride which results in higher jumps. If the ride is too smooth from very first day, the speed might slow down.

But I still hope the best for Diaspora's success. Looking forward to it.

Capturing the stdout/err of nonterminating binaries

Its work again :)

Came across this issue at work. I did not have any control over the binary, otherwise, could have done the logging from inside the binary.

Anyways, it goes something like this. I wanted to capture std output and error of a few binaries which run forever. The first option of-course is O/P redirection -
$PATH/foobin >> $PATH/foolog 2>&1 &

But what happens to file size if binary does not terminate for months ? :S

I tried clearing the content of file once it reaches 10MB, but no luck, because while clearing the content, the file is still in use and so, as soon as the log is written again, file size is retained, even if half of the file is empty.


var=`ls -l $PATH/foolog | awk '{print $columnno}'`
if [ $var -gt 10000000 ]
then
>foolog
fi


This does not help :(

So the solution is something like this.

Do not redirect the O/P, instead pipe it to a script and let the script do the job.
$PATH/foobin 2>&1 | $PATH/foo.sh &

foo.sh -

while true
do
read line
if [ ! -z "$line" ]
then
echo $line >> $PATH/foolog
fi
var=`ls -l $PATH | grep foolog | awk '{print $columnno}'`
if [ $var -gt 10000000 ]
then
rm $PATH/foolog
fi
done


It's done.

Adding a startup script to ubuntu

Its quite trivial but I have seen lot of people struggling for it including me. So here it goes -

write the startup script. say foo.sh in /etc/init.d folder. I dont need tell that it is to be done as root.

#!/bin/sh

start()
{
#launch your binary. For example if you want to lauch svnserver as a daemon at startup then

if ps -ef | grep "svnserve -d" | grep -v grep
then
echo svnserver is already running as a daemon
else
svnserve -d
fi
}
stop()
{
#stop your binary. You can do it by retrieving the PID and killing it
if ps -ef | grep "svnserve -d" | grep -v grep
then
var=`ps -ef | grep "svnserve -d" | grep -v grep | awk '{ print $2 }'
kill $var
else
echo svnserver is already stopped
fi
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
start
stop
;;
*)
echo "Usage - foo.sh start/stop/restart"
;;
esac



Once the script is done run -

update-rc.d foo.sh defaults

This will add the symbolic links for different run levels

Make the script executable
chmod +x foo.sh

Now whenever you will reboot, foo.sh will invoke your binary.

SSH

We use this every now n then, isnt it ?

I was recently looking into it, for setting up a secure remote login on an embedded device.

So, heres the picture that I see

installing an ssh server is as easy as everything else in ubuntu -
sudo apt-get install openssh-server

And of course, for any other device other than PC a cross compiled version will be required. Sometimes the X86 binaries work as they are on some of the devices, provided the device's arechitecture is X86 based too.

There are two steps in ssh authentication :
1. Host authentication
2. User authentication


1. Host authentication - This is the first step and it happens as soon as we try to set up a communication channel between two machines using SSH. Considering the two machines as an embedded device and a Linux box here, if we want to login onto the device from Linux box using SSH then we need to have the public key of device listed in ~/.ssh/known_hosts file of Linux box where '~' is the home of user who wants to log in, say 'root'. If the public key of device has not been listed in the 'known hosts' file before hand then instead of failing, the Linux box will issue a 'unverified key' warning which can be easily ignored and the public key of the device (sent by device to the machine for verification) can be added into the list right there. So, this step does not give a very strong stand for preventing an unauthorized access (unless strictly configured). But this is mandatory, so there can be two approaches -


i) Leave the job of adding public key of device into the list till the time of login when the warning can be ignored and key can be added forcibly, which does not sound very legal.


ii) Generate a public/private key pair for the device and copy private key on all other devices (if its required on multiple devices) . The public key should be added into the list of 'known hosts' in Linux box before login. The same public key will facilitate host authentication for all the devices.


2. User authentication - This is the second step and it happens only after successful host authentication. This happens as a series of requests sent by Linux box to the device. The request can be a key of a user or a password. So, if the first request fails then second request is sent and so on. To explain -


i) Key based authentication - For key based authentication of a user 'root' we need to copy the public key of 'root' in 'authorized_keys' file on the device at proper location. The location sometimes changes based on the server. For example, for openssh server it will be in the ~/.ssh/ folder however for "dropbear" server the location is '/etc/dropbear/authorized_keys'. At the time of authentication, the device encrypts a random string with the public key of 'root' and sends it to Linux box. If the box is able to decrypt it correctly with the private key of 'root' and send back the correct string to device then the user is considered authenticated. The keys can be protected using a passphrase at the time of generation for better security.


ii) Password based authentication - Password based authentication is equally secure since the password is sent over the channel in encrypted format. SSH authentication works like a fall back mechanism, so, if the key based authentication fails then it will automatically ask for password.


note: this division of host authentication and user authentication is not standard. I have just pictured them for better understanding.

setuid not working - Ubuntu

Ok, I am posting it here because I struggled a lot with this, sought/got lot of help and finally got this :-)

If you want to run any script as a root forever, I mean, no matter who starts the script, it should run as a root then lay back because Ubuntu wont let you do it easily.

Ideally, to run any script as root irrespective of who starts it, root should own it and the user id of script should be set to root. It goes like this -

on your root prompt i.e. sudo su - root
create your script, say foo.sh

and let root own it
chown root:root foo.sh

then set the UID
sudo chmod 4755 foo.sh

exit the root prompt and run your script as a normal user. It should run as root. BUT the sad part is that it does not.

The newer versions of ubuntu (I have tested on 9.04+) don't allow setting the uid for scripts. So even if you set it, it will automatically revert back or not work as expected.

But there is a ray of hope. You can still set the uid for binaries. So here it goes -

write a small c program which takes the name of your script as a parameter and runs it :-)

#include
int main(int argc, char* argv[])
{
system(argv[1]);
return 0;
}

compile it and set uid for its binary
gcc foo.c -o foo
sudo chown root:root foo
sudo chmod 4755 foo

Make your script executable
chmod 755 foo.sh

and you are done



execute the binary with name of your script as a parameter, something like this

./foo ./foo.sh

and the script will run under root :-)

To mention a use case - I was using Hudson CI tool which was supposed to invoke a shell script and the script required root privileges. This solution worked just perfect for me