HallmarcDotNet

Marc Elliot Hall's Blog

Headlines

Thomas wrote about our community Saint Peters, Missouri...

Christmas card is up Check out the Flash...

Blog-o-licious We've got blogs...

Site Redesigned HallmarcDotNet has a new look...

 

Welcome to Marc's Weblog

— also known as my vanity gripe page

From sunny, Las Vegas, Nevada, this is the blog of Marc Elliot Hall, leader and system engineer extraordinaire.


October
Sun Mon Tue Wed Thu Fri Sat
 
17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
2018
Months
OctNov Dec

Thu, 26 Nov 2015


Left Holding (Open) the Bag


Filesystem inode Trouble Part II

You may recall that in January, 2014 I wrote a piece about inodes and filesystems behaving badly. At the behest of a colleague (Hi, Josh!), this is the exciting conclusion to that saga.

When last we left the action, my free disk space looked like this:

[root@localhost ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 9.9G 7.6G 1.8G 81% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/sda1 485M 70M 390M 16% /boot
/dev/sda5 4.0G 808M 3.0G 22% /tmp
/dev/sda7 53G 670M 50G 2% /var
/dev/sda3 5.0G 216M 4.5G 5% /var/log

Yet, ordinary users were unable to log in, and I could not create new files on the root (“/” ) filesystem.

To summarize thus far:

  1. My users can’t log in.
  2. I have disks, which the OS has identified.
  3. I have filesystems on the disks.
  4. I have mount points for the filesystems.
  5. I probably even have mounted those filesystems.
  6. I was able to rebuild the mounted fileystem table.

After digging about and recreating the filesystem description table, I determined that the system had run out of inodes:

[root@localhost ~]# df -ih
Filesystem Inodes IUsed IFree IUse% Mounted on
devtmpfs 982K 632 982K 1% /dev
tmpfs 985K 1 985K 1% /dev/shm
/dev/sda2 640K 640K 0 100% /
devtmpfs 982K 632 982K 1% /dev
/dev/sda1 126K 50 125K 1% /boot
/dev/sda5 256K 96K 161K 38% /tmp
/dev/sda7 3.4M 3.4K 3.4M 1% /var
/dev/sda3 320K 177 320K 1% /var/log
tmpfs 985K 1 985K 1% /dev/shm
tmpfs 985K 1 985K 1% /dev/shm
tmpfs 985K 1 985K 1% /dev/shm

A quick reminder about inodes: Linux (and other Unix and Unix-like operating systems) in their default configuration use inodes to keep track of what file goes where on the system’s disks, and to keep metadata about the file (user and group ownership, creation time, read and write permissions, etc.). Think of inodes as the index for the file system: at least one inode for each file (you will have more than one inode per file if the file is linked to multiple locations in your filesystem). Unfortunately, there are a finite number of inodes available (varying from filesystem to filesystem and configuration to configuration, but typically numbering well into the tens of thousands), and when they run out — even if the system has more raw space available — it can’t create any more files. Moreover, the number of inodes a filesystem has cannot be (easily) changed after it is created.

Fortunately, there is a simple solution!

Unfortunately, I no longer have access to the system that I was troubleshooting when I wrote my earlier post. However, the fix is pretty universal. With a little script-fu, we can find out how many files each directory in the filesystem has. Once we have identified the quantity and location, we can determine whether there is any particular reason to keep those files. Most of the time in a situation like this, some runaway process has been spewing data that doesn’t get cleaned up properly, either because the process never terminates or because the process isn’t properly coordinating with tools like logrotate. If the data being spewed is a bunch of small files, we can then simply delete the files.

To start, then:

echo 'echo $(ls -a "${1}" | wc -l) ${1}' > /tmp/files_`date +%F`
chmod 700 /tmp/files_`date +%F`
find . -mount -type d -print0 | xargs -0 -n1 /tmp/files_`date +%F` | sort -n | tail -10

This will:

  1. Generate a list of directories in the root filesystem;
  2. Count the number of files in each directory;
  3. Spit out the list with two columns:
    • the right column with the directory name,
    • the left column with the file count;
  4. Sort the two-column list by the file count value;
  5. If the list is more than 10 lines, only show you the 10 with the most files.

Usually, the directory with the most files is your culprit. You’ll want to verify that, then determine whether you should just delete the oldest umpteen files in that directory, all the files in the directory, or whatever other subset is appropriate. You’ll also want to correct whatever process is generating the files. Short of rewriting the program that spewed so much data, you can fix this a number of ways. Three straightforward methods are:

  1. Set up a wrapper script that manages it properly,
  2. Create a logrotate script to clean it up, or
  3. Build a cron job that will periodically do that for you.

Don’t forget to delete your temporary file:

rm -f /tmp/files_`date +%F`

Happy hunting!

posted at: 20:20 |


Wed, 08 Jan 2014


I’ve Run out of Places to Put Stuff


Filesystem inode Trouble

When I want to find out how much space is available on a server, I use the df command, usually with a flag like -h for “human readable”. This provides me with a nice summary of space in the filesystems, in this case showing me that my root ( “/” ) filesystem has 1.8 GB available out of 9.9 GB total, for example:

[root@localhost ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 9.9G 7.6G 1.8G 81% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/sda1 485M 70M 390M 16% /boot
/dev/sda5 4.0G 808M 3.0G 22% /tmp
/dev/sda7 53G 670M 50G 2% /var
/dev/sda3 5.0G 216M 4.5G 5% /var/log

Sometimes, things go dramatically awry, however. For example, I recently encountered a situation where ordinary users were unable to log in to a host. I could only log in as root. This is generally a bad practice (and not everybody should have the power of root anyway), so I went about troubleshooting. Among the things I did was check whether any filesystems were full with the aforementioned df -h command.

And I got this output:

[root@localhost ~]# df -h
df: cannot read table of mounted filesystems

This is suggestive of a major problem. The system is running, obviously. And, this is good: it means that the system can read at least a couple of the filesystems directly. It just can’t summarize their status for me.

So, I look at the file that is supposed to contain the table of mounted filesystems:

[root@localhost ~]# cat /etc/mtab

No output at all sad

Then I look at the partition table (using fdisk -l), to see what the system thinks its disks look like:

[root@localhost ~]# fdisk -l

Disk /dev/sda: 80.5 GB, 80530636800 bytes
255 heads, 63 sectors/track, 9790 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000d5e85

Device Boot Start End Blocks Id System
/dev/sda1 * 1 64 512000 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 64 1370 10485760 83 Linux
/dev/sda3 1370 2022 5242880 83 Linux
/dev/sda4 2022 9791 62401536 5 Extended
/dev/sda5 2023 2545 4194304 83 Linux
/dev/sda6 2545 2800 2048000 82 Linux swap / Solaris
/dev/sda7 2800 9791 56156160 83 Linux

So far so good: this system knows it has a disk (/dev/sda) with partitions (sda1 through sda7); and it at least can identify the type of filesystems they contain.

Just in case any of the fileystems aren’t mounted, using mount -a I attempt to mount them all:

[root@localhost ~]# mount -a
mount: /dev/sda1 already mounted or /boot busy
mount: /dev/sda5 already mounted or /tmp busy
mount: /dev/sda7 already mounted or /var busy
mount: /dev/sda3 already mounted or /var/log busy
can't create lock file /etc/mtab~1605: No space left on device (use -n flag to override)
mount: devpts already mounted or /dev/pts busy
mount: sysfs already mounted or /sys busy

That looks mostly good; they’re already showing as mounted (or just busy, but that’s a rather improbable situation). However, I see the line that says can't create lock file /etc/mtab~1605: No space left on device (use -n flag to override), which worries me. Quite a lot.

Looking a little deeper, I try to see whether /etc/mtab (my mounted file system table file) even exists at all:

[root@localhost ~]# ls -l /etc/mt*
-rw-r--r--. 1 root root 0 Jan 3 09:20 /etc/mtab

It’s there, but has zero bytes! That means the file is empty. It should contain enough information to describe the mounted filesystems — always more than zero bytes.

To summarize thus far:

  1. My users can’t log in.
  2. I have disks, which the OS has identified.
  3. I have filesystems on the disks.
  4. I have mount points for the filesystems.
  5. I probably even have mounted those filesystems.
  6. But, before I can check the status of the filesystems, I’ll have to force the system to rebuild the mounted fileystem table.

Fortunately, because Linux has a virtual filesystem containing information about the current running environment kept entirely in system RAM (the /proc filesystem), using grep and I/O redirection I can export the contents of the known mounts file ( /proc/mounts ) into a new /etc/mtab file and try my df command again:

[root@localhost ~]# grep -v rootfs /proc/mounts > /etc/mtab

Now I can see that my /etc/mtab file contains 1423 bytes:

[root@localhost ~]# ls -l /etc/mt*
-rw-r--r--. 1 root root 1423 Jan 3 09:30 /etc/mtab

Then I can check whether the system can tell me about the filesystems using df and the -h flag:

[root@localhost ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 9.9G 7.6G 1.8G 81% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/sda1 485M 70M 390M 16% /boot
/dev/sda5 4.0G 808M 3.0G 22% /tmp
/dev/sda7 53G 670M 50G 2% /var
/dev/sda3 5.0G 216M 4.5G 5% /var/log

It claims I’ve got plenty of space! Why, then, can I not use touch to create a file in the / directory, let alone log in as an ordinary user?

Possibly, because the inodes are all used up. But, “What are inodes?” you ask… Linux (and other Unix and Unix-like operating systems) in their default configuration use inodes to keep track of what file goes where on the system’s disks, and to keep metadata about the file (user and group ownership, creation time, read and write permissions, etc.). Think of inodes as the index for the file system: one inode for each file. Unfortunately, there are a finite number of inodes available (varying from filesystem to filesystem and configuration to configuration, but typically numbering well into the tens of thousands), and when they run out — even if the system has more raw space available — I can’t create any more files; thus our current problem.

Fortunately, now that my mounted filesystem table has been rebuilt, I can check for what inodes are looking like using df and the -i flag:

[root@localhost ~]# df -ih
Filesystem Inodes IUsed IFree IUse% Mounted on
devtmpfs 982K 632 982K 1% /dev
tmpfs 985K 1 985K 1% /dev/shm
/dev/sda2 640K 640K 0 100% /
devtmpfs 982K 632 982K 1% /dev
/dev/sda1 126K 50 125K 1% /boot
/dev/sda5 256K 96K 161K 38% /tmp
/dev/sda7 3.4M 3.4K 3.4M 1% /var
/dev/sda3 320K 177 320K 1% /var/log
tmpfs 985K 1 985K 1% /dev/shm
tmpfs 985K 1 985K 1% /dev/shm
tmpfs 985K 1 985K 1% /dev/shm

Yup, out of inodes on the / filesystem. What to do?

Join us next time for the exciting conclusion!

posted at: 16:37 |


Fri, 03 Jan 2014


Take this job…


I/O Redirection

As previously discussed, sometimes I want the output of a command to go somewhere besides the screen right in front of me.

For example, I have a script running from cron — which has no screen to which it should send its output. On many Linux and Unix systems, it will instead generate an email, which is generally sent to the system administrator. She probably doesn’t want to see the output of my script, however. Especially if there are 50 users on the system, all of whom are sending script output to email. And we certainly don’t want the output to go to the great bit bucket in the sky… At least not until we learn about /dev/null.

Instead, I want both <STDOUT> and <STDERR> to go to a file. Earlier, I showed you how to send either <STDOUT> or <STDERR> to a file. However, I can also combine these into a single I/O redirect, like so:

ls 2>&1> foo

This has taken the <STDERR> and redirected it to the same file descriptor as <STDOUT>, which is then dumped to a file. This is rather complicated to type, so more recent versions of some shells provide a shorter method:

ls &> foo

But what if I want my <STDOUT> and <STDERR> to go to two different files? Stay tuned…

posted at: 09:43 |



Take this job…


I/O Redirection

Linux shells (and Unix shells before them) have three popular methods for passing data around: STDIN (standard input), STDOUT (standard output), and STDERR (standard error). To keep things simple, think of STDIN as your keyboard, STDOUT as your screen, and STDERR as, uh, your screen when something breaks. As you will see later, there are nuances — STDIN isn’t always the keyboard, nor are STDOUT and STDERR always your screen.

Let’s start with a simple reason for knowing what STDIN, STDOUT, and STDERR do. For example, sometimes I want the output of a command to go somewhere besides the screen right in front of me. Instead, I may want the output to go to a new file, fileout.txt, so I can read it later.

To do this, I would redirect the output from my command, like so:

ls foo > fileout.txt

That “>” means “take standard out from the previous command (ls foo) and redirect it to the named file (fileout.txt); if the file already exists, overwrite it with whatever output I give you”. It’s what’s called an I/O redirection operator. Other such operators are < (take the following file and give its contents to the preceding command) and >> (append the output of the previous operation to the following file; if the file doesn’t already exist, create it).

STDIN, STDOUT, and STDERR have numbers, too. STDIN is 0, STDOUT is 1, and STDERR is 2. When using I/O redirection operators without an identifying number, the shell presumes that you want to reference STDOUT. However, you can always choose to explicitly use the numbers. Let’s say I want STDERR to be redirected to a file:

ls foo 2> fileerr.txt

After running the command, fileerr.txt would contain text like this:

ls: cannot access foo: No such file or directory

Presuming, of course, that a file named “foo” does not exist in the current directory.

Naturally, there are ways to combine redirection operators to make scripting output less verbose and avoid having your system administrator contact you about your abuse of the system.

Join us next time for another exciting episode!

posted at: 09:43 |


Thu, 18 Feb 2010


Tripping Myself Up

Caught with My Pants Down

At about 4:15 yesterday afternoon I received an unusal phone call. The guy at the other end didn’t identify himself at first. He just asked me if I was the site admin for eldoradotech.org, which I am. He then somewhat murkily explained that he’d received a phishing email from a third party. Further, the email linked to my web site. No big deal, except that my web site was in fact serving up a page that looked just like a JPMorgan Chase login page. Not good.

After determining that there was a problem, I immediately deleted the unauthorized files from the server and then shut it down. Unfortunately, this resulted in all of my web sites, email, and other services being unavailable, which is a hassle for my millions thousands legions hundreds scores dozens of two friends and fans. However, it was necessary because whoever had put those files on my server the first time could always put them back a second, and could further exploit not only that server, but the other servers in my gleaming, high-tech ghetto basement datacenter as well as the desktops and laptops around the house.

Fortunately, my workday was largely over; so I rushed out to my car at 5:05 to get home and check things out. All the way there I considered the many possible vectors an attacker might have used to break in to my server. Some of them are difficult and unlikely, while others would simply require access to my password. I generally keep close tabs on my password, but you can never rule out some kind of a slip-up. This is why changing your password regularly is a good idea, even if it is a pain in the ass.

During transit I also thought about the varying consequences of the attack: if it was just the one system, damage could be limited. However, if the attacker had been in the system a long time before using it for nefarious purposes, he or she might have logged other passwords, confidential business information, financial records, or other valuable data. This worried me.

When I arrived at home, I first turned off all the other computers in the house. That’s three other servers, three desktops, and two laptops, currently. This was to prevent the attacker from using one of them to re-infect the first system if he or she was already loose on my network.

After this first step, I booted up the infected server from a clean Knoppix CD image and analyzed the logs. It looks like two IPs were running a dictionary attack against a weakly-passworded mythtv user account:

58.177.188.213
172.173.83.246

Neither IP is responding to ping, now.

The attacker appears to have gained access to the brand-spanking-new mythtv account (no this server wasn’t being used for MythTV, but I keep accounts synchronized across my hosts to keep things simple) and then used a privilege escalation exploit to create a new user, ‘ftpd’. Then the attacker gave the new ftpd account a UID of ‘0’ (essentially, the same access level as root). From there, it was all down hill.

Logs don’t always tell the truth, because they can be edited, deleted, or corrupted. Having something to track back through was nice, but it’s not sufficient. Because for all I know the attacker was leaving a false trail, I elected to nuke the site from orbit. It’s the only way to be sure. So, for remediation, I wiped the system and reinstalled the OS and applications from known clean sources, removed the unauthorized ftpd account, changed passwords left-and-right, then restored user data from my latest backup.

I’m lucky that this was all it took. If my not-quite-anonymous caller hadn’t clued me in, it might’ve been several hours, or possibly several days, before I noticed a problem. And if the attacker had been more sophisticated about covering tracks, I might still not know what vector had been used to break in to my system. In other words, relatively little damage was done (at least to me; I can’t speak for people who may have been phished) and this was a relatively easy system to get back up and running. Now I just need to be more conscientious about my passwords.

posted at: 13:48 |


Tue, 19 Jan 2010


Blog Coding Updated

Although I've based the code for this blog on the Blosxom framework, and use the Tiny MCE JavaScript library to handle editing chores, the fundamentals are significantly modified from the original.

Among other things, I've custom-coded the blog to support picture uploads, automagically create thumbnails and link to the full-sized images; added an authentication mechanism; and configured custom blogs for each member of the family. 

Unfortunately, due to a misconfiguration on my part, Tiny MCE was substituting a relative path for the absolute path on each of the image includes and links. This worked just fine, until I added the ability to browse the blog by category or by date. When these features are active, it causes the blog script to generate temporary subdirectories in the URL, and in conjunction with Apache, redirects requests for the category- or calendar-based pages, breaking the images.

When I discovered this problem, I did a little research and determined that I could simply tweak the Tiny MCE configuration to eliminate the issue on all new blog posts. However, this did not fix any existing entries. 

Because Blosxom generates web pages based on the datestamp on the individual files created when a user writes a post, simply running a search-and-replace against all files to change the relative paths to absolute paths would result in all existing blog entries showing up as being new as of the time I executed the change.

Obviously, this is not good.

Further, although I could individually modify the files one at a time to restore their original datestamps, the volume of files involved made that a non-starter. 

To resolve the issue, I have written a Perl script that parses through all of the existing blog entries, corrects the paths, and then saves the file with the original datestamp. This wasn't rocket surgery; but it was a new endeavor for me. On the off-chance that you might encounter a similar problem, I am making this script available under the GPL. Feel free to use it, but be sure to make a backup copy of your data before executing it. 

posted at: 17:43 |


Sat, 02 May 2009


Social Networks

The latest rage on Facebook apparently is apps that ask you to list five of your favorite things in a category. I don't think I can even name five wrestlers, so it's probably just as well I'm not on Facebook. Although my wife is, and she thinks it's wonderful. I did once, long ago, join Classmates.com. None of the people I would have considered renewing relationships with ever seem to have joined. Before I reached my current viewpoints on Internet-enabled social networking, I also joined Friendster.com; but it has been nearly a decade since then.

Facebook, though, for me, is just not a draw. Yes, I do have a blog, which I even occasionally update; but as you can see, it's on my personal website, where I have total control of the context and copyrights. But when it comes to interpersonal relationships, I like to keep my one-on-one communications confidential. The idea of a "wall" where I get drive-by comments from acquaintances, or having the entire subscriber base (or even just the people already on my friends list) know who is in my social network, or having people tag pictures with my name, with or without me actually being in them — even with the "privacy" controls Facebook provides —  just leaves me cold.

Further, having to maintain personas on multiple networks (Linked-in, Facebook, Classmates.com, MySpace, Friendster, Orkut, etc.) to maintain separation between professional and personal lives, as well as hit the "right" sites so that the "right" people see I'm a member of the same communities… well, it's burdensome. The MySpace people want to use MySpace; the Facebook people want to use Facebook. It's just too much work for me. The single option of Geocities in the '90s was easier.

Finally, I've had a website since 1996. My Google PageRank is excellent. If people I already know want to find me online, it's a piece of cake. If people I don't already know want to find someone with my skills (e.g., recruiters), my resume is at or near the top of Google's results in all the markets I care about.

That being said, for many, Facebook and its ilk are pretty darn good. They have reasonably featureful interfaces, a critical mass of users, and the backing of major corporations to ensure they stay available. And that's fine; it's just not for me.

posted at: 01:49 |


Thu, 19 Mar 2009




Unlocking Cryptography

Generating SSL Keys on Debian Linux

As reported in my last entry, I recently created updated SSL keys for my server. This is a somewhat arcane process, involving wizardly incantations on the command line. As a service to the community, I will now describe this process and provide a simple script to stramline the process.

First, the reason it was necessary to generate these keys is that the default Debian install creates keys that are only good for one year. Further, these keys are “snakeoil”; that is literally what the configuration calls them, which serves as a reminder to sys-admins that they are the default configuration (generally more exploitable), they are not part of a chain-of-trust (nobody else is vouching that you are who you say you are), and they potentially do not uniquely identify your server (setting up a series of servers with the same configuration can cause confusion among various connecting hosts).

These instructions apply to generating a self-signed key: just as with the default Debian key, nobody else is vouching that you are who you say you are. If you want to get an “official” key, you have several options, of varying expense:

Unless you are going to be selling something to the general public, or will be accepting payments from people you don’t personally know, however, these are all overkill. A self-signed cert will work just fine for you if you are using this server inside an organization where you have control over browser deployments, or if you are working with a technical audience of people you already know.

On to the steps!

Remember, these are specific to Debian GNU/Linux default installs. If your system is on another version of Linux, you’ve customized your install in some unusual way, or you’re using another OS, you will have to modify these instructions to match your environment.

Login as root:

sudo su -

Change directories to your SSL configuration directory:

cd /etc/ssl;

Create the seed for your private key:

openssl genrsa -out example.com.key 1024;

Use the seed to generate a public/private key pair request:

openssl req -new -key example.com.key -out example.com.csr;

Generate and sign the keys:

openssl x509 -req -days 365 -in example.com.csr -signkey example.com.key -out example.com.crt;

copy the old/default key to a timestamped file:

mv /etc/ssl/example.com.csr "/etc/ssl/example.com.csr.`/bin/date +%Y%m%d`";

Copy the old/default apache certificate to a timestamped file:

mv /etc/apache-ssl/apache.pem "/etc/apache-ssl/apache.pem.`/bin/date +%Y%m%d`";

Copy the new private key to the apache-ssl certificate:

cp -p example.com.key /etc/apache-ssl/apache.pem;

Sign the new apache-ssl certificate:

cat example.com.crt >> /etc/apache-ssl/apache.pem;

Change permissions on the certificate to avoid security issues:

chmod 600 /etc/apache-ssl/apache.pem;

Delete the originals:

rm /etc/apache2/apache.pem;

link the apache-ssl certificate to apache2’s, so you don’t deal with multiple certs when you don’t need to:

ln /etc/apache-ssl/apache.pem /etc/apache2/apache.pem;

copy the apache cert to the generic ssl cert library:

cp -p /etc/apache-ssl/apache.pem /etc/ssl/certs/ssl-cert-example.com.pem;

copy the private key to a restricted area:

mv ./example.com.key /etc/ssl/private/;

Change permissions on the private keys to ensure they remain private:

chmod 600 /etc/ssl/private/*;

change ownership on the private keys, as well:

chown root.ssl-cert /etc/ssl/private/example.com.key;

Move the public key into the certificate directory:

mv example.com.crt /etc/ssl/certs/;

Change permissions on the public keys, also:

chmod 600 /etc/ssl/hall*;
chmod 600 /etc/ssl/certs/example.com.crt;
chmod go+r /etc/ssl/certs/example.com.pem;

Restart Apache and your mailserver (I use Postfix rather than Exim) so that they reload their keys:

etc/init.d/./apache2 restart;

/etc/init.d/./postfix reload;

All done!

I’ve also written a script to automate this process. Feel free to use it, but remember I’m not responsible if it breaks anything.

Comments, criticisms, and corrections are welcome.

posted at: 01:00 |


Mon, 16 Mar 2009



Marvelously Modified Mailserver

Geeky Fun!

After spending a whole week of classroom time in a “System p LPAR and Virtualization I: Planning and Configuration” training session, this weekend I was feeling motivated to make a few changes. As I’d been deferring the (completely unrelated) migration of my email and SSH server to a new platform, it was time to take action!

This is a relatively large change for my small environment. Currently, I’m running a web server (Apache), a mail server (Postfix with SpamAssassin), a remote access server (SSH), Windows (Samba) and Unix (NFS) networking servers, some monitoring utilities (Monit), and various smaller functional programs.

Fortunately, the migration process was to be relatively painless. As I had planned for this, I already had mirrored the configuration from my “Old and Busted” system (based on an Intel Pentium III running at 800 MHz, and although rock solid, dreadfully slow), to the “New and Kewl” system (based on an Intel Xeon dual core, dual processor running at 2.3 GHz). All that needed to be done, then, was:

  1. At the router, stop accepting inbound email for the duration of the migration.
  2. Disable the Postfix daemon on OldAndBusted.
  3. Copy the user mailboxes from OldAndBusted to NewAndKewl.
  4. At the router, set inbound email connections to be directed to NewAndKewl.

And that should do it!

Except for the small item of ensuring that my users’ individual email clients are all configured to talk to NewAndKewl instead of OldAndBusted. Not a problem! I use DNS for my internal network, so I updated the DNS configuration to point mail.hallmarc.net at NewAndKewl, and everything was good.

Except the email clients were using the IP address rather than the fully-qualified domain name for the mail server. Uh. Dumb. Ah! but I can modify the configuration from the command line for all the kids accounts by logging in remotely and changing all of Thunderbird’s instances of OldAndBusted’s IP to NewAndKewl’s IP. Done and done. (Yes, I did have to use the GUI on my wife’s Windows XP PC to do this. One more reason not to support Windows.)

About using that GUI… Apparently my wife for months had been clicking through a dialog box every time she collected email. The dialog indicated that the mail server’s SSL/TLS certificates had expired. I only learned this because, yup, I used the GUI to change her server setting. So now I needed to update my server certs. Which will be the subject of my next blog entry.

posted at: 19:33 |


Thu, 12 Feb 2009


Time’s up!

Unix time is fun! As I noted in a previous entry about the nature of timekeeping in the computer world, Unix tracks time by counting seconds since January 1, 1970.

The latest milestone for Unix time is at 23:31:30 UTC on February 13, 2009. People who like patterns in their numbers will rejoice as Unix time reaches 1234567890 seconds since the beginning of the Epoch. By coincidence, this day falls on Friday the 13th on the Gregorian calendar.

If you're using a Unix or Unix-like system, you can see how this works:

[on GNU]

>$ date -ud@1234567890

[on BSD]

>$ date -ur 1234567890

That will look like this:

PuTTY_date_screenshot.gif

For more information see the time_t entry on Wikipedia, the free encyclopedia.

posted at: 12:16 |


Tue, 10 Jun 2008




Bone-headed Maneuvering

Server Outage

While the wife and kiddies are out of town, I thought I’d catch up on a few things with the website.

Among my aspirations was to update the operating system kernel to a more current version with various patches and improvements.

Debian GNU/Linux makes this process relatively simple with the marvel of their dpkg suite of tools, including such amazing functionality as apt-get update, apt-get upgrade, and apt-get dist-upgrade. Some prefer to use aptitude instead of apt-get; and some (who might prefer a GUI), use synaptic. But all of these commands do the same thing: grab the most reasonably current version of the software already installed on the system from a centrally-maintained repository, and upgrade it in-place on the local computer, for free. Normally, this is absolutely painless.

Linux is renowned for never needing a reboot; and to a large extent, this is true. However, when upgrading the kernel the very core of the system is being replaced. Because Linux keeps running programs in memory even when they’ve changed on disk, this requires that the system be rebooted — because the kernel is a running program. Most other programs can be individually stopped and restarted; but the kernel controls all other programs — if it’s stopped everything is stopped. Thus the reboot requirement.

Unfortunately, I moronically chose to upgrade my kernel with one for the wrong CPU architecture, leaving me with a system that would not restart. Of course, initially, I didn’t know that, and had to figure it out.

Being the experienced and skilled technical professional that I am, I had a reasonably current backup of the system. But just to be completely sure, I booted the system using Knoppix to verify that the filesystem on the server was still intact.

To my surprise and joy, it was.

My next step was to take another backup of the system so that I could be absolutely sure that I had all of my current data. This took roughly six hours to complete, as the system has 500 GB of disk and all of that data was passing through a USB 1.1 connection to an identical external hard drive. Normally, an incremental backup is sufficient; but I wanted the previous full backup to remain available if my new full backup was deficient in some way.

Six hours is a long time, and I still have a job: so I let it run overnight, and then went to work the next morning. Sleeping and working prevented me from determining the kernel incompatiblity until last night.

The a-Ha! moment came after multiple attempts to fix the bootloader (I use lilo) failed. lilo would run when I updated the configuration to point to my new kernel; but when I rebooted the system it would hang with a cryptic:

lilo
0x01 "Illegal command"
lilo
0x01 "Illegal command"
lilo
0x01 "Illegal command"
…etc…

Google results led me to this page, which claims, and I quote, “This shouldn’t happen”. Yeesh.

After making various changes to the lilo.conf file, each with similar unwelcome results, I was extremely frustrated. By this time, my Web, email, and media server had been out of commission for more than 24 hours. Had it belonged to anybody else, I would have recommended starting fresh and buying new hardware. However, the Wife Acceptance Factor of that decision would have been strongly negative for me.

Not to be deterred from having a functional system, I then elected to re-install the OS and restore my data from my redundant backup. After the eighth failure on the “Install the Base System” step in the Debian installer:

Jun 10 04:23:25 base-installer: error: exiting on error base-installer/kernel/failed-install
Jun 10 04:23:53 main-menu[1323]: WARNING **: Configuring 'base-installer' failed with error code 1
Jun 10 04:23:53 main-menu[1323]: WARNING **: Menu item 'base-installer' failed.

(and similar variations), I then tried installing a different kernel.

Wow!

By trying to use 2.6.18-4-686 instead of 2.6.18-4-486, I had hosed my system.

The lesson: Intel’s Pentium III processor is not compatible with the full 686 architecture Linux kernel.

Next steps: Install all this stuff on the dual Xeon I bought two months ago, and retain the Pentium III as a redundant backup system, instead.

posted at: 13:59 |


Tue, 29 Apr 2008




“And to think your [sic] paid for this”

The Looming Y2K38 Crisis

Following is a demonstration of how easily distractible I am.

A few months ago, I was doing a little training of some co-workers, and wound up composing this email:

Marc Hall/STL/MASTERCARD
01/14/2008 02:07 PM

While explaining to our newest team members this morning how [edit: redacted project name] works, I ran off on a related tangent about Unix timestamps formatted in seconds since the beginning of the Epoch. This further sent me off on a tangent about how Unix keeps track of time. This reminded me of the Looming Y2K38 Crisis.

In case you are unfamiliar with the idea, Unix keeps track of time by counting seconds since January 1, 1970. This is known as the beginning of the Unix Epoch. Today, around 1,200,160,000 seconds have elapsed. The seconds are represented in 32-bit Unix and unix-like systems by a four-byte integer value. Because a four-byte signed integer has a maximum decimal value of 2,146,483,547, on January 19, 2038, 03:14:07, Unix will run out of bits to store our seconds.

And time will stop.

No, seriously, either systems relying on the time will terminate in unpredictable ways, or the apparent time will wrap around back to January 1, 1970.

This is bad, for reasons left as an exercise for the reader.

Some time (ha!) between now and 2038, someone will have to go through every line of code in the Unix universe and validate that on the rollover date the system will not crash or behave unpredictably. This will be a project with a scope similar to the Y2K-bug-stomping-frenzy that concluded last century. It will make the DST patching we did after Congress last altered timekeeping look like making mud pies. Programmers specializing in Unix will be dragged kicking and screaming out of retirement and handed large sums of cash to evaluate critical systems. And, in the end, after months of trepidation and hype, January 19, 2038, will be a non-event — Because, like the Y2K Crisis, enough people will really understand how bad it could get if systems are left unpatched, that adequate time and resources will be allocated to be sure that everything is fixed in time.

Some have predicted that all 32-bit Unix systems will be long since retired by 2038, and 64-, 128-, 256-, 512-bit systems will have eliminated this as an issue. However, I have personally dealt with embedded systems more than 20 years old already. I expect there are 8-, 16-, and 32-bit embedded systems out there right now that will still be in use in 2038. Traffic signals. Assembly line controllers. Communications equipment. A lot of these run on 32-bit Unix-like kernels. In addition, there will still be business software running in emulated 32-bit environments, too, much like MasterCard is still using mainframes long after Microsoft’s predicted migration to all-Windows-all-the-time. Legacy systems have a way of hanging around.

You heard it here, first!

More info (so you know I’m not just making stuff up):
http://www.y2k38.info/index.html
http://home.netcom.com/~rogermw/Y2038.html
http://www.hackosis.com/index.php/2007/12/21/linux-is-not-y2k38-compliant/

The Boss’ Response …

And to think your [sic] paid for thissad

… And My Reply to the Boss’ Response

Hey, I’m just developing my career potential happy

After all, Consultant-level —- no, Senior Consultant-level —- work requires strategically-oriented, thought leadership about the company’s long-term outlook and anticipation of future events that will affect business operations at the limits of the planning horizon. The ability to assimilate, internalize, and communicate these strategic issues is what separates the Senior Consultant from the Engineer.

Further, if training dollars are not available, then it is incumbent on Senior Consultants to provide appropriate knowledge transfer to the various lower-echelon engineering staffers.

Also, I’m paid a premium for my excellent grammar winking

posted at: 15:18 |


Mon, 28 Apr 2008




Windows Security for the Insecure

Tools for Keeping Windows Unbroken

I recently had a conversation with an acquaintance about his company-issued laptop, and how it had become significantly slower. He also reported some behavior (pop-up windows, extra toolbars, etc.) that are symptomatic of a computer that has been infected by a virus, spyware, adware, or worse. His complaints sounded all too familiar, as Microsoft Windows users have had similar issues since the advent of Windows for Workgroups.

Knowing that I have some experience with computers, he asked me for some advice on what to do. After explaining that his laptop is to the systems I work with as a dinghy is to an ocean liner, I agreed to impart some wisdom.

First, I explained that he shouldn’t be worrying about fixing the laptop: it’s a company laptop, and therefore the company’s responsibility. They need to hire someone with domain-specific competence to do routine maintenance and security auditing on these computers. In other words, they need to hire a geek.

Presuming that, for whatever reason, his employer would not be supporting this computer, I also gave him a brief overview of the wide variety of misuses that his laptop could be engaged in. Here are a few, and let me emphasize that this is not a comprehensive list:

  • spam zombie
  • identity theft
  • key logging
  • delivery of unwanted advertising (is there any other kind?)
  • distributed denial of service attacks
  • vector for other malware to be transmitted to other computers
  • industrial espionage

After outlining the risks to him and others that could result from a compromised machine, I agreed to provide him with more information in a follow-up email. This blog entry is an expansion on that email.

Places to seek understanding of the problem

Carnegie Mellon University supports an organization called the Computer Emergency Response Team (CERT), which watches the Internet for trends in computer abuse. CERT maintains a web site dedicated to helping people keep their computers secure. Two sections of that site of particular benefit to my acquaintance are:

Another popular resource is Security Novice, which outlines best security practices from the perspective of a novice.

Microsoft also provides a reasonably complete explanation of security basics. Naturally, this is geared specifically for Windows users, but then, most PC users are Windows users.

Organizations seeking to fix the problem

Security is a process, not a product. Nevertheless, here are a few free tools that will improve your overall situation, at least initially. If these are so good, why are they free? Principally, two reasons:

  1. They are loss-leaders for commercial products, or
  2. The Free/Open Source Software (F/OSS) community is a strong force on the Internet, and has, essentially, developed entire environments for PC users to be productive without spending any money. Some groups in this movement are motivated by pragmatism, and some by idealism; but the result is a full suite of operating systems and applications that rival the corporate software world’s offerings in virtually every category.
    F/OSS software includes several variants of Linux, OpenOffice, several of the tools listed below, and many other programs. Development of these programs is sponsored by major companies, like IBM, Sun, Google, and Oracle, as well as largely volunteer organizations, like the Mozilla Foundation, Apache Foundation, and Free Software Foundation.

A comprehensive list is beyond the scope of this blog, so I won’t cover things like firewalls and root kit detection. However, the tools I describe below will give you a glimpse into the variety of precautions you can take immediately.

Web Browsing

For browsing the Web, I recommend Mozilla’s Firefox, a more secure web browser than Microsoft’s Internet Explorer. The biggest reason for choosing a more secure browser is that it is more difficult (although unfortunately still not impossible) for a malicious outsider to use a website to deliver malware to your PC.

Anti-Virus

Every Windows PC should have an virus scanner and removal tool, and Grisoft has an excellent free program, AVG Anti-Virus (the free one does the job, but you can pay ‘em for additional features).

Spyware Detection and Removal

Spyware can be even more dangerous than a typical virus, at least to the computer user whose PC has been compromised. Spybot Search & Destroy is my favorite tool for this purpose.

Adware Removal

Adware is mostly an annoyance; it uses CPU time and RAM that you want for your own purposes to put advertisements on your screen when you’re trying to do work. Lavasoft Ad-Aware Free is my choice for this (the free one does the job, but you can pay ‘em for additional features).

Email Safety

Finally, if you’re using Microsoft Outlook Express for email, that’s just asking for trouble. Either use Microsoft Outlook (without the “Express”winking, or Mozilla’s Thunderbird.

Summary

These tools will make your life much easier, and won’t cost you (or, in the case of my acquaintance, your employer) a lot of money. My advice is to take advantage of them and save yourself many of the headaches associated with using a PC on the Internet.

posted at: 14:45 |


Wed, 15 Nov 2006


Is it blackmail?


A Fascinating Apache Log Entry

Apache is the software that runs the HallmarcDotNet web server. It keeps a record of every request for a web page, including where it came from, the date and time, the method and protocol version used, whether the request was successful or not, the size of the file that was sent (if any) in response to the request, the URL that the requester had most recently visited, and the browser and version the requester is claiming to use.

Two or three times a month I run a script that looks for interesting entries in the log: cracking attempts, broken links, recruiters finding my resume, people reading my novel or blog, that sort of thing.

Today when I ran my script, I encountered a fascinating entry - one that piqued my curiosity. Here’s the raw text:

mail.fz.k12.mo.us - - [06/Nov/2006:10:46:25 -0600]
"GET /cgi-bin/blosxom.cgi HTTP/1.1"
200 19898 "http://www.google.com/search?hl=en&lr=&q=dubray+dirty+stuff+mo&btnG=Search"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"

What that means is that someone inside the Fort Zumwalt School District network did a Google search for the combination of terms “dubray+dirty+stuff+mo”. For the uninitiated, Bernard J. DuBray is the current superintendent of that school district. A middle school (as it happens, the one my oldest child attends) in the district is named after Dr. DuBray.

It’s not a a very imaginative query… Possibly a student, but more likely a disgruntled employee. Or a very good PR flack, looking to get ahead of the opposition.

Do we have a potential blackmailer? Do I call the papers? The police? The network admin at Fort Zumwalt SD?

It probably means absolutely nothing; but it’s still fun to speculate. And imagine the possibilities.

posted at: 13:58 |



Marc Elliot Hall St. Peters, Missouri 

Page created: 21 January 2002
Page modified: 09 December 2017

spacer About Us | Site Map | Privacy Policy | Contact Us | ©1999 - 2017 Marc Elliot Hall