Journeys in Hosting 2/x - OS template considerations

When you lease a virtual machine (VM), virtual private server (VPS), dedicated server, container, or similar system, you are usually given a choice of operating system to install. Most hosting providers have a selection of templates for a few different operating systems. In this article I highlight a few of the things you should examine after the OS is installed and before you begin using your new system. I assume a basic Debian OS template since that is what I’ve used the most, but much of what I have to say can apply to most Linux distributions if not various OS choices generally.

Default hosting OS setups across providers tend to look the same on the surface, but you can usually find subtle differences. In my experience variations are minor or innocuous. However, I’ve found these differences start to add up when trying to automate the management of a large fleet from many distinct providers.

In no particular order, and surely incomplete, here is a list of system characteristics to examine immediately after the initial OS installation:

Daemons. A base OS install has a relatively limited set of start up services and daemons running in the background. Using variations of ps look for anything that seems out of place. In some cases you may find a provider-specific service running. Typically these are used to communicate with a provider’s custom back end for monitoring and management. Generally these services can be removed or disabled if you don’t rely on the provider’s custom API and management capabilities.

External network listeners. Not unlike daemons mentioned above, you should probably make sure there are no more network listeners reachable from outside the system than you need. Most providers only enable a SSH listener on TCP port 22 by default, but it is worth checking.

cloud-init. This is a common tool used by hosting providers for managing (mostly) network settings. It is generally safe to keep and use if you are unsure, but depending on your personal preferences you may want to override some defaults (e.g., the default name servers listed in /etc/resolv.conf).

User accounts. Most providers give you access to the root account, but some create and restrict remote login to an unprivileged account. It should be very rare and I’d suggest suspicious, for other non-standard accounts to exist on new systems. The same applies to unexpected groups too.

SSH keys. Unless you’ve instructed the installation process to install your own SSH public keys you shouldn’t expect to find any keys installed for any account. I have seen keys in the /root/.ssh/authorized_keys file on a few providers. Chances are these are benign artifacts of the template build, but you probably don’t want them there. In addition, you might want to generate a fresh pair of server SSH keys after reading the inaugural Precomputed SSH Host Keys post in this series.

sshd_config settings. You may find a file in /etc/ssh/sshd_config.d with SSH server settings. I’ve usually seen the PasswordAuthentication yes set in this file so that remote password-based authentication is enabled by default. I’d recommend you disable password-authentication for all accounts and remove that setting wherever you find it. Occasionally I’ve seen some other non-default settings in the main server config file so make sure it looks sane before you get too far.

Shell and system history. Every once in awhile you may find your brand new system has some history associated with it. You may find some commands in the shell history or login/reboot events in the last/wtmp database. Anything you find is probably just an innocuous artifact left over from the template build process, but aren’t you at least curious to see if anything is there and what it is if you find something? :-)

Shell environment. I cannot remember a case where I saw something unexpected set in a shell environment, but it wouldn’t hurt to keep an eye on what you might find there (e.g., PATH).

Disks. A few times I’ve found the default install does not utilize the entire disk available to the OS. Unfortunately it hasn’t happened enough, but I’ve even found an entire disk assigned to the system I didn’t expect to have. You can use tools like lsblk and df to verify what disks are installed and check if they are setup as you’d expect.

DNS settings. It is usually a good idea to know what resolver(s) your new system is configured to use by checking /etc/resolv.conf. You may find DNS resolution set to use the local host (i.e., 127.0.0.1). This usually means a package like systemd-resolved and/or resolvconf is arbitrating access to a set of full resolvers. It is rare, but I have seen providers install a full caching resolver (e.g., unbound) on the local host to use by default.

Time synchronization. Some container-based installations may not need or be able to set the clock, but most likely you’ll benefit and want to ensure the system has an accurate notion of time with your favorite time synchronization software. Many systems will come with an NTP daemon by default, but some will not. Practically all providers I’ve seen do not alter the default time servers when using NTP, but it wouldn’t hurt to check. In addition, the default time zone for your system may not be to your liking. Be sure it is what you want.

Firewall. Depending on the OS you choose you may or may not have firewall rules running by default (iptables/nftables/ufw). Some providers can provision firewall filtering for you outside of the OS. You may or may not want this capability and it is good to know if any rules are running. Some providers may block unsolicited ingress flows to your system by default.

Cron/Timers. I dedicated an earlier blog post to a Case of the unpredictable run-parts crontab customization. I think I’ve only seen something like that once since, but it is worth mentioning here for completeness. systemd-timer may be replacing cron for newer systems, but the idea is the same. You may wish to check what jobs are scheduled to run and when for your new system.

Host names. Usually a minor, mostly cosmetic property will be host name(s) set in the /etc/hostname file or /etc/hosts file. Some providers will use a name you set during provisioning, others will just use a default from the OS template. If you care about these you may wish to change them. Note, if the system uses something like cloud-init, changes you make manually may be reverted unless you ensure otherwise.

Package repositories. I have seen providers specify their own local repositories and mirror sites for the package management system. This may be OK, but if not take a look to see if the defaults are what you want.

Logging. Most Linux systems come with rsyslog installed by default these days, but I usually replace it with syslog-ng. The only time I’ve run into a problem was when the Debian syslog-ng package maintainer once set the statistics frequency logging to 1. That told the logging daemon to collect and log statistics every second! That wasn’t a hosting provider-specific issue, but in the rare case an anomaly like this appears and a provider uses that specific OS version by default, you’ll want to fix it as soon as possible. I have never seen strange provider-installed logging statements, such as sending logs to a remote collector by default, but that is the sort of thing I’d be on the look out for.

Automated upgrades. Automated upgrades may be enabled by default for your system. If you do or don’t want this, be sure to make the necessary adjustments.

There are a number of other unusual and surprising things I’ve seen over the course of my experience with hosting providers. For example, one provider filtered all UDP traffic between their network and the rest of the Internet. Another filtered all TCP port 22 traffic to the system so if you wanted to run SSH you had to do so on an odd port. Most providers filter outbound TCP port 25 by default in order to limit email spam emitting from their systems. Different virtualization technology (e.g., KVM, OpenVZ, VMware) may introduce slight differences in what you can see or do with your OS. Memory swap partitions can differ widely if even they are setup at all by default. It may depend on how much memory your system starts out with. And of course dynamic versus static IP address assignment, route table entries, and subnet isolation policies differ widely.

Enumerating every possible unique property you may experience in a hosting environment would probably make this a much longer blog post. I generally prefer to start with what many providers refer to as a “minimal” OS install. This usually means what it implies, very little software outside of what is needed for a base system you can remotely SSH into and interact with so you can add and customize anything additional as you wish. The less software you start with, the fewer provider-specific anomalies you’re likely to incur.

Hopefully the items above are a good sample of the some of the common and interesting customization you’re likely to see. If there anything not covered above you think should be, feel free to reach out and let me know about it.