Tools to Manage Your Servers on a Budget

226
vote

Computing infrastructures are growing rapidly. Hardware has become less expensive and their form factors smaller and smaller. With the advent of grid technologies, virtualization,and other distributed computing methods we need to find better ways to manage the the sheer number of resources being created.

Data center operation tools have become a big market. The two big names in this space have already been acquired. Opsware, acquired by HP for >$1B and BladeLogic acquired by BMC for $800M have lots of modules that can take care of sending out configurations and managing assets of an IT department. Unfortunately there license fees are just extraordinary and may not fit in most IT budgets.

For those of you who don't have the budget for such tools you can put together your own solution using open source software.
I've compiled a list of some of my favorite tools for such tasks:

SystemImager: software which automates Linux installs, software distribution. It allows you to capture an image of your Linux install and distribute it effectively to other servers.

Pdsh: Allows you to send commands to groups of machines at the same time. For example you may want to edit the same file on 100 servers. You can use pdsh to distribute the file changes in a matter of seconds. Other uses include send restart to services, rebooting a bunch of machines at once, getting status of all machines. Basically anything you can run at the command line can be done via pdsh.

Cssh: ClusterSSH is similar to pdsh, but cssh allows you to launch a bunch of xterms on specified hosts and run the same command on each host at the same time. For example you could run: cssh host1 host2 hostN and an xterm will pop up for all hosts. You also get a master window that you can type into. Whatever you type into the master window shows up on each xterm. A powerful tool that you have to be really careful with especially if you have a bunch of xterms you're trying to manage at once.

Ganglia: A good monitoring tool for your environment. It is a client based app that collects useful metrics on what your servers are doing in almost real time. It has a web gui that is accessible to users as well if you want to show them performance data for the environment. It keeps track of CPU, memory, network, i/o, and other metrics.

SunGridEngine: A good tool if you want to distribute jobs to multiple hosts. The idea is simple. Simply type qsub "command" and it will look at the hosts within your cluster and pick the least loaded machine to run the "command". Great for job distribution, load management, or parallel job submissions.

Cfengine: One of my favorite tools. This tool allows you to define a standard configuration(s) for your environment. You can then push out these configuration automatically through cron. Very powerful tool. Read up more on its capabilities here.

Naigios: Another great open monitoring solution. Can set it to alert you when various event are triggered (e.g. server down, http down, etc..)

Swatch: A perl based tool used to monitor any log file and send alerts based on regex patterns.

This list is not comprehensive by all means.
Feel free to add your favorites or comment below.
_____________________

Vassilios
Co-Founder
OuterVillage.com
http://outervillage.com

If you enjoyed this posting please subscribe to our RSS feed or submit it to your favorite social networks.

None
A comma-separated list of terms describing this content. Example: funny, bungee jumping, "Company, Inc.".

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Youtube and google video links are automatically converted into embedded videos.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
1 + 0 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
website statistics