Warning

 

Close

Confirm Action

Are you sure you wish to do this?

Confirm Cancel
Member Login
Posted: 9/2/2004 8:50:54 AM EST
Looking to tap the mighty ARFcom well of knowledge.

Since our management has decided that we don't need to have operators in our Network Operations Center anymore (against the advice of the people who actually do the work!!!), our Network Admin group has been tasked with part of the jobs the NOC used to handle. One of these processes is the rebooting of servers, both Windows 2K and Unix, that have hung up.

Realizing that the Unix boxes are more easily remotely rebooted than Windoze machines, are there any products that you guys can recommend that will let us remotely reboot the servers? If we need separate products for the OS's, that's fine.

Thanks!

Michael
Link Posted: 9/2/2004 8:56:49 AM EST
Win2K is remotely administered through either Terminal Services (remote admin mode which doesn't require licenses) or you can even remotely reboot through Computer Management.

Of course if you use HP/Compaq machines you could also use the ILO/RILO card.
Link Posted: 9/2/2004 9:00:15 AM EST
I just use a program by ATT/Bell Labs called WinVNC to administer all of my servers. It works for reboot. I like it a lot more than the Terminal Services product.
Link Posted: 9/2/2004 9:05:44 AM EST
I was just about to mention the ilo/rilo cards if you have hp servers. Pretty sweet little cards. Allow you to login and start servers up, reboot servers, replay the original shutdown sequence. You can even use a PDA to reboot the servers with the ilo/rilo cards.
Link Posted: 9/2/2004 9:08:45 AM EST
How many servers are we talking about here?

I wrote a program that reboots our Point of Sale servers on a regular schedule (800 units nationwide) and checks if they came back normally and reports thier status, errors, etc ...

Link Posted: 9/2/2004 9:17:57 AM EST

Originally Posted By mrstang01:
Looking to tap the mighty ARFcom well of knowledge.

Since our management has decided that we don't need to have operators in our Network Operations Center anymore (against the advice of the people who actually do the work!!!), our Network Admin group has been tasked with part of the jobs the NOC used to handle. One of these processes is the rebooting of servers, both Windows 2K and Unix, that have hung up.

Realizing that the Unix boxes are more easily remotely rebooted than Windoze machines, are there any products that you guys can recommend that will let us remotely reboot the servers? If we need separate products for the OS's, that's fine.

Thanks!

Michael

doomed, DOOMED.
Link Posted: 9/2/2004 9:31:38 AM EST
We have RIB boards in several of our HP servers, and that is an option, we were just wondering if there is something better out there.

We're talking about 100 W2K servers, and about 85 Unix boxes.

And I agree, something is going to bite us in the ASS, and then they'll decide having operators is cheaper than having the owner of the company bitch about not getting his morning reports when he's used to.

Michael
Link Posted: 9/2/2004 9:34:54 AM EST
Oh, and don't forget that you can the shutdown.exe command line to shut down a server remotely. Just make sure you add the /r (reboot) switch.
Link Posted: 9/2/2004 9:53:29 AM EST
I can send you what I use if you want it ....... IM or email me ...
Link Posted: 9/2/2004 10:02:39 AM EST
[Last Edit: 9/2/2004 10:03:13 AM EST by RealFastV6]

Originally Posted By Joe_Blacke:
Oh, and don't forget that you can the shutdown.exe command line to shut down a server remotely. Just make sure you add the /r (reboot) switch.



Indeed.

I have a pile of batch files that run in Task Scheduler on a management XP station that just run down the list once a month and reboot everything preventitively on a Sunday. (16 W2k Servers, never had a problem)

One could do the same thing with some creative SSH scripting for the *nix boxes.

Link Posted: 9/2/2004 10:04:07 AM EST
are you talkig abotu rebooting when the machine is not responsive or not functioning properly?
Link Posted: 9/2/2004 10:10:14 AM EST
I guess I better modify my answer seeing as the shutdown /r /f may not work depending on how locked up the server it.

We're a dell shop and they all have DRAC cards, which allow you to power cycle the box remotely all the way down and all the way up.

However, I will say that in my experience monthly preventitive reboots can go a long way in helping avoid emergency reboots.
Link Posted: 9/2/2004 10:24:08 AM EST
Link Posted: 9/2/2004 10:28:20 AM EST
Yes, we're using the reboot /r monthly on the Windoze boxes, this solution would be for those times when the server is unresponsive.

Michael
Link Posted: 9/2/2004 10:29:25 AM EST
The datacenters I manage are 80% lights out.

If you're talking about hung processes, but you can still gain access to the OS, You should use remote desktop for MS, and a terminal server (Like a Cisco 2600) for the Unix systems. The Cisco product will allow you to connect to it via ssh, then connect to the Unix machine via the console port.

If you're talking about a hung system, then gaining access to either will not give you any advantadge. I suggest one of the power management devices, there's plenty out there. Basically an IP enabled power switch which allows you to execute a turn on, off, or reboot (turnoff wait X seconds, then turn back on).

Feel free to email me if you need any more specific details
Link Posted: 9/2/2004 2:48:25 PM EST

Originally Posted By RBAD:
Try the APC MasterSwitch product(s) !
They work great for us. (includes remote cycling of POWER to the server(s) in question)



We have those too at a couple remote sites. Good shit.


Link Posted: 9/2/2004 2:52:52 PM EST
just use SSH for the UNIX boxes. For windows though, turn them into unix boxes. Windows was never made to be a serving platform.
Link Posted: 9/2/2004 2:55:04 PM EST

Originally Posted By x5060:
just use SSH for the UNIX boxes. For windows though, turn them into unix boxes. Windows was never made to be a serving platform.



Link Posted: 9/8/2004 5:34:03 AM EST
Good info guys, to the top for Dayshifts input after the Holiday weekend.

Michael
Link Posted: 9/8/2004 5:41:51 AM EST

Originally Posted By RBAD:
Try the APC MasterSwitch product(s) !

They work great for us. (includes remote cycling of POWER to the server(s) in question)




Just went with this solution for our web boxes and edge equipment. Hope it works. We are having LOTS of power issues right now (Imagine that with 2 hurricanes down and one on the way).
Link Posted: 9/8/2004 5:45:05 AM EST
+1 Joe_Blacke

If it's a Compaq/HP shop, get RILO cards. We have 6 remote datacenters with 70+ DL380's per site . . . I've never set foot in any of them. The only thing you need physical access for is for network issues, hardware failures and upgrades. They work regardless of the OS and are accessable through a web interface.

The problem with software methods is that if the server is locked, so is your remote management software. RILO's are 100% hardware and powered seperately from the main box.

I'm sure that RILO like devices are available for most other platforms also.

IM for more info if you're interested.

_Disconnector_
Link Posted: 9/8/2004 6:32:01 AM EST

Originally Posted By _disconnector_:
+1 Joe_Blacke

If it's a Compaq/HP shop, get RILO cards. We have 6 remote datacenters with 70+ DL380's per site . . . I've never set foot in any of them. The only thing you need physical access for is for network issues, hardware failures and upgrades. They work regardless of the OS and are accessable through a web interface.

The problem with software methods is that if the server is locked, so is your remote management software. RILO's are 100% hardware and powered seperately from the main box.

I'm sure that RILO like devices are available for most other platforms also.

IM for more info if you're interested.

_Disconnector_



Yup, DRAC is the Dell Version I mentioned above.
Link Posted: 9/8/2004 6:33:58 AM EST

Originally Posted By mrstang01:
We have RIB boards in several of our HP servers, and that is an option, we were just wondering if there is something better out there.

We're talking about 100 W2K servers, and about 85 Unix boxes.

And I agree, something is going to bite us in the ASS, and then they'll decide having operators is cheaper than having the owner of the company bitch about not getting his morning reports when he's used to.

Michael



What could be better than a RIB card?
Link Posted: 9/8/2004 6:47:21 AM EST
You can slap an sshd server onto a windows box and then just ssh in and reboot from the command line. The cygwin version works. (www.cygwin.com). Free, too.

Of course, it's no help if the server is so wedged that the sshd server is down. But that's true of anything.

Link Posted: 9/8/2004 6:54:52 AM EST
I agree - ILO for HP and DRAC for Dell. This is a minimum requirement before my department will even support a box. We manage thousands of servers... and less than 100 are local. We use Terminal Services/VNC/Tivoli/etc... for day to day remote control, and DRAC/ILO for hung server issues.

IP based KVM is also an appealing emerging technology... which is promising remote hung server reboot capability as well. Easier and cheaper retrofit than adding drac/ilo to every server if you dont already have that.
Link Posted: 9/8/2004 6:55:46 AM EST

Originally Posted By mcgredo:
You can slap an sshd server onto a windows box and then just ssh in and reboot from the command line. The cygwin version works. (www.cygwin.com). Free, too.

Of course, it's no help if the server is so wedged that the sshd server is down. But that's true of anything.




Anything but the rib card. There can be no OS on the box and you can still connect. there can be no hard drive even.
There is a reason theat HP/Proliant servers are the best Wintell box made. And don't chime in with some Dell is better shit. Dell is just a Proliant clone. Dell has been fighting the suits from Compaq for years.
Link Posted: 9/8/2004 6:59:38 AM EST
Make sure your computers are either using AT power supplies or support motherboards with the option to turn the system on when power is restored. APC makes a product called the Master Switch which has a telnet and web interface as well as recepticles on it. Through the interface you can tell the master switch what recepticles to turn on or off. The switch even has serial ports that can be hooked to the server that, when the server is running APC's software, will give the computer notice to safely shutdown in advance if it is not hung up.

Like I said, If the system board does not support system on when power retored then you don't have much hope. Perhaps if the board supports wake-on-lan yuo could do something to help out there.

-Foxxz
Link Posted: 9/8/2004 7:00:23 AM EST
Guys, he's talking about if the OS locks up. If the OS is locked, you obviously can't get in using Terminal Services or VNC. You need to be able to go in out of band.

The newest generation of Dell PowerEdge servers will have a BMC (baseboard management controller) that operates out of band, so if your OS locks up, you can still remotely access the BMC and bounce the box from there. This will be implemented in the x8xx series of PowerEdge servers. Pretty neat stuff.
Link Posted: 9/8/2004 7:03:37 AM EST
[Last Edit: 9/8/2004 7:04:10 AM EST by SNorman]
I know a lot of places that use remote managed power strips. No matter what kind of OS you use, sooner or later that sucker is going to hang for one reason or another. With the power strip you can remotely power cycle individual power outlets.

Crap Foxxz beat me to it
Link Posted: 9/8/2004 7:18:03 AM EST

Originally Posted By Matthew_Q:
Guys, he's talking about if the OS locks up. If the OS is locked, you obviously can't get in using Terminal Services or VNC. You need to be able to go in out of band.

The newest generation of Dell PowerEdge servers will have a BMC (baseboard management controller) that operates out of band, so if your OS locks up, you can still remotely access the BMC and bounce the box from there. This will be implemented in the x8xx series of PowerEdge servers. Pretty neat stuff.



That BMC is pretty worthless IMHO..... in the Dell 8G line, it will be disabled anytime there is a DRAC installed, and pretty much anything in the 8G line will be coming with DRAC4 on the montherboard. The 2800, 2850, 1850.... etc... all come with DRAC4.

You wanna talk COOL? Wait till you see the DRAC4. They fixed everything that sucked about DRAC3 and then some! Its awesome!

Dell is touting the integrated BMC just because thats going to be anindustry standard... blah blah blah. But its nothing in comparison to DRAC!
Link Posted: 9/8/2004 7:22:55 AM EST
I worked in a MasterSwitch shop. The only problem is that you cannot get a true console connection to see boot-time errors. For term services to work (and VNC, Tivoli, etc) the server must be up . . . if you have a boot-time issue you are SOL unless you have a DRAC/RILO. MasterSwitch is basically a remote power switch. RILO's are that and much more.

Our RILO's are powered off of a seperate circuit and UPS system. Even if there is an issue with one of the redundant power supplys, we can connect to the server itself.

KVM over IP is super cool, but extremely bandwidth intensive. Be prepared for mega latency unless you are on a gigalan.

I will never again work in a shop that isn't RILO equipped.

_Disconnector_
Link Posted: 9/8/2004 7:29:44 AM EST

Originally Posted By _disconnector_:
KVM over IP is super cool, but extremely bandwidth intensive. Be prepared for mega latency unless you are on a gigalan.



Not any more! The newer stuff uses the same technology in RILO cards... and has similar latency. Works fine over VPN/DSL connections even.... and "bearable" over dialup (if anything on dialup is actually "bearable")
Link Posted: 9/8/2004 7:30:30 AM EST

Originally Posted By ar50troll:

Originally Posted By mrstang01:
We have RIB boards in several of our HP servers, and that is an option, we were just wondering if there is something better out there.

We're talking about 100 W2K servers, and about 85 Unix boxes.

And I agree, something is going to bite us in the ASS, and then they'll decide having operators is cheaper than having the owner of the company bitch about not getting his morning reports when he's used to.

Michael



What could be better than a RIB card?



As others have mentioned, IP based KVM. We use both RILO and IP KVM (Avocent switches with DSView), and I prefer the KVM's. Especially with Windows servers. I haven't had any problems with latency at WAN speeds.
Link Posted: 9/8/2004 7:39:05 AM EST

Originally Posted By Chimborazo:

Originally Posted By ar50troll:

What could be better than a RIB card?



As others have mentioned, IP based KVM. We use both RILO and IP KVM (Avocent switches with DSView), and I prefer the KVM's. Especially with Windows servers. I haven't had any problems with latency at WAN speeds.



Yes, but they are not mutually exclusive. Managment cards give you remote floppy, remote CDROM, alert management, light out power up, power down, resets, in addition to remote out of band KVM capabilities.

I see IP based KVM as an excellent solution for daily use remote control, but it does not remove the need for an out of band mgmt card... it just enhances your capabilities. The comparison is kind of apples to oranges, if you makes them againt each other.
Link Posted: 9/8/2004 9:59:24 AM EST
For those of you with RILO's, how do you handle the connections to them, do you have an individual LAN connection to each one? That sure eats up a lot of switch ports quickly, doesn't it?

Michael
Link Posted: 9/8/2004 10:02:47 AM EST

Originally Posted By mrstang01:
For those of you with RILO's, how do you handle the connections to them, do you have an individual LAN connection to each one? That sure eats up a lot of switch ports quickly, doesn't it?

Michael



Yes, you have a seperate port. Where I'm at we use teaming NICS as well. So that's a minimum of 3 ports per server. My client has over 800 servers, so you do the math. And that's just front end. peoplesoft is on Hitachi servers and many *NiX boxes too. I also freelance at Enron and Lyondell, both of them do the same thing and have over 500 servers. If you have to worry about port cost, you got bigger problems.
Link Posted: 9/8/2004 10:40:00 AM EST

Originally Posted By mrstang01:
For those of you with RILO's, how do you handle the connections to them, do you have an individual LAN connection to each one? That sure eats up a lot of switch ports quickly, doesn't it?

Michael




Thats a great issue! I have TRIED to get Dell to put two interfaces on their DRAC mgmt ports.... like a two port mini-hub, so we could daisy chain the DRAC ports with only one primary switch port required per rack. I think thats a genius idea for the little used mgmt ports but Dell hasnt embraced it yet. They probably will wait and copy HP once HP starts doing it.

But yes.... switch ports aside..... you gotta have out of band management!
Link Posted: 9/8/2004 11:27:31 AM EST
Best thing is to set the RILO on a seperate switch and router if possible. This will allow you to connect via the RILO in case their are issues with the network or power.

I've been using RILO for years. It was my best tool when I worked for our Server Monitoring group. I had over 1350 Windows Servers I was monitoring real time with CIM, IBM Director, and Dell Openview, BMC Patrol, and a few other tools. Only two of those servers could be considered "local" and all the rest were managed via TS, Computer Management, and Scripts. That's everything from Domain Controllers, Exchange server, SNA Gateway servers, SQL boxes and everything in between.

If terminal services was not responding, I could connect via the RILO and perform a virtual power down and reboot, provided I couldn't kill the offending processes with command line utilities.

It's also a wonderfull tool if your server is stuck at an F1 prompt on an abnormal boot sequence. You can literally watch the entire boot sequence, which is something you can't do with many of the software products which require the server to be at least to the user interface (TS, RDP, .ect).
Link Posted: 9/8/2004 11:29:53 AM EST

Originally Posted By Joe_Blacke:
Best thing is to set the RILO on a seperate switch and router if possible. This will allow you to connect via the RILO in case their are issues with the network or power.

I've been using RILO for years. It was my best tool when I worked for our Server Monitoring group. I had over 1350 Windows Servers I was monitoring real time with CIM, IBM Director, and Dell Openview, BMC Patrol, and a few other tools. Only two of those servers could be considered "local" and all the rest were managed via TS, Computer Management, and Scripts. That's everything from Domain Controllers, Exchange server, SNA Gateway servers, SQL boxes and everything in between.

If terminal services was not responding, I could connect via the RILO and perform a virtual power down and reboot, provided I couldn't kill the offending processes with command line utilities.

It's also a wonderfull tool if your server is stuck at an F1 prompt on an abnormal boot sequence. You can literally watch the entire boot sequence, which is something you can't do with many of the software products which require the server to be at least to the user interface (TS, RDP, .ect).



RIBS rule...Not the roofie kind

And yeah, Dell will wait till HP does it and copy them (See above post) Why do you think Dell is the last to market with blade servers. Why do you think Dells Mngt Suite looks just like Insight Manager....hmmmmm
Top Top