February 2011 Archives

Hopefully this will be the end of my story on this topic.  I really can’t take another weekend like this.  It could have been a nice relaxing 4 day weekend, but didn’t turn out that way.

I woke up and got into work a few minutes after 9 AM on Sunday February 20.  I found that pretty much all the servers had come back up without issue.  The only real problem was the UPS that powers the networking equipment didn’t power on.  This is was most likely due to me trying to move plugs around on Saturday trying to keep stuff powered up.  I turned on that UPS and the network backbone booted up.  I found that the firewall again had forgotten to pass any traffic, I had to update its rules to get it working again, have to investigate it again.

Since everything had already been offline for almost 24 hours I decided to do some updates that needed to get done and just leave servers offline for a few more hours.  I had a series of Windows updates, drivers and firmware changes that needed to be made.  One set of updates was for the hard drive storage which is used by all the servers.  Which means I had to shut down all the Virtual Machines and all of the Virtual Servers.  The VMs aren’t too bad since they boot fairly quickly.  The virtual servers however can take 20-45 minutes to reboot.  I had to reboot them a few times to do all the various updates.  Once all of that was done I totally shut off the Virtual Servers and began the process of updating of the SAN storage.  That process took another hour or so.

Finally about 3:45 PM I had finished all the updates and gotten all the servers fully working again.  I then left work and headed out to my car.  After getting in and starting my car I realized I forgot my sunglasses inside and headed back in.  On my way back to my car I noticed that my driver side front tire was totally flat, yep my tech problems expanded to automotive.

I then set about removing the normal tire and mounting my space saver spare.  As I was jacking up the flat tire it began to seep air quickly enough for me to hear it.  I never found the hole visually but I could certainly hear it.  I then mounted the spare.  As I let the car down off the jack I noticed the spare was not “fully” inflated.  Which honestly isn’t too surprising since it had been in my trunk for four and a half years.  One small bit of luck is that I figured the maintenance shop at my school had some way to inflate tires.  I slowly drove around the back of my school to the shop.  I found a compressed air hose on a ceiling pulley that was pressurized, lucky for me.  After wrestling with an ill fitting mate point on the hose I was able to inflate the spare properly.

As I drove back to the front of the school I noticed a small shiny metallic cylinder on ground.  I stopped to look at it since it looked odd.  It is good I did since it was the key for my wheel locks.  I apparently left it on the lock lug as I drove away and it fell off.  If I had succeeded in losing that it would have been a week or so waiting for a new one in mail.  Another bit of good luck.  I then limped home on my space saver spare.

On Monday February 21 we had the day off for President’s Day.  I had decided to get a full set of new tires since I still had the original ones on my car and they had 45 of 50 thousand miles on them already.  I didn’t think patching a tire that old was worth it.  Kelly and I drove to Costco and I got a new set of 4 Michelin Primacy MXV4 which are the new version or upgrade of the Energy MXV4 that comes default on the Accord.  The tires were a little pricier than I had hoped, but they were cheaper at Costco and also had a $70 discount (for 4).  I can say for sure the new tires are better than the space saver spare, not ready to decide on them overall yet since I have only driven 10 miles on them.

After getting home from the tires and relaxing for a few minutes I was prodded to determine the fate of the TiVo and its potentially dead drive.  I took the “dead” drive and attached it to my arcade PC which I use for this type of diagnosis.  I was initially enthused since the drive showed up and seemed to work when I turned on the computer.  I then ran a command on the drive to prevent it from falling asleep (going idle) and not waking up.  This is a problem that sometimes happens with this model of hard drive in a TiVo.  I returned the “dead” hard drive to the TiVo and it worked, hurrah!  Another stroke of luck it seems.  I will be getting a UPS for this TiVo in hopes of protecting it and keeping it running well into the future.

The end of this story of a weekend of problems ends on a bright note I suppose.  Tuesday is a snow day, maybe I can relax a bit after shoveling some more snow.  If you made it through all 4 parts thanks for reading and I hope your President’s Day weekend was better than mine.

We left Part 2 with me finally having fixed my home machine when suddenly my cell phone rings.

Somehow within 2 minutes of finishing fixing my home machine I get a call from work telling me the power is out.  Really?  This is great news since you know I was just bored today anyway obviously.  Nothing else to do other than take a shower and go in and see what is happening.

A little backstory here for context.  Over the recent snowstorm in early February we had a power outage as well.  In that one 1-2 phases of our 3 phase power were flickering on and off causing all sorts of havoc.  Also I found that some things were plugged into the wrong UPS.  I spent several hours moving power and rebalancing the power after the last outage.  According to the UPS controller card I was supposed to get a little over 2 hours out of everything but our phones, which is more like 14 hours.

When I got the call from work they tell me the power went out about 10:30 AM.  I was hoping to get to work before the batteries ran out to check something and shut down nicely some things.  I got to work close to 12:30 PM and found everything but the phones dead, sort of.  Apparently again 2 of our 3 phase power is out.  What this means is random outlets are powered on but only half to 3/4 voltage, which sort of works but not really.  If you walk around my school some emergency lights are on and others aren’t but in most cases you can’t turn on the real lights either.  Which means it is just DARK.

The data center has no windows and the lights don’t work, and since some power is still there the emergency lights won’t power up.  It is so much fun to work like that I must tell you.  All the UPS are dead except for phones which say they still have power, again sort of.  Luckily the UPS takes the 80 volts and boosts it to 120 but isn’t draining batteries as fast.  I grabbed a standard light and plugged it in to the one UPS still on so I would have some light in the room.  Since some power is on I tried moving some plugs around to get stuff on.  This doesn’t work to well since as soon as I draw too much power it basically goes off.  I can get little power adapters to work but not power up full servers.  I try various things for almost 90 minutes and don’t make much progress.

Shortly before 2 PM the rest of the power goes out finally.  I figured at that point I couldn’t do anything else.  I went out to my car and decided to drive around back where the power comes in.  Surprisingly the BGE (power company) guy is there.  I stop and wait for him to come back to his truck and while I am standing there I see a tree that snapped in half and the top half with a y fork in it has fallen directly onto the power lines pinning the 3 phase wires together and potentially doing more damage.  For some reason I didn’t take a picture of this, oh well.  I find out he turned off the one remaining phase to prevent any further damage being caused by drawing too much power at the wrong voltage to their equipment and ours.  The good news at this point was we know what the problem is.  The bad news is it only affects my school which is essentially one customer instead of 100-1000 customers like other outages around the area.  I finally leave and head home close to 2:30 PM.

While I was at school Kelly called about 1 PM and let me know the power went out at home.  I tried to power off the server from school but lost the connection before I could.  The power got turned back on at about 2:20 PM as I was heading home.

When I got home I checked my server and it appears to have stayed up the entire power outage.  I didn’t think my UPS would last that long but I guess it did.  I then went upstairs and sat down with Kelly in living room.  I noticed the TiVo had the TiVo symbol on it and all the lights were on instead of showing the time as usual.  ARGH!!!

I turn on the TV and am greeted by the initial TiVo powering on screen that just says “Welcome. Powering up…”  I pull the plug on the box and wait a few minutes.  Plug it back in and same thing.  I repeat and wait longer, again same result.  I did some searching online and most people say this usually indicates a drive problem.  Hard to believe since as I related in Part 1 I just replaced this drive less than 2 months ago.  I opened up the TiVo case and powered it on while listening carefully.  It sounds like the drive is spinning up, but not going any further.  I still actually have the original drive from my TiVo that best I can tell I had never booted up or I at least ran a complete Clear and Delete on.  Once I installed the original drive the TiVo booted right up and took me to the guided setup screen.  I went through and configured the TiVo with OTA and FIOS.  It has OS version 9.4 on it instead of the current 11.x but it works for now.

I still have yet to pull the dead drive and hook it to a PC to see if I can access it and hopefully copy it off.  Also I will have to RMA this replacement drive that I have had for less than 2 months.  Maybe I will take this opportunity to RMA my server drive too.  I may have lost all my recordings and season passes though, which really sucks after having worked so hard to save them and just lose them again.

Fast forward to 9 PM and I get a message on my cell phone (missed it while I was in basement watching movie) that the power is back on at school.  I will go in tomorrow to see what is happening.  I would work from home but I can’t connect to anything, which I suspect is due to some UPS not turning back on most likely.

Hopefully this will be concluded in Part 4, although it may be a day or two before I find out the final fate of the TiVo drive.  Stay tuned.

In Part 1 I told the (horror) story of my tech woes around Christmas.  Now I will talk about President’s day weekend which is turning out to be even more of a mess.

Friday February 18. 2011 I woke up ready to go and plans for things to do that day.  Friday was a teacher work day which means no students and usually less interruptions.  I was hoping to finish building some new install images based of the freshly released Windows 7 Service Pack 1 and maybe do some upgrades of servers to Windows 2008 R2 Service Pack 1.  However that isn’t what happened.  At about 7:45 AM I got a call at my desk saying “Did you know all the phones in the Ward House are out?” my response “Um, no?”  The ward house is where almost all the employees that aren’t teachers have offices, admissions, business, etc.

I went over to check it out and apparently one of the switches that connects our phones to the main server room and powers them via Power over Ethernet was dead.  It appears that the switch really overheated, most likely caused by it being 70 outside (strange for February) and the heat still being on in building.  We have two 24 port switches for the phones in the Ward House and I found out we have 32 active phones.  Luckily many people were out on Friday so I set to work getting 24 phones working as quickly as possible.  The first problem was determining which 24 to fix and where they are plugged in.  You might say “don’t you know that already Mr. Network Admin?” and it would be a valid question.  My only excuse is there have be a bunch of people who left or moved in last year in the Ward House.

Fast forward 2 hours and I have 24 phones working and everyone is happy.  Just for fun I plugged back in the dead switch after having it unplugged and cooling for 2 hours and it came back on.  Just in case I only hooked in the 8 non-needed phones for Friday.  It seemed to still be working fine when I left Friday evening.

As a result of the 2+ hours lost in the morning I never really got back to doing what I had planned to do.  I didn’t really work on the images at all, but I did get to install SP1 on a few 2008R2 servers for testing.  Although I had a splitting headache most of the day, I think due to spending hours in a dusty hot wiring closet.

Saturday February 19, 2011 started off ok, after Kelly woke me up with the Cuisinart I got up.  After eating breakfast and relaxing a bit I turned on my desktop PC to pay some bills at 9:11 AM.  At about 9:14 AM I got a virus / malware infection on my machine.  I saw Java load on my machine and I should have know something odd was happening, I thought it was the updater but it was actually running something.  Then almost every application on my machine instantly closed and I got a taskbar notification of an infection (from a fake program).  Watching this happen in real time right in front of me without any user interaction gives me a much better understanding of what happens to students at my school.  I have often said I wonder how they get this stuff on their machines, now I know how.  Of course there are many other ways and sometimes they help it, but it obviously can happen without even trying.

My best guess is this particularly nasty little program got on my machine via a Flash exploit via a page I loaded on startup in Opera.  The only other thing running on my machine was iTunes (auto started to synch iPod) and Outlook.  I didn’t have IE open at all.  I know I had a semi out of date Flash version in Opera since it doesn’t seem to prompt to update as much or as well as the ActiveX version in IE.  The sites I had open in Opera are ones I open every time I load Opera and visit almost daily.  Most likely one of those sites hosted an ad that was fake and exploited a hole in Flash to install a dropper.  I immediately tried to start Task Manager and Process Explorer, both of which this fake anti-virus killed.  The fake AV pretty much instantly killed anything I tried to load.

I then logged out of my main account and logged in as an alternate admin account.  Thankfully the fake AV appeared to be contained to my main profile.  When I logged into the alternate account Windows Defender popped up and warned me about finding Rogue:Win32/Winwebsec at 9:21 AM.  I imagine it found it actually when I was logged in as main account but was stopped before it could warn me.  I told it to remove it.  I then loaded Process Explorer to check for anything odd running and didn’t see anything strange. I also ran AutoRuns to look for anything, I didn’t really see much but it doesn’t look at stuff for other profiles very well. After some manual checking I loaded Malwarebytes’ Anti-Malware (my current favorite app for removing fake AV and other spyware) and ran a quick scan.  MBAM like Autoruns works much better as the infected user but in this case the infection was preventing it from running so running it as another user was the only choice.  It ran and found a handful of items all within my main profile.  It found a combination of Spyware.Zbot and Trojan.Hiloti at 9:41 AM.  I also installed updated versions of Flash (both ActiveX IE and Opera versions). While trying to clean the machine I installed a real Anti-Virus on my machine for the first time in a REALLY long time (if ever).  I installed Microsoft Security Essentials because it works arguably as well as any other and it is free.  I started a full machine scan to try and make sure nothing else was hiding on the machine.  While the full scan was running our power went out at 10:18 AM.  It only went out for a few seconds, just long enough to be annoying.

I turned my machine back on at 10:20 AM.  I logged in as my alternate account again.  Immediately MSE popped up and warned me it had found Trojan.Podjot.A in my main profile directory (now the fourth different virus).  I had it remove that.  I then initiated another semi-full scan, I excluded a few directories that literally have millions of files and I am almost 100% sure wouldn’t have the virus in them.  It finished at 10:47 AM and found a bunch of Java based pieces of a virus.  Here is a screenshot of those:
Virus Java
As a result I went and installed the latest version of Java just in case.  I don’t think the virus entered through Java, but rather just used it to run.  I then rebooted my machine again after I uninstalled some other unrelated items.

I logged back into my main account at 10:53 AM and was instantly greeted by the fake AV again.  I then promptly logged back off again then back in as my alternate account.  I ran a quick scan with MBAM and a custom scan with MSE of my main profile directory again and found nothing, which was not good.  I had previously used regedit to check the HKLM\Software\Microsoft\Windows\CurrentVersion\Run and RunOnce and also checked with AutoRuns.  I now manually loaded the main account ntuser.dat as a hive so I could check the HKCU\Software\Microsoft\Windows\CurrentVersion\Run and RunOnce for it.  In the RunOnce I found a key pointing to aOfOjPl08200.exe in C:\Programdata\aOfOjPl08200.  I was pretty sure this was my last problem and I verified it by looking at the creation time and seeing it was 9:14 AM.  At this point I was very curious as to why this wasn’t caught by MBAM or MSE so I did a Google search for online virus scanners.  These are sites that you can upload a file to and have it scanned for viruses.  I tried a few and the file kept coming up clean.  I then found Virus Total which uses 43 different Anti-Virus engines to scan a file.  I uploaded the file there and got the results you can see at this link or in the picture below:
Virus
You may notice that only 13 of the 43 anti-virus even picked this up as a virus and notably MSE did not.  I apparently was the lucky recipient of a freshly created variant of the Zbot Trojan, yippee!  I deleted the registry entry pointing to this file and deleted the file.  I then logged out and back in as my main profile which didn’t load properly.  I then realized I had forgotten to unload the hive.  I logged back in as the alternate account and unloaded the hive and rebooted the machine just in case.

Update: Microsoft now recognizes my file as a a virus as you can see at this link.

I logged in at 11:22 AM and found a clean profile.  Success!  It only took me 2 hours to clean these 5-6 viruses off my machine.  All of them were working together on some level to infect and prevent removal on my machine.  Nasty stuff.

Then at 11:24 AM my cell phone rang, read what happened next in Part 3.

This is a somewhat old story but it will lead into a more current one in Part 2 shortly.

Over Christmas Kelly and I went to visit my parents in Pittsburgh, PA.  We left the morning of Wednesday December 22 and returned Sunday December 26.  When we left everything tech wise was working pretty well in the house.  However when we got back a sequence of events started unfolding to make that no longer true.

We arrived home to find the arcade / Windows Media Center computer dead.  This computer records all the stuff we watch on normal over the air networks and feeds it to the Xbox 360 to allow us to watch it on the big screen in basement.  The computer wouldn’t even boot up and it appeared the main drive was dead.  I tried various methods to recover the drive with almost no success.  Luckily that drive only housed the operating system and non of the real data other than what was scheduled to record.  All the data was on other drives.  I went out on Monday and bought an Intel 40 GB SSD to replace the dead drive and get the machine working.  After a few hours and some reconfiguration I got everything working again.  I also later found out the original drive had a 5 year rather than 3 year warranty so I got it replaced as well.

While I was getting the arcade OS drive replaced I finally got the failed half of my 1 TB mirror for server replaced as well.  It actually had died a long time ago (like years) but I had never replaced it since the online RMA page wouldn’t process it.  I tried it again since they were both Western Digital drives and it actually worked this time.  I got both drives on Wednesday and was still off work thankfully so I could get them installed.

Also on Wednesday the hard drive in my Series 3 TiVo decided it was also time to quit working.  This was not totally a surprise since the TiVo had a few freezing / garbled picture / rebooting issues before.  This time however it would boot and only work for about 10 minutes at a time.  Knowing it was dying and not wanting to push my luck I immediately removed the drive and started determining what to do.  This was also a WD drive and was still under warranty so I started a third RMA.  The replacement drive arrived on Friday.  After a couple of false starts and a good dose of patience I was able to copy 95-99% of the failing drive to the replacement.  Luckily the lost parts were in recordings and not the operating system part of the drives.  I think we lost a few minutes of a few shows but not much really.

The final problem occurred within a week, not sure of exact day.  Half of my OS 150 GB mirror in server failed as I was doing a consistency check on the mirror set.  As of today (2/19/2011) I still haven’t replaced this drive via RMA with WD (yes again), however I may do it now for reasons you will see in Part 2.

On short note on Western Digital.  You may think by reading this that their drives are junk and prone to failure, but I would really disagree.  Most of these drives have been on 24 hours a day 7 days a week for almost 3 or more years.  They served me well for quite a while.  Also the RMA process of getting a replacement no questions asked in 48 hours max can’t be beat.

About this Archive

This page is an archive of entries from February 2011 listed from newest to oldest.

November 2010 is the previous archive.

June 2011 is the next archive.

Find recent content on the main index or look in the archives to find all content.