EOC Folding @ Home - Stats News & Updates

What happened yesterday...07.05.16, 11:30am CST

In case you didn't notice, the stats were down most of the day yesterday. I finally took the plunge and upgraded the database from an older version of MySQL to the latest MariaDB. It required a full backup & reload of all data because of the wide gap in version differences. In the future, DB upgrades should go pretty seamless and just require a simple table check.

I took the opportunity while reloading the data to create partitions for the history tables. This should help performance and make life a little easier for things that I plan on doing in the future. Up till this point I had to prune off old USER data to keep things manageable. Currently, the earliest USER data is January 2013, however I've kept TEAM data much longer and it dates back to June 2008.

The poor server is really starting to show its age, the last couple times I've had to reboot it I make sure to have a remote console active. For some reason it gets stuck at the BIOS screen and I have to fidget trying stuff to get it to continue on. Don't know why it just started doing this within the past few months, but thankfully rebooting is few and far between. Can't afford a new server so pray it keeps on running. There is a slight chance it's an issue with the BIOS version since the latest on Dell's site seems to have reverted back one update (though Dell's file keeping skills are pretty horrible and I've seen files disappear all the time). Next time I have to reboot the server I'm going to revert the BIOS back to that version and see if it makes a difference, there was only a non-relevant change in the two versions. Until then, as long as things keep running I'm not going to tempt fate.

Also worth noting when I was making changes to everything my setting change for the I/O scheduler got clobbered in the kernel settings who knows when. It defaults to CFQ, but I changed it Deadline as that seems to give better performance with the RAID hardware and heavy database usage.

So what's next for the folding stats you might ask? I have a lot of little backend code changes that I started over the years but never have got around to finishing and rolling out, so that is going to be my first priority. Next, I'm going to merge Team 0 (Default) & 446 (Google) in with the general rankings because their points are more in-line with the other teams. I also need to address the issue of updating records instead of inserting new ones when some ranks change but points don't. With only about 1% of a team being "active" that generates a LOT of extra data from the inactive people. I think if I can get that issue resolved I would be able to process more teams and individuals and still store less "new" data each update. I know I need to update the production colors too, and fix the sig images because points have grown so much.

If you have any suggestions, bugs, or whatever, please send me an email and I promise I will respond now.

New Drives Are In...11.05.14, 4:56pm CST

First I want to thank those of you that donated for replacement drives, I really appreciate it.

So here's the story. At first I bought an identical 300 GB replacement drive off eBay that was supposed to be good. After I bought it I figured I would buy a pair of 146 GB drives that matched the rest in the server *just in case*. Well the first drive came in and it was DOA... No huge deal, the guy refunded my money no hassles.

I was a little hesitant anyhow about the supposedly good drive left in the array as it took way too long to copy the data off (practically a whole day when it should of been maybe 30 minutes). Long story short I'm glad I replaced both because when I got them back here and tested them out the controller gave some errors and you could hear the supposedly good drive clicking away so it was on its last legs as well.

The new drives have been in for probably a little over a week now but I haven't moved anything around yet. I've been backing files up onto them just to make sure they are good and aren't going to give any errors.

This weekend I'm going to try and at minimum move the various databases onto their proper arrays. I *might* decide to upgrade MySQL (the DB software) which means things will probably be down for a few hours, not sure yet I'm still doing some testing on another machine. It would be nice though.

Drive Failure10.16.14, 4:53pm CST

Sorry for the downtime today. One of the disks for the stats failed, so rather than risk running the array in a degraded state I moved the data to another array. It just went painfully slow.

I'm kind of on edge about that other disk so I decided to just buy a replacement pair that also matches the other drives in the server. That way if one fails in the future I already have a good spare on hand.

Hopefully they will come in early next week, I can test them out, then go swap them out at the data center.

That's all for now. Wish I had more news. Wish I had more free time.

Where does the time go?03.10.14, 11:09am CST

Already March of 2014, where does the time go? Yes, I do get sidetracked from time to time... lol.

I took the folding stats down for a few days in order to clean up and shift some data around. First problem the User History table was growing insanely large and causing issues. We are talking close to TWO BILLION rows of data and around 60GB of storage (just for one table). Second problem was I needed to alter the WUs & Points fields to accommodate larger numbers mostly for the anonymous user.

The stats numbers should settle back down in a week's time, but for now they should be churning along without any more issues.

Since I've already started fixing some things I'm working on the code right now to fix some other minor issues, pretty much all behind-the-scenes type stuff. I will adjust the rank colors in a week or so, I'm waiting for the production numbers to settle back down then I can better gauge what to make each level.

I have a failed drive in the server right now and am waiting on a replacement to arrive before heading down to the data center to swap it out. I've moved all content off that raid partition to be safe. It shouldn't affect the stats at all, and there shouldn't be any downtime replacing the drive.

Other than that, everything is status-quo I suppose. I'll post some more updates in the coming days / weeks as I update and fix stuff.

Here are some query times, now you can see why having a very large table is a bad idea... lol.

Query OK, 739383379 rows affected (6 hours 39 min 20.11 sec)
Query OK, 739383379 rows affected (21 hours 9 min 39.23 sec)
Query OK, 519104842 rows affected (7 hours 4 min 24.29 sec)
Query OK, 519104842 rows affected (14 hours 54 min 37.28 sec)
Query OK, 602639803 rows affected (5 hours 36 min 4.34 sec)

So I'm a couple days late...01.29.13, 5:48pm CST

Oh what a weekend, it seemed like everyone I know managed to get that stupid fake FBI ransom-ware on their PC and needed me to clean it off. AFAIK, even the latest version of Java has possible exploitsÖ yay!

On a different note, I HAVE been working on the FAH stats on my dev box. Basically doing things I planned on doing a year ago. LOL! Yeah how time flies. Iím getting close to pushing it out onto the main server, when that happens expect the stats to be down for a full day. Itís going to be a lot of work / processing to fix things up. I will give you guys a couple days warning before I do it, it will more than likely be on the weekend or maybe a Friday. Itís not going to be this weekend because of the Super Bowl, hopefully February 9th or 10th.

Yes I plan on changing team colors too during the refresh, please just be patient. Yes Iím aware of the anonymous user not showing proper historical data, thatís one reason for this code change. Also *fingers crossed* Iíll be able to process more people & teams without adding any overhead.

For you DB guruís, amongst other things I did do an interesting test comparing times and space usage for the historical data as one table vs. individual tables for each teams. While having the thousands of individual tables overcomes the inherent MyISAM table-locking issue, you run into issues having to tune MySQL parameters to handle having so many tables open, then there is a file-system issue having so many files in a single directory. In the end it was a toss-up. All the extra data processing / manipulation takes more time and there really isnít a bottle-neck with DB table-locks. Disk usage also appears to be close to the same, there is no space savings advantage one way or another. So in the end Iím keeping the historical data as one big table like it has been (more or less) and not having to worry about as many code changes. It was a lot of work rewriting all the code for the tests, and while it didn't prove to be beneficial it was still a fun little project in itself.

On my dev box here I put in an old SSD to use for the stats database, which even an old SSD smokes any hard drive. I saw the new Samsung 840 Pro 256GB drives on sale so I pulled the trigger on that and am waiting for it to be delivered, that's going to be cool to try out and see if there is any improvement.

Now, in other news... A HUGE THANK YOU to everyone that has donated for Horatioís surgery (info & link in post below). He has had the surgery and will hopefully be going to his foster home this week to recover. She did send me a couple pictures right before & after the surgery, they are pretty graphic and I donít think are really appropriate to post. Maybe after he has healed up some and on his feet I can post something a little nicer. They were very ecstatic by all your generosity, believe me it has really helped out to defray costs.

Yes, I am still here!!!01.23.13, 11:25am CST

Howdy Everyone! Iíve received a few emails lately from people asking if Iím still alive and why I havenít made any posts lately. Itís scary to think how fast this past year has gone by, but it seems like every year is like that. Sadly most of that time is just work, work, workÖ

Anyhow, before I talk about the folding stats, there is something else I wanted to bring up that is dear and personal to me. Please bear with me as I really donít want to go into too many details because it is such a hard thing to talk about. Back on November 11, my dog Squirt passed away. She was a German Shepherd mix and would have been 14 this January. Nobody believed how old she was because she was always in such good health and very active, but I rescued her when she was just a tiny little puppy, only about a month old. She was my little shadow always following me around, and I was fortunate enough to be able to bring her to work with me, so we were always together all these years. When she passed away so suddenly my world just collapsed, I was devastated. And thatís about as much as I want to talk about that.

One day in Mid-December I saw on the local news where they had security camera footage of a lady abandoning a little puppy in front of a groomerís shop in the middle of a freezing cold night. The dog had a disorder where part of his brain didnít fully develop and thus he didnít have very good motor skills. Basically he walked around looking like he was drunk. But if you saw how his tail was wagging and how happy he was to see people and other dogs, you could tell that little guy was full of life, love, and happiness. That was kind of like a wakeup call where I realized there are so many pets out there that need loving homes and I knew I could damn sure do better than that lady that abandoned that sweet little puppy.

So I started searching on PetFinder and contacting local rescue groups. At first I found a dog that looked a LOT like Squirt, they could have been family they looked so similar. I called but found out she was already adopted. But then this very nice lady sent me a picture of a 2-yr old boy dog in a local shelter that kind of matched the description of what I was looking for, except he was male and I really wanted a female. I really thought long and hard about it, and the lady contacted me again the next day asking if maybe I would just foster a dog instead since she knew my whole back story. Well, sometimes the universe works in mysterious ways and long story short I decided to adopt the dog that was in the shelter. Only afterwards did I learn that the day I decided to adopt him was the day he was scheduled to be euthanized! I decided ďLuckyĒ would be a good name for him.

This was right before Christmas, and even with everything else going on the lady from the rescue group took the time off from her work to go pick him up (which I also discovered later was pretty far away), take him to the vet to get check out, fixed, get his shots, etc, etc, so I could legally adopt him, and even watch him for a couple days because I was extremely sick and stuck in bed for a week. On Christmas Eve morning I was finally well enough to meet her and pick up Lucky. He was a little shy at first, but quickly warmed up to everyone, and now he follows me around everywhere. He is such a loving and sweet dog and extremely well behaved, every day Iím in disbelief that this little guy wouldnít be around if all those prior events didnít take place when they did. Like I said, the universe works in mysterious ways.

Fast forward to this week. I get an email from the lady that rescued Lucky. They rescued another dog from a local shelter that unfortunately one of his legs is crushed so bad he is going to have to get it amputated. They really need donations for his surgery (I'm sure it will cost several thousand dollars!), which is why Iím turning to you guys. Iím asking that instead of the donations you make to these folding stats, instead please donate to this rescue group to help them defray the costs for all the good that they do. Iím including a picture of Horatio, the dog needing surgery below. Please help out if you can, I know you people are very generous already otherwise you wouldn't be doing F@H. I made a ChipIn widgit below. If you donate, please be sure to put in a note, "For Horatio's Surgery" so they know. ;)

Okay, I think this post is long enough. I'll post some about the folding stats this weekend. I have been working quite a bit on them behind the scenes with an update very soon.

Several New Things01.08.12, 6:08pm CST

First off, I would like to give a HUGE thanks to Nat from Team Anandtech for donating a pair of 300GB SAS drives. This is going to be a tremendous help in that I can now put the Folding Stats on their own dedicated disks, and also not worry about running out of space! I'm hoping I can go down to the datacenter to get these installed one day this week. If you've got the time, go over to Team Anandtech's Forum and tell the guys thanks!

Second, I've taken the CSV query offline for the time being. It is not used by many people, but it does have the capability to overload the server if misused. I am planning on re-implementing it in a slightly different fashion that won't require so much processing power. Please be patient and I will make an announcement once it is back up.

Third, I have not done the DB upgrade yet, I'm waiting until after I get these new drives installed. When I do, it will be done on the weekend, during the week is just too busy for me to try and do such an overhaul uninterrupted.

Fourth, there was a little downtime this past week. I was doing some firmware updates to various components on the server, all of which required reboots.

Fifth, I've been thinking how to maximize useful data, while minimizing the not so useful data. I do plan on restoring a bunch of the old data that got pruned. But then I'm going to go back through it and remove records where [inactive] users points stayed the same, but their rank changed [down]. I ran a query the other day, and out of the ~540,000 users being tracked, only about 3-4% are regularly active! So as you can see there is a lot of overhead tracking inactivity.

This week I also plan on making a code change related to above. For inactive users when their rank goes down, instead of adding a new row with the "current update,(same) points, and (new) rank.", instead I'm just going to update their last entry with the new info. This will have one little side-effect in the display of the stats. For inactive people it won't show the little red arrow as they move down the ranks. I'm still thinking about it if I keep track of two points in time then I can still show the red arrow and not have too much overhead in space, but translating a thought into practical code is another story!

I also am planning on moving team Google into the normal ranks. Their points total has dropped so much that when I do they will be ranked around 47-48 or so.

I know, I still need to adjust production totals for the colors too... It's on my list!

Finally, one last interesting statistic. With a code change implemented around the beginning of 2010 to track more users, the number of rows added to the user tables went from about 12 million per month to over 38 million per month!

Hope everyone had a Happy New Year! Hard to believe it's already 2012!

Sorry For The Down Time12.29.11, 11:16am CST

As many have noticed the FAH Stats have been down for the past couple days. I've been pruning old historical data in an attempt to get the database under control. The user history table grew to over 1.1 Billion rows, with the table consuming over 36GB!

I've managed to shave it down to around 767 million rows and the table size is now a little under 22GB. There's some tipping point where things really start to balloon if you compare the two pairs of numbers.

Yes I did do a backup before I started removing data, I do plan to eventually re-insert some of it once I get things better organized.

I've un-blocked all the IPs for now. A few people emailed me and I don't think I got a chance to respond back to everyone. If you are still having trouble accessing the XML please send me another email.

I haven't gotten around to the color changes yet, that will probably happen next week.

Be Warned Now... I will be upgrading the database (MySQL) probably this Saturday. Meaning the whole site will be completely down for at least 2-4 hours. I've got to do a complete DB dump / reload because I'm upgrading between versions that are not compatible with leaving the data in the tables.

I'm sure there's more that I can't think of right now, if I remember it I'll be sure to make an additional post.

P.S. for those that want to complain... Realistically this is probably only the second down-time in the past seven or eight YEARS for the stats (that have been MY fault and not Stanford's)... The last time being when I moved the stats to the current server.

Making Some Changes11.29.11, 2:50pm CST

First thing first. We've been having some load issues with the server and I've been trying to track down the root cause. I'm pretty sure the culprit lies in some buggy scripts that people have written that are inadvertently pounding the server with mass requests.

That being said, I've been going through the logs and have blocked some of the worst offenders over-querying the XML. If you have been blocked, but you feel your usage is legitimate, then please email me and we can work things out to figure out what needs to be done.

I have NOT gone through the regular page logs yet, I'm sure there could be some people querying regular pages and parsing that data. If you are one of those people I would HIGHLY recommend you check your scripts to make sure they aren't doing anything crazy.

Second, I'm *finally* going to be changing the color levels. I know it's been a long time coming, but as we all know with processing power increasing, so does the points. Hopefully one day this week I will be able to make the change and see how the new numbers pan out.

Third, in case you didn't notice, we have a database with well over 1.1 BILLION rows of data! What I'm planning on doing is making a complete backup (safety first), then pruning out a BUNCH of the old data. First I'm going to try and simply remove old rows where a person's points did not change, simply their rank (i.e. inactive users moving down the list). Past that I will be pruning out data to keep a maximum of one record per-day per person (i.e. reducing potentially from 8 records a day to 1). Finally, if the table is still too crazy big I'll probably just lop off the oldest data first until I can get it back down to something manageable. Realistically you shouldn't notice any change in numbers on the pages.

I'm sure I'm forgetting something else, if I remember what it is I'll be sure to make another post.

Sorry for no updates in a while!06.18.11, 4:11pm CST

How time flies when you are having fun... NOT! Anyhow, I know I haven't made any posts in a while and I apologize for that.

I'm sure some people noticed the site was down for a bit yesterday, there was a little hiccup (pretty sure it was my fault) but everything is okay now.

I've been experimenting on my dev server with the latest MySQL 5.5 and partitioning. The good news is I can partition the existing tables, the bad news is doing an upgrade to the latest MySQL will require a complete dump / import of the data.

Partitioning will allow the data (and indexes) to be physically split into separate database files on the back end. This translates into faster updates, and also I will probably merge the old archive tables so that people can get their full history (assuming they have been folding back in like 2004).

I know I've been talking about a new version of the stats for a while, but I've just had zero free time. It's all in my head, I can visualize exactly how I want the database to look and how everything is going to be divided and processed and such, it's just sitting down and getting to actually code it is where I'm at a loss.

Besides that not a lot to report. I know Stanford has been working diligently on new clients. I've read a lot about the v7 but have not personally tried it out yet. Temperatures here in Houston have been hitting 100F+ for the highs which is ridiculous this early in the summer. To top it off we have had very little rain in the past four months (if not longer)... Maybe like 1/10" a month... that ain't much! Hot and dry means the folding clients get powered down until we get a break in the weather. ;)

