23 December 2018

23rd of December 2018

Farm status
Intel GPUs
Running Einstein O1OD1 work overnight

Nvidia GPUs
Off

Raspberry Pis
Running Einstein BRP4 work


Away
I had a couple of weeks off so things were powered down while I was away. Since returning Sydney had a couple of heavy hail storms that fortunately haven't broken anything for me. The hail storms seem to have cleared the hot weather so I am able to continue crunching for the moment.


Projects
I have a couple of little projects to finish off this year. Firstly I need to finish off the Pi Cluster which means I need to buy more Pi3's

The second one is to replace the proxy server which also means buying some more hardware. Its a 4th generation Intel i5 and Intel are currently up to the 9th generation which gives you an idea how old it is. While it doesn't impact the crunching, its running 24/7 so reliability is everything.

There are a number of things on the "to do" list for next year. Technology advances and hardware needs updating. I try to replace the hardware over a couple of years so they don't fall too far behind.

25 November 2018

25th of November

Farm status
Intel GPUs
All running Einstein O1OD1 work overnight

Nvidia GPUs
Two running Einstein O1OD1 work overnight

Raspberry Pis
Twelve running Einstein BRP4 work


Increased storage capacity
I purchased 3 WD Red 8TB disks to put into the Drobo (its used to backup the storage server). This brings the Drobo up to 16TB of usable space.

At the same time I purchased 3 Toshiba 8TB disks to put in the storage server. I have a new disk controller on order that supports up to 8 drives and will swap the drives around when installing it. The idea is to have a vdev of 3 drives in raidz1 which will give single drive redundancy and more than 16TB of disk space due to compression. I can then add more drives as needed.


Disk performance
While doing some online research I came across the gnome-disk-utility that can benchmark disk performance so I let it loose on a few different drives that I have. Interestingly the Seagate expansion drives had a high transfer speed and the SSHD came in the slowest for transfer speed and 2nd slowest for seek time. I haven't benchmarked the NVMe SSD yet.

Drive model Type Interface Ave read (MB/sec) Ave seek (msec)
Samsung 850 Pro 2.5" SSD SATA 521.7 0.05
WD10j31x 2.5" SSHD SATA 89.8 17.31
WD2002FAEX 3.5" HDD SATA 124.2 12.78
WD4000F9YZ 3.5" HDD SATA 137.3 14.32
Seagate expansion 3TB 3.5" HDD USB 3 158.8 17.51
Seagate expansion 2TB 3.5" HDD USB 3 139.6 15.42

11 November 2018

11th of the 11th (month)

Farm status
Intel GPUs
All have been running Einstein O1OD1 work overnight

Nvidia GPUs
All off

Raspberry Pis
Twelve running Einstein BRP4 work


Other news
This week has been all about the file server and ZFS.

The Samsung 970Pro SSD arrived along with the Silverstone ECM21 adapter card. The only gripe I have with it is they didn't include a screw to hold the SSD in place but there are holes for it on the card. Cheap skates. I don't have any in my stockpile that fit so I ended up using a cable tie. I have since gone onto eBay and ordered a packet of 12 screws which I hope are the right size.

Installation is straight forward, no drivers are need and Linux saw it as a disk device and gave it a nvme device name. I just used gparted and created a partition table and formatted it. My original idea was to partition it into 2 drives and have one as a ZFS intent log drive (aka ZIL) and the other partition as a cache drive, however ZFS likes to deal in drives rather than partitions so it refused to mount the 2nd partition. I tried it both as a ZIL and as a cache drive. A ZIL doesn't need to be big, about 8GB is enough even for a 10Gbe network, it only needs to hold 5 seconds worth of writes.

I testing by dragging a 3.7GB test folder on and off my Drobo 5N2. I deleted it from the destination before running each test.

The Drobo has 2 x 1Gbe ports bonded together, 4 x 4TB WD red hard disks and a 256GB mSATA SSD. The storage server has a 10Gbe NIC, 4 x 4TB WD SE (enterprise grade) hard drives plugged into an Intel controller in JBOD mode. They're both plugged into the same Netgear switch. One 10Gbe port on the switch was an uplink to a 10Gbe switch which in turn has a 1Gbe link to the router. The other 10Gbe port was plugged into the storage server. The Drobo was plugged into two of the 1Gbe ports. That makes the theoretical maximum speed 1 gigabit/sec or 1Gbe ie the speed of the slowest device in the loop. Assuming a 10 bit signal (start bit, 8 bits data and a stop bit) that means we should be able to push 100MB (that's right megabytes) a second.

The best I could get was 49MB/sec which drops down to around 20MB before climbing back up which is probably the SSD on the Drobo buffering data. That was going from the Drobo to the storage server. Setting it up as a cache drive or ZIL didn't really make any difference to this regardless of which direction I was copying the files. As far as I understand it the ZIL is hardly used due to the asynchronous writes and the cache is only useful for data being read for a 2nd or subsequent time. If I had lots of users or random reads happening then it possibly would have made a difference, but for copying stuff in and out it doesn't make any difference in the speed. Ideally I should have used two devices that can do 10Gbe so as to remove the network speed from the equation.

I know in the reviews they recommend the Intel Optane 900P as a ZIL device but they're AUD $529 for the 280GB model, compared to the $250 that I spent on the Silverstone adapter and Samsung 970 Pro (512GB). That's double the price for half the capacity. You really need two devices for the ZIL to ensure data integrity as well.

I really liked Linus Tech Tips NVMe based server (a Supermicro SSG 2028R I believe) connected to his hard disk based server which sounds cool. Just connect them by that 40Gbe Mellanox controller and a short bit of fibre optic cable between the two and move the files off the fast storage to the slower one at lightening speed, or at least as fast as the hard disks can go. Maybe I should just buy an SSG 2028R...

If anyone has any suggestions on how to optimise it without throwing too much more hardware at it let me know. As always I'm happy to hear any comments.

03 November 2018

3rd of November

Farm status
Intel GPUs
All off

Nvidia GPUs
All off

Raspberry Pis
12 crunching Einstein BRP4 work


File server
I’ve converted it over to Linux and that is when the problems started. The RAID controller wasn’t recognised, even in JBOD mode so I removed it and plugged the drives directly into the motherboard SATA ports. I setup ZFS on the 4 drives without any issues. When I copied the files back it took ages and was indicating it was only doing 75MB/sec write speed which I think is pretty slow. Maybe the ZFS overheads are such that is a good figure.

I managed to get the RAID controller going but write speeds are still 75MB/sec. I have ordered a PCIe to NVMe adapter and a NVMe SSD to use as a cache and logging drive. NVMe SSD’s are somewhat faster than SATA which only go up to 6 gigabit/sec. Hopefully this will speed things up. I'm not sure if I need a second one (ie one for cache and another for logging).


Networking updated
I ordered and received a couple of 10Gbit network switches (the ones with 2 x 10Gbe and 8 x 1Gbe) and put them in. I also got a full 8 x 10Gbe switch to plug into the router so that I can get to use this higher speed down to the smaller switches and into the file server. I still have an extra 10Gbe network card that I could put into the file server.


New LIGO search
Einstein have a new search they are conducting on the LIGO O1 data. Its CPU only. I did some testing to see how it performs on the Intel GPU machines using all available threads versus using half of them. It certainly does more work using all threads.

I started doing the same on the AMD machines but after the first batch running on half the threads there isn't any new work available. Being 1st generation Ryzen’s they are pretty poor on the hyperthreading front, whereas Intel have had quite a few generations to optimise their designs.

21 October 2018

21st of October

Farm status
Intel GPUs
Six i7-8700's running Asteroids and Seti

Nvidia GPUs
Off

Raspberry Pis
12 running Einstein BRP4 work


Other news
I got a PC assembler out and have now got all six of the i7-8700's swapped in. The i7-6700's that I used before are now decommissioned. That brings the Intel GPU part of the farm up to 36 cores/72 threads. They're slower than the 6th generation but due to the increased core count still produce more work.

With the new machines I also had to install Linux which is easy enough to do when I follow my own step-by-step instructions. I then discovered a slight problem with the way I setup BOINC on them so had to reinstall it on the existing machines. I am now working around a quirk of the Asteroids project server giving some machines the SSE2 or SSE3 app which is slower than the AVX version.

The ECC memory for the file server arrived and has been put into it. There is no obvious benefit at the moment until I get it running Linux and ZFS. That is the next step for it. I may get a SSD to use as a cache drive. I did use one of the other machines and a couple of external USB drives to practice getting ZFS running. Things left to try before I convert it are setting up windows file shares using samba and getting the UPS recognised.


The new router/gateway device also showed up so that is something else for me to work on.

13 October 2018

13th of October

Farm status
Intel GPUs
Two i7-8700’s running Seti
Three i7-6700’s running Asteroids and Seti

Nvidia GPUs
Two running GPUgrid and Seti

Raspberry Pis
Twelve running Einstein BRP4 work


Supermicro BIOS update
I haven’t updated the Supermicro BIOS since I got it. It came with a BIOS version 1 and they are up to version 3 now. I decided its about time it got updated due to all the Intel Spectre patching that is happening and download the latest from their website. The instructions say to put it on a DOS boot disk and run it. How to create a DOS boot disk in this day and age? I have a USB floppy drive and MS-DOS 6 upgrade diskettes, but even that doesn’t work.

After a lot of googling and coming up with suggestions that don’t work I ran across one that actually worked by using Rufus to create it. After that it was fairly straight forward apart from the couple of times when the machine rebooted and appeared to have died (it powers off), but leaving it to do its thing it got there. I expected a reboot after applying the BIOS but not two while booting up again.


Load balanced internet
After last weeks issues with the Telstra internet connection not working to US destinations but the other one did I had a bit of a look at load balancers. It seems they are usually combined with a network firewall appliance so I might get one of those. I don't think the firewall built into my current routers is particularly effective so it should also improve network security.

I need to get the Telstra internet connection changed back to ADSL and get another modem to get it going. Personally I would like to get rid of the home phone which would solve the problem of scam phone calls from overseas, but the wife wants to keep it. This would all change if/when the NBN comes around as there is no point in having two phone lines when NBN offer speeds up to 100Mbit on a single connection. They don't think they be doing my area until 2019.


File server changes
One of the other projects I have is to replace the file server, however I am looking at reusing the existing one running under some derivative of Linux in order to move away from Microsoft products. That was one of the reasons to update the BIOS. I plan on setting the RAID controller to JBOD mode and then using ZFS for its file system, however the server doesn’t have ECC memory which is recommended for ZFS file systems. It can take it but I didn’t buy it at the time due to cost.

06 October 2018

6th of October

Farm status
Intel GPUs
Two i7-8700 running Seti
Three i7-6700 running Seti

Nvidia GPUs
Two running Seti, were doing GPUgrid

Raspberry Pis
All running Einstein BRP4 work


Other news
Bloomberg reported on Supermicro computers being hardware hacked at the factory. If the reports turn out to be true then its a problem for me as I have two Supermicro machines as storage servers. Okay one is still in its box but I have been using the other one for a few years now.


Comms Issues
I had some communications issues with Telstra (one of the largest ISP’s in Australia). I have been unable to upload to or download from US sites for the last 3 days. European sites were unaffected. It seems there was a problem with the undersea cable connecting us to the rest of the world. The Raspberry Pis are on a separate internet link using a different ISP and they also were unaffected. I probably need a load balancer so I can share both internet connections and to provide failover capability. That will be another little project to do.

30 September 2018

30th of September

Farm status
Intel GPUs
Two i7-8700's running Seti
Six i7-6700's off

Nvidia GPUs
Off

Raspberry Pis
All running Einstein BRP4 work.


Networking
I ordered two of the 10GbE switches which turned out to be the wrong model. I thought I ordered the Netgear GS110EMX but instead I got the MX version which is unmanaged. I might have well have saved myself some money and got the ASUS which was cheaper as they have the same features apart from the warranty.

I did a quick test using the Drobo5N2 and file server and I managed to only get 52% of the theoretical 2GbE (ie two 1GbE ports) that I was expecting so it hasn't provided any speed improvement over the 1GbE switch that I had before. While I can get a 10GbE network card and plug that into the file server I can't do anything with the Drobo's networking.

In other news we had some warmer weather so most of the farm was off and besides that I have also been spending some time working on the Rpi cluster -  see marksrpicluster.blogspot.com for details.

23 September 2018

23rd of September

Farm status
Intel GPUs
All running Seti overnight

Nvidia GPUs
Off

Raspberry Pis
All running Einstein BRP4 work


Networking upgrades
I am looking at upgrading the internal network to 10 gigabit. Its a toss up of getting a cheaper unmanaged switch, the ASUS XG-2008 which gives 2 x 10GbE plus 8 x 1GbE for $349 AUD, or a Netgear smart-managed switch which gives the same ports but costs $90 more.

I have one device currently with a 10GbE network port in it, I have a few with dual 1GbE network ports that are teamed/bonded together. To get link aggregation (sometimes call port trunking) you need a managed switch. In addition to that I would also need to get a full 10GbE switch to plug the other ones into and they are around $930 AUD for the XS708T (8 x 10GbE) or $1800 for the XS716T (16 x 10GbE). There are slightly cheaper models such as the XS708E ($690) and XS716E ($1460) which have less features. The Netgear ones come with a lifetime warranty unlike the ASUS which only has the statutory 1 year warranty.

01 September 2018

1st of September

Farm status
Intel GPUs
3 x i7-6700 running Seti and occasional Asteroids
2 x i7-8700 running Seti and occasional Asteroids

Nvidia GPUs
2 x Ryzen 1700 running Seti

Raspberry Pis
All running Einstein BRP4 work


Other news
Nvidia launched a new GPU chip family called Turing and also the top of the line GPU’s now designated RTX-2080, RTX-2080Ti and RTX-2070. I have four GTX 1060’s so their logical replacement would be a RTX-2060 however they haven’t even announced them yet so I may be looking to replace them closer to Christmas.

I got a kernel update (4.17.15) for all the machines. It promptly broke the one machine with bonded network ports. I’ve raised a bug for that and have to run an older kernel on that machine until its fixed. The other machines with single network ports are fine.

19 August 2018

19th of August

Farm status
Intel GPUs
2 x i7-8700 doing Asteroids and Seti work

Nvidia GPUs
2 x Ryzen 1700 doing Asteriods and Seti

Marks Rpi Cluster
All doing Einstein BRP4 work


Other news
Einstein discovered an issue with the preparation of the data for the gravity wave search. They think there may be an issue with the de-jittering they apply before we crunch the work units. The current search is about 75% completed. I have stopped running Einstein gravity wave work while the project determine what to do next.

15 July 2018

15th of July

Farm status
Intel GPUs
Three i7-6700’s running Einstein gravity wave work
Two i7-8700;s running Asteroids and Seti work

Nvidia GPUs
Two running Seti work

Raspberry Pis
All running Einstein BRP4 work


Other news
Asteroids ran out of disk space last week so I wasn’t able to upload completed work. They took the project down for a couple of days so they could copy the data across to a temporary location. There are reports they’ve run out of space again today. They have a storage server coming but when it will be available is not known.

CPDN aka Climate Prediction came back last week after being unavailable for 3 months. Their virtual machine images had all been corrupted so they were forced to restore back to March and rebuild many components. They don’t have much work available and tasks that were in progress before are no longer recognised. I don’t have any machines attached to the project as they don’t have 64 bit science apps (32 bit apps need extra non-standard libraries to run under Linux). I’d like them to build 64 bit apps as that would mean we don’t need the other libraries. OSX meanwhile is going 64 bit only so hopefully we can get some action on this front.


BOINC 7.12 released
The latest version of BOINC was released. The main changes are to support Science United which is like an account manager for BOINC. It allows computer clusters to let the public run some of their tasks. For more information see https://scienceunited.org/

The Linux build is yet to reach stretch-backports so I haven’t given it a try. There is already a 7.12.1 which hasn’t even made it into Debian.

07 July 2018

7th of July

Farm status
Intel GPUs
3 running Einstein gravity wave work
2 running Asteroids and Seti work

Nvidia GPUs
All off

Raspberry Pis
All running Einstein BRP4 work


Hardware updates
I got another of the i7-8700’s installed this week. It replaces an i7-6700. I ended up taking it to a local PC shop. They didn’t do a good job. They left the front case fan unplugged and the power wiring for the motherboard wasn’t passed through the back as its supposed to. I fixed the case fan up myself, it just needed the two rear fans to be on a Y splitter so there was a fan header free on the motherboard for the front case fan. When I feel inclined I will fix up the power wiring, meanwhile its crunching away.

The plan is to replace my eight i7-6700’s (4 cores/8 threads) with six i7-8700’s (6 cores/12 threads). That grows the core count while reducing the number of physical machines. Two have been swapped out already. The rest just need an on-site PC assembler.


Software updates
We got an updated Linux kernel in stretch-backports. It installed fine on one of the Intel GPU machines. When I went to install it on one of the Nvidia GPU machines however It wouldn’t boot. It left a blank screen and I couldn’t ssh into it. I had to boot using the older kernel. There is a Debian bug raised for it (901919). They have since resolved it by patching the Nvidia drivers, but they haven’t yet made it to stretch-backports.

24 June 2018

24th of June

Farm status
Intel GPUs
Four i7-6700's running Einstein gravity wave work
One i7-8700 running Seti work

Nvidia GPUs
Two running burst of GPUgrid with Seti

Raspberry Pis
All running Einstein BRP4 work


Other news
I have 3 of the i7-6700's off  while I am running a couple of the Nvidia GPU machines. The sun is out so its warm during the day. I let the Nvidia GPUs idle during the day but nights are cool so they run overnight. The remaining i7's are crunching 24/7.


Power9 CPU
I had a look at the IBM AC922. They sport dual Power9 CPUs with up to 22 cores/88 threads. They can also be fitted with 2-6 Telsa V100 GPUs (way too expensive for me). They are an ideal number cruncher. They are installed in a number of computer clusters such as Summit at the Oak Ridge National Lab and Sierra at the Lawrence Livermore National Lab.

There are a couple of other companies also selling Power9 based computers such as Raptor Computing which have a Talos II (dual CPU machine) and a Talos II lite (single CPU machine) that is more affordable at $1399 (USD) without CPU or memory.

I'd love to get one or two of the AC922's even if they don't have Tesla's in them they'd make a great cruncher. Sadly while they run Linux out of the box and there is a BOINC client for them in Debian I would have to get various science apps and recompile them for the PPC64LE architecture and optimise them. That is something I don't have the expertise to tackle.

16 June 2018

16th of June

Farm status
Intel GPUs
Running Einstein gravity wave work

Nvidia GPUs
Running GPUgrid plus Seti

Raspberry Pis
Running Einstein BRP4 work


Other news
Einstein gravity wave work has been hard to get recently. They have 2.9 million work units left to complete in the current search but there doesn’t seem to be many ready to send on the project server. That means my computers go idle due to lack of work. Yesterday evening I had one of the i7’s request work for over an hour and each time it got none.

Intel still haven’t managed to get their Neo drivers into Debian. This driver replaces the Beignet package on the 8th generation or later CPUs. Its available on Github but is yet to make it into a package.


Outstanding things
I still need to sort out the i7-8700’s. I just need a PC installer to assemble them and I could then swap out the 6th generation ones. I might have to resort to taking them down to a nearby PC shop and getting them assembled.

I’m not sure how to get a 10Gbe network going. Sure I can get.a couple of switches with 10Gbe (ASUS have a fairly cheap one) but the routers I am using don’t have 10Gbe capability which means they would need to be replaced. Most of the network cabling I have is Cat6 and fairly short which is good. The 10Gbe routers however are expensive. Which one to get and how to hook it up to the ADSL are my problems. I need to find a networking guru to consult.

The other thing on my list is to see if I can get HTcondor going and run BOINC as a backfill for the cluster (just like the real clusters do). My experiments using the Raspberry Pis didn’t work out so I need a guru who also knows HTcondor.

27 May 2018

27th of May

Farm status
Intel GPUs
Five running Asteroids, Einstein and Seti

Nvidia GPUs
Two running overnight doing GPUgrid and Seti

Raspberry Pis
All running Einstein BRP4 work


Other news
I managed to pickup some GPU work from GPUgrid. They’ve been concentrating on their multi-core CPU app and GPU work has been in short supply. This time I got some short and long work units which have been running fine. This exposed a problem with my app_config file that wasn’t working for the short work units. I use an app_config file to allocate a whole CPU thread to their GPU work units which makes then run quicker. I resolved the issue with it.

Asteroids and Einstein both passed 50 million credits. Asteroids gave another badge as a result. I have been running Einstein overnight on the i7-6700’s. It sometimes takes a few goes to get the gravity wave work and I have to manually intervene until they’ve got tasks.

I was concentrating on Seti work to keep it ahead of Asteroids and Einstein however they ran out of work units this weekend so its back to Asteroids for a bit. Seti gives less credit than the other two therefore it takes more processing to keep the credit scores aligned.

The Bramble increased to 12 crunchers and I used all 3 of the Mk II Pi^4 cases I had printed off. See MarksRpiCluster for details.

12 May 2018

12th of May

Farm status
Intel GPUs
All running Einstein gravity wave work

Nvidia GPUs
Off

Raspberry Pis
All running Einstein BRP4 work


Other news
Winter weather has arrived allowing numerous machines to run constantly. I’ve been concentrating on Einstein work but have the Intels running down so the Nvidia GPUs can be used. Due to the power available (domestic grade power circuits) I can’t run all of them at once, even if the weather allows me to.

Additional USB chargers and some Y split fan header cables arrived, as did three Pi3 model B+. I have swapped the NFS server over. One has been put into service crunching, that gives 11 compute nodes for the moment. I am waiting for the L shaped power cables to be able to get the remaining one into my third Pi^4 case.

I haven’t worked out what to do with the remaining Pi2 and Pi3 model B’s, but I do have the prototype Pi^4 case with the 40mm fans that I could use to get some of the Pi3’s going. The Pi2’s don’t need active cooling so can run with the top off their case. I will need more network cables and SD cards to be able to use these. I really should have another look at NFS booting them. Or I could give the Pi2’s away.


BOINC testing
We’re testing 7.10.2 at the moment which is a release candidate. It looks like it will finally fix the BOINC event log (aka messages) stuffing up the time format under Linux. It also moves the boinc data directory in Linux to be in /var/lib/boinc with a symlink to the old one at /var/lib/boinc-client.

25 April 2018

25th of April

Farm status
Intel GPUs
Running Einstein and Seti work

Nvidia GPUs
Two running Seti work

Raspberry Pis
All running Einstein BRP4 work


Other news
I have been able to run some bursts of Seti work and Einstein overnight (their tasks take 9 to 10 hours). Einstein has started processing their 2nd observation run of Ligo gravity waves so that will keep things busy for 3 or 4 months. They have access to the Atlas cluster as well as what additional processing power the rest of us add.

AMD released the Zen+ CPUs which use a slightly smaller process size (12 nano meter) and so has allowed them to increase the speed of their CPUs while using the same power as before. They have also tweaked the design a bit to improve the cache hit rates. I don’t think its worthwhile upgrading just for a speed increase.

I still haven’t managed to swap out the i7-6700’s for the i7-8700’s as my PC installer guy seems to have disappeared. Time to find another one I think. At the moment I have one i7-8700 running and seven i7-6700’s which are a bit faster but less cores. There are another five i7-8700’s sitting in boxes.

The Nvidia GPU machines got an updated driver so I have installed it on one machine. The last time they pushed an updated driver out it kept dropping the GPU into low power mode so I am a bit wary of updated drivers as its difficult to get back to the previous version.

08 April 2018

8th of April

Farm status
Intel GPUs
All off

Nvidia GPUs
All off

Raspberry Pis
All running Einstein BRP4 work


Other news
This last fortnight has been all about the Raspberry Pis. Its still too hot to be running the other machines so I have been concentrating on the little ones.

First off was the arrival of the 11 Pi3 model B+ and swapping out the Pi3 model B’s. First problem was a lack of heatsinks. I put as many into service as I could (5 of them) and ordered more heatsinks. Once heatsinks arrived I then decided I would use new SD cards rather than reusing the ones from the older Pis. A trip to the shops fixed that. Then a late night imaging a bunch of SD cards and firing up each Pi3B+ and installing the software.

Because I now had a bunch of spare Pi3 model B’s I decided I would use one of them as a NFS server in conjunction with the PiDrive that wasn’t doing anything. That made life a lot easier as I can now just copy various config files from it into the appropriate directories instead of what I used to do (manually edit file and cut and paste). I know I tried setting up an NFS server a couple of years ago but it wasn’t reliable. This time it seems a lot better.

At the moment I have upgraded 9 out of 10 number compute nodes and one support node. I have one more compute node left to swap over that is finishing off the work it has which takes around 11 hours.

I looked at the 3rd Pi^4 case that I had and thought why not put the two other compute nodes, currently in official Pi cases, into the Pi^4 case and get another two Pis. And while I am at it lets replace the Pi3B that is running the NFS with a 3B+ as well. I can feel the need to order more parts.

I broke a stand-off in one of the Pi^4 cases due to the screw holding the Pi3B in getting stuck. The head of the screw was stripped so the screwdriver couldn’t get a grip. In the end I had to deliberately break it to get the old Pi out. The M2.5 screws are so tiny and the metal isn’t hard so its easy to strip the head on them. I took half an hour just to get the piece of stand off and screw separated. Needless to say that screw got thrown away. I will have to glue the stand-off into the case now.


HT Condor
I have been using the freed-up Pi3B’s to experiment a bit with HT Condor. Its the software they run on a real cluster for scheduling batch jobs and its available in the Raspbian and Debian repositories. The HT stands for High Throughput. All was going fine until I enabled the firewall. After that I can’t get the components to talk to each other so I am trying to resolve that.

A number of compute clusters run HT Condor and have BOINC as a backfill task, that is if the cluster doesn’t have anything else to run it will start up a single instance of BOINC for each available core on each compute node. I don’t think thats going to work too well with the Pis due to the lack of memory however it should work on the larger machines which don’t have the memory constraints.

17 March 2018

17th of March

Farm status
Intel GPUs
All off

Nvidia GPUs
All off.  Ran Einstein gravity wave work during the last week

Raspberry Pis
The ones with fans running Einstein BRP4 work


Debian point release
Stretch had a point release 9.4 so a few updates came out of it, plus various security fixes during the week. Raspbian followed the day after with the same fixes but also updated firmware to support the new Pi.


New Raspberry Pi
The Raspberry Pi foundation released a new Pi called the Pi 3 model B+. It has a faster CPU (1.4GHz now), the same memory (1GB of DDR2) and better WiFi and networking. I promptly ordered 11 to replace my current farm before the distributor ran out of stock. They arrived yesterday.

I have swapped out four Pi3 crunchers and the support Pi3 already with Pi3B+ models. I only had 5 spare copper heatsinks so I need to order more.

I also got another 5 port USB charger which I use to power the Pis. It turns out its only rated to supply 3 x 2 amp plus 2 x 1 amp even though I have one running 4 Pi3’s in a Pi^4 (Pi to the power of four) case plus the fans. I might need another charger or two.


Pi^4 case
I have two more Pi^4 cases that are waiting on some M2.5 screws which I had to order off eBay. They are coming from China and take up to 30 days to arrive. I couldn’t find any in the local electronics parts stores or hobby shops.

To make it a bit safer I also have some chrome fan grills on the way but they aren’t holding up the build. Just don’t stick your fingers in the fan. The 60mm fans hurt.

Photos of the Mk I case coming soon - once the screws arrive.

04 March 2018

4th of March

Farm status
Intel GPUs
One running. The rest off.

Nvidia GPUs
Two running the rest off

Raspberry Pis
All running Einstein BRP4 work


Einstein O2 gravity wave tuning
Weather goes from hot to cool, so in the cool times almost all the farm runs Einstein gravity wave tuning run work. Their first tuning run for the O2 (Observation run 2) had problems so we’re doing it again. They had to issue new apps and new data files.

I find the Ryzen’s aren’t too good at hyper threading this app so I only run 8 at a time. If I run 16 at a time their run time doubles. The Intel machines however are better so I use all available threads on them.


Other news
I still haven’t got the other 5 i7-8700’s assembled yet, still waiting on my PC assembler to return from holidays.

Another annoying thing is Intel have stopped updating Beignet which provides OpenCL capability to their iGPUs. They apparently had open source drivers (Beignet) and closed sources ones. They have now decided to have one set of open source drivers called Neo. Unfortunately it could take a long time before they become available in Debian repositories.


BOINC testing
BOINC 7.9.2 has made it all the way up to stretch-backports in Debian. Its also available via locutusofborg’s ppa for Ubuntu and of course from the BOINC download all page. This updates a few things and looks like it might have fixed one annoying bug with the Manager Tasks tab. It still has a problem with the event log losing its time format. This version is to support Science United.

18 February 2018

18th of February

Farm status
Intel GPUs
All off.

Nvidia GPUs
All off

Raspberry Pis
Six with fans running Einstein BRP4 work


Other news
Doing more Einstein Gravity wave O2 tuning run work. This time its the i7-8700 and a couple of the Ryzen’s. The Ryzen’s seem to take quite a long time with them at the higher frequencies, even when running on half the available cores.

I did software updates on all the Pis, however one of them threw seg faults after rebooting. I reimaged the SD card but it was horribly slow. I assumed that meant the SD card was stuffed and reimaged a new one and reinstalled everything. Speeds are much quicker now which seems to confirm my diagnosis.

I had to go over to North Ryde to pickup the parts for the 5 new i7-8700’s. They are currently awaiting assembly. I am hoping my PC assembler will be available soon so I can get them going, especially with the Einstein gravity wave tuning run finishing in a week or so and the actual processing run starting. The hot weather isn’t helping either.

11 February 2018

12th of February

Farm status
Intel GPUs
One i7-8700 running, others off

Nvidia GPUs
One running, the rest off

Raspberry Pis
All running Einstein BRP4 work


Crunching news
The hot humid weather broke for a few days so I had most of the farm running a mix of Asteroids and Einstein work. It got hot again so they’re mostly off now.

Einstein are doing a tuning run for their O2 Gravity wave work so I have been running them on different machines to see how they go. When you look at the times per work unit you can see the i7-8700 is the fastest.

Intel i7-6700 running 8 approx 34,100 sec = 4,262.5 sec/WU
Intel i7-8700 running 12 approx 39,500 sec = 3,291.66 sec/WU
AMD Ryzen 7-1700 running 16 approx 56,000 sec = 3,500 sec/WU


Hardware upgrades
I placed the order for another 5 x i7-8700 machines comprising motherboard (ASUS Prime Z370-P), CPU (i7-8700), cooler (Noctua U9S), memory (Kingston 16GB DDR4 2666MHz kit) and one more case (Fractal Design ARC Midi R2). Unfortunately the coolers are out of stock until the 23rd of February.

I am going to reuse four cases that currently have i7-6700’s. I have already got one new machine in a new case running and there is one on this order. The cases appear to be discontinued.

I had hoped to get 24GB of memory in the new builds but Kingston doesn’t appear to have any 8GB (2 x 4GB) memory kits, only the 16GB kits. I wanted to use a 16GB plus an 8GB kit. The price is also rather high at the moment as 2666GHz memory is relatively new.

It is expected updated Ryzen CPU’s will be released in April so that will be the next major upgrade. I think all it will offer is a higher clock speed while still being able to use the existing motherboards and chipsets. I don’t expect extra cores, but maybe they will tweak the cache as well.

28 January 2018

And now its a sauna

Farm status
Intel GPUs
All off

Nvidia GPUs
All off

Raspberry Pis
6 with fans running Einstein BRP4 work


Sauna
Its hot and humid so nothing much is running. The only difference from last week is the humidity has jumped up, but we haven’t had any rain.


i7-8700 build
I got the i7-8700 and its been partially setup software-wise. I turns out the ASUS Prime Z370-P has a DVI-D and a display port built-in. My KVM’s use VGA for the monitor so I have ordered a bunch of adapters. I have DVI-I and even HDMI adapters. I will work on it some more once the adapters arrive later in the week.

I had an issue with the DDR4-2666 memory not running at 2666 (it only wanted to run at 2400MHz) but a BIOS update seems to have taken care of that.

The built-in UHD 630 graphics are not supported properly until the 4.15 kernel (Debian Stretch are on 4.9). The 4.14 kernel which is in stretch-backports has the UHD 630 as alpha-test so you can set the i915.alpha_support=1 kernel parameter to get it recognised. There are also a few start up errors with it but it starts up anyway. I am not sure if they are BIOS or kernel issues.

21 January 2018

More scorching

Farm status
Intel GPUs
All off

Nvidia GPUs
All off

Raspberry Pis
Six running.


Hardware purchasing
I jumped on the bandwagon and ordered parts for an i7-8700 build. Only one supplier had them and they are limiting them to 1 per customer. Memory is also rather expensive for the 2666Mhz kits so I only got 16GB. Unfortunately two parts are listed as “order only” - The ARC Midi R2 case and the Noctua NH-U9S CPU cooler.

I normally just tell the computer shop what I want and get them to build it, however the usual guy that handles this has left the company. Probably time to find another shop. I couldn’t get my regular power supply (Seasonic G-450) as it appears to be discontinued in favour of higher power models. I settled for Seasonic G-350’s which are not a modular power supply. I expect total power draw to be around 90-100 watts so well within its capabilities (power supplies are most efficient at 50% load).

My main concern with this build is Linux won’t recognise the graphics, at least until they update the kernel.

If it works out, the plan is to replace the eight i7-6700’s with six i7-8700’s. Both CPUs are 65 watt parts. The i7-8700 is a 6 core 12 thread CPU so I will end up with more cores with less machines.


Other news
As you would have noted pretty much everything is off due to the hot weather. The only things crunching are the Pis with fans.

07 January 2018

Summer scorcher

Farm status
Intel GPUs
All off

Nvidia GPUs
All off

Raspberry Pi’s
All running Einstein BRP work


Meltdown and Spectre bugs
Unless you’ve been living under a rock for the last week you could not fail to hear about the two bugs dubbed Meltdown and Spectre. Meltdown is caused by the CPU doing speculative execution and effects all Intel processors made in the last 10 years as well as some recent ARM processors. The speculative execution is a feature the chip designers add to prevent the CPU waiting to load instructions, however it also has this nasty side effect.

Yesterday we got patched kernels for the Meltdown bug, both Linux and Windows. The Raspberry Pi’s don’t use the effected ARM processors so they didn’t need kernel updates. I spent some time applying it to all computers.

There have already been some lawsuits filed against Intel. There could be some performance impact caused by the patch but I have not been able to tell how much yet. Spectre has not been patched as its much harder to counter. It effects AMD, ARM and Intel (and possibly other) CPU brands.

I haven’t seen patches for the Drobo and I expect a number of mobile phones and tablets will need patching, a lot of them use the effected ARM processors.


Other farm news
Speaking of meltdown is was like that today with the temperature getting up to 38 degrees C here so I had everything off until things cooled down. Even now the only computers I have running are the Raspberry Pi’s.

I do, when its cool enough, run the main crunchers overnight. I am still trying to get Asteroids and Einstein up to the 50 million credits that Seti has accumulated.