Computer Systemen 2011: Labbook

Overwritten Labbook 2011

November 8, 2011

Sparta seems to be machine sparta.science.uva.nl, which is reachable from mremote. Also reachable without domain?! Ping says gives 146.50.40.202 for sparta and 146.50.4.50 for mremote.

November 3, 2011

You can get the author of a file by ls -galt --author.

October 19, 2011

My favorite Linux-machine u015425 is from Efstatrios Gavves (C3.250)

October 17, 2011

Repeated psum performance measurements (Fig. 12.33) on machine deze. Values are ((1,7.5),(2,3.8),(4,1.8),(8,0.9), (16,0.49), (32,0.37), (64, 0.38)).

October 11, 2011

Downloaded Go (MinGW port) on u305. Installer dumps everything in C:\go directory.
Started documentation-server with godoc --http=:6060. Compiled one of the tests with 8g printbig.go; 8l printbig.8 -o printbig.exe.
Made the proxylab-handout. The perl-script didn't work, so made the proxy_handout.c manually.
The go-documentation has an example of a simple webserver.
The main-function can't have arguments, but the arguments can be read with the flag-package in Linux-style:
func main() { var value string flag.StringVar(&value, "port-number", ":15213", "Use the port-number argument to specify the port of the local machine. Defaults as ':15213'.") flag.Parse() fmt.Printf("Starting proxy server at %s\n", value) http.HandleFunc("/", handler) http.ListenAndServe(value, nil) } Which works with .\proxy.exe -port-number :15214.
Rewrote so that an integer argument is possible. Looked at http request. Several things can be asked:
The host is localhost:15215! The raw url is /index.html!i
Have to look if I can get all the information required to submit the proxy-request. My previous call to lynx seems to ignores the first part and is no proxy-call. Seems that this option has to be set with lynx.cfg (which is called with the -cfg argument).
Tried it again at machine deze, and here it works!:Thread 0: Forwarding request to end server: GET / HTTP/1.0 Host: www.ai-class.com Accept: text/html, text/plain, audio/mod, image/*, application/msword, application/pdf, application/postscript, text/sgml, */*;q=0.01 Accept-Encoding: gzip, compress Accept-Language: en User-Agent: Lynx/2.8.5rel.1 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.8e-fips-rhel5 *** End of Request *** Thread 0: Forwarded 8192 bytes from end server to client Thread 0: Forwarded 444 bytes from end server to client

October 10, 2011

Next week, I have to start with H8 and H12 (processes and threads).
Wednesday and Thursday a should cover RIO and sockets.

October 6, 2011

Solved the issue with the first wrong measurement by changing the order of the test (decending). Eszter reports that this doesn't work on here laptop (last 4 measurements are twice as large).
Tried out the proxy-lab. Tested proxy-solution with lynx deze:18216 http://www.google.nl. Needed to use everytime a different port (check how port is closed). Nothing in the proxy.log. No code to shutdown client (should add signal handler). Could also test with CentOS and Windows. You can activate a proxy in Explorer->Tools->Internet Options->LAN Network. With Mozilla under CentOS the proxy is under edit preferences->Network->Settings.

October 5, 2011

Sammie complains that the output of the kernel is not correct.
Tested AenE's code, and also there the Mean is not printed for the current version (which is intended behavior, only the mean speedup is printed).
Changed the code,
void register_line_functions() { /* ... the current version is test first */ add_line_function(&version1, VERSION1_DESCR); /* ... Register intermediate results here */ /* ... Remove the naive implementation as fast as possible, because testing this version takes a lot of time */ add_line_function(&version2, VERSION2_DESCR); set_line_baseline(VERSION1_DESCR); } but strange enough version1 doesn't seems to be tested, while with Remy it seems that the first measurement is wrong (version1 vs version1):
(arnoud@deze) ./driver Teamname: REMSAM! Member 1: Remi de Zoeten Email 1: Remi.de.z@gmail.com Member 2: Sammie Katt Email 2: Sammie.katt@gmail.com Processor Cache Size ~= 12288 Kb Processor Block Size ~= 64 Kb Line: Version = version2: test!: Dim 128 256 512 1024 1536 Mean Your CPEs 21.9 13.9 16.0 17.1 21.8 Using baseline implementation 'version1: eerste versie!' Baseline CPEs 11.0 14.0 16.0 17.1 21.6 Speedup 0.5 1.0 1.0 1.0 1.0 0.9 Summary of Your Best Scores: Line : 0.9 (version2: test!)
Looked at driver.c, byt fcyc is called with cleared cache.

October 3, 2011

Downloaded Intel's performance measurement library. To build this library, also Windows Driver Development Kit is needed. The Driver Development Kit is 619 Mb.
Installed the build environments and the Tools. Changed the compiler in the Makefile from Intel from g++ to cl.exe. cl.exe -help gives the command line options. In the Win7 x64 Checked Build Environment the nmake fails on missing assert.h. Tried the other Visual Studio command prompts. Both VS2005, VS2008 and VS2010 fail on line 64 on the conversion of wchar_t to LPCSTR.
Removing the L from the string solved this issue. cpucounters.cpp compiles, next problem is cpucounterstest.cpp (can not find windriver.h).
windriver.h is in the directory PCM_Win, which also contains a projectfile which nearly works. Remaining problems is that intrin.h and windows.h cannot be loaded at the same time. Solved the issue by not including intrin.h but defining cpuid myself. This works, except that cpuid cannot be found by the linker. Found an example (VS2005 express) of the usage of cpuid, but no special libraries there.
Made a project with the cpuid-example, works fine. Compiled PCM_Win as Win32 without including windows.h, and now suddenly it works. Only remaining problems is that pcm 1 -nc -ns complains abut missing signed msr.sys driver (as specified in Windows_HOWTO.rtf). Unfortunatelly, you need a certificate from a certification autority ...

September 30, 2011

Looked at answer of first question. Print(%x) gives the same answer for 4e and 4i. Printing the bytes shows that 4i is correct (number previously used was to big, difference was rounded). Tested with -4 and -1.
One student claimed that there was a counter example for each solution to (a < 0) ? 1 : -1. Tested solution h with (INT_MIN, -1, 0, +1, INT_MAX), all equivalent).

September 27, 2011

Problem 2.90. The binairy representation of 1/7 = 0.142857142857..., so y=142857 (Problem 2.82). 22/7 is 3,142857142857... (same y). M_PI is 3,14159, so both numbers diverge at the third decimal point.
The little endian representation of 22/7 is 93 24 49 40, of M_PI db 0f 49 40 , so the difference is in the least significant bits (93 24 versus db 0f).
Should look at getting the fraction first. Problem 2.89 is solved as PROBLEM 4.
The sign is the same. exp = 40 49 = o100 0000 0-1oo = 1000 0000 (128-127) = 1. The fraction is 49 24 93 = o100 1001 0010 0100 1001 0011 (repeating pattern 1001) = (1 + 4793491/2^23) * 2 = (1 + 4,793,491/8,388,608) * 2 = (1.57142865657)*2 = 3.1428573131561279296875.

September 23, 2011

Downloaded CoreInfo, which is a MicroSoft TechNet tool to display information about the processor. CoreInfo reports that nb-unreal supports at max SSSE3-instructions (and not yet AVX-instructions).
Also downloaded ProcessExplorer, which also gives an estimate of the performance of the GPU.

September 21, 2011

Used u015425 to mount uva_home. Emicro doesn't know cfis (or ssh).

September 19, 2011

In the combination of instruction set and 64bits basis, the goals are only told at the end of the presentation, not at the beginning.
Use voting to get a group response (if possible after each section).

September 7, 2011

Looked at floating point puzzles at slide 34 of floating point presentation. Should say that x == (int)(float) x fails for numbers larger than 2^23. Yet, a direct test didn't fail. Included it in the test-harness of datalab. Still OK. Strange, because problem 2.49 claims the same as my hypothesis (2^24+1). 2^24 is XOR, should have used pow(2,24). Solved by initiating x with x = 1 + (1 << 24); (warning about parentheses). Now my function fails (btest still OK, is this x not sampled?).
A bit clueless on 2/3 == 2/3.0 and d * d >= 0.0. Solution can be found at Practice Problem 2.54. My intution was ok for both problems.

September 1, 2011

The new CMU slides are now in accordance with their Markus Püschel's presentation guidelines.
New slides use the Calibre font. Available inside Microft Office Compatibility Pack. This actually didn't work (package already installed). The PowerPoint2007 viewer works better (Calibri installed, yet not Calibri bold and italic). Yet, the operator not showing up in 01-overview.pptx is due to another missing font (Zapf Dingbats). Zapf Dingbats is a MacIntosh font, according to Microsoft support it should be replaced by MonoType Sorts (which is not available at u305). According to Microsoft support, this font is part of the Office 97 distribution. Downloaded the file from findthatfile.com. Followed Microsoft's instructions to install font. Tried to replace Zapf Dingbots with MonoType Sorts (via Format->ReplaceFonts), but PowerPoint complaints that I selected a single-byte font to replace a double-byte font. Also replacing it with Webdings or Wingdings didn't help.
Strange enough, on nb-unreal I can display the symbol (an arrow), although Zapf Dingbats is also not installed here.

August 30, 2011

Requests work from the staff-domain, not from edu domain.

August 25, 2011

Logged in on ArnoudsUbuntu (146.50.53.34), aqa abeel or u002697. Seems that no scp-daemon is running (portscan only showed port 40045 and 50596). Installed sudo apt-get install ssh, which updated one package and installed two aditional (openssh-client and openssh-server).
nohup ./puzzle-requestd.pl -l Fall2011 -p 15213 & works for ArnoudsUbuntu.
Exceed is well in under Programs->Open Text Exceed. Only got grey screen, but this was because I started a failsafe session. Started ssh sessions from the Windows machine. Couldn't start xclock, because DISPLAY was not set. Once set, the connection was not allowed. Changing the Exceed->Settings->Security from NoHost to AnyHost.
Session had to be restarted. Both Gnome and KDE failed, only default works. Now I get a xclock. firefox starts up, but is unbelievable slow (5 minutes to restore session). lynx works fine and displays request form, yet is not clear how the download is delivered.
Started konqueror &, but programs complains that dcopserver is not running (because DiskQuota exceeded). With again quota, konqueror and firefox work fine again. Only problem is that the response on the request has address http://arnoudsubuntu:15213?, which is not known (http://146.50.53.34:15213 is).
Replaced hostname -f with hostname -I. Fails on trailing whitespace, which doesn't want to be trimmed?! Putting the ip directly in the code solves the issue (temporally).
With again quota also KDE works fine with Exceed. Could set Exceed->Settings back to NoHost.
146.50.53.34 is reachable from mremote as abeel.science.uva.nl.

August 22, 2011

svn doesn't work anymore on sparta, version in /usr/bin is too old (v1.4.4). Softpkg doesn't work anymore on sparta, is not installed in /usr/local/bin/softpkg. Download from Gert and Make doesn't work, because command arch is not found.
Tunis (u021055) is not longer reachable. Tunis is now in a different domain: u021055.1x.uva.nl. Still unknown host.
Logging in to emicro fails horribly (rather short path). u015425 works better. Copied the binary on u015425 to my packages and now softpkg works again at sparta (and svn).

August 15, 2011

Jeroen Roodhart suggested disown, which is a bash-buildin. Unfortunately, my shell is tc.

August 11, 2011

Created puzzle-requestd.pl, which calls makepuzzle.pl
Script works on sparta, only the tar file has two directories
Script puts everything in one directory, btest works, driver is missing Driverlib.pm.
Modified driver.pl so that it only checks the nickname when given.
Script works, but daemon dies when session ends. Tried nohup &, but this also fails.

August 9, 2011

Looked at bomblab/src/makephases.pl how to make a custom selection for the datalab. The datalab makes its selection by including the source files of certain puzzles.
Made for the moment the current selection:
- phase1a -> bitAnd
- phase1b -> bitOr
- phase1c -> bitNor
- ... -> bitXor (all rating 1) : bitlevel manipulation
- phase2a -> getByte
- phase2b -> fitsBits
- phase2c -> divpwr2
- ... -> bitXor (all rating 2, but quite diverse, better:)
- phase2a -> allEvenBits
- phase2b -> fitsBits
- phase2c -> anyOddBit
- Still bitlevel manipulation
- phase3a.c -> IsPositive (8 ops)
- phase3b.c -> IsLessOrEqual (24 ops)
- phase3c.c -> IsAsciiDigit (15 ops)
- ... -> is*.c (rating 2/3) (many only negative of other).
- Phase 3-4 Arithmetic functions
- phase4a.c -> ilog2 (90 ops)
- phase4b.c -> greatestBitPos (70 ops)
- phase4c.c -> howManyBits (90 ops)
- ... (all high rating and many max ops).
- Phase 5-6 Floating point functions
- phase5a.c -> float_neg.c (10 ops)
- phase5b.c -> float_abs.c (10 ops)
- No third! Possible puzzle: bit2double (page 281)
- phase5c.c -> float_abs.c
- puzzles/float_half.c phase6a.c
- puzzles/float_twice.c phase6b.c
- puzzles/float_i2f.c phase6c.c
Note that the rating can be found as comment in the code
./makephases.pl -d ./phasesworks (after removing secret phase) and prints new file to screen. Yet, this is the whole file, not only the student-handout. Phasehead.c was available as bits-header.c. Also decl and tests should be generated (decl-header.c and tests-header.c are available). The files are created from the makefile (make btest).
With an empty (or UvA) header, the output of makephases.pl is equal to selections.c, which seems to be the input for the makefile.
make btest fails, because ./makephases.pl has added an empty bombid. Changed it to int lab_id = 0;. make also creates bits-handout.c. No space behind the lab_id (while specified in makephases.pl, strange, but not essential).
In the bomblab, bombrequestd.pl calls makebomb.pl, which calls makephases.pl.
Made makepuzzle.pl from makebomb.pl, which creates puzzles/puzzle*/selections.c when called with the command makepuzzle.pl -s ../src/ -o puzzles -i 1 -l Fall2011 -u arnoud.
Made a puzzles/Makefile which can create puzzle*/btest from students bits.c file, and can create bits-handout.c and bits-solution.c.

August 3, 2011

The De Morgan rules are explained in Ben Bruidegom's book at page 61.
Boole algebra is explained in Bryant's book at page 80, but without De Morgan rules.

August 2, 2011

Looked at the schedule for coming year. Met Vivianne Tolen (professional coach).
Looked at Chapter 2. First Lecture from CMU is huge (80 slides). Datalab assignment runs in Pittsburg for two weeks, so I have make a pick in the puzzles (skip at least floating point puzzles?!).According to the release notes, there are 73 integer puzzles.
Looked at first puzzle (looks like problem2.14). Tried (~x | ~y) which fails, but indicates correct solution. When verbose is on I get the following output: Looking for function 'bitAnd' in ./bddcheck/all-functions.txt Executing ./bddcheck/checkprogs.pl -t -v -T 10 -f bits.c -p bitAnd -F tests.c -P test_bitAnd Looking for bitAnd in bits.c Looking for test_bitAnd in tests.c ./bddcheck/cbit/cbit -t 10 bitAnd-1.c test_bitAnd-2.c Bug Condition Unsatisfiable Integer Ops: 34 PI Vars: 64, Boolean Ops: 1 Time: 0.00 sec. Comparing bits.c:bitAnd to tests.c:test_bitAnd .. OK
In all-functions.txt it was specified that "all" arguments are to be tested for 'bitAnd', yet what "all" means is unclear. The readme of cbit specifies: "Run ./cbit -h for information on how to run the program. The most cryptic part is how to specify range restrictions on the function arguments.". The file cbit/ast-eval.c seems to indicate that 64bits are tested. When I ammend the function with if (y == 123) return 2; this is counterexample is found.
My solution is an instantion of the De Morgan rules from Boolean algebra.

April 29, 2011

The new AVX capability is reported as feature by cpuid in register_C (bit 28).
Seems that AMD is using sse4a and sse4b instead of sse4_1 and sse4_2. Couldn't find a recent AMD-machine to test it.

April 28, 2011

Looked at Cpuinfo assignment. Found in Bas' Minfo.txt the following machines on the staff domain:
- emicro: sse2 (reachable)
- u015425: ssse3 (reachable)
- abdeel: sse2 (not reachable)
Yet, it is easier to use machines on the edu domain:
- sremote: sse4.1
- ow127: ssse3
- ow132: ssse3
- ow137: ssse3
- ow140: not reachable
- ow147: not reachable
- ow150: ssse3
- ow158: ssse3
Checked what is going on with emicro. The /proc/cpuinfo doesn't show sse3, but bit0 of register0 seems to be active. According to http://fixunix.com/hardware/386709-sse3-not-sse3.html>, pni is an alias for sse3. emicro's pni-flag is active. emicro is an dual core Opteron Processor 2220, which seems to belong to the Santa Rosa.
abdeel is a real sse2.

March 23, 2011

Twee fouten in vraag 3:
- regel 4 in 3f.
- geheugenadress 0x80038 ipv 0x80032.

March 21, 2011

Inspected /proc/cpuinfo. Nao has cache size of 128 Kb and a clflusize of 32.
According to :
"The AMD Geode LX-800 used in the Nao has two levels of cache, with the L2 cache of 128 KB having more in uence on the performance. The L1 and L2 caches show the expected behavior, with a peak transfer rate of 1.7 GB/sec and 1.5 GB/sec respectively. But the performance of the RAM is much worse, dropping to only 240 MB/sec."
From the AMD specs, the LX-800 has 64K I/64K D L1 cache.
The picture on the specs, indicates that the LX also has an embedded graphics processor. LibBlt seems to be a way to access this embedded graphics processor. Most interesting example is for BltPutBatchOp.
According to the Laptop wiki, the L1 cache miss latency is 10-12 clock cycles, the L2 28-35 clock cycles.

March 14, 2011

Cross-compiled the memory-mountain for the Nao. Program can be executed, but it takes ages to execute. Yet, it seems to work:
Clock frequency is approx. 499.9 MHz Memory mountain (MB/sec) s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18 s19 s20 s21 s22 s23 s24 s25 s26 s27 s28 s29 s30 s31 s32 s33 s34 s35 s36 s37 s38 s39 s40 s41 s42 s43 s44 s45 s46 s47 s48 s49 s50 s51 s52 s53 s54 s55 s56 s57 s58 s59 s60 s61 s62 s63 s64 32m 171.5 99.7 70.3 55.9 55.6 55.7 55.9 58.1 55.4 55.2 56.1 56.3 56.8 55.7 56.0 59.1 56.8 58.0 56.1 58.2 57.0 57.6 57.7 53.8 57.5 57.5 56.0 56.1 56.9 55.3 55.4 60.7 56.4 62.9 57.2 56.3 62.9 62.9 61.4 59.0 55.5 60.9 62.6 61.7 61.9 60.5 60.9 64.0 61.8 61.1 61.0 61.6 63.9 59.4 60.3 63.9 59.8 59.4 58.7 63.3 58.6 56.8 52.7 67.7
Result is found in duncan.cvs.

March 2, 2011

The Makefile for handin was incorrect. Made a new Makefile.
Some students reported (quite far) reported that the first measurement of the first matrix is slower than the rest. Add an extra test to the driver, to warm up cache.
Tim reports that driver is only working without set_baseline.
Merel reports that with an extra variable for intensity the test fails.

March 1, 2011

Solved the man-page problem (yet, not yet for gcc): man -C ~/etc/man.config clock_gettime. Inside the man.config the important change is the usage of groff instead of nroff: NROFF /usr/bin/groff -Tascii -man

February 21, 2011

Also making consistent measurements for the line-version.
First measurements for naive:
- 256x256: 78 -> 3518
- 512x512: 190 -> 8255
- 1024x1024: 260 -> 17898
- 1536x1536: 260 -> 33333
- 2048x2048: 260 -> 49472
Naive is really slow, so reintroduced 64x64 and 128x128:
- 64x64: 78 -> 712
- 128x128: 190 -> 1416
Black_line gives a speedup of 254 (cpe 17).
Adding -02 doesn't increases speed of Naive:
- 128x128: 1416 -> 1416
- 256x256: 3518 -> 3562
- 512x512: 8255 -> 8260
- 1024x1024: 17898 -> 17902
- 1536x1536: 33333 -> 33303
intense_line gives a speedup of 1199 (cpe 6.5) mid_line gives a speedup of 1206 (cpe 6.5)
With find_cpe_v (,,work), the test is repeated multiple times, which is much too slow for the line function. Function also gives negative results.
Running black_line for the range 128 ... 1536 gives a speed up of 750x.

February 16, 2011

The distributed perflab code of last year seems to be located on sremote:/home/arnoud/onderwijs/CS/staff/perflab/working64.

February 15, 2011

ddd is on tunis installed on /usr/bin/ddd. Not installed on deze.

February 7, 2011

Found at cmu-site additional resources, including updated gdb-commands for 64-bits programs.
Finally, after 8 years, a new release of the bomblab.
Couldn't request a bomb via the webinterface. Tested ./makebomb.pl -s ./src -b ./bombs. Without problems, bomb0 is created. With disas phase_1 and x/s (0x402408) I could defuse the first phase.
In the old webinterface of the bomb, only two scripts were recently updated. Sendbomb.pl:
< system("cd $bombdir; tar cf - $bombname/{bomb,bomb.c} | gmime-uuencode $bombname.tar $uufile > $uufile") == 0s --- > system("cd $bombdir; tar cf - $bombname/{bomb,bomb.c} | uuencode $bombname.tar > /tmp/$bombname.$$") == 0
and bomb-requestd.pl
< $form .= "With this form you can request your own personal bomb.\n"; < $form .= " This request only seems to work from the science.uva.nl domain. \n"; 119c117 < $form .= "your team and then click the Request button. \n"; --- > $form .= "your team and then click the Submit button. \n"; 145c143 < $form .= "\n"; --- > $form .= "\n"; 191c189 < $notifyflag = "-q"; --- > $notifyflag = ""; 225,227c223 < #$server_dname = hostname(); < #$server_dname = "faro.science.uva.nl"; < chomp ($server_dname = `hostname -f`); --- > $server_dname = hostname();
Moved the new source to the old webserver scripts. A bomb is now started to be created, only the script is redirected from tunis to u021055 After a restart, bomb2.tar was delivered.

February 2, 2011

Looked at the specs of the processor of deze. Deze has multiple E7530 processors. Found out via linuxforums how I should interpret the information in /proc/cpuinfo: Deze has four processors, each with six cores and hyperthreading (double each core into two logical processors). Total number of processors is 48, but cpuid should answer 12 (instead of 20). Intel seems to suggest that the number of logical processors should be queried by command 0xB.
Adjusted cpuinfo code. Seems to work correct on deze, but both on sparta and tunis cpuinfo reports hyperthreading true (so no double cores, only two threads). Yet, sparta is a E2160, with two cores and no hyperthreading capabilities. Why is this flag true?
At the end, made a scheme where the flag is read, but overwritten to false if no response on the topo_req is received. Intel-64-architecture indicates three possible cases. Anyway, works now correctly on deze, ow150, mremote, sparta, tunis. htt or hyperthreading is not in /proc/cpuinfo, but /home/arnoud/bin/linux/cpuid gives two clues: first that htt can also mean multi-core supported, second that (multi-processing synth): is only (t=2) for the dual cores (without virtual htt), while (t=32!?) for Xeon server with 6 cores (and additional htt).

February 1, 2011

No connection with ow137, ow132, or ow127. Could connect to ow140 and ow158.
No lynx on machine 'deze'.
Making cpuinfo with gcc-4.1 gives error messages: /home/arnoud/onderwijs/CS/staff/intro/2011/cpuinfo.c:151: error: impossible constraint in âmâho /home/arnoud/onderwijs/CS/staff/intro/2011/cpuinfo.c:167: error: impossible constraint in âmâ
The code in the intel directory of both the staff and edu-domain (including get_cpu_type.c from Intel) uses the i386 registers. In January 7, 2009, I had a working version of cpuid.c with gcc version 4.1.2.
Found the 2009 version at the staff domain in the cpuinfolab/2009 directory.
Also solved issue with the vendor and name-string (asm request as argument a unsigned long (8 bits long for 64bits systems), but cpuid only fills 4 bits (unsigned int on both 32 and 64 bits systems). Example works now fine for both ow150 and deze. gdb is not installed on deze, tar (v1.15) doesn't recognize tgz-files on deze. tar is version 1.17 on ow150.

Previous Labbook

Labbook 2010